From david at ar.media.kyoto-u.ac.jp Sun Jul 1 07:50:48 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 01 Jul 2007 20:50:48 +0900 Subject: [Numpy-discussion] Building numpy 1.0.3-2 on Linux 2.6.8 i686 (Debian 3.1) In-Reply-To: References: <46863ADC.6010409@ar.media.kyoto-u.ac.jp> Message-ID: <46879518.8020809@ar.media.kyoto-u.ac.jp> Michael Hoffman wrote: > David Cournapeau wrote: >> Michael Hoffman wrote: >>> Hi. I have been trying to build NumPy on a 32-bit Linux box using python >>> setup.py build. I received the following errors: > >> [...] > > >> Which distribution are you building on ? > > Which Linux distribution? Debian 3.1. Mmh, 3.1 is sarge, which has python2.4 as default, right ? There may be a mismatch somewhere... Particularly since you have two python include path (/nfs/acari/mh5/include/python2.5 and /software/python-2.5/include/python2.5) which are non standart. Are you using those on purpose ? Why not the debian python devel package (do you use your own compiled python ?) David From john.c.cartwright at comcast.net Sun Jul 1 18:10:07 2007 From: john.c.cartwright at comcast.net (John Cartwright) Date: Sun, 1 Jul 2007 16:10:07 -0600 Subject: [Numpy-discussion] problem compiling v.1.0.3 on a Mac In-Reply-To: <4686C04A.4020001@gmail.com> References: <4D239651-778E-4CB6-A0CD-A0E48A53B6FE@comcast.net> <4686C04A.4020001@gmail.com> Message-ID: <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> Hello Robert, just a followup - I found that I could compile with a stand-alone python 2.5 installation, but could not with a 2.5 framework install. Same problem as listed below. Thanks for any help that you can provide. --john On Jun 30, 2007, at 2:42 PM, Robert Kern wrote: > John Cartwright wrote: >> Hello All, >> >> I'm having trouble compile on a Mac 10.4.10. It seems as if it's >> not finding /usr/include: >> >> ... >> from /Library/Frameworks/Python.framework/Versions/ >> 2.4/include/python2.4/Python.h:81, >> from _configtest.c:2: >> /usr/include/stdarg.h:4:25: error: stdarg.h: No such file or >> directory >> ... >> >> I tried setting the "CFLAG=-I/usr/include", but w/o success. Can >> anyone help me? > > It should build out-of-box. Is this the standard Python > distribution from > www.python.org? Check your environment variables. You should not > have CFLAGS or > LDFLAGS; these will overwrite the flags that are necessary for > building Python > extension modules. > > If that doesn't work, please give us the complete output of > > $ python setup.py -v build > > Thanks. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a > harmless enigma > that is made terrible by our own mad attempt to interpret it as > though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Sun Jul 1 18:19:54 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 01 Jul 2007 17:19:54 -0500 Subject: [Numpy-discussion] problem compiling v.1.0.3 on a Mac In-Reply-To: <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> References: <4D239651-778E-4CB6-A0CD-A0E48A53B6FE@comcast.net> <4686C04A.4020001@gmail.com> <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> Message-ID: <4688288A.6010403@gmail.com> John Cartwright wrote: > Hello Robert, > > just a followup - I found that I could compile with a stand-alone > python 2.5 installation, but could not with a 2.5 framework install. > Same problem as listed below. By framework install, you mean the distribution from www.python.org? or did you compile it yourself? > Thanks for any help that you can provide. I'd like to, but you will have to provide the information I asked for. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From john.c.cartwright at comcast.net Sun Jul 1 18:28:02 2007 From: john.c.cartwright at comcast.net (John Cartwright) Date: Sun, 1 Jul 2007 16:28:02 -0600 Subject: [Numpy-discussion] problem compiling v.1.0.3 on a Mac In-Reply-To: <4688288A.6010403@gmail.com> References: <4D239651-778E-4CB6-A0CD-A0E48A53B6FE@comcast.net> <4686C04A.4020001@gmail.com> <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> <4688288A.6010403@gmail.com> Message-ID: Thanks for the prompt reply! By "framework install" I mean the Mac OS X universal installer at www.python.org. By "stand-alone" I mean compiled from the source tarball at the same site. --john On Jul 1, 2007, at 4:19 PM, Robert Kern wrote: > John Cartwright wrote: >> Hello Robert, >> >> just a followup - I found that I could compile with a stand-alone >> python 2.5 installation, but could not with a 2.5 framework install. >> Same problem as listed below. > > By framework install, you mean the distribution from > www.python.org? or did you > compile it yourself? > >> Thanks for any help that you can provide. > > I'd like to, but you will have to provide the information I asked for. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a > harmless enigma > that is made terrible by our own mad attempt to interpret it as > though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Sun Jul 1 18:38:27 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 01 Jul 2007 17:38:27 -0500 Subject: [Numpy-discussion] problem compiling v.1.0.3 on a Mac In-Reply-To: References: <4D239651-778E-4CB6-A0CD-A0E48A53B6FE@comcast.net> <4686C04A.4020001@gmail.com> <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> <4688288A.6010403@gmail.com> Message-ID: <46882CE3.2060702@gmail.com> John Cartwright wrote: > Thanks for the prompt reply! By "framework install" I mean the Mac > OS X universal installer at www.python.org. By "stand-alone" I mean > compiled from the source tarball at the same site. Okay, now can you give us the output of $ python setup.py -v build ? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From john.c.cartwright at comcast.net Sun Jul 1 19:31:25 2007 From: john.c.cartwright at comcast.net (John Cartwright) Date: Sun, 1 Jul 2007 17:31:25 -0600 Subject: [Numpy-discussion] problem compiling v.1.0.3 on a Mac In-Reply-To: <46882CE3.2060702@gmail.com> References: <4D239651-778E-4CB6-A0CD-A0E48A53B6FE@comcast.net> <4686C04A.4020001@gmail.com> <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> <4688288A.6010403@gmail.com> <46882CE3.2060702@gmail.com> Message-ID: I tried to send that last night, but the message was so large that it's waiting for approval. Here's the first part of the output: Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_3844 blas_opt_info: ( library_dirs = /Library/Frameworks/Python.framework/Versions/2.5/ lib:/usr/local/lib:/usr/lib ) FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-faltivec', '-I/System/Library/Frameworks/ vecLib.framework/Headers'] lapack_opt_info: ( library_dirs = /Library/Frameworks/Python.framework/Versions/2.5/ lib:/usr/local/lib:/usr/lib ) FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-faltivec'] running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands -- compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands -- fcompiler options running build_src building py_modules sources building extension "numpy.core.multiarray" sources Generating build/src.macosx-10.3-fat-2.5/numpy/core/config.h new_compiler returns distutils.unixccompiler.UnixCCompiler new_fcompiler returns numpy.distutils.fcompiler.nag.NAGFCompiler customize NAGFCompiler NAGFCompiler instance properties: archiver = ['ar', '-cr'] compile_switch = '-c' compiler_f77 = ['f95', '-fixed', '-O4', '-target=native'] compiler_f90 = ['f95', '-O4', '-target=native'] compiler_fix = ['f95', '-fixed', '-O4', '-target=native'] libraries = [] library_dirs = [] linker_so = ['f95', '-unsharedf95', '-Wl,-bundle,- flat_namespace,- undefined,suppress'] object_switch = '-o ' ranlib = ['ranlib'] version = None version_cmd = ['f95', '-V'] customize AbsoftFCompiler AbsoftFCompiler instance properties: archiver = ['ar', '-cr'] compile_switch = '-c' compiler_f77 = ['f77', '-N22', '-N90', '-N110', '-f', '-s', '-O'] compiler_f90 = ['f90', '-YCFRL=1', '-YCOM_NAMES=LCS', '- YCOM_PFX', '- YEXT_PFX', '-YCOM_SFX=_', '-YEXT_SFX=_', '- YEXT_NAMES=LCS', '-s', '-O'] compiler_fix = ['f90', '-YCFRL=1', '-YCOM_NAMES=LCS', '- YCOM_PFX', '- YEXT_PFX', '-YCOM_SFX=_', '-YEXT_SFX=_', '- YEXT_NAMES=LCS', '-f', 'fixed', '-YCFRL=1', '- YCOM_NAMES=LCS', '-YCOM_PFX', '-YEXT_PFX', '- YCOM_SFX=_', '-YEXT_SFX=_', '-YEXT_NAMES=LCS', '-s', '-O'] libraries = ['fio', 'f90math', 'fmath', 'U77'] library_dirs = [] linker_so = ['f90', '-K', 'shared'] object_switch = '-o ' ranlib = ['ranlib'] version = None version_cmd = ['f90', '-V -c /tmp/tmpYPRTHr__dummy.f -o /tmp/tmpYPRTHr__dummy.o'] customize IbmFCompiler IbmFCompiler instance properties: archiver = ['ar', '-cr'] compile_switch = '-c' compiler_f77 = ['xlf', '-qextname', '-O5'] compiler_f90 = ['xlf90', '-qextname', '-O5'] compiler_fix = ['xlf90', '-qfixed', '-qextname', '-O5'] libraries = [] library_dirs = [] linker_so = ['xlf95', '-Wl,-bundle,-flat_namespace,- undefined,suppress'] object_switch = '-o ' ranlib = ['ranlib'] version = None version_cmd = ['xlf', '-qversion'] customize GnuFCompiler Could not locate executable g77 Could not locate executable f77 GnuFCompiler instance properties: archiver = ['ar', '-cr'] compile_switch = '-c' compiler_f77 = ['f77', '-g', '-Wall', '-fno-second-underscore', '- fPIC', '-O2', '-funroll-loops', '-mcpu=7450', '- mtune=7450'] compiler_f90 = None compiler_fix = None libraries = ['g2c', 'cc_dynamic'] library_dirs = [] linker_exe = ['f77', '-g', '-Wall'] linker_so = ['f77', '-g', '-Wall', '-undefined', 'dynamic_lookup', ' -bundle'] object_switch = '-o ' ranlib = ['ranlib'] version = None version_cmd = ['f77', '--version'] customize Gnu95FCompiler Could not locate executable f95 Gnu95FCompiler instance properties: archiver = ['ar', '-cr'] compile_switch = '-c' compiler_f77 = ['/usr/local/bin/gfortran', '-Wall', '-ffixed- form', '- fno-second-underscore', '-fPIC', '-O3', '- funroll-loops', '-mcpu=7450', '-mtune=7450'] compiler_f90 = ['/usr/local/bin/gfortran', '-Wall', '-fno-second- underscore', '-fPIC', '-O3', '-funroll-loops', '- mcpu=7450', '-mtune=7450'] compiler_fix = ['/usr/local/bin/gfortran', '-Wall', '-ffixed- form', '- fno-second-underscore', '-Wall', '-fno-second- underscore', '-fPIC', '-O3', '-funroll-loops', '-mcpu=7450', '- mtune=7450'] libraries = ['gfortran'] library_dirs = ['/usr/local/lib/gcc/powerpc-apple- darwin8.9.0/4.3.0'] linker_exe = ['/usr/local/bin/gfortran', '-Wall'] linker_so = ['/usr/local/bin/gfortran', '-Wall', '-undefined', 'dynamic_lookup', '-bundle'] object_switch = '-o ' ranlib = ['ranlib'] version = LooseVersion ('4.3.0') version_cmd = ['/usr/local/bin/gfortran', '--version'] customize Gnu95FCompiler Could not locate executable f95 customize Gnu95FCompiler using config C compiler: gcc -arch ppc -arch i386 -isysroot /Developer/SDKs/ MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 compile options: '-I/Library/Frameworks/Python.framework/Versions/2.5/ include/python2.5 -Inumpy/core/src -Inumpy/core/include -I/Library/ Frameworks/Python.framework/Versions/2.5/include/python2.5 -c' gcc: _configtest.c In file included from _configtest.c:2: /Library/Frameworks/Python.framework/Versions/2.5/include/python2.5/ Python.h:18:20: error: limits.h: No such file or directory ... --john On Jul 1, 2007, at 4:38 PM, Robert Kern wrote: > John Cartwright wrote: >> Thanks for the prompt reply! By "framework install" I mean the Mac >> OS X universal installer at www.python.org. By "stand-alone" I mean >> compiled from the source tarball at the same site. > > Okay, now can you give us the output of > > $ python setup.py -v build > > ? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a > harmless enigma > that is made terrible by our own mad attempt to interpret it as > though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From mathewww at charter.net Sun Jul 1 19:49:42 2007 From: mathewww at charter.net (Mathew) Date: Sun, 01 Jul 2007 16:49:42 -0700 Subject: [Numpy-discussion] problem compiling v.1.0.3 on a Mac In-Reply-To: References: <4D239651-778E-4CB6-A0CD-A0E48A53B6FE@comcast.net> <4686C04A.4020001@gmail.com> <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> <4688288A.6010403@gmail.com> <46882CE3.2060702@gmail.com> Message-ID: <46883D96.9080508@charter.net> John Cartwright wrote: > I tried to send that last night, but the message was so large that > it's waiting for approval. Here's the first part of the output: > Same for me. Here is the beginning of mine Running from numpy source directory. F2PY Version 2_3875 blas_opt_info: blas_mkl_info: ( library_dirs = /u/vento0/myeates/lib ) ( include_dirs = /u/vento0/myeates/include ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /u/vento0/myeates/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS ( library_dirs = /u/vento0/myeates/lib ) (paths: ) (paths: ) (paths: /u/vento0/myeates/lib/libptf77blas.a) (paths: ) (paths: /u/vento0/myeates/lib/libptcblas.a) (paths: ) (paths: /u/vento0/myeates/lib/libatlas.a) Setting PTATLAS=ATLAS ( include_dirs = /u/vento0/myeates/include ) (paths: /u/vento0/myeates/include/atlas) (paths: /u/vento0/myeates/include/cblas.h) Setting PTATLAS=ATLAS ( library_dirs = /u/vento0/myeates/lib ) (paths: ) FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/u/vento0/myeates/lib'] language = c include_dirs = ['/u/vento0/myeates/include'] new_compiler returns distutils.unixccompiler.UnixCCompiler customize GnuFCompiler find_executable('gfortran') Found executable /u/vento0/myeates/bin/gfortran gnu: no Fortran 90 compiler found find_executable('g77') Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found exec_command('/u/vento0/myeates/bin/gfortran --version',) Retaining cwd: /u/vento0/myeates/numpy _preserve_environment([]) _update_environment(...) _exec_command_posix(...) Running os.system('( /u/vento0/myeates/bin/gfortran --version ; echo $? > /u/vento0/myeates/tmp/tmpLMMu5C/H-vRSk ) > /u/vento0/myeates/tmp/tmpLMMu5C/Yd7QL7 2>&1') _update_environment(...) exec_command(['gfortran', '-g', '-Wall', '-fno-second-underscore', '-fPIC', '-O2', '-funroll-loops', '-print-libgcc-file-name'],) Retaining cwd: /u/vento0/myeates/numpy _preserve_environment([]) _update_environment(...) _exec_command_posix(...) From a.h.jaffe at gmail.com Mon Jul 2 11:32:39 2007 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Mon, 02 Jul 2007 16:32:39 +0100 Subject: [Numpy-discussion] how do I configure with gfortran In-Reply-To: <4686C05E.4090801@gmail.com> References: <46869505.3080209@charter.net> <4686A644.4040402@gmail.com> <4686B15C.4070507@charter.net> <4686B673.1030908@gmail.com> <4686BDC1.9070704@stsci.edu> <4686C05E.4090801@gmail.com> Message-ID: This is slightly off-topic, but probably of interest to anyone reading this thread: Is there any reason why we don't use --fcompiler=gfortran as an alias for --fcompiler=gfortran (and --fcompiler=g77 as an alias for --fcompiler=gnu, for that matter)? Those seem to me to be much more mnemonic names... Andrew Robert Kern wrote: > Christopher Hanley wrote: >> I have found that setting my F77 environment variable to gfortran is >> also sufficient. >> >> > setenv F77 gfortran >> > python setup.py install > > That might work okay for building scipy and other packages that only actually > have FORTRAN-77 code; however, I suspect that Matthew is trying to build > something with Fortran 90+. From a.h.jaffe at gmail.com Mon Jul 2 12:02:09 2007 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Mon, 02 Jul 2007 17:02:09 +0100 Subject: [Numpy-discussion] how do I configure with gfortran In-Reply-To: References: <46869505.3080209@charter.net> <4686A644.4040402@gmail.com> <4686B15C.4070507@charter.net> <4686B673.1030908@gmail.com> <4686BDC1.9070704@stsci.edu> <4686C05E.4090801@gmail.com> Message-ID: I wrote: > This is slightly off-topic, but probably of interest to anyone reading > this thread: > > Is there any reason why we don't use --fcompiler=gfortran as an alias > for --fcompiler=gfortran (and --fcompiler=g77 as an alias for > --fcompiler=gnu, for that matter)? But, sorry, of course this should be: Is there any reason why we don't use --fcompiler=gfortran as an alias for --fcompiler=gnu95? > Those seem to me to be much more mnemonic names... Andrew From mathewww at charter.net Mon Jul 2 19:52:44 2007 From: mathewww at charter.net (Mathew) Date: Mon, 02 Jul 2007 16:52:44 -0700 Subject: [Numpy-discussion] gfortran config probs Message-ID: <46898FCC.9050206@charter.net> Just realized this might be important .... I'm using Python2.5.1 From barrywark at gmail.com Mon Jul 2 20:26:15 2007 From: barrywark at gmail.com (Barry Wark) Date: Mon, 2 Jul 2007 17:26:15 -0700 Subject: [Numpy-discussion] Buildbot for numpy In-Reply-To: <20070616081155.GC20362@mentat.za.net> References: <20070616081155.GC20362@mentat.za.net> Message-ID: I have the potential to add OS X Server Intel (64-bit) and OS X Intel (32-bit) to the list, if I can convince my boss that the security risk (including DOS from compile times) is minimal. I've compiled both numpy and scipy many times, so I'm not worried about resources for a single compile/test, but can any of the regular developers tell me about how many commits there are per day that will trigger a compile/test? About the more general security risk of running a buildbot slave, from my reading of the buildbot manual (not the source, yet), it looks like the slave is a Twisted server that runs as a normal user process. Is there any sort of sandboxing built into the buildbot slave or is that the responsibility of the OS (an issue I'll have to discuss with our IT)? On a side note, buildbot.scipy.org goes to the DSP lab, Univ. of Stellenbosch's home page, not the buildbot status page. Thanks, Barry On 6/16/07, Stefan van der Walt wrote: > Hi all, > > Short version > ============= > > We now have a numpy buildbot running at > > http://buildbot.scipy.org > > Long version > ============ > > Albert Strasheim and I set up a buildbot for numpy this week. For > those of you unfamiliar with The Buildbot, it is > > """ > ...a system to automate the compile/test cycle required by most > software projects to validate code changes. By automatically > rebuilding and testing the tree each time something has changed, build > problems are pinpointed quickly, before other developers are > inconvenienced by the failure. The guilty developer can be identified > and harassed without human intervention. By running the builds on a > variety of platforms, developers who do not have the facilities to > test their changes everywhere before checkin will at least know > shortly afterwards whether they have broken the build or not. Warning > counts, lint checks, image size, compile time, and other build > parameters can be tracked over time, are more visible, and are > therefore easier to improve. > > The overall goal is to reduce tree breakage and provide a platform to > run tests or code-quality checks that are too annoying or pedantic for > any human to waste their time with. Developers get immediate (and > potentially public) feedback about their changes, encouraging them to > be more careful about testing before checkin. > """ > > While we are still working on automatic e-mail notifications, the > system already provides valuable feedback -- take a look at the > waterfall display: > > http://buildbot.scipy.org > > If your platform is not currently on the list, please consider > volunteering a machine as a build slave. This machine will be > required to run the buildbot client, and to build a new version of > numpy whenever changes are made to the repository. (The machine does > not have to be dedicated to this task, and can be your own > workstation.) > > We'd like to thank Robert Kern, Jeff Strunk and Gert-Jan van Rooyen > who helped us to get the ball rolling, as well as Neilen Marais for > offering his workstation as a build slave. > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From haase at msg.ucsf.edu Tue Jul 3 02:35:31 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 3 Jul 2007 08:35:31 +0200 Subject: [Numpy-discussion] arr.dtype.byteorder == '=' --- is this "good code" In-Reply-To: References: Message-ID: any comments !? On 6/25/07, Sebastian Haase wrote: > Hi, > Suppose I'm on a little-edian system. > Could I have a little-endian numpy array arr, where > arr.dtype.byteorder > would actually be "<" > instead of "=" !? > > There are two kinds of systems: little edian and big endian. > But there are three possible byteorder values: "<", ">" and "=" > > I assume that if arr.dtype.byteorder is "=" > then, even on a little endian system > the comparison arr.dtype.byteorder == "<" still fails !? > Or are the == and != operators overloaded !? > > Thanks, > Sebastian Haase > From faltet at carabos.com Tue Jul 3 03:46:18 2007 From: faltet at carabos.com (Francesc Altet) Date: Tue, 03 Jul 2007 09:46:18 +0200 Subject: [Numpy-discussion] arr.dtype.byteorder == '=' --- is this "good code" In-Reply-To: References: Message-ID: <1183448778.2867.5.camel@carabos.com> El dt 03 de 07 del 2007 a les 08:35 +0200, en/na Sebastian Haase va escriure: > any comments !? > > On 6/25/07, Sebastian Haase wrote: > > Hi, > > Suppose I'm on a little-edian system. > > Could I have a little-endian numpy array arr, where > > arr.dtype.byteorder > > would actually be "<" > > instead of "=" !? You can always use arr.dtype.str[0], which I think it always returns a '<', '>' or '|': In [2]:a=numpy.array([1]) In [3]:a.dtype.byteorder Out[3]:'=' In [4]:a.dtype.str Out[4]:' > There are two kinds of systems: little edian and big endian. > > But there are three possible byteorder values: "<", ">" and "=" > > > > I assume that if arr.dtype.byteorder is "=" > > then, even on a little endian system > > the comparison arr.dtype.byteorder == "<" still fails !? > > Or are the == and != operators overloaded !? No, this will fail. The == and != are not overloaded because dtype.byteorder is a pure python string: In [11]:type(a.dtype.byteorder) Out[11]: Cheers, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth From haase at msg.ucsf.edu Tue Jul 3 04:34:33 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 3 Jul 2007 10:34:33 +0200 Subject: [Numpy-discussion] arr.dtype.byteorder == '=' --- is this "good code" In-Reply-To: <1183448778.2867.5.camel@carabos.com> References: <1183448778.2867.5.camel@carabos.com> Message-ID: Thanks for the reply. Rethinking the question ... wasn't there an attribute named something like: is_native() ?? The a.dtype.str[0] is certainly better than nothing ... just doesn't look very good ;-) -Sebastian On 7/3/07, Francesc Altet wrote: > El dt 03 de 07 del 2007 a les 08:35 +0200, en/na Sebastian Haase va > escriure: > > any comments !? > > > > On 6/25/07, Sebastian Haase wrote: > > > Hi, > > > Suppose I'm on a little-edian system. > > > Could I have a little-endian numpy array arr, where > > > arr.dtype.byteorder > > > would actually be "<" > > > instead of "=" !? > > You can always use arr.dtype.str[0], which I think it always returns a > '<', '>' or '|': > > In [2]:a=numpy.array([1]) > In [3]:a.dtype.byteorder > Out[3]:'=' > In [4]:a.dtype.str > Out[4]:' In [5]:a.dtype.str[0] > Out[5]:'<' > > > > There are two kinds of systems: little edian and big endian. > > > But there are three possible byteorder values: "<", ">" and "=" > > > > > > I assume that if arr.dtype.byteorder is "=" > > > then, even on a little endian system > > > the comparison arr.dtype.byteorder == "<" still fails !? > > > Or are the == and != operators overloaded !? > > No, this will fail. The == and != are not overloaded because > dtype.byteorder is a pure python string: > > In [11]:type(a.dtype.byteorder) > Out[11]: From faltet at carabos.com Tue Jul 3 04:36:52 2007 From: faltet at carabos.com (Francesc Altet) Date: Tue, 03 Jul 2007 10:36:52 +0200 Subject: [Numpy-discussion] arr.dtype.byteorder == '=' --- is this "good code" In-Reply-To: References: <1183448778.2867.5.camel@carabos.com> Message-ID: <1183451812.2867.29.camel@carabos.com> El dt 03 de 07 del 2007 a les 10:34 +0200, en/na Sebastian Haase va escriure: > Thanks for the reply. > Rethinking the question ... wasn't there an attribute named something like: > is_native() > ?? In [3]:a.dtype.isnative Out[3]:True :) -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth From l.mastrodomenico at gmail.com Tue Jul 3 19:22:46 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Wed, 4 Jul 2007 01:22:46 +0200 Subject: [Numpy-discussion] PEP 368: Standard image protocol and class Message-ID: [Sorry for the cross-posting, but I think this may be relevant for both NumPy and ndimage.] Hello everyone, I have submitted to the Python core developers a new PEP (Python Enhancement Proposal): http://www.python.org/dev/peps/pep-0368/ It proposes two things: * the creation of a standard image protocol/interface that can be hopefully implemented interoperably by most Python libraries that manipulate images; * the addition to the Python standard library of a basic implementation of the new protocol. The new image protocol is heavily inspired by a subset of the NumPy array interface, with a few image-specific additions and changes (e.g. the "size" attribute of an image is a tuple (width, height)). Of course it would be wonderful if these new image objects could interoperate out-of-the-box with numpy arrays and ndimage functions. There is another proposal that would be very useful for that, PEP 3118 by Travis Oliphant and Carl Banks: http://www.python.org/dev/peps/pep-3118/ The image PEP (368) currently lists only modes based on uint8/16/32 numbers, but the final version will probably also include modes based on float32 and float16 (converted in software to/from float32/64 when necessary). A discussion about it is currently going on in the python-3000 mailing list: Any suggestion, comment or criticism from the NumPy/SciPy people would be very useful, but IMHO keeping the discussion only on the python-3000 ML may be a good idea, to avoid duplicating answers on different mailing lists. Thanks in advance. -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From wbaxter at gmail.com Tue Jul 3 21:24:19 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Wed, 4 Jul 2007 10:24:19 +0900 Subject: [Numpy-discussion] [SciPy-dev] PEP 368: Standard image protocol and class In-Reply-To: References: Message-ID: I'm not subscribed to the main Python list, so I'll just ask here. It looks like the protocol doesn't support any floating point image formats, judging from the big table of formats in the PEP. These are becoming more important these days in computer graphics as a way to pass around high dynamic range images. OpenEXR is the main example of such a format: http://www.openexr.com/. I think a PEP that aims to be a generic image protocol should support at least 32 bit floats if not 64-bit doubles and 16 bit "Half"s used by some GPUs (and supported by the OpenEXR format). ---bb On 7/4/07, Lino Mastrodomenico wrote: > > [Sorry for the cross-posting, but I think this may be relevant for > both NumPy and ndimage.] > > Hello everyone, > > I have submitted to the Python core developers a new PEP (Python > Enhancement Proposal): > > http://www.python.org/dev/peps/pep-0368/ > > It proposes two things: > > * the creation of a standard image protocol/interface that can be > hopefully implemented interoperably by most Python libraries that > manipulate images; > > * the addition to the Python standard library of a basic > implementation of the new protocol. > > The new image protocol is heavily inspired by a subset of the NumPy > array interface, with a few image-specific additions and changes (e.g. > the "size" attribute of an image is a tuple (width, height)). > > Of course it would be wonderful if these new image objects could > interoperate out-of-the-box with numpy arrays and ndimage functions. > There is another proposal that would be very useful for that, PEP 3118 > by Travis Oliphant and Carl Banks: > > http://www.python.org/dev/peps/pep-3118/ > > The image PEP (368) currently lists only modes based on uint8/16/32 > numbers, but the final version will probably also include modes based > on float32 and float16 (converted in software to/from float32/64 when > necessary). > > A discussion about it is currently going on in the python-3000 mailing > list: > > > > Any suggestion, comment or criticism from the NumPy/SciPy people would > be very useful, but IMHO keeping the discussion only on the > python-3000 ML may be a good idea, to avoid duplicating answers on > different mailing lists. > > Thanks in advance. > > -- > Lino Mastrodomenico > E-mail: l.mastrodomenico at gmail.com > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rudolphv at gmail.com Wed Jul 4 04:03:21 2007 From: rudolphv at gmail.com (Rudolph van der Merwe) Date: Wed, 4 Jul 2007 10:03:21 +0200 Subject: [Numpy-discussion] Does Numpy's covariance function numpy.cov() work for complex data? Message-ID: <97670e910707040103l40e89b30w118d9e5c668ff6b2@mail.gmail.com> Does anyone know if Numpy's covariance calculation function, cov(), which is located in /numpy/lib/function_base.py calculate the covariance matrix of complex data correctly? I.e., does it implement something like , P = cov(X) = 1/(N-1) * \Sum_i ( X[:,i] * transpose(X[:,i].conj()) ) -- Rudolph van der Merwe From j.reid at mail.cryst.bbk.ac.uk Wed Jul 4 06:15:58 2007 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Wed, 04 Jul 2007 11:15:58 +0100 Subject: [Numpy-discussion] Scipy release Message-ID: Hi, Is there going to be a scipy release anytime soon? I'm using numpy 1.0.3 with scipy 0.5.2 and I get these ugly warnings all the time: c:\apps\python25\lib\site-packages\scipy\misc\__init__.py:25: DeprecationWarning: ScipyTest is now called NumpyTest; please update your code test = ScipyTest().test These are the latest downloadable versions right? I'm beginning to wonder whether I shouldn't start building scipy myself. Thanks, John. From alexandre.fayolle at logilab.fr Wed Jul 4 06:48:40 2007 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Wed, 4 Jul 2007 12:48:40 +0200 Subject: [Numpy-discussion] Scipy release In-Reply-To: References: Message-ID: <20070704104840.GG5831@crater.logilab.fr> On Wed, Jul 04, 2007 at 11:15:58AM +0100, John Reid wrote: > Hi, > > Is there going to be a scipy release anytime soon? > > I'm using numpy 1.0.3 with scipy 0.5.2 and I get these ugly warnings all > the time: > > c:\apps\python25\lib\site-packages\scipy\misc\__init__.py:25: > DeprecationWarning: ScipyTest is now called NumpyTest; please update > your code > test = ScipyTest().test > > These are the latest downloadable versions right? I'm beginning to > wonder whether I shouldn't start building scipy myself. If these do really drive you mad, why not use the -W option to python ? Running you script with python -W ignore myscript.py -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science Reprise et maintenance de sites CPS: http://www.migration-cms.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 481 bytes Desc: Digital signature URL: From j.reid at mail.cryst.bbk.ac.uk Wed Jul 4 06:59:28 2007 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Wed, 04 Jul 2007 11:59:28 +0100 Subject: [Numpy-discussion] Scipy release In-Reply-To: <20070704104840.GG5831@crater.logilab.fr> References: <20070704104840.GG5831@crater.logilab.fr> Message-ID: Ok I'll try that although I guess that it turns off all warnings. that I'm concerned as well that scipy's release cycle isn't as quick as it could be. John. Alexandre Fayolle wrote: > On Wed, Jul 04, 2007 at 11:15:58AM +0100, John Reid wrote: >> Hi, >> >> Is there going to be a scipy release anytime soon? >> >> I'm using numpy 1.0.3 with scipy 0.5.2 and I get these ugly warnings all >> the time: >> >> c:\apps\python25\lib\site-packages\scipy\misc\__init__.py:25: >> DeprecationWarning: ScipyTest is now called NumpyTest; please update >> your code >> test = ScipyTest().test >> >> These are the latest downloadable versions right? I'm beginning to >> wonder whether I shouldn't start building scipy myself. > > If these do really drive you mad, why not use the -W option to python ? > > Running you script with > > python -W ignore myscript.py > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From david at ar.media.kyoto-u.ac.jp Wed Jul 4 07:02:03 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 04 Jul 2007 20:02:03 +0900 Subject: [Numpy-discussion] Scipy release In-Reply-To: References: <20070704104840.GG5831@crater.logilab.fr> Message-ID: <468B7E2B.1000604@ar.media.kyoto-u.ac.jp> John Reid wrote: > Ok I'll try that although I guess that it turns off all warnings. that > I'm concerned as well that scipy's release cycle isn't as quick as it > could be. > Well, quite the contrary, it is as quick as it can be. If you think it is too slow, don't hesitate to help :) David From j.reid at mail.cryst.bbk.ac.uk Wed Jul 4 07:19:16 2007 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Wed, 04 Jul 2007 12:19:16 +0100 Subject: [Numpy-discussion] Scipy release In-Reply-To: <468B7E2B.1000604@ar.media.kyoto-u.ac.jp> References: <20070704104840.GG5831@crater.logilab.fr> <468B7E2B.1000604@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau wrote: > John Reid wrote: >> Ok I'll try that although I guess that it turns off all warnings. that >> I'm concerned as well that scipy's release cycle isn't as quick as it >> could be. >> > Well, quite the contrary, it is as quick as it can be. If you think it > is too slow, don't hesitate to help :) Well I don't really want to get into an argument about the definition of 'quick as it can be'. I'm just trying to say that when people try the latest stable versions of 2 libraries that are closely coupled, they can be put off by warnings that appear out of the box. I'm pretty sure these warnings aren't important but seeing as I don't work on the internals of the libraries I don't know for sure. It looks like something that a quicker release schedule would fix and improve confidence amongst novice users of scipy. That said I'm not the one building the releases, I was just curious how the releases are managed and whether people who work on scipy are aware of this. Probably most scipy developers aren't using the 0.5.2 version with these warnings. John. From david at ar.media.kyoto-u.ac.jp Wed Jul 4 07:22:54 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 04 Jul 2007 20:22:54 +0900 Subject: [Numpy-discussion] Scipy release In-Reply-To: References: <20070704104840.GG5831@crater.logilab.fr> <468B7E2B.1000604@ar.media.kyoto-u.ac.jp> Message-ID: <468B830E.4050807@ar.media.kyoto-u.ac.jp> John Reid wrote: > > > Well I don't really want to get into an argument about the definition of > 'quick as it can be'. > > I'm just trying to say that when people try the latest stable versions > of 2 libraries that are closely coupled, they can be put off by warnings > that appear out of the box. I'm pretty sure these warnings aren't > important but seeing as I don't work on the internals of the libraries I > don't know for sure. It looks like something that a quicker release > schedule would fix and improve confidence amongst novice users of scipy. I think most scipy developers are aware that releases are always too late for users. But the problem really is a lack of manpower, not a lack of love for users. > > That said I'm not the one building the releases, I was just curious how > the releases are managed and whether people who work on scipy are aware > of this. Probably most scipy developers aren't using the 0.5.2 version > with these warnings. I think most developers use subversion, indeed. It is not enforced, but my impression is that people try pretty hard to avoid breaking the main trunk (that is using the last subversion is not more buggy than a release). At least, I myself try to do so, and the fact that most scipy dev use subversion help. I always do a full run of the tests when I deploy a new version, though. David From david at ar.media.kyoto-u.ac.jp Wed Jul 4 07:26:01 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 04 Jul 2007 20:26:01 +0900 Subject: [Numpy-discussion] What is an empty matrix ? Message-ID: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> Hi, I was wondering what an empty matrix is, and what it is useful for (by empty matrix, I mean something created by numpy.matrix([])) ? Using those crash some functions (see for example scipy ticket #381), and I am not sure how to fix this bug. David From nwagner at iam.uni-stuttgart.de Wed Jul 4 07:39:43 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 04 Jul 2007 13:39:43 +0200 Subject: [Numpy-discussion] What is an empty matrix ? In-Reply-To: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> References: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> Message-ID: <468B86FF.7090702@iam.uni-stuttgart.de> David Cournapeau wrote: > Hi, > > I was wondering what an empty matrix is, and what it is useful for > (by empty matrix, I mean something created by numpy.matrix([])) ? Using > those crash some functions (see for example scipy ticket #381), and I am > not sure how to fix this bug. > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > empty(...) empty((d1,...,dn),dtype=float,order='C') Return a new array of shape (d1,...,dn) and given type with all its entries uninitialized. This can be faster than zeros. Nils From alexandre.fayolle at logilab.fr Wed Jul 4 07:42:10 2007 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Wed, 4 Jul 2007 13:42:10 +0200 Subject: [Numpy-discussion] Scipy release In-Reply-To: References: <20070704104840.GG5831@crater.logilab.fr> Message-ID: <20070704114210.GI5831@crater.logilab.fr> On Wed, Jul 04, 2007 at 11:59:28AM +0100, John Reid wrote: > Ok I'll try that although I guess that it turns off all warnings. that It does. See the documentation of the warnings module for the full syntax and fine grained control. -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science Reprise et maintenance de sites CPS: http://www.migration-cms.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 481 bytes Desc: Digital signature URL: From david at ar.media.kyoto-u.ac.jp Wed Jul 4 07:39:13 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 04 Jul 2007 20:39:13 +0900 Subject: [Numpy-discussion] What is an empty matrix ? In-Reply-To: <468B86FF.7090702@iam.uni-stuttgart.de> References: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> <468B86FF.7090702@iam.uni-stuttgart.de> Message-ID: <468B86E1.7050002@ar.media.kyoto-u.ac.jp> Nils Wagner wrote: > David Cournapeau wrote: >> Hi, >> >> I was wondering what an empty matrix is, and what it is useful for >> (by empty matrix, I mean something created by numpy.matrix([])) ? Using >> those crash some functions (see for example scipy ticket #381), and I am >> not sure how to fix this bug. >> >> David >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > empty(...) > empty((d1,...,dn),dtype=float,order='C') > > Return a new array of shape (d1,...,dn) and given type with all its > entries uninitialized. This can be faster than zeros. > I understand numpy.empty, but this is different: an empty matrix is not equivalent at all to empty. What I am talking about is what is the nature of something lie numpy.array((1, 0)) ? I am actually wondering whether it makes sense at all. a = numpy.matrix([]) b = a + 1 does not raise any error, but I don't see how this should be considered meaningful ? David From j.reid at mail.cryst.bbk.ac.uk Wed Jul 4 08:15:19 2007 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Wed, 04 Jul 2007 13:15:19 +0100 Subject: [Numpy-discussion] Scipy release In-Reply-To: <468B830E.4050807@ar.media.kyoto-u.ac.jp> References: <20070704104840.GG5831@crater.logilab.fr> <468B7E2B.1000604@ar.media.kyoto-u.ac.jp> <468B830E.4050807@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau wrote: > John Reid wrote: > I think most developers use subversion, indeed. It is not enforced, but > my impression is that people try pretty hard to avoid breaking the main > trunk (that is using the last subversion is not more buggy than a > release). At least, I myself try to do so, and the fact that most scipy > dev use subversion help. > > I always do a full run of the tests when I deploy a new version, though. > > David Thanks for the heads up, John. From openopt at ukr.net Wed Jul 4 08:21:00 2007 From: openopt at ukr.net (dmitrey) Date: Wed, 04 Jul 2007 15:21:00 +0300 Subject: [Numpy-discussion] What is an empty matrix ? In-Reply-To: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> References: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> Message-ID: <468B90AC.1000205@ukr.net> As for me, I used empty matrices in MATLAB as well as python rather often. Simple example: from numpy import array, hstack #... m = A.shape[1] a = array(()).reshape(0,m) for i in some_ind: a = hstack((a, A[i])) for i in some_ind2: a = hstack((a, Aeq[i])) return a other example: from numpy import append def myfunc(arr, arr2 = numpy.array(())) assert(arr.ndim==1 and arr2.ndim==1) return append(arr, arr2, 1) HTH, Dmitrey David Cournapeau wrote: > Hi, > > I was wondering what an empty matrix is, and what it is useful for > (by empty matrix, I mean something created by numpy.matrix([])) ? Using > those crash some functions (see for example scipy ticket #381), and I am > not sure how to fix this bug. > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > From j.reid at mail.cryst.bbk.ac.uk Wed Jul 4 09:21:35 2007 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Wed, 04 Jul 2007 14:21:35 +0100 Subject: [Numpy-discussion] What is an empty matrix ? In-Reply-To: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> References: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau wrote: > Hi, > > I was wondering what an empty matrix is, and what it is useful for > (by empty matrix, I mean something created by numpy.matrix([])) ? Using > those crash some functions (see for example scipy ticket #381), and I am > not sure how to fix this bug. > > David Empty input to an algorithm can often be handled naturally by empty matrices in the implementation. Quite often I find that if I've coded things right, the boundary empty input case is handled naturally in this way. I would prefer that no functions crashed when passed empty matrices. Just my 2 cents, John. From nwagner at iam.uni-stuttgart.de Wed Jul 4 09:46:07 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 04 Jul 2007 15:46:07 +0200 Subject: [Numpy-discussion] Scipy release In-Reply-To: <468B7E2B.1000604@ar.media.kyoto-u.ac.jp> References: <20070704104840.GG5831@crater.logilab.fr> <468B7E2B.1000604@ar.media.kyoto-u.ac.jp> Message-ID: <468BA49F.8030700@iam.uni-stuttgart.de> David Cournapeau wrote: > John Reid wrote: > >> Ok I'll try that although I guess that it turns off all warnings. that >> I'm concerned as well that scipy's release cycle isn't as quick as it >> could be. >> >> > Well, quite the contrary, it is as quick as it can be. If you think it > is too slow, don't hesitate to help :) > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > David, Can you close ticket #420 ? AFAIK, it is fixed in svn. http://projects.scipy.org/scipy/scipy/ticket/420 Nils From aisaac at american.edu Wed Jul 4 12:39:06 2007 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 4 Jul 2007 12:39:06 -0400 Subject: [Numpy-discussion] [SciPy-dev] numpy.cumproduct() documentation: bug? In-Reply-To: <468B56F8.6050708@ukr.net> References: <468B56F8.6050708@ukr.net> Message-ID: On Wed, 04 Jul 2007, dmitrey apparently wrote: > cumproduct(x, axis=None, dtype=None, out=None) > Sum the array over the given axis. Docstring bug. But it behaves right. Cheers, Alan Isaac From david at ar.media.kyoto-u.ac.jp Wed Jul 4 23:22:50 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 05 Jul 2007 12:22:50 +0900 Subject: [Numpy-discussion] What is an empty matrix ? In-Reply-To: References: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> Message-ID: <468C640A.1000001@ar.media.kyoto-u.ac.jp> John Reid wrote: > David Cournapeau wrote: >> Hi, >> >> I was wondering what an empty matrix is, and what it is useful for >> (by empty matrix, I mean something created by numpy.matrix([])) ? Using >> those crash some functions (see for example scipy ticket #381), and I am >> not sure how to fix this bug. >> >> David > > Empty input to an algorithm can often be handled naturally by empty > matrices in the implementation. Quite often I find that if I've coded > things right, the boundary empty input case is handled naturally in this > way. I would prefer that no functions crashed when passed empty matrices. Well, I think nobody argues that scipy function should crash whatever input you give :). The problem is more how to treat them. For example, using numpy.linalg.pinv crashes numpy right now, det and fft do not work, and norm returns 0. This is seems inconcistent to me. If norm is 0, why det should not be ? Personally, I would say both should be errors, but I don't use empty arrays, so I don't have a good grasp of their usefulness. David From j.reid at mail.cryst.bbk.ac.uk Thu Jul 5 07:45:17 2007 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Thu, 05 Jul 2007 12:45:17 +0100 Subject: [Numpy-discussion] What is an empty matrix ? In-Reply-To: <468C640A.1000001@ar.media.kyoto-u.ac.jp> References: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> <468C640A.1000001@ar.media.kyoto-u.ac.jp> Message-ID: Ok so crashing is always bad. What I should have said is that I think errors are bad in almost all cases as well. The norm returning zero seems sensible to me so perhaps inversions and such should raise exceptions. I would much prefer no errors were raised except where necessary. Like I said I tend to find that if I write code that handles the non-empty case naturally then I don't have to change it to do something sensible for the empty boundary case. John. From lbolla at gmail.com Thu Jul 5 08:01:40 2007 From: lbolla at gmail.com (lorenzo bolla) Date: Thu, 5 Jul 2007 14:01:40 +0200 Subject: [Numpy-discussion] f2py and openmp Message-ID: <80c99e790707050501g3459ef4ei52066fae0bf84@mail.gmail.com> hi all, I'm using f2py to compile a f90 function, parallelized with openmp, into a shared object, that I can import in python. the question is: when I call the function from python, how can I specify the number of threads to use? the usual way of doing it, with a common fortran executable, is setting the enviroment variable OMP_NUM_THREADS to the desired value. using os.environ to set it from python does not seem to work: the function is always executed using 4 processors (that is quite strange in itself: where does "4" comes from?). any hints? thank you all in advance, lorenzo. -------------- next part -------------- An HTML attachment was scrubbed... URL: From svetosch at gmx.net Thu Jul 5 08:42:34 2007 From: svetosch at gmx.net (Sven Schreiber) Date: Thu, 05 Jul 2007 13:42:34 +0100 Subject: [Numpy-discussion] What is an empty matrix ? In-Reply-To: <468C640A.1000001@ar.media.kyoto-u.ac.jp> References: <468B83C9.4050102@ar.media.kyoto-u.ac.jp> <468C640A.1000001@ar.media.kyoto-u.ac.jp> Message-ID: <468CE73A.1090303@gmx.net> David Cournapeau schrieb: > Well, I think nobody argues that scipy function should crash whatever > input you give :). The problem is more how to treat them. For example, > using numpy.linalg.pinv crashes numpy right now, det and fft do not > work, and norm returns 0. This is seems inconcistent to me. If norm is > 0, why det should not be ? Personally, I would say both should be > errors, but I don't use empty arrays, so I don't have a good grasp of > their usefulness. > First of all I suggest to change terminology from "empty" (because of the confusion with numpy.empty()) to something like zero-length array. Such zero-length arrays are useful for generic code, so I agree with other posters that errors should normally not be raised just because of this. Then I would suggest that all functions that return arrays should return some conformable zero-length array. For example, IMHO e = np.linalg.inv(np.ones((0,0))) should return another (0,0)-array (it crashes right now). For things like sum() or det(), I guess the problem is that such reduce-like methods return scalars for other good reasons, and therefore they cannot be zero-length. I don't see a better solution for this inconsistency except to document this and tell people to watch out; but maybe the real experts know better!? just my 2?-ct, sven From travis at enthought.com Fri Jul 6 08:09:41 2007 From: travis at enthought.com (Travis Vaught) Date: Fri, 6 Jul 2007 07:09:41 -0500 Subject: [Numpy-discussion] ANN: SciPy Conference Early Registration Reminder Message-ID: <5A9C5264-C85B-4DD8-836D-E63582058720@enthought.com> Greetings, The *SciPy 2007 Conference on Scientific Computing with Python* early registration deadline is July 15, 2007. After this date, the price for registration will increase from $150 to $200. More information on the Conference is here: http://www.scipy.org/ SciPy2007 The registration page is here: https://www.enthought.com/scipy07/ The Conference is to be held on August 16-17. Tutorial Sessions are being offered on August 14-15 (http://www.scipy.org/SciPy2007/ Tutorials). The price to attend Tutorials is $75. The Saturday following the Conference will hold a Sprint session for those interested in pitching in on particular development efforts. (suggestions welcome: http://www.scipy.org/SciPy2007/Sprints) Today is the deadline for abstract submissions for those wanting to present at the conference. Please email to abstracts at scipy.org by midnight US Central Time. From the conference web page: "If you are using Python in Scientific Computing, we'd love to hear from you. If you are interested in presenting at the conference, you may submit an abstract in Plain Text, PDF or MS Word formats to abstracts at scipy.org -- the deadline for abstract submission is July 6, 2007. Papers and/or presentation slides are acceptable and are due by August 3, 2007. Presentations will be allowed 30-35 minutes, depending on the final schedule." We're looking forward to another great gathering. Best, Travis From cookedm at physics.mcmaster.ca Fri Jul 6 11:07:10 2007 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri, 6 Jul 2007 11:07:10 -0400 Subject: [Numpy-discussion] how do I configure with gfortran In-Reply-To: References: <46869505.3080209@charter.net> <4686A644.4040402@gmail.com> <4686B15C.4070507@charter.net> <4686B673.1030908@gmail.com> <4686BDC1.9070704@stsci.edu> <4686C05E.4090801@gmail.com> Message-ID: <66EE0EDD-6218-4D73-837D-D658BDA9830A@physics.mcmaster.ca> On Jul 2, 2007, at 12:02 , Andrew Jaffe wrote: > I wrote: > >> This is slightly off-topic, but probably of interest to anyone >> reading >> this thread: >> >> Is there any reason why we don't use --fcompiler=gfortran as an alias >> for --fcompiler=gfortran (and --fcompiler=g77 as an alias for >> --fcompiler=gnu, for that matter)? > > But, sorry, of course this should be: > > Is there any reason why we don't use --fcompiler=gfortran as an > alias > for --fcompiler=gnu95? > >> Those seem to me to be much more mnemonic names... > I've added support in r3882 for aliases, so --fcompiler=gfortran works as --fcompiler=gnu95 (and g77 for gnu, and ifort for intel). -- |>|\/|< /------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From mathewww at charter.net Fri Jul 6 12:19:24 2007 From: mathewww at charter.net (Mathew) Date: Fri, 06 Jul 2007 09:19:24 -0700 Subject: [Numpy-discussion] how do I configure with gfortran In-Reply-To: <66EE0EDD-6218-4D73-837D-D658BDA9830A@physics.mcmaster.ca> References: <46869505.3080209@charter.net> <4686A644.4040402@gmail.com> <4686B15C.4070507@charter.net> <4686B673.1030908@gmail.com> <4686BDC1.9070704@stsci.edu> <4686C05E.4090801@gmail.com> <66EE0EDD-6218-4D73-837D-D658BDA9830A@physics.mcmaster.ca> Message-ID: <468E6B8C.1080509@charter.net> nope. try again % python setup.py -v config_fc --fcompiler=gfortran install Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_3882 blas_opt_info: blas_mkl_info: ( library_dirs = /u/vento0/myeates/lib:/usr/lib ) ( include_dirs = /usr/include:/u/vento0/myeates/include ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /u/vento0/myeates/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS ( library_dirs = /u/vento0/myeates/lib:/usr/lib ) (paths: ) (paths: ) (paths: ) (paths: /u/vento0/myeates/lib/libptf77blas.a) (paths: ) (paths: /u/vento0/myeates/lib/libptcblas.a) (paths: ) (paths: /u/vento0/myeates/lib/libatlas.a) Setting PTATLAS=ATLAS ( include_dirs = /usr/include:/u/vento0/myeates/include ) (paths: ) (paths: /u/vento0/myeates/include/atlas) (paths: /u/vento0/myeates/include/cblas.h) Setting PTATLAS=ATLAS ( library_dirs = /u/vento0/myeates/lib:/usr/lib ) (paths: ) (paths: ) FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/u/vento0/myeates/lib'] language = c include_dirs = ['/u/vento0/myeates/include'] new_compiler returns distutils.unixccompiler.UnixCCompiler customize GnuFCompiler find_executable('g77') Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found David M. Cooke wrote: > On Jul 2, 2007, at 12:02 , Andrew Jaffe wrote: > > >> I wrote: >> >> >>> This is slightly off-topic, but probably of interest to anyone >>> reading >>> this thread: >>> >>> Is there any reason why we don't use --fcompiler=gfortran as an alias >>> for --fcompiler=gfortran (and --fcompiler=g77 as an alias for >>> --fcompiler=gnu, for that matter)? >>> >> But, sorry, of course this should be: >> >> Is there any reason why we don't use --fcompiler=gfortran as an >> alias >> for --fcompiler=gnu95? >> >> >>> Those seem to me to be much more mnemonic names... >>> > > I've added support in r3882 for aliases, so --fcompiler=gfortran > works as --fcompiler=gnu95 (and g77 for gnu, and ifort for intel). > > From oliphant at ee.byu.edu Fri Jul 6 15:43:22 2007 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri, 06 Jul 2007 13:43:22 -0600 Subject: [Numpy-discussion] segfault caused by incorrect Py_DECREF in ufunc In-Reply-To: References: Message-ID: <468E9B5A.3010706@ee.byu.edu> Tom Denniston wrote: >Below is the code around line 900 for ufuncobject.c >(http://svn.scipy.org/svn/numpy/trunk/numpy/core/src/ufuncobject.c) > >There is a decref labeled with ">>>" below that is incorrect. As per >the python documentation >(http://docs.python.org/api/dictObjects.html): > >#PyObject* PyDict_GetItem( PyObject *p, PyObject *key) ># >#Return value: Borrowed reference. >#Return the object from dictionary p which has a key key. Return NULL >if the key #key is not present, but without setting an exception. > >PyDict_GetItem returns a borrowed reference. Therefore this code does >not own the contents to which the obj pointer points and should not >decref on it. Simply removing the Py_DECREF(obj) line gets rid of the >segfault. > >I was wondering if someone could confirm that my interpretation is >correct and remove the line. I don't have access to the svn or know >how to change it. > >Most people do not see this problem because it only affects user defined types. > > You are right on with your analysis. Thank you for the test, check, and fix. I've changed it in SVN. Best regards, -Travis From tim.hochberg at ieee.org Fri Jul 6 16:13:13 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 6 Jul 2007 13:13:13 -0700 Subject: [Numpy-discussion] Should bool_ subclass int? Message-ID: I'm working on getting some old code working with numpy and I noticed that bool_ is not a subclass of int. Given that python's bool subclasses into and that the other scalar types are subclasses of their respective counterparts it seems at first glance that numpy.bool_ should subclass python's bool, which in turn subclasses int. Or am I missing something here? -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From l.mastrodomenico at gmail.com Fri Jul 6 16:47:31 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Fri, 6 Jul 2007 22:47:31 +0200 Subject: [Numpy-discussion] [SciPy-dev] PEP 368: Standard image protocol and class In-Reply-To: References: Message-ID: 2007/7/4, Bill Baxter : > I think a PEP that aims to be a generic image protocol should > support at least 32 bit floats if not 64-bit doubles and 16 bit > "Half"s used by some GPUs (and supported by the OpenEXR format). Yes, the next version of the PEP will include float16 and float32 versions of both the L and the RGBA modes. The float16 type is the IEEE 754r one, implemented in software and compatible with OpenGL and OpenEXR. -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From robert.kern at gmail.com Fri Jul 6 17:25:27 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 06 Jul 2007 16:25:27 -0500 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: Message-ID: <468EB347.2000608@gmail.com> Timothy Hochberg wrote: > > I'm working on getting some old code working with numpy and I noticed > that bool_ is not a subclass of int. Given that python's bool subclasses > into and that the other scalar types are subclasses of their respective > counterparts it seems at first glance that numpy.bool_ should subclass > python's bool, which in turn subclasses int. Or am I missing something here? That would certainly be desirable. There might be a technical reason why it's not, but if you can do it, and it seems to work for you, let's check it in. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tom.denniston at alum.dartmouth.org Fri Jul 6 19:40:28 2007 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Fri, 6 Jul 2007 18:40:28 -0500 Subject: [Numpy-discussion] segfault caused by incorrect Py_DECREF in ufunc In-Reply-To: <468E9B5A.3010706@ee.byu.edu> References: <468E9B5A.3010706@ee.byu.edu> Message-ID: Thanks! On 7/6/07, Travis Oliphant wrote: > Tom Denniston wrote: > > >Below is the code around line 900 for ufuncobject.c > >(http://svn.scipy.org/svn/numpy/trunk/numpy/core/src/ufuncobject.c) > > > >There is a decref labeled with ">>>" below that is incorrect. As per > >the python documentation > >(http://docs.python.org/api/dictObjects.html): > > > >#PyObject* PyDict_GetItem( PyObject *p, PyObject *key) > ># > >#Return value: Borrowed reference. > >#Return the object from dictionary p which has a key key. Return NULL > >if the key #key is not present, but without setting an exception. > > > >PyDict_GetItem returns a borrowed reference. Therefore this code does > >not own the contents to which the obj pointer points and should not > >decref on it. Simply removing the Py_DECREF(obj) line gets rid of the > >segfault. > > > >I was wondering if someone could confirm that my interpretation is > >correct and remove the line. I don't have access to the svn or know > >how to change it. > > > >Most people do not see this problem because it only affects user defined types. > > > > > > You are right on with your analysis. Thank you for the test, check, and > fix. > > I've changed it in SVN. > > Best regards, > > -Travis > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From v-nijs at kellogg.northwestern.edu Fri Jul 6 20:20:08 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Fri, 06 Jul 2007 19:20:08 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: I wrote the attached (small) program to read in a text/csv file with different data types and convert it into a recarray without having to pre-specify the dtypes or variables names. I am just too lazy to type-in stuff like that :) The supported types are int, float, dates, and strings. I works pretty well but it is not (yet) as fast as I would like so I was wonder if any of the numpy experts on this list might have some suggestion on how to speed it up. I need to read 500MB-1GB files so speed is important for me. Thanks, Vincent -------------- next part -------------- A non-text attachment was scrubbed... Name: load.py Type: application/octet-stream Size: 2601 bytes Desc: not available URL: From jdh2358 at gmail.com Fri Jul 6 21:53:51 2007 From: jdh2358 at gmail.com (John Hunter) Date: Fri, 6 Jul 2007 20:53:51 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: <88e473830707061853k24fe56d9mae19660c8813991a@mail.gmail.com> On 7/6/07, Vincent Nijs wrote: > I wrote the attached (small) program to read in a text/csv file with > different data types and convert it into a recarray without having to > pre-specify the dtypes or variables names. I am just too lazy to type-in > stuff like that :) The supported types are int, float, dates, and strings. > > I works pretty well but it is not (yet) as fast as I would like so I was > wonder if any of the numpy experts on this list might have some suggestion > on how to speed it up. I need to read 500MB-1GB files so speed is important > for me. In matplotlib.mlab svn, there is a function csv2rec that does the same. You may want to compare implementations in case we can fruitfully cross pollinate them. In the examples directy, there is an example script examples/loadrec.py From oliphant.travis at ieee.org Sat Jul 7 01:45:01 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 06 Jul 2007 23:45:01 -0600 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: Message-ID: <468F285D.30308@ieee.org> Timothy Hochberg wrote: > > I'm working on getting some old code working with numpy and I noticed > that bool_ is not a subclass of int. Given that python's bool > subclasses into and that the other scalar types are subclasses of > their respective counterparts it seems at first glance that > numpy.bool_ should subclass python's bool, which in turn subclasses > int. Or am I missing something here? The reason it is not, is because it is not binary compatible with Python's integer. The numpy bool_ is always only 8-bits while the Python integer is 32-bits or 64-bits. This could be changed I suspect, but then it would break the relationship between scalars and their array counterparts and I'm sure we would not want to bump up all bool arrays to 32 or 64-bits. -Travis From stefan at sun.ac.za Sat Jul 7 05:10:58 2007 From: stefan at sun.ac.za (stefan) Date: Sat, 7 Jul 2007 10:10:58 +0100 Subject: [Numpy-discussion] Buildbot for numpy In-Reply-To: References: Message-ID: On Mon, 2 Jul 2007 17:26:15 -0700, "Barry Wark" wrote: > On a side note, buildbot.scipy.org goes to the DSP lab, Univ. of > Stellenbosch's home page, not the buildbot status page. Sorry about that -- I misconfigured Apache. Everything should be fine now. Cheers St?fan From tim.hochberg at ieee.org Sat Jul 7 11:34:19 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sat, 7 Jul 2007 08:34:19 -0700 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: <468F285D.30308@ieee.org> References: <468F285D.30308@ieee.org> Message-ID: On 7/6/07, Travis Oliphant wrote: > > Timothy Hochberg wrote: > > > > I'm working on getting some old code working with numpy and I noticed > > that bool_ is not a subclass of int. Given that python's bool > > subclasses into and that the other scalar types are subclasses of > > their respective counterparts it seems at first glance that > > numpy.bool_ should subclass python's bool, which in turn subclasses > > int. Or am I missing something here? > The reason it is not, is because it is not binary compatible with > Python's integer. The numpy bool_ is always only 8-bits while the > Python integer is 32-bits or 64-bits. > > This could be changed I suspect, but then it would break the > relationship between scalars and their array counterparts Do you have and idea off the top of your head head how painful this would be from an implementation standpoint. And is there a theoretical reason that it is important that the scalar and array implementations match? I would think that, conceptually, they are all 1-bit integers, and it seems that the 8-bit, versus 32- or 64-bits is just an implementation detail. My case is not particularly pressing or important, but I have a feeling that this is going to bite other people eventually. In particular, if you pull a value out of a boolean array and pass it to some third party module that doesn't know about numpy. If that function is doing some sort of check on argument type, which while not common does happen, then it will fail. The workaround is straightforward of course, simply apply bool to scalars when you get them back if you're going to be passing them to a finicky function. That's kind of clunky and surprising though. and I'm sure > we would not want to bump up all bool arrays to 32 or 64-bits. No. I wouldn't think so. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlosjosepita at gmail.com Wed Jul 4 02:32:32 2007 From: carlosjosepita at gmail.com (Carlos Pita) Date: Wed, 4 Jul 2007 03:32:32 -0300 Subject: [Numpy-discussion] arange and linspace could take an output array Message-ID: <7798eaa0707032332y344b48f3i510abf8370d3dd90@mail.gmail.com> Hi all, I have a few cases where an output array with samples for a linear envelope should be composed by several linear segments, each one the output of arange o linspace. It would be preferable in terms of performance to write this segments directly to subviews of the output array instead of generating temporary arrays and then copying them to the final one. So why don't provide output arguments for arange and linspace? Thank you in advance. Cheers, Carlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlosjosepita at gmail.com Wed Jul 4 02:35:35 2007 From: carlosjosepita at gmail.com (Carlos Pita) Date: Wed, 4 Jul 2007 03:35:35 -0300 Subject: [Numpy-discussion] arange and linspace could take an output array Message-ID: <7798eaa0707032335v63dd4b2dxd8f313a90cbb5c90@mail.gmail.com> Hi all, I have a few cases where an output array with samples for a linear envelope should be composed by several linear segments, each one the output of arange o linspace. It would be preferable in terms of performance to write this segments directly to subviews of the output array instead of generating temporary arrays and then copying them to the final one. So why don't provide output arguments for arange and linspace? Thank you in advance. Cheers, Carlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From b3i4old02 at sneakemail.com Sun Jul 1 10:21:28 2007 From: b3i4old02 at sneakemail.com (Michael Hoffman) Date: Sun, 01 Jul 2007 15:21:28 +0100 Subject: [Numpy-discussion] Building numpy 1.0.3-2 on Linux 2.6.8 i686 (Debian 3.1) In-Reply-To: <46879518.8020809@ar.media.kyoto-u.ac.jp> References: <46863ADC.6010409@ar.media.kyoto-u.ac.jp> <46879518.8020809@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau wrote: > Mmh, 3.1 is sarge, which has python2.4 as default, right ? There may be > a mismatch somewhere... Particularly since you have two python include > path (/nfs/acari/mh5/include/python2.5 and > /software/python-2.5/include/python2.5) which are non standart. Are you > using those on purpose ? Why not the debian python devel package (do you > use your own compiled python ?) I'm not the sysadmin, and I don't control /usr or /software. My home directory is /nfs/acari/mh5. I thought I would start over to control for any weirdness in the installation. This is on a different machine--a x86_64 system (Linux 2.6.5). I built a fresh Python 2.5.1 from the tarball using: ./configure --prefix=/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1 No dice, still doesn't work. Since the problem seems to be missing -shared arguments to g77, I decided to rebuild my Python using --enable-shared. That didn't help either. As a final resort, I cut and pasted the g77 line that produced the error, added "-shared" to it, and then ran python setup.py build again. I repeated this process for the next Fortran library as well. This resulted in a working build, which passed all tests when installed. Elsewhere, Robert Kern recommends using python setup.py -v build and posting the full results to debug installations. So here goes: """ $ CFLAGS= LDFLAGS= LD_LIBRARY_PATH=/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/bin/python setup.py -v build non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_3844 blas_opt_info: blas_mkl_info: ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) ( include_dirs = /usr/local/include:/usr/include:/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /usr/local/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) (paths: ) (paths: ) (paths: /usr/lib/atlas) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) (paths: ) (paths: ) (paths: /usr/lib/atlas) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /usr/local/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /usr/lib/atlas (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE /lustre/work1/ensembl/mh5/src/packages/numpy-1.0.3/numpy/distutils/system_info.py:1314: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) (paths: ) (paths: ) libraries blas not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) libraries blas not found in /usr/local/lib (paths: /usr/lib/libblas.so) ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) ( include_dirs = /usr/local/include:/usr/include:/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /usr/local/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) (paths: ) (paths: ) (paths: /usr/lib/atlas) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) libraries lapack_atlas not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib (paths: ) (paths: ) libraries lapack_atlas not found in /usr/local/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas (paths: ) (paths: ) libraries lapack_atlas not found in /usr/lib/atlas (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries ptf77blas,ptcblas,atlas not found in /usr/lib (paths: ) (paths: ) libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) (paths: ) (paths: ) (paths: /usr/lib/atlas) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) libraries lapack_atlas not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /usr/local/lib (paths: ) (paths: ) libraries lapack_atlas not found in /usr/local/lib (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /usr/lib/atlas (paths: ) (paths: ) libraries lapack_atlas not found in /usr/lib/atlas (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) (paths: ) libraries f77blas,cblas,atlas not found in /usr/lib (paths: ) (paths: ) libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE /lustre/work1/ensembl/mh5/src/packages/numpy-1.0.3/numpy/distutils/system_info.py:1221: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) (paths: ) (paths: ) libraries lapack not found in /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib (paths: ) (paths: ) libraries lapack not found in /usr/local/lib (paths: /usr/lib/liblapack.so) ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 ( library_dirs = /nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib:/usr/local/lib:/usr/lib ) FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src building py_modules sources creating build creating build/src.linux-x86_64-2.5 creating build/src.linux-x86_64-2.5/numpy creating build/src.linux-x86_64-2.5/numpy/distutils building extension "numpy.core.multiarray" sources creating build/src.linux-x86_64-2.5/numpy/core Generating build/src.linux-x86_64-2.5/numpy/core/config.h new_compiler returns distutils.unixccompiler.UnixCCompiler new_fcompiler returns numpy.distutils.fcompiler.gnu.GnuFCompiler customize GnuFCompiler GnuFCompiler instance properties: archiver = ['ar', '-cr'] compile_switch = '-c' compiler_f77 = ['g77', '-g', '-Wall', '-fno-second-underscore', '- fPIC', '-O3', '-funroll-loops', '-mmmx', '-m3dnow', '- msse2', '-msse'] compiler_f90 = None compiler_fix = None libraries = ['g2c-pic'] library_dirs = [] linker_exe = ['g77'] linker_so = ['g77'] object_switch = '-o ' ranlib = ['ranlib'] version = LooseVersion ('3.3.5') version_cmd = ['g77', '--version'] customize GnuFCompiler customize GnuFCompiler using config C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-I/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include/python2.5 -Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:50: warning: int format, different type arg (arg 3) _configtest.c:57: warning: int format, different type arg (arg 3) _configtest.c:72: warning: int format, different type arg (arg 3) gcc _configtest.o -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -L/usr/local/lib -L/usr/lib -o _configtest _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c gcc _configtest.o -o _configtest _configtest.o(.text+0xd): In function `main': /nfs/acari/mh5/src/packages/numpy-1.0.3/_configtest.c:5: undefined reference to `exp' collect2: ld returned 1 exit status _configtest.o(.text+0xd): In function `main': /nfs/acari/mh5/src/packages/numpy-1.0.3/_configtest.c:5: undefined reference to `exp' collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c gcc _configtest.o -lm -o _configtest _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c: In function `main': _configtest.c:4: warning: statement with no effect gcc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest adding 'build/src.linux-x86_64-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/__multiarray_api.h' to sources. creating build/src.linux-x86_64-2.5/numpy/core/src conv_template:> build/src.linux-x86_64-2.5/numpy/core/src/scalartypes.inc adding 'build/src.linux-x86_64-2.5/numpy/core/src' to include_dirs. conv_template:> build/src.linux-x86_64-2.5/numpy/core/src/arraytypes.inc numpy.core - nothing done with h_files= ['build/src.linux-x86_64-2.5/numpy/core/src/scalartypes.inc', 'build/src.linux-x86_64-2.5/numpy/core/src/arraytypes.inc', 'build/src.linux-x86_64-2.5/numpy/core/config.h', 'build/src.linux-x86_64-2.5/numpy/core/__multiarray_api.h'] building extension "numpy.core.umath" sources adding 'build/src.linux-x86_64-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/__ufunc_api.h' to sources. conv_template:> build/src.linux-x86_64-2.5/numpy/core/src/umathmodule.c adding 'build/src.linux-x86_64-2.5/numpy/core/src' to include_dirs. numpy.core - nothing done with h_files= ['build/src.linux-x86_64-2.5/numpy/core/src/scalartypes.inc', 'build/src.linux-x86_64-2.5/numpy/core/src/arraytypes.inc', 'build/src.linux-x86_64-2.5/numpy/core/config.h', 'build/src.linux-x86_64-2.5/numpy/core/__ufunc_api.h'] building extension "numpy.core._sort" sources adding 'build/src.linux-x86_64-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/__multiarray_api.h' to sources. conv_template:> build/src.linux-x86_64-2.5/numpy/core/src/_sortmodule.c numpy.core - nothing done with h_files= ['build/src.linux-x86_64-2.5/numpy/core/config.h', 'build/src.linux-x86_64-2.5/numpy/core/__multiarray_api.h'] building extension "numpy.core.scalarmath" sources adding 'build/src.linux-x86_64-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/__multiarray_api.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/__ufunc_api.h' to sources. conv_template:> build/src.linux-x86_64-2.5/numpy/core/src/scalarmathmodule.c numpy.core - nothing done with h_files= ['build/src.linux-x86_64-2.5/numpy/core/config.h', 'build/src.linux-x86_64-2.5/numpy/core/__multiarray_api.h', 'build/src.linux-x86_64-2.5/numpy/core/__ufunc_api.h'] building extension "numpy.core._dotblas" sources adding 'numpy/core/blasdot/_dotblas.c' to sources. building extension "numpy.lib._compiled_base" sources building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources creating build/src.linux-x86_64-2.5/numpy/linalg adding 'numpy/linalg/lapack_litemodule.c' to sources. building extension "numpy.random.mtrand" sources creating build/src.linux-x86_64-2.5/numpy/random C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -c' gcc: _configtest.c _configtest.c:7:2: #error No _WIN32 _configtest.c:7:2: #error No _WIN32 failure. removing: _configtest.c _configtest.o building data_files sources running build_py creating build/lib.linux-x86_64-2.5 creating build/lib.linux-x86_64-2.5/numpy copying numpy/__init__.py -> build/lib.linux-x86_64-2.5/numpy copying numpy/_import_tools.py -> build/lib.linux-x86_64-2.5/numpy copying numpy/add_newdocs.py -> build/lib.linux-x86_64-2.5/numpy copying numpy/ctypeslib.py -> build/lib.linux-x86_64-2.5/numpy copying numpy/dual.py -> build/lib.linux-x86_64-2.5/numpy copying numpy/matlib.py -> build/lib.linux-x86_64-2.5/numpy copying numpy/setup.py -> build/lib.linux-x86_64-2.5/numpy copying numpy/version.py -> build/lib.linux-x86_64-2.5/numpy copying build/src.linux-x86_64-2.5/numpy/__config__.py -> build/lib.linux-x86_64-2.5/numpy creating build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/__init__.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/__version__.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/ccompiler.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/conv_template.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/core.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/cpuinfo.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/exec_command.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/extension.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/from_template.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/info.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/intelccompiler.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/interactive.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/lib2def.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/line_endings.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/log.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/mingw32ccompiler.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/misc_util.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/setup.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/system_info.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/distutils/unixccompiler.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying build/src.linux-x86_64-2.5/numpy/distutils/__config__.py -> build/lib.linux-x86_64-2.5/numpy/distutils creating build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/__init__.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/bdist_rpm.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/build.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/build_clib.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/build_ext.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/build_py.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/build_scripts.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/build_src.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/config.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/config_compiler.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/egg_info.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/install.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/install_data.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/install_headers.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command copying numpy/distutils/command/sdist.py -> build/lib.linux-x86_64-2.5/numpy/distutils/command creating build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/__init__.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/absoft.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/compaq.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/g95.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/gnu.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/hpux.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/ibm.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/intel.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/lahey.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/mips.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/nag.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/none.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/pg.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/sun.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler copying numpy/distutils/fcompiler/vast.py -> build/lib.linux-x86_64-2.5/numpy/distutils/fcompiler creating build/lib.linux-x86_64-2.5/numpy/testing copying numpy/testing/__init__.py -> build/lib.linux-x86_64-2.5/numpy/testing copying numpy/testing/info.py -> build/lib.linux-x86_64-2.5/numpy/testing copying numpy/testing/numpytest.py -> build/lib.linux-x86_64-2.5/numpy/testing copying numpy/testing/setup.py -> build/lib.linux-x86_64-2.5/numpy/testing copying numpy/testing/utils.py -> build/lib.linux-x86_64-2.5/numpy/testing creating build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/__init__.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/__svn_version__.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/__version__.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/auxfuncs.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/capi_maps.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/cb_rules.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/cfuncs.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/common_rules.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/crackfortran.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/diagnose.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/f2py2e.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/f2py_testing.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/f90mod_rules.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/func2subr.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/info.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/rules.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/setup.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/f2py/use_rules.py -> build/lib.linux-x86_64-2.5/numpy/f2py creating build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/__init__.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/api.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/main.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/nary.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/py_wrap.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/py_wrap_subprogram.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/py_wrap_type.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/setup.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib copying numpy/f2py/lib/wrapper_base.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib creating build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/Fortran2003.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/__init__.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/api.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/base_classes.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/block_statements.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/parsefortran.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/pattern_tools.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/readfortran.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/sourceinfo.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/splitline.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/statements.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/test_Fortran2003.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/test_parser.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/typedecl_statements.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser copying numpy/f2py/lib/parser/utils.py -> build/lib.linux-x86_64-2.5/numpy/f2py/lib/parser creating build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/__init__.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/__svn_version__.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/_internal.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/arrayprint.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/defchararray.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/defmatrix.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/fromnumeric.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/info.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/ma.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/memmap.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/numeric.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/numerictypes.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/records.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/setup.py -> build/lib.linux-x86_64-2.5/numpy/core copying numpy/core/code_generators/generate_array_api.py -> build/lib.linux-x86_64-2.5/numpy/core creating build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/__init__.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/arraysetops.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/convdtype.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/function_base.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/getlimits.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/index_tricks.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/info.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/machar.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/polynomial.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/scimath.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/setup.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/shape_base.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/twodim_base.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/type_check.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/ufunclike.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/user_array.py -> build/lib.linux-x86_64-2.5/numpy/lib copying numpy/lib/utils.py -> build/lib.linux-x86_64-2.5/numpy/lib creating build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/__init__.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/alter_code1.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/alter_code2.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/array_printer.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/arrayfns.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/compat.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/fft.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/fix_default_axis.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/functions.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/linear_algebra.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/ma.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/matrix.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/misc.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/mlab.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/precision.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/random_array.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/rng.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/rng_stats.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/setup.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/typeconv.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/ufuncs.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric copying numpy/oldnumeric/user_array.py -> build/lib.linux-x86_64-2.5/numpy/oldnumeric creating build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/__init__.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/alter_code1.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/alter_code2.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/compat.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/convolve.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/fft.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/functions.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/image.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/linear_algebra.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/ma.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/matrix.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/mlab.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/nd_image.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/numerictypes.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/random_array.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/session.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/setup.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/ufuncs.py -> build/lib.linux-x86_64-2.5/numpy/numarray copying numpy/numarray/util.py -> build/lib.linux-x86_64-2.5/numpy/numarray creating build/lib.linux-x86_64-2.5/numpy/fft copying numpy/fft/__init__.py -> build/lib.linux-x86_64-2.5/numpy/fft copying numpy/fft/fftpack.py -> build/lib.linux-x86_64-2.5/numpy/fft copying numpy/fft/helper.py -> build/lib.linux-x86_64-2.5/numpy/fft copying numpy/fft/info.py -> build/lib.linux-x86_64-2.5/numpy/fft copying numpy/fft/setup.py -> build/lib.linux-x86_64-2.5/numpy/fft creating build/lib.linux-x86_64-2.5/numpy/linalg copying numpy/linalg/__init__.py -> build/lib.linux-x86_64-2.5/numpy/linalg copying numpy/linalg/info.py -> build/lib.linux-x86_64-2.5/numpy/linalg copying numpy/linalg/linalg.py -> build/lib.linux-x86_64-2.5/numpy/linalg copying numpy/linalg/setup.py -> build/lib.linux-x86_64-2.5/numpy/linalg creating build/lib.linux-x86_64-2.5/numpy/random copying numpy/random/__init__.py -> build/lib.linux-x86_64-2.5/numpy/random copying numpy/random/info.py -> build/lib.linux-x86_64-2.5/numpy/random copying numpy/random/setup.py -> build/lib.linux-x86_64-2.5/numpy/random running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize GnuFCompiler GnuFCompiler instance properties: archiver = ['ar', '-cr'] compile_switch = '-c' compiler_f77 = ['g77', '-g', '-Wall', '-fno-second-underscore', '- fPIC', '-O3', '-funroll-loops', '-mmmx', '-m3dnow', '- msse2', '-msse'] compiler_f90 = None compiler_fix = None libraries = ['g2c-pic'] library_dirs = [] linker_exe = ['g77'] linker_so = ['g77'] object_switch = '-o ' ranlib = ['ranlib'] version = LooseVersion ('3.3.5') version_cmd = ['g77', '--version'] customize GnuFCompiler customize GnuFCompiler using build_ext building 'numpy.core.multiarray' extension compiling C sources C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC creating build/temp.linux-x86_64-2.5 creating build/temp.linux-x86_64-2.5/numpy creating build/temp.linux-x86_64-2.5/numpy/core creating build/temp.linux-x86_64-2.5/numpy/core/src compile options: '-Ibuild/src.linux-x86_64-2.5/numpy/core/src -Inumpy/core/include -Ibuild/src.linux-x86_64-2.5/numpy/core -I/nfs/acari/mh5/include/python2.5 -I/nfs/acari/mh5/include -I/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include/python2.5 -Inumpy/core/src -Inumpy/core/include -c' gcc: numpy/core/src/multiarraymodule.c In file included from numpy/core/src/arrayobject.c:510, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/scalartypes.inc.src: In function `scalar_value': numpy/core/src/scalartypes.inc.src:79: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:80: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:81: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:82: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:83: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:84: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:85: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:86: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:89: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:90: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:91: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:92: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:93: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:97: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:98: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:99: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:100: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:103: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:104: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:105: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:109: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:110: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:111: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:112: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:113: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:115: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:510, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/scalartypes.inc.src: In function `PyArray_ScalarFromObject': numpy/core/src/scalartypes.inc.src:341: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:510, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/scalartypes.inc.src: In function `gentype_byteswap': numpy/core/src/scalartypes.inc.src:1138: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:510, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/scalartypes.inc.src: In function `gentype_reduce': numpy/core/src/scalartypes.inc.src:1259: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src: In function `bool_arrtype_new': numpy/core/src/scalartypes.inc.src:1834: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:1836: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src: In function `PyArray_DescrFromTypeObject': numpy/core/src/scalartypes.inc.src:2608: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2608: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2608: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2612: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2614: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2614: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2617: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2619: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2621: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalartypes.inc.src:2621: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:511, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_BOOL': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_BYTE': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_UBYTE': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_SHORT': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_USHORT': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_INT': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_UINT': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_LONG': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_ULONG': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_LONGLONG': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_ULONGLONG': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_FLOAT': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_DOUBLE': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_LONGDOUBLE': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_CFLOAT': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_CDOUBLE': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_CLONGDOUBLE': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_STRING': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_UNICODE': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `OBJECT_to_VOID': numpy/core/src/arraytypes.inc.src:789: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:511, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arraytypes.inc.src: In function `BOOL_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `BYTE_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `UBYTE_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `SHORT_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `USHORT_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `INT_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `UINT_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONG_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `ULONG_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONGLONG_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `ULONGLONG_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `FLOAT_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `DOUBLE_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONGDOUBLE_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CFLOAT_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CDOUBLE_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CLONGDOUBLE_to_STRING': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `BOOL_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `BYTE_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `UBYTE_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `SHORT_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `USHORT_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `INT_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `UINT_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONG_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `ULONG_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONGLONG_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `ULONGLONG_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `FLOAT_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `DOUBLE_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONGDOUBLE_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CFLOAT_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CDOUBLE_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CLONGDOUBLE_to_UNICODE': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `BOOL_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `BYTE_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `UBYTE_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `SHORT_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `USHORT_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `INT_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `UINT_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONG_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `ULONG_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONGLONG_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `ULONGLONG_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `FLOAT_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `DOUBLE_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `LONGDOUBLE_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CFLOAT_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CDOUBLE_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `CLONGDOUBLE_to_VOID': numpy/core/src/arraytypes.inc.src:857: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:858: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:511, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arraytypes.inc.src: In function `OBJECT_dot': numpy/core/src/arraytypes.inc.src:1913: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:1914: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src: In function `set_typeinfo': numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2385: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2403: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2403: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2403: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2403: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2403: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2403: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2414: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2421: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2428: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2435: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2444: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2445: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2446: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2447: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2448: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2449: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2450: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2451: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2452: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraytypes.inc.src:2453: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arrayobject.c: In function `PyArray_SetMap': numpy/core/src/arrayobject.c:2441: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arrayobject.c: In function `array_richcompare': numpy/core/src/arrayobject.c:4685: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4686: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4706: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4707: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4747: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4748: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4753: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4754: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4772: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4773: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4810: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:4811: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:4883, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arraymethods.c: In function `array_tofile': numpy/core/src/arraymethods.c:422: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/arrayobject.c:4883, from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arraymethods.c: In function `array_reduce': numpy/core/src/arraymethods.c:1124: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arraymethods.c:1124: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arrayobject.c: In function `array_strides_set': numpy/core/src/arrayobject.c:6032: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arrayobject.c: In function `array_dataptr_get': numpy/core/src/arrayobject.c:6127: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:6128: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arrayobject.c: In function `PyArray_FromInterface': numpy/core/src/arrayobject.c:8243: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:8247: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:8277: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arrayobject.c: In function `arraydescr_isnative_get': numpy/core/src/arrayobject.c:10923: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:10923: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arraydescr_hasobject_get': numpy/core/src/arrayobject.c:10943: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:10945: warning: dereferencing type-punned pointer will break strict-aliasing rules In file included from numpy/core/src/multiarraymodule.c:97: numpy/core/src/arrayobject.c: In function `arraydescr_richcompare': numpy/core/src/arrayobject.c:11452: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11454: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11458: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11460: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11464: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11466: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11470: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11472: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11476: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11478: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11482: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11484: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_contiguous_get': numpy/core/src/arrayobject.c:11692: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11692: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_fortran_get': numpy/core/src/arrayobject.c:11693: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11693: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_updateifcopy_get': numpy/core/src/arrayobject.c:11694: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11694: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_owndata_get': numpy/core/src/arrayobject.c:11695: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11695: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_aligned_get': numpy/core/src/arrayobject.c:11696: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11696: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_writeable_get': numpy/core/src/arrayobject.c:11697: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11697: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_behaved_get': numpy/core/src/arrayobject.c:11699: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11699: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_carray_get': numpy/core/src/arrayobject.c:11700: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11700: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_forc_get': numpy/core/src/arrayobject.c:11709: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11711: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_fnc_get': numpy/core/src/arrayobject.c:11724: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11726: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_farray_get': numpy/core/src/arrayobject.c:11740: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11742: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_updateifcopy_set': numpy/core/src/arrayobject.c:11764: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11764: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_aligned_set': numpy/core/src/arrayobject.c:11779: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11779: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c: In function `arrayflags_writeable_set': numpy/core/src/arrayobject.c:11795: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/arrayobject.c:11795: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c: In function `PyArray_ConvertToCommonType': numpy/core/src/multiarraymodule.c:2137: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c: In function `PyArray_DescrConverter': numpy/core/src/multiarraymodule.c:5244: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:5246: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:5248: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:5250: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:5252: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:5254: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:5256: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:5258: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c: In function `array_fromfile': numpy/core/src/multiarraymodule.c:6285: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c: In function `array_can_cast_safely': numpy/core/src/multiarraymodule.c:7002: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:7002: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c: In function `initmultiarray': numpy/core/src/multiarraymodule.c:7581: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:7583: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:7586: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:7588: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/multiarraymodule.c:7591: warning: dereferencing type-punned pointer will break strict-aliasing rules gcc -pthread -shared build/temp.linux-x86_64-2.5/numpy/core/src/multiarraymodule.o -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lm -lm -lpython2.5 -o build/lib.linux-x86_64-2.5/numpy/core/multiarray.so building 'numpy.core.umath' extension compiling C sources C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC creating build/temp.linux-x86_64-2.5/build creating build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5 creating build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/numpy creating build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/numpy/core creating build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/numpy/core/src compile options: '-Ibuild/src.linux-x86_64-2.5/numpy/core/src -Inumpy/core/include -Ibuild/src.linux-x86_64-2.5/numpy/core -I/nfs/acari/mh5/include/python2.5 -I/nfs/acari/mh5/include -I/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include/python2.5 -Inumpy/core/src -Inumpy/core/include -c' gcc: build/src.linux-x86_64-2.5/numpy/core/src/umathmodule.c gcc -pthread -shared build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/numpy/core/src/umathmodule.o -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lm -lpython2.5 -o build/lib.linux-x86_64-2.5/numpy/core/umath.so building 'numpy.core._sort' extension compiling C sources C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/include -Ibuild/src.linux-x86_64-2.5/numpy/core -I/nfs/acari/mh5/include/python2.5 -I/nfs/acari/mh5/include -I/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include/python2.5 -Inumpy/core/src -Inumpy/core/include -c' gcc: build/src.linux-x86_64-2.5/numpy/core/src/_sortmodule.c gcc -pthread -shared build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/numpy/core/src/_sortmodule.o -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lm -lpython2.5 -o build/lib.linux-x86_64-2.5/numpy/core/_sort.so building 'numpy.core.scalarmath' extension compiling C sources C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/include -Ibuild/src.linux-x86_64-2.5/numpy/core -I/nfs/acari/mh5/include/python2.5 -I/nfs/acari/mh5/include -I/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include/python2.5 -Inumpy/core/src -Inumpy/core/include -c' gcc: build/src.linux-x86_64-2.5/numpy/core/src/scalarmathmodule.c numpy/core/src/scalarmathmodule.c.src: In function `alter_pyscalars': numpy/core/src/scalarmathmodule.c.src:1064: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1069: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1074: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src: In function `restore_pyscalars': numpy/core/src/scalarmathmodule.c.src:1099: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1104: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1109: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src: In function `use_pythonmath': numpy/core/src/scalarmathmodule.c.src:1133: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1138: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1143: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src: In function `use_scalarmath': numpy/core/src/scalarmathmodule.c.src:1167: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1172: warning: dereferencing type-punned pointer will break strict-aliasing rules numpy/core/src/scalarmathmodule.c.src:1177: warning: dereferencing type-punned pointer will break strict-aliasing rules gcc -pthread -shared build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/numpy/core/src/scalarmathmodule.o -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lm -lpython2.5 -o build/lib.linux-x86_64-2.5/numpy/core/scalarmath.so building 'numpy.core._dotblas' extension compiling C sources C compiler: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC creating build/temp.linux-x86_64-2.5/numpy/core/blasdot compile options: '-DNO_ATLAS_INFO=1 -Inumpy/core/blasdot -Inumpy/core/include -Ibuild/src.linux-x86_64-2.5/numpy/core -I/nfs/acari/mh5/include/python2.5 -I/nfs/acari/mh5/include -I/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/include/python2.5 -Inumpy/core/src -Inumpy/core/include -c' gcc: numpy/core/blasdot/_dotblas.c g77 build/temp.linux-x86_64-2.5/numpy/core/blasdot/_dotblas.o -L/usr/lib -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lblas -lpython2.5 -lg2c-pic -o build/lib.linux-x86_64-2.5/numpy/core/_dotblas.so /usr/lib/libfrtbegin.a(frtbegin.o)(.text+0x1e): In function `main': : undefined reference to `MAIN__' collect2: ld returned 1 exit status /usr/lib/libfrtbegin.a(frtbegin.o)(.text+0x1e): In function `main': : undefined reference to `MAIN__' collect2: ld returned 1 exit status error: Command "g77 build/temp.linux-x86_64-2.5/numpy/core/blasdot/_dotblas.o -L/usr/lib -L/nfs/acari/mh5/arch/Linux-x86_64/opt/python-2.5.1/lib -lblas -lpython2.5 -lg2c-pic -o build/lib.linux-x86_64-2.5/numpy/core/_dotblas.so" failed with exit status 1 """ -- Michael Hoffman From mathewww at charter.net Sun Jul 1 17:03:12 2007 From: mathewww at charter.net (Mathew) Date: Sun, 01 Jul 2007 14:03:12 -0700 Subject: [Numpy-discussion] output from python setup.py -v config_fc --fcompiler=gnu95 build Message-ID: <46881690.1020702@charter.net> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log URL: From oliphant.travis at ieee.org Sat Jul 7 12:16:00 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat, 07 Jul 2007 10:16:00 -0600 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org> Message-ID: <468FBC40.8030006@ieee.org> > > > On 7/6/07, *Travis Oliphant* > wrote: > > Timothy Hochberg wrote: > > > > I'm working on getting some old code working with numpy and I > noticed > > that bool_ is not a subclass of int. Given that python's bool > > subclasses into and that the other scalar types are subclasses of > > their respective counterparts it seems at first glance that > > numpy.bool_ should subclass python's bool, which in turn subclasses > > int. Or am I missing something here? > The reason it is not, is because it is not binary compatible with > Python's integer. The numpy bool_ is always only 8-bits while the > Python integer is 32-bits or 64-bits. > > This could be changed I suspect, but then it would break the > relationship between scalars and their array counterparts > > > Do you have and idea off the top of your head head how painful this > would be from an implementation standpoint. And is there a theoretical > reason that it is important that the scalar and array implementations > match? I would think that, conceptually, they are all 1-bit integers, > and it seems that the 8-bit, versus 32- or 64-bits is just an > implementation detail. It would probably take about 2-3 hours to make the change and about 3 more hours to fix the problems that were not anticipated. Basically, we would have to special-case the bool like we do the unicode scalar (which also doesn't necessarily match the array-based representation but instead follows the Python implementation). I guess I don't really see a problem in switching just the numpy.bool_ scalar to be a sub-class of the Python bool type and adjusting the code to make the switch when creating a scalar. -Travis From tim.hochberg at ieee.org Sat Jul 7 12:10:54 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sat, 7 Jul 2007 09:10:54 -0700 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: <468FBC40.8030006@ieee.org> References: <468F285D.30308@ieee.org> <468FBC40.8030006@ieee.org> Message-ID: On 7/7/07, Travis Oliphant wrote: > > > > > > > > On 7/6/07, *Travis Oliphant* > > wrote: > > > > Timothy Hochberg wrote: > > > > > > I'm working on getting some old code working with numpy and I > > noticed > > > that bool_ is not a subclass of int. Given that python's bool > > > subclasses into and that the other scalar types are subclasses of > > > their respective counterparts it seems at first glance that > > > numpy.bool_ should subclass python's bool, which in turn > subclasses > > > int. Or am I missing something here? > > The reason it is not, is because it is not binary compatible with > > Python's integer. The numpy bool_ is always only 8-bits while the > > Python integer is 32-bits or 64-bits. > > > > This could be changed I suspect, but then it would break the > > relationship between scalars and their array counterparts > > > > > > Do you have and idea off the top of your head head how painful this > > would be from an implementation standpoint. And is there a theoretical > > reason that it is important that the scalar and array implementations > > match? I would think that, conceptually, they are all 1-bit integers, > > and it seems that the 8-bit, versus 32- or 64-bits is just an > > implementation detail. > It would probably take about 2-3 hours to make the change and about 3 > more hours to fix the problems that were not anticipated. Basically, > we would have to special-case the bool like we do the unicode scalar > (which also doesn't necessarily match the array-based representation but > instead follows the Python implementation). > > I guess I don't really see a problem in switching just the numpy.bool_ > scalar to be a sub-class of the Python bool type and adjusting the code > to make the switch when creating a scalar. Thanks for info. I'll put this on my list of things to look into, although it may take me a few weeks to get around to it, depending on how busy next week is. I don't see this as urgent, but it seems like a good change to make going forward. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Sat Jul 7 12:41:28 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sat, 7 Jul 2007 09:41:28 -0700 Subject: [Numpy-discussion] Aligning an array on Windows In-Reply-To: <1180342043.2699.7.camel@localhost.localdomain> References: <1180031632.2585.42.camel@localhost.localdomain> <200705252039.14426.faltet@carabos.com> <1180342043.2699.7.camel@localhost.localdomain> Message-ID: [Much time passes] I went ahead and added this warning. I kept thinking I should write a more detailed explanation about why this is a problem, but I never got around to it. At least this let's people know that there are some dragons to be wary of. -tim On 5/28/07, Francesc Altet wrote: > > El dv 25 de 05 del 2007 a les 14:19 -0700, en/na Timothy Hochberg va > escriure: > > Don't feel bad, when I had a very similar problem early on when we > > were first adding multiple types and it mystified me for considerably > > longer than this seems to have stumped you. > > Well, I wouldn't say the truth if I say that this doesn't help ;) > > Anyway, I think that this piece of code is dangerous enough and in order > to avoid someone (including me!) tripping over it again, it would be > nice to apply the next 'patch': > > Index: interp_body.c > =================================================================== > --- interp_body.c (revision 3053) > +++ interp_body.c (working copy) > @@ -89,6 +89,9 @@ > unsigned int arg2 = params.program[pc+3]; > #define arg3 params.program[pc+5] > #define store_index params.index_data[store_in] > + /* WARNING: From now on, only do references to params.mem > [arg[123]] > + & params.memsteps[arg[123]] inside the VEC_ARG[123] macros, > + or you will risk accessing invalid addresses. */ > #define reduce_ptr (dest + flat_index(&store_index, j)) > #define i_reduce *(long *)reduce_ptr > #define f_reduce *(double *)reduce_ptr > > Cheers, > > -- > Francesc Altet | Be careful about using the following code -- > Carabos Coop. V. | I've only proven that it works, > www.carabos.com | I haven't tested it. -- Donald Knuth > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 7 13:53:22 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Jul 2007 11:53:22 -0600 Subject: [Numpy-discussion] Problems with RandomArray In-Reply-To: <9cf809a00706280231i1bc27212mcb12cc817df3c809@mail.gmail.com> References: <9cf809a00706280231i1bc27212mcb12cc817df3c809@mail.gmail.com> Message-ID: On 6/28/07, Alexander Dietz wrote: > > Hi, > > I am not sure how to subscribe to this mailing list (there was no link for > that, just this email adress), but hopefully someone will get this email and > can me subscribe to this list, answer my question or ask someone else. > > Anyway, here is my question: > > I am using python with matplotlib version 0.90.1 and with numpy (as > recommended), on a Linux box. So far matplotlib and numpy is working, but I > need to use RandomArray! So, RandomArray can be found in the "Numerical > Python" documentation and RandomArray.py can also be found within the > "numeric" directory. If including this to my PYTHONPATH variable I can > import RandomArray and also use some of the functions! But the function I am > interested in is: multivariate_normal. When I try to use this function > python stops responding, I have to kill python from outside! > Is there a way to make this function work? Or maybe there is a quick > workaround using functions in random and else? That would be really great! RandomArray is not supported in numpy, use random instead. In [55]: numpy.random.standard_normal((2,2)) Out[55]: array([[ 0.29469565, 0.10410348], [-0.93587919, 1.19499785]]) In [56]: numpy.random.normal(0,1,(2,2)) Out[56]: array([[ 1.0061501 , 1.13688667], [-0.01728606, -1.72602317]]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Jul 7 14:25:39 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 07 Jul 2007 13:25:39 -0500 Subject: [Numpy-discussion] problem compiling v.1.0.3 on a Mac In-Reply-To: References: <4D239651-778E-4CB6-A0CD-A0E48A53B6FE@comcast.net> <4686C04A.4020001@gmail.com> <6D0CD51C-BC14-4D63-B139-ED24D049CF5B@comcast.net> <4688288A.6010403@gmail.com> <46882CE3.2060702@gmail.com> Message-ID: <468FDAA3.2010305@gmail.com> John Cartwright wrote: > I tried to send that last night, but the message was so large that > it's waiting for approval. Here's the first part of the output: Do you have this directory: /Developer/SDKs/MacOSX10.4u.sdk If not, you might have to make sure you have the correct SDK installed. I think it's on the 10.4 installation CD with the name MacOSX1.4.Universal.pkg . -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat Jul 7 14:32:17 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Jul 2007 12:32:17 -0600 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org> <468FBC40.8030006@ieee.org> Message-ID: On 7/7/07, Timothy Hochberg wrote: > > > > On 7/7/07, Travis Oliphant wrote: > > > > > > > > > > > > > On 7/6/07, *Travis Oliphant* > > > wrote: > > > > > Here is a link to PEP 285 where Guido discusses his reasoning about the bool type. I note that boolean arrays behave as integers under addition of a scalar, but not under addition of boolean arrays, where '+' seems to mean 'or'. The latter looks inconsistent with the Python convention. In [60]: a Out[60]: array([ True, True, True, True], dtype=bool) In [61]: a + a Out[61]: array([ True, True, True, True], dtype=bool) In [62]: a + 1 Out[62]: array([2, 2, 2, 2]) In [66]: True + True Out[66]: 2 Now might be a good time to discuss and document these choices. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpmusu at cc.usu.edu Sat Jul 7 14:37:59 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Sat, 07 Jul 2007 12:37:59 -0600 Subject: [Numpy-discussion] fancy indexing/broadcasting question Message-ID: <468FDD87.10400@cc.usu.edu> A quick question for the group. I'm working with some code to generate some arrays of random numbers. The random numbers, however, need to meet certain criteria. So for the moment, I have things that look like this (code is just an abstraction): import numpy normal=numpy.random.normal RNDarray = normal(25,15,(50,50)) tmp1 = (RNDarray < 0) | (RNDarray > 25) while tmp1.any(): print tmp1.size, tmp1.shape, tmp1[tmp1].size RNDarray[tmp1] = normal(5,3, size = RNDarray[tmp1].size) tmp1 = (RNDarray < 0) | (RNDarray > 25) This code works well. However, it might be considered inefficient because, for each iteration of the while loop, all values get reevaluated even if they have previously met the criteria encapsulated in tmp1. It would be better if, for each cycle of the while loop, only those elements that have previously not met criteria get reevaluated. I tried a few things that haven't worked. An example is provided below (changed code marked with *): RNDarray = normal(25,15,(50,50)) tmp1 = (RNDarray < 0) | (RNDarray > 25) while tmp1.any(): print tmp1.size, tmp1.shape, tmp1[tmp1].size RNDarray[tmp1] = normal(5,3, size = RNDarray[tmp1].size) * tmp1 = (RNDarray[tmp1] < 0) | (RNDarray[tmp1] > 25) I see why this doesn't work properly. When tmp1 is reevaluated in the while loop, it returns an array with a different shape. Does anyone have any suggestions for improvement here? Please let me know if any of this requires clarification. Thanks, -Mark From mpmusu at cc.usu.edu Sat Jul 7 14:41:22 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Sat, 07 Jul 2007 12:41:22 -0600 Subject: [Numpy-discussion] fancy indexing/broadcasting question In-Reply-To: <468FDD87.10400@cc.usu.edu> References: <468FDD87.10400@cc.usu.edu> Message-ID: <468FDE52.7060302@cc.usu.edu> Sorry...here's a minor correction to the code. #1st part import numpy normal=numpy.random.normal RNDarray = normal(25,15,(50,50)) tmp1 = (RNDarray < 0) | (RNDarray > 25) while tmp1.any(): print tmp1.size, tmp1.shape, tmp1[tmp1].size RNDarray[tmp1] = normal(25,15, size = RNDarray[tmp1].size) tmp1 = (RNDarray < 0) | (RNDarray > 25) #2nd part import numpy normal=numpy.random.normal RNDarray = normal(25,15,(50,50)) tmp1 = (RNDarray < 0) | (RNDarray > 25) while tmp1.any(): print tmp1.size, tmp1.shape, tmp1[tmp1].size RNDarray[tmp1] = normal(25,15, size = RNDarray[tmp1].size) * tmp1 = (RNDarray[tmp1] < 0) | (RNDarray[tmp1] > 25) Mark.Miller wrote: > A quick question for the group. I'm working with some code to generate > some arrays of random numbers. The random numbers, however, need to > meet certain criteria. So for the moment, I have things that look like > this (code is just an abstraction): > > import numpy > normal=numpy.random.normal > > RNDarray = normal(25,15,(50,50)) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > while tmp1.any(): > print tmp1.size, tmp1.shape, tmp1[tmp1].size > RNDarray[tmp1] = normal(5,3, size = RNDarray[tmp1].size) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > > This code works well. However, it might be considered inefficient > because, for each iteration of the while loop, all values get > reevaluated even if they have previously met the criteria encapsulated > in tmp1. It would be better if, for each cycle of the while loop, only > those elements that have previously not met criteria get reevaluated. > > I tried a few things that haven't worked. An example is provided below > (changed code marked with *): > > RNDarray = normal(25,15,(50,50)) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > while tmp1.any(): > print tmp1.size, tmp1.shape, tmp1[tmp1].size > RNDarray[tmp1] = normal(5,3, size = RNDarray[tmp1].size) > * tmp1 = (RNDarray[tmp1] < 0) | (RNDarray[tmp1] > 25) > > I see why this doesn't work properly. When tmp1 is reevaluated in the > while loop, it returns an array with a different shape. > > Does anyone have any suggestions for improvement here? Please let me > know if any of this requires clarification. > > Thanks, > > -Mark > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Dr. Mark P. Miller Department of Biology 5305 Old Main Hill Utah State University Logan, UT 84322-5305 USA ><><><><><><><><><><><><><><><>< http://www.biology.usu.edu/people/facultyinfo.asp?username=mpmbio http://www.marksgeneticsoftware.net From aisaac at american.edu Sat Jul 7 15:02:20 2007 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 7 Jul 2007 15:02:20 -0400 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org><468FBC40.8030006@ieee.org> Message-ID: On Sat, 7 Jul 2007, Charles R Harris apparently wrote: > In [60]: a > Out[60]: array([ True, True, True, True], dtype=bool) > In [61]: a + a > Out[61]: array([ True, True, True, True], dtype=bool) Yea! Behaves like a boolean array. And for multiplication to. And in boolean matrices, powers work right. (I use this.) > In [62]: a + 1 > Out[62]: array([2, 2, 2, 2]) Yea! Coercion to int, as expected. > In [66]: True + True > Out[66]: 2 Boo! Hopefully Python will "fix" this one day. Cheers, Alan Isaac From robert.kern at gmail.com Sat Jul 7 14:59:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 07 Jul 2007 13:59:17 -0500 Subject: [Numpy-discussion] fancy indexing/broadcasting question In-Reply-To: <468FDE52.7060302@cc.usu.edu> References: <468FDD87.10400@cc.usu.edu> <468FDE52.7060302@cc.usu.edu> Message-ID: <468FE285.6080204@gmail.com> Mark.Miller wrote: > Sorry...here's a minor correction to the code. > > #1st part > import numpy > normal=numpy.random.normal > > RNDarray = normal(25,15,(50,50)) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > while tmp1.any(): > print tmp1.size, tmp1.shape, tmp1[tmp1].size > RNDarray[tmp1] = normal(25,15, size = RNDarray[tmp1].size) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > > #2nd part > import numpy > normal=numpy.random.normal > > RNDarray = normal(25,15,(50,50)) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > while tmp1.any(): > print tmp1.size, tmp1.shape, tmp1[tmp1].size > RNDarray[tmp1] = normal(25,15, size = RNDarray[tmp1].size) > * tmp1 = (RNDarray[tmp1] < 0) | (RNDarray[tmp1] > 25) The reason is that tmp1 is no longer a mask into RNDarray, but into RNDarray[tmp1] (the old tmp1). For something as small as (50, 50) and simple criteria (no looping), the first version will probably be faster than any attempt to optimize it. However, if you do have larger arrays or slower criteria, you can reduce the size of the re-evaluated array pretty simply. I'm still not sure it will be faster, but here it is: import numpy normal = numpy.random.normal RNDarray = normal(25, 15, (50, 50)) badmask = (RNDarray < 0) | (RNDarray > 25) nbad = badmask.sum() while nbad > 0: new = normal(25, 15, size=nbad) RNDarray[badmask] = new newbad = (new < 0) | (new > 25) badmask[badmask] = newbad nbad = newbad.sum() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Sat Jul 7 15:13:59 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 7 Jul 2007 15:13:59 -0400 Subject: [Numpy-discussion] fancy indexing/broadcasting question In-Reply-To: <468FDD87.10400@cc.usu.edu> References: <468FDD87.10400@cc.usu.edu> Message-ID: On 07/07/07, Mark.Miller wrote: > A quick question for the group. I'm working with some code to generate > some arrays of random numbers. The random numbers, however, need to > meet certain criteria. So for the moment, I have things that look like > this (code is just an abstraction): > > import numpy > normal=numpy.random.normal > > RNDarray = normal(25,15,(50,50)) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > while tmp1.any(): > print tmp1.size, tmp1.shape, tmp1[tmp1].size > RNDarray[tmp1] = normal(5,3, size = RNDarray[tmp1].size) > tmp1 = (RNDarray < 0) | (RNDarray > 25) > > This code works well. However, it might be considered inefficient > because, for each iteration of the while loop, all values get > reevaluated even if they have previously met the criteria encapsulated > in tmp1. It would be better if, for each cycle of the while loop, only > those elements that have previously not met criteria get reevaluated. You can write a quick recursive function to do it: In [33]: def gen_condition(n): ....: A = normal(size=n) ....: c = abs(A)>1 ....: subn = sum(c) ....: if subn>0: ....: A[c] = gen_condition(subn) ....: return A ....: Probably not ideal if it's going to take many tries, but it's pretty clear. Anne From charlesr.harris at gmail.com Sat Jul 7 15:21:43 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Jul 2007 13:21:43 -0600 Subject: [Numpy-discussion] Another change in default type of ones and zeros. Message-ID: Hi All, Originally, ones and zeros defaulted to integer, later the default changed to float, now it looks like it is integer again. In [80]: ones(2).dtype Out[80]: dtype('int32') In [81]: zeros(2).dtype Out[81]: dtype('int32') In [82]: __version__ Out[82]: '1.0.4.dev3880' This could break some code. Did I miss a decision somewhere along the line? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpmusu at cc.usu.edu Sat Jul 7 15:41:52 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Sat, 07 Jul 2007 13:41:52 -0600 Subject: [Numpy-discussion] fancy indexing/broadcasting question In-Reply-To: <468FE285.6080204@gmail.com> References: <468FDD87.10400@cc.usu.edu> <468FDE52.7060302@cc.usu.edu> <468FE285.6080204@gmail.com> Message-ID: <468FEC80.1010809@cc.usu.edu> That seems to do the trick. Runtimes are reduced by 15-20% in my code. Robert Kern wrote: > The reason is that tmp1 is no longer a mask into RNDarray, but into > RNDarray[tmp1] (the old tmp1). For something as small as (50, 50) and simple > criteria (no looping), the first version will probably be faster than any > attempt to optimize it. > > However, if you do have larger arrays or slower criteria, you can reduce the > size of the re-evaluated array pretty simply. I'm still not sure it will be > faster, but here it is: > > > import numpy > normal = numpy.random.normal > > RNDarray = normal(25, 15, (50, 50)) > badmask = (RNDarray < 0) | (RNDarray > 25) > nbad = badmask.sum() > while nbad > 0: > new = normal(25, 15, size=nbad) > RNDarray[badmask] = new > newbad = (new < 0) | (new > 25) > badmask[badmask] = newbad > nbad = newbad.sum() > From charlesr.harris at gmail.com Sat Jul 7 16:37:35 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Jul 2007 14:37:35 -0600 Subject: [Numpy-discussion] Another change in default type of ones and zeros. In-Reply-To: References: Message-ID: On 7/7/07, Charles R Harris wrote: > > Hi All, > > Originally, ones and zeros defaulted to integer, later the default changed > to float, now it looks like it is integer again. > > In [80]: ones(2).dtype > Out[80]: dtype('int32') > > In [81]: zeros(2).dtype > Out[81]: dtype('int32') > > In [82]: __version__ > Out[82]: '1.0.4.dev3880' > > > This could break some code. Did I miss a decision somewhere along the > line? This seems to be a problem with ipython -pylab choosing the old compatibility mode. I thought that was going away. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 7 17:34:14 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Jul 2007 15:34:14 -0600 Subject: [Numpy-discussion] Another change in default type of ones and zeros. In-Reply-To: References: Message-ID: On 7/7/07, Charles R Harris wrote: > > > > On 7/7/07, Charles R Harris wrote: > > > > Hi All, > > > > Originally, ones and zeros defaulted to integer, later the default > > changed to float, now it looks like it is integer again. > > > > In [80]: ones(2).dtype > > Out[80]: dtype('int32') > > > > In [81]: zeros(2).dtype > > Out[81]: dtype('int32') > > > > In [82]: __version__ > > Out[82]: '1.0.4.dev3880' > > > > > > This could break some code. Did I miss a decision somewhere along the > > line? > > > This seems to be a problem with ipython -pylab choosing the old > compatibility mode. I thought that was going away. > It depends on whether ipython is invoked as ipython -pylab -p numeric or as ipython -p numeric -pylab The first uses the numeric compatibility layer of MPL, the second gives you numpy. Hmm.... Chuck Chuck > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From v-nijs at kellogg.northwestern.edu Sat Jul 7 18:18:05 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Sat, 07 Jul 2007 17:18:05 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: <88e473830707061853k24fe56d9mae19660c8813991a@mail.gmail.com> Message-ID: Thanks for the reference John! csv2rec is about 30% faster than my code on the same data. If I read the code in csv2rec correctly it converts the data as it is being read using the csv modules. My setup reads in the whole dataset into an array of strings and then converts the columns as appropriate. Best, Vincent On 7/6/07 8:53 PM, "John Hunter" wrote: > On 7/6/07, Vincent Nijs wrote: >> I wrote the attached (small) program to read in a text/csv file with >> different data types and convert it into a recarray without having to >> pre-specify the dtypes or variables names. I am just too lazy to type-in >> stuff like that :) The supported types are int, float, dates, and strings. >> >> I works pretty well but it is not (yet) as fast as I would like so I was >> wonder if any of the numpy experts on this list might have some suggestion >> on how to speed it up. I need to read 500MB-1GB files so speed is important >> for me. > > In matplotlib.mlab svn, there is a function csv2rec that does the > same. You may want to compare implementations in case we can > fruitfully cross pollinate them. In the examples directy, there is an > example script examples/loadrec.py > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From torgil.svensson at gmail.com Sun Jul 8 06:52:35 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Sun, 8 Jul 2007 12:52:35 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: <88e473830707061853k24fe56d9mae19660c8813991a@mail.gmail.com> Message-ID: Given that both your script and the mlab version preloads the whole file before calling numpy constructor I'm curious how that compares in speed to using numpy's fromiter function on your data. Using fromiter should improve on memory usage (~50% ?). The drawback is for string columns where we don't longer know the width of the largest item. I made it fall-back to "object" in this case. Attached is a fromiter version of your script. Possible speedups could be done by trying different approaches to the "convert_row" function, for example using "zip" or "enumerate" instead of "izip". Best Regards, //Torgil On 7/8/07, Vincent Nijs wrote: > Thanks for the reference John! csv2rec is about 30% faster than my code on > the same data. > > If I read the code in csv2rec correctly it converts the data as it is being > read using the csv modules. My setup reads in the whole dataset into an > array of strings and then converts the columns as appropriate. > > Best, > > Vincent > > > On 7/6/07 8:53 PM, "John Hunter" wrote: > > > On 7/6/07, Vincent Nijs wrote: > >> I wrote the attached (small) program to read in a text/csv file with > >> different data types and convert it into a recarray without having to > >> pre-specify the dtypes or variables names. I am just too lazy to type-in > >> stuff like that :) The supported types are int, float, dates, and strings. > >> > >> I works pretty well but it is not (yet) as fast as I would like so I was > >> wonder if any of the numpy experts on this list might have some suggestion > >> on how to speed it up. I need to read 500MB-1GB files so speed is important > >> for me. > > > > In matplotlib.mlab svn, there is a function csv2rec that does the > > same. You may want to compare implementations in case we can > > fruitfully cross pollinate them. In the examples directy, there is an > > example script examples/loadrec.py > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: load_iter.py Type: text/x-python Size: 2539 bytes Desc: not available URL: From tim.hochberg at ieee.org Sun Jul 8 13:08:16 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 8 Jul 2007 10:08:16 -0700 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: <88e473830707061853k24fe56d9mae19660c8813991a@mail.gmail.com> Message-ID: On 7/8/07, Torgil Svensson wrote: > > Given that both your script and the mlab version preloads the whole > file before calling numpy constructor I'm curious how that compares in > speed to using numpy's fromiter function on your data. Using fromiter > should improve on memory usage (~50% ?). > > The drawback is for string columns where we don't longer know the > width of the largest item. I made it fall-back to "object" in this > case. > > Attached is a fromiter version of your script. Possible speedups could > be done by trying different approaches to the "convert_row" function, > for example using "zip" or "enumerate" instead of "izip". I suspect that you'd do better here if you removed a bunch of layers from the conversion functions. Right now it looks like: imap->chain->convert_row->tuple->generator->izip. That's five levels deep and Python functions are reasonably expensive. I would try to be a lot less clever and do something like: def data_iterator(row_iter, delim): row0 = row_iter.next().split(delim) converters = find_formats(row0) # left as an exercise yield tuple(f(x) for f, x in zip(conversion_functions, row0)) for row in row_iter: yield tuple(f(x) for f, x in zip(conversion_functions, row0)) That's just a sketch and I haven't timed it, but it cuts a few levels out of the call chain, so has a reasonable chance of being faster. If you wanted to be really clever, you could use some exec magic after you figure out the conversion functions to compile a special function that generates the tuples directly without any use of tuple or zip. I don't have time to work through the details right now, but the code you would compile would end up looking this: for (x0, x1, x2) in row_iter: yield (int(x0), float(x1), float(x2)) Here we've assumed that find_formats determined that there are three fields, an int and two floats. Once you have this info you can build an appropriate function and exec it. This would cut another couple levels out of the call chain. Again, I haven't timed it, or tried it, but it looks like it would be fun to try. -tim > > > On 7/8/07, Vincent Nijs wrote: > > Thanks for the reference John! csv2rec is about 30% faster than my code > on > > the same data. > > > > If I read the code in csv2rec correctly it converts the data as it is > being > > read using the csv modules. My setup reads in the whole dataset into an > > array of strings and then converts the columns as appropriate. > > > > Best, > > > > Vincent > > > > > > On 7/6/07 8:53 PM, "John Hunter" wrote: > > > > > On 7/6/07, Vincent Nijs wrote: > > >> I wrote the attached (small) program to read in a text/csv file with > > >> different data types and convert it into a recarray without having to > > >> pre-specify the dtypes or variables names. I am just too lazy to > type-in > > >> stuff like that :) The supported types are int, float, dates, and > strings. > > >> > > >> I works pretty well but it is not (yet) as fast as I would like so I > was > > >> wonder if any of the numpy experts on this list might have some > suggestion > > >> on how to speed it up. I need to read 500MB-1GB files so speed is > important > > >> for me. > > > > > > In matplotlib.mlab svn, there is a function csv2rec that does the > > > same. You may want to compare implementations in case we can > > > fruitfully cross pollinate them. In the examples directy, there is an > > > example script examples/loadrec.py > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From v-nijs at kellogg.northwestern.edu Sun Jul 8 14:15:11 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Sun, 08 Jul 2007 13:15:11 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: I am not (yet) very familiar with much of the functionality introduced in your script Torgil (izip, imap, etc.), but I really appreciate you taking the time to look at this! The program stopped with the following error: File "load_iter.py", line 48, in convert_row=lambda r: tuple(fn(x) for fn,x in izip(conversion_functions,r)) ValueError: invalid literal for int() with base 10: '2174.875' A lot of the data I use can have a column with a set of int?s (e.g., 0?s), but then the rest of that same column could be floats. I guess finding the right conversion function is the tricky part. I was thinking about sampling each, say, 10th obs to test which function to use. Not sure how that would work however. If I ignore the option of an int (i.e., everything is a float, date, or string) then your script is about twice as fast as mine!! Question: If you do ignore the int's initially, once the rec array is in memory, would there be a quick way to check if the floats could pass as int's? This may seem like a backwards approach but it might be 'safer' if you really want to preserve the int's. Thanks again! Vincent On 7/8/07 5:52 AM, "Torgil Svensson" wrote: > Given that both your script and the mlab version preloads the whole > file before calling numpy constructor I'm curious how that compares in > speed to using numpy's fromiter function on your data. Using fromiter > should improve on memory usage (~50% ?). > > The drawback is for string columns where we don't longer know the > width of the largest item. I made it fall-back to "object" in this > case. > > Attached is a fromiter version of your script. Possible speedups could > be done by trying different approaches to the "convert_row" function, > for example using "zip" or "enumerate" instead of "izip". > > Best Regards, > > //Torgil > > > On 7/8/07, Vincent Nijs wrote: >> Thanks for the reference John! csv2rec is about 30% faster than my code on >> the same data. >> >> If I read the code in csv2rec correctly it converts the data as it is being >> read using the csv modules. My setup reads in the whole dataset into an >> array of strings and then converts the columns as appropriate. >> >> Best, >> >> Vincent >> >> >> On 7/6/07 8:53 PM, "John Hunter" wrote: >> >>> On 7/6/07, Vincent Nijs wrote: >>>> I wrote the attached (small) program to read in a text/csv file with >>>> different data types and convert it into a recarray without having to >>>> pre-specify the dtypes or variables names. I am just too lazy to type-in >>>> stuff like that :) The supported types are int, float, dates, and strings. >>>> >>>> I works pretty well but it is not (yet) as fast as I would like so I was >>>> wonder if any of the numpy experts on this list might have some suggestion >>>> on how to speed it up. I need to read 500MB-1GB files so speed is important >>>> for me. >>> >>> In matplotlib.mlab svn, there is a function csv2rec that does the >>> same. You may want to compare implementations in case we can >>> fruitfully cross pollinate them. In the examples directy, there is an >>> example script examples/loadrec.py >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Vincent R. Nijs Assistant Professor of Marketing Kellogg School of Management, Northwestern University 2001 Sheridan Road, Evanston, IL 60208-2001 Phone: +1-847-491-4574 Fax: +1-847-491-2498 E-mail: v-nijs at kellogg.northwestern.edu Skype: vincentnijs From torgil.svensson at gmail.com Sun Jul 8 16:15:15 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Sun, 8 Jul 2007 22:15:15 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: <88e473830707061853k24fe56d9mae19660c8813991a@mail.gmail.com> Message-ID: On 7/8/07, Timothy Hochberg wrote: > > > On 7/8/07, Torgil Svensson wrote: > > Given that both your script and the mlab version preloads the whole > > file before calling numpy constructor I'm curious how that compares in > > speed to using numpy's fromiter function on your data. Using fromiter > > should improve on memory usage (~50% ?). > > > > The drawback is for string columns where we don't longer know the > > width of the largest item. I made it fall-back to "object" in this > > case. > > > > Attached is a fromiter version of your script. Possible speedups could > > be done by trying different approaches to the "convert_row" function, > > for example using "zip" or "enumerate" instead of "izip". > > I suspect that you'd do better here if you removed a bunch of layers from > the conversion functions. Right now it looks like: > imap->chain->convert_row->tuple->generator->izip. That's > five levels deep and Python functions are reasonably expensive. I would try > to be a lot less clever and do something like: > > def data_iterator(row_iter, delim): > row0 = row_iter.next().split(delim) > converters = find_formats(row0) # left as an exercise > yield tuple(f(x) for f, x in zip(conversion_functions, row0)) > for row in row_iter: > yield tuple(f(x) for f, x in zip(conversion_functions, row0)) > That sounds sane. I've maybe been attracted to bad habits here and got away with it since i'm very i/o-bound in these cases. My main objective has been reducing memory footprint to reduce swapping. > That's just a sketch and I haven't timed it, but it cuts a few levels out of > the call chain, so has a reasonable chance of being faster. If you wanted to > be really clever, you could use some exec magic after you figure out the > conversion functions to compile a special function that generates the tuples > directly without any use of tuple or zip. I don't have time to work through > the details right now, but the code you would compile would end up looking > this: > > for (x0, x1, x2) in row_iter: > yield (int(x0), float(x1), float(x2)) > > Here we've assumed that find_formats determined that there are three fields, > an int and two floats. Once you have this info you can build an appropriate > function and exec it. This would cut another couple levels out of the call > chain. Again, I haven't timed it, or tried it, but it looks like it would be > fun to try. > > -tim > Thank you for the lesson! Great tip. This opens up for a variety of new coding options. I've made an attempt on the fun part. Attached are a version that generates the following generator code for Vincent's __main__=='__name__' - code: def get_data_iterator(row_iter,delim): yield (int('1'),int('3'),datestr2num('1/97'),float('1.12'),float('2.11'),float('1.2')) for row in row_iter: x0,x1,x2,x3,x4,x5=row.split(delim) yield (int(x0),int(x1),datestr2num(x2),float(x3),float(x4),float(x5)) Best Regards, //Torgil -------------- next part -------------- A non-text attachment was scrubbed... Name: load_gen_iter.py Type: text/x-python Size: 2823 bytes Desc: not available URL: From barrywark at gmail.com Sun Jul 8 16:30:43 2007 From: barrywark at gmail.com (Barry Wark) Date: Sun, 8 Jul 2007 13:30:43 -0700 Subject: [Numpy-discussion] Buildbot for numpy In-Reply-To: References: Message-ID: Stefan, No worries. I thought it was something like that. Any thoughts on my other questions? I'd love to have some ammunition to take to my boss. Thanks, Barry On 7/7/07, stefan wrote: > > On Mon, 2 Jul 2007 17:26:15 -0700, "Barry Wark" > wrote: > > On a side note, buildbot.scipy.org goes to the DSP lab, Univ. of > > Stellenbosch's home page, not the buildbot status page. > > Sorry about that -- I misconfigured Apache. Everything should be fine now. > > Cheers > St?fan > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From torgil.svensson at gmail.com Sun Jul 8 16:31:41 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Sun, 8 Jul 2007 22:31:41 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: Hi I stumble on these types of problems from time to time so I'm interested in efficient solutions myself. Do you have a column which starts with something suitable for int on the first row (without decimal separator) but has decimals further down? This will be little tricky to support. One solution could be to yield StopIteration, calculate new type-conversion-functions and start over iterating over both the old data and the rest of the iterator. It'd be great if you could try the load_gen_iter.py I've attached to my response to Tim. Best Regards, //Torgil On 7/8/07, Vincent Nijs wrote: > I am not (yet) very familiar with much of the functionality introduced in > your script Torgil (izip, imap, etc.), but I really appreciate you taking > the time to look at this! > > The program stopped with the following error: > > File "load_iter.py", line 48, in > convert_row=lambda r: tuple(fn(x) for fn,x in > izip(conversion_functions,r)) > ValueError: invalid literal for int() with base 10: '2174.875' > > A lot of the data I use can have a column with a set of int?s (e.g., 0?s), > but then the rest of that same column could be floats. I guess finding the > right conversion function is the tricky part. I was thinking about sampling > each, say, 10th obs to test which function to use. Not sure how that would > work however. > > If I ignore the option of an int (i.e., everything is a float, date, or > string) then your script is about twice as fast as mine!! > > Question: If you do ignore the int's initially, once the rec array is in > memory, would there be a quick way to check if the floats could pass as > int's? This may seem like a backwards approach but it might be 'safer' if > you really want to preserve the int's. > > Thanks again! > > Vincent > > > On 7/8/07 5:52 AM, "Torgil Svensson" wrote: > > > Given that both your script and the mlab version preloads the whole > > file before calling numpy constructor I'm curious how that compares in > > speed to using numpy's fromiter function on your data. Using fromiter > > should improve on memory usage (~50% ?). > > > > The drawback is for string columns where we don't longer know the > > width of the largest item. I made it fall-back to "object" in this > > case. > > > > Attached is a fromiter version of your script. Possible speedups could > > be done by trying different approaches to the "convert_row" function, > > for example using "zip" or "enumerate" instead of "izip". > > > > Best Regards, > > > > //Torgil > > > > > > On 7/8/07, Vincent Nijs wrote: > >> Thanks for the reference John! csv2rec is about 30% faster than my code on > >> the same data. > >> > >> If I read the code in csv2rec correctly it converts the data as it is being > >> read using the csv modules. My setup reads in the whole dataset into an > >> array of strings and then converts the columns as appropriate. > >> > >> Best, > >> > >> Vincent > >> > >> > >> On 7/6/07 8:53 PM, "John Hunter" wrote: > >> > >>> On 7/6/07, Vincent Nijs wrote: > >>>> I wrote the attached (small) program to read in a text/csv file with > >>>> different data types and convert it into a recarray without having to > >>>> pre-specify the dtypes or variables names. I am just too lazy to type-in > >>>> stuff like that :) The supported types are int, float, dates, and strings. > >>>> > >>>> I works pretty well but it is not (yet) as fast as I would like so I was > >>>> wonder if any of the numpy experts on this list might have some suggestion > >>>> on how to speed it up. I need to read 500MB-1GB files so speed is important > >>>> for me. > >>> > >>> In matplotlib.mlab svn, there is a function csv2rec that does the > >>> same. You may want to compare implementations in case we can > >>> fruitfully cross pollinate them. In the examples directy, there is an > >>> example script examples/loadrec.py > >>> _______________________________________________ > >>> Numpy-discussion mailing list > >>> Numpy-discussion at scipy.org > >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >>> > >> > >> > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- > Vincent R. Nijs > Assistant Professor of Marketing > Kellogg School of Management, Northwestern University > 2001 Sheridan Road, Evanston, IL 60208-2001 > Phone: +1-847-491-4574 Fax: +1-847-491-2498 > E-mail: v-nijs at kellogg.northwestern.edu > Skype: vincentnijs > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From v-nijs at kellogg.northwestern.edu Sun Jul 8 17:36:12 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Sun, 08 Jul 2007 16:36:12 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: Torgil, The function seems to work well and is slightly faster than your previous version (about 1/6th faster). Yes, I do have columns that start with, what looks like, int's and then turn out to be floats. Something like below (col6). data = [['col1', 'col2', 'col3', 'col4', 'col5', 'col6'], ['1','3','1/97','1.12','2.11','0'], ['1','2','3/97','1.21','3.12','0'], ['2','1','2/97','1.12','2.11','0'], ['2','2','4/97','1.33','2.26','1.23'], ['2','2','5/97','1.73','2.42','1.26']] I think what your function assumes is that the 1st element will be the appropriate type. That may not hold if you have missing values or 'mixed types'. Best, Vincent On 7/8/07 3:31 PM, "Torgil Svensson" wrote: > Hi > > I stumble on these types of problems from time to time so I'm > interested in efficient solutions myself. > > Do you have a column which starts with something suitable for int on > the first row (without decimal separator) but has decimals further > down? > > This will be little tricky to support. One solution could be to yield > StopIteration, calculate new type-conversion-functions and start over > iterating over both the old data and the rest of the iterator. > > It'd be great if you could try the load_gen_iter.py I've attached to > my response to Tim. > > Best Regards, > > //Torgil > > On 7/8/07, Vincent Nijs wrote: >> I am not (yet) very familiar with much of the functionality introduced in >> your script Torgil (izip, imap, etc.), but I really appreciate you taking >> the time to look at this! >> >> The program stopped with the following error: >> >> File "load_iter.py", line 48, in >> convert_row=lambda r: tuple(fn(x) for fn,x in >> izip(conversion_functions,r)) >> ValueError: invalid literal for int() with base 10: '2174.875' >> >> A lot of the data I use can have a column with a set of int?s (e.g., 0?s), >> but then the rest of that same column could be floats. I guess finding the >> right conversion function is the tricky part. I was thinking about sampling >> each, say, 10th obs to test which function to use. Not sure how that would >> work however. >> >> If I ignore the option of an int (i.e., everything is a float, date, or >> string) then your script is about twice as fast as mine!! >> >> Question: If you do ignore the int's initially, once the rec array is in >> memory, would there be a quick way to check if the floats could pass as >> int's? This may seem like a backwards approach but it might be 'safer' if >> you really want to preserve the int's. >> >> Thanks again! >> >> Vincent >> >> >> On 7/8/07 5:52 AM, "Torgil Svensson" wrote: >> >>> Given that both your script and the mlab version preloads the whole >>> file before calling numpy constructor I'm curious how that compares in >>> speed to using numpy's fromiter function on your data. Using fromiter >>> should improve on memory usage (~50% ?). >>> >>> The drawback is for string columns where we don't longer know the >>> width of the largest item. I made it fall-back to "object" in this >>> case. >>> >>> Attached is a fromiter version of your script. Possible speedups could >>> be done by trying different approaches to the "convert_row" function, >>> for example using "zip" or "enumerate" instead of "izip". >>> >>> Best Regards, >>> >>> //Torgil >>> >>> >>> On 7/8/07, Vincent Nijs wrote: >>>> Thanks for the reference John! csv2rec is about 30% faster than my code on >>>> the same data. >>>> >>>> If I read the code in csv2rec correctly it converts the data as it is being >>>> read using the csv modules. My setup reads in the whole dataset into an >>>> array of strings and then converts the columns as appropriate. >>>> >>>> Best, >>>> >>>> Vincent >>>> >>>> >>>> On 7/6/07 8:53 PM, "John Hunter" wrote: >>>> >>>>> On 7/6/07, Vincent Nijs wrote: >>>>>> I wrote the attached (small) program to read in a text/csv file with >>>>>> different data types and convert it into a recarray without having to >>>>>> pre-specify the dtypes or variables names. I am just too lazy to type-in >>>>>> stuff like that :) The supported types are int, float, dates, and >>>>>> strings. >>>>>> >>>>>> I works pretty well but it is not (yet) as fast as I would like so I was >>>>>> wonder if any of the numpy experts on this list might have some >>>>>> suggestion >>>>>> on how to speed it up. I need to read 500MB-1GB files so speed is >>>>>> important >>>>>> for me. >>>>> >>>>> In matplotlib.mlab svn, there is a function csv2rec that does the >>>>> same. You may want to compare implementations in case we can >>>>> fruitfully cross pollinate them. In the examples directy, there is an >>>>> example script examples/loadrec.py >>>>> _______________________________________________ >>>>> Numpy-discussion mailing list >>>>> Numpy-discussion at scipy.org >>>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> -- >> Vincent R. Nijs >> Assistant Professor of Marketing >> Kellogg School of Management, Northwestern University >> 2001 Sheridan Road, Evanston, IL 60208-2001 >> Phone: +1-847-491-4574 Fax: +1-847-491-2498 >> E-mail: v-nijs at kellogg.northwestern.edu >> Skype: vincentnijs >> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Vincent R. Nijs Assistant Professor of Marketing Kellogg School of Management, Northwestern University 2001 Sheridan Road, Evanston, IL 60208-2001 Phone: +1-847-491-4574 Fax: +1-847-491-2498 E-mail: v-nijs at kellogg.northwestern.edu Skype: vincentnijs From tim.hochberg at ieee.org Sun Jul 8 17:51:23 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 8 Jul 2007 14:51:23 -0700 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: On 7/8/07, Vincent Nijs wrote: > > Torgil, > > The function seems to work well and is slightly faster than your previous > version (about 1/6th faster). > > Yes, I do have columns that start with, what looks like, int's and then > turn > out to be floats. Something like below (col6). > > data = [['col1', 'col2', 'col3', 'col4', 'col5', 'col6'], > ['1','3','1/97','1.12','2.11','0'], > ['1','2','3/97','1.21','3.12','0'], > ['2','1','2/97','1.12','2.11','0'], > ['2','2','4/97','1.33','2.26','1.23'], > ['2','2','5/97','1.73','2.42','1.26']] > > I think what your function assumes is that the 1st element will be the > appropriate type. That may not hold if you have missing values or 'mixed > types'. Vincent, Do you need to auto detect the column types? Things get a lot simpler if you have some known schema for each file; then you can simply pass that to some reader function. It's also more robust since there's no way in general to differentiate a column of integers from a column of floats with no decimal part. If you do need to auto detect, one approach would be to always read both int-like stuff and float-like stuff in as floats. Then after you get the array check over the various columns and if any have no fractional parts, make a new array where those columns are integers. -tim Best, > > Vincent > > > On 7/8/07 3:31 PM, "Torgil Svensson" wrote: > > > Hi > > > > I stumble on these types of problems from time to time so I'm > > interested in efficient solutions myself. > > > > Do you have a column which starts with something suitable for int on > > the first row (without decimal separator) but has decimals further > > down? > > > > This will be little tricky to support. One solution could be to yield > > StopIteration, calculate new type-conversion-functions and start over > > iterating over both the old data and the rest of the iterator. > > > > It'd be great if you could try the load_gen_iter.py I've attached to > > my response to Tim. > > > > Best Regards, > > > > //Torgil > > > > On 7/8/07, Vincent Nijs wrote: > >> I am not (yet) very familiar with much of the functionality introduced > in > >> your script Torgil (izip, imap, etc.), but I really appreciate you > taking > >> the time to look at this! > >> > >> The program stopped with the following error: > >> > >> File "load_iter.py", line 48, in > >> convert_row=lambda r: tuple(fn(x) for fn,x in > >> izip(conversion_functions,r)) > >> ValueError: invalid literal for int() with base 10: '2174.875' > >> > >> A lot of the data I use can have a column with a set of int?s (e.g., > 0?s), > >> but then the rest of that same column could be floats. I guess finding > the > >> right conversion function is the tricky part. I was thinking about > sampling > >> each, say, 10th obs to test which function to use. Not sure how that > would > >> work however. > >> > >> If I ignore the option of an int (i.e., everything is a float, date, or > >> string) then your script is about twice as fast as mine!! > >> > >> Question: If you do ignore the int's initially, once the rec array is > in > >> memory, would there be a quick way to check if the floats could pass as > >> int's? This may seem like a backwards approach but it might be 'safer' > if > >> you really want to preserve the int's. > >> > >> Thanks again! > >> > >> Vincent > >> > >> > >> On 7/8/07 5:52 AM, "Torgil Svensson" wrote: > >> > >>> Given that both your script and the mlab version preloads the whole > >>> file before calling numpy constructor I'm curious how that compares in > >>> speed to using numpy's fromiter function on your data. Using fromiter > >>> should improve on memory usage (~50% ?). > >>> > >>> The drawback is for string columns where we don't longer know the > >>> width of the largest item. I made it fall-back to "object" in this > >>> case. > >>> > >>> Attached is a fromiter version of your script. Possible speedups could > >>> be done by trying different approaches to the "convert_row" function, > >>> for example using "zip" or "enumerate" instead of "izip". > >>> > >>> Best Regards, > >>> > >>> //Torgil > >>> > >>> > >>> On 7/8/07, Vincent Nijs wrote: > >>>> Thanks for the reference John! csv2rec is about 30% faster than my > code on > >>>> the same data. > >>>> > >>>> If I read the code in csv2rec correctly it converts the data as it is > being > >>>> read using the csv modules. My setup reads in the whole dataset into > an > >>>> array of strings and then converts the columns as appropriate. > >>>> > >>>> Best, > >>>> > >>>> Vincent > >>>> > >>>> > >>>> On 7/6/07 8:53 PM, "John Hunter" wrote: > >>>> > >>>>> On 7/6/07, Vincent Nijs wrote: > >>>>>> I wrote the attached (small) program to read in a text/csv file > with > >>>>>> different data types and convert it into a recarray without having > to > >>>>>> pre-specify the dtypes or variables names. I am just too lazy to > type-in > >>>>>> stuff like that :) The supported types are int, float, dates, and > >>>>>> strings. > >>>>>> > >>>>>> I works pretty well but it is not (yet) as fast as I would like so > I was > >>>>>> wonder if any of the numpy experts on this list might have some > >>>>>> suggestion > >>>>>> on how to speed it up. I need to read 500MB-1GB files so speed is > >>>>>> important > >>>>>> for me. > >>>>> > >>>>> In matplotlib.mlab svn, there is a function csv2rec that does the > >>>>> same. You may want to compare implementations in case we can > >>>>> fruitfully cross pollinate them. In the examples directy, there is > an > >>>>> example script examples/loadrec.py > >>>>> _______________________________________________ > >>>>> Numpy-discussion mailing list > >>>>> Numpy-discussion at scipy.org > >>>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Numpy-discussion mailing list > >>>> Numpy-discussion at scipy.org > >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >>>> > >>> _______________________________________________ > >>> Numpy-discussion mailing list > >>> Numpy-discussion at scipy.org > >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > >> -- > >> Vincent R. Nijs > >> Assistant Professor of Marketing > >> Kellogg School of Management, Northwestern University > >> 2001 Sheridan Road, Evanston, IL 60208-2001 > >> Phone: +1-847-491-4574 Fax: +1-847-491-2498 > >> E-mail: v-nijs at kellogg.northwestern.edu > >> Skype: vincentnijs > >> > >> > >> > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Vincent R. Nijs > Assistant Professor of Marketing > Kellogg School of Management, Northwestern University > 2001 Sheridan Road, Evanston, IL 60208-2001 > Phone: +1-847-491-4574 Fax: +1-847-491-2498 > E-mail: v-nijs at kellogg.northwestern.edu > Skype: vincentnijs > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From torgil.svensson at gmail.com Sun Jul 8 18:40:04 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Mon, 9 Jul 2007 00:40:04 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: > Question: If you do ignore the int's initially, once the rec array is in > memory, would there be a quick way to check if the floats could pass as > int's? This may seem like a backwards approach but it might be 'safer' if > you really want to preserve the int's. In your case the floats don't pass as ints since you have decimals. The attached file takes another approach (sorry for lack of comments). If the conversion fail, the current row is stored and the iterator exits (without setting a 'finished' parameter to true). The program then re-calculates the conversion-functions and checks for changes. If the changes are supported (=we have a conversion function for old data in the format_changes dictionary) it calls fromiter again with an iterator like this: def get_data_iterator(row_iter,delim,res): for x0,x1,x2,x3,x4,x5 in res['data']: x0=float(x0) print (x0,x1,x2,x3,x4,x5) yield (x0,x1,x2,x3,x4,x5) yield (float('2.0'),int('2'),datestr2num('4/97'),float('1.33'),float('2.26'),float('1.23')) for row in row_iter: x0,x1,x2,x3,x4,x5=row.split(delim) try: yield (float(x0),int(x1),datestr2num(x2),float(x3),float(x4),float(x5)) except: res['row']=row return res['finished']=True res['data'] is the previously converted data. This has the obvious disadvantage that if only the last row has fractions in a column, it'll cost double memory. Also if many columns change format at different places it has to re-convert every time. I don't recommend this because of the drawbacks and extra complexity. I think it is best to convert your files (or file generation) so that float columns are represented with 0.0 instead of 0. Best Regards, //Torgil On 7/8/07, Vincent Nijs wrote: > I am not (yet) very familiar with much of the functionality introduced in > your script Torgil (izip, imap, etc.), but I really appreciate you taking > the time to look at this! > > The program stopped with the following error: > > File "load_iter.py", line 48, in > convert_row=lambda r: tuple(fn(x) for fn,x in > izip(conversion_functions,r)) > ValueError: invalid literal for int() with base 10: '2174.875' > > A lot of the data I use can have a column with a set of int?s (e.g., 0?s), > but then the rest of that same column could be floats. I guess finding the > right conversion function is the tricky part. I was thinking about sampling > each, say, 10th obs to test which function to use. Not sure how that would > work however. > > If I ignore the option of an int (i.e., everything is a float, date, or > string) then your script is about twice as fast as mine!! > > Question: If you do ignore the int's initially, once the rec array is in > memory, would there be a quick way to check if the floats could pass as > int's? This may seem like a backwards approach but it might be 'safer' if > you really want to preserve the int's. > > Thanks again! > > Vincent > > > On 7/8/07 5:52 AM, "Torgil Svensson" wrote: > > > Given that both your script and the mlab version preloads the whole > > file before calling numpy constructor I'm curious how that compares in > > speed to using numpy's fromiter function on your data. Using fromiter > > should improve on memory usage (~50% ?). > > > > The drawback is for string columns where we don't longer know the > > width of the largest item. I made it fall-back to "object" in this > > case. > > > > Attached is a fromiter version of your script. Possible speedups could > > be done by trying different approaches to the "convert_row" function, > > for example using "zip" or "enumerate" instead of "izip". > > > > Best Regards, > > > > //Torgil > > > > > > On 7/8/07, Vincent Nijs wrote: > >> Thanks for the reference John! csv2rec is about 30% faster than my code on > >> the same data. > >> > >> If I read the code in csv2rec correctly it converts the data as it is being > >> read using the csv modules. My setup reads in the whole dataset into an > >> array of strings and then converts the columns as appropriate. > >> > >> Best, > >> > >> Vincent > >> > >> > >> On 7/6/07 8:53 PM, "John Hunter" wrote: > >> > >>> On 7/6/07, Vincent Nijs wrote: > >>>> I wrote the attached (small) program to read in a text/csv file with > >>>> different data types and convert it into a recarray without having to > >>>> pre-specify the dtypes or variables names. I am just too lazy to type-in > >>>> stuff like that :) The supported types are int, float, dates, and strings. > >>>> > >>>> I works pretty well but it is not (yet) as fast as I would like so I was > >>>> wonder if any of the numpy experts on this list might have some suggestion > >>>> on how to speed it up. I need to read 500MB-1GB files so speed is important > >>>> for me. > >>> > >>> In matplotlib.mlab svn, there is a function csv2rec that does the > >>> same. You may want to compare implementations in case we can > >>> fruitfully cross pollinate them. In the examples directy, there is an > >>> example script examples/loadrec.py > >>> _______________________________________________ > >>> Numpy-discussion mailing list > >>> Numpy-discussion at scipy.org > >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >>> > >> > >> > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- > Vincent R. Nijs > Assistant Professor of Marketing > Kellogg School of Management, Northwestern University > 2001 Sheridan Road, Evanston, IL 60208-2001 > Phone: +1-847-491-4574 Fax: +1-847-491-2498 > E-mail: v-nijs at kellogg.northwestern.edu > Skype: vincentnijs > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: load_gen_iter2.py Type: text/x-python Size: 4304 bytes Desc: not available URL: From v-nijs at kellogg.northwestern.edu Sun Jul 8 19:06:45 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Sun, 08 Jul 2007 18:06:45 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: Tim, I do want to auto-detect. Reading numbers in as floats is probably not a huge penalty. Is there an easy way to change the type of one column in a recarray that you know? I tried this: ra.col1 = ra.col1.astype(?i?) but that didn?t seem to work. I assume that means you would have to create a new array from the old one with an updated dtype list. Thanks, Vincent On 7/8/07 4:51 PM, "Timothy Hochberg" wrote: > > > On 7/8/07, Vincent Nijs wrote: >> Torgil, >> >> The function seems to work well and is slightly faster than your previous >> version (about 1/6th faster). >> >> Yes, I do have columns that start with, what looks like, int's and then >> turnTim, >> out to be floats. Something like below (col6). >> >> data = [['col1', 'col2', 'col3', 'col4', 'col5', 'col6'], >> ['1','3','1/97','1.12','2.11','0'], >> ['1','2','3/97',' 1.21','3.12','0'], >> ['2','1','2/97','1.12','2.11','0'], >> ['2','2','4/97','1.33','2.26',' 1.23'], >> ['2','2','5/97','1.73','2.42','1.26']] >> >> I think what your function assumes is that the 1st element will be the >> appropriate type. That may not hold if you have missing values or 'mixed >> types'. > > > Vincent, > > Do you need to auto detect the column types? Things get a lot simpler if you > have some known schema for each file; then you can simply pass that to some > reader function. It's also more robust since there's no way in general to > differentiate a column of integers from a column of floats with no decimal > part. > > If you do need to auto detect, one approach would be to always read both > int-like stuff and float-like stuff in as floats. Then after you get the array > check over the various columns and if any have no fractional parts, make a new > array where those columns are integers. > > -tim > >> Best, >> >> Vincent >> >> >> On 7/8/07 3:31 PM, "Torgil Svensson" < torgil.svensson at gmail.com> wrote: >> >>> > Hi >>> > >>> > I stumble on these types of problems from time to time so I'm >>> > interested in efficient solutions myself. >>> > >>> > Do you have a column which starts with something suitable for int on >>> > the first row (without decimal separator) but has decimals further >>> > down? >>> > >>> > This will be little tricky to support. One solution could be to yield >>> > StopIteration, calculate new type-conversion-functions and start over >>> > iterating over both the old data and the rest of the iterator. >>> > >>> > It'd be great if you could try the load_gen_iter.py I've attached to >>> > my response to Tim. >>> > >>> > Best Regards, >>> > >>> > //Torgil >>> > >>> > On 7/8/07, Vincent Nijs wrote: >>>> >> I am not (yet) very familiar with much of the functionality introduced in >>>> >> your script Torgil (izip, imap, etc.), but I really appreciate you >>>> taking >>>> >> the time to look at this! >>>> >> >>>> >> The program stopped with the following error: >>>> >> >>>> >> File "load_iter.py", line 48, in >>>> >> convert_row=lambda r: tuple(fn(x) for fn,x in >>>> >> izip(conversion_functions,r)) >>>> >> ValueError: invalid literal for int() with base 10: '2174.875' >>>> >> >>>> >> A lot of the data I use can have a column with a set of int?s (e.g., >>>> 0?s), >>>> >> but then the rest of that same column could be floats. I guess finding >>>> the >>>> >> right conversion function is the tricky part. I was thinking about >>>> sampling >>>> >> each, say, 10th obs to test which function to use. Not sure how that >>>> would >>>> >> work however. >>>> >> >>>> >> If I ignore the option of an int ( i.e., everything is a float, date, or >>>> >> string) then your script is about twice as fast as mine!! >>>> >> >>>> >> Question: If you do ignore the int's initially, once the rec array is in >>>> >> memory, would there be a quick way to check if the floats could pass as >>>> >> int's? This may seem like a backwards approach but it might be 'safer' if >>>> >> you really want to preserve the int's. >>>> >> >>>> >> Thanks again! >>>> >> >>>> >> Vincent >>>> >> >>>> >> >>>> >> On 7/8/07 5:52 AM, "Torgil Svensson" wrote: >>>> >> >>>>> >>> Given that both your script and the mlab version preloads the whole >>>>> >>> file before calling numpy constructor I'm curious how that compares in >>>>> >>> speed to using numpy's fromiter function on your data. Using fromiter >>>>> >>> should improve on memory usage (~50% ?). >>>>> >>> >>>>> >>> The drawback is for string columns where we don't longer know the >>>>> >>> width of the largest item. I made it fall-back to "object" in this >>>>> >>> case. >>>>> >>> >>>>> >>> Attached is a fromiter version of your script. Possible speedups could >>>>> >>> be done by trying different approaches to the "convert_row" function, >>>>> >>> for example using "zip" or "enumerate" instead of "izip". >>>>> >>> >>>>> >>> Best Regards, >>>>> >>> >>>>> >>> //Torgil >>>>> >>> >>>>> >>> >>>>> >>> On 7/8/07, Vincent Nijs >>>> > wrote: >>>>>> >>>> Thanks for the reference John! csv2rec is about 30% faster than my >>>>>> code on >>>>>> >>>> the same data. >>>>>> >>>> >>>>>> >>>> If I read the code in csv2rec correctly it converts the data as it >>>>>> is being >>>>>> >>>> read using the csv modules. My setup reads in the whole dataset into an >>>>>> >>>> array of strings and then converts the columns as appropriate. >>>>>> >>>> >>>>>> >>>> Best, >>>>>> >>>> >>>>>> >>>> Vincent >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> On 7/6/07 8:53 PM, "John Hunter" wrote: >>>>>> >>>> >>>>>>> >>>>> On 7/6/07, Vincent Nijs wrote: >>>>>>>> >>>>>> I wrote the attached (small) program to read in a text/csv file with >>>>>>>> >>>>>> different data types and convert it into a recarray without >>>>>>>> having to >>>>>>>> >>>>>> pre-specify the dtypes or variables names. I am just too lazy to >>>>>>>> type-in >>>>>>>> >>>>>> stuff like that :) The supported types are int, float, dates, and >>>>>>>> >>>>>> strings. >>>>>>>> >>>>>> >>>>>>>> >>>>>> I works pretty well but it is not (yet) as fast as I would like >>>>>>>> so I was >>>>>>>> >>>>>> wonder if any of the numpy experts on this list might have some >>>>>>>> >>>>>> suggestion >>>>>>>> >>>>>> on how to speed it up. I need to read 500MB-1GB files so speed is >>>>>>>> >>>>>> important >>>>>>>> >>>>>> for me. >>>>>>> >>>>> >>>>>>> >>>>> In matplotlib.mlab svn, there is a function csv2rec that does the >>>>>>> >>>>> same. You may want to compare implementations in case we can >>>>>>> >>>>> fruitfully cross pollinate them. In the examples directy, there >>>>>>> is an >>>>>>> >>>>> example script examples/loadrec.py >>>>>>> >>>>> _______________________________________________ >>>>>>> >>>>> Numpy-discussion mailing list >>>>>>> >>>>> Numpy-discussion at scipy.org >>>>>>> >>>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> _______________________________________________ >>>>>> >>>> Numpy-discussion mailing list >>>>>> >>>> Numpy-discussion at scipy.org >>>>>> >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>> >>>>> >>> _______________________________________________ >>>>> >>> Numpy-discussion mailing list >>>>> >>> Numpy-discussion at scipy.org >>>>> >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >> >>>> >> -- >>>> >> Vincent R. Nijs >>>> >> Assistant Professor of Marketing >>>> >> Kellogg School of Management, Northwestern University >>>> >> 2001 Sheridan Road, Evanston, IL 60208-2001 >>>> >> Phone: +1-847-491-4574 Fax: +1-847-491-2498 >>>> >> E-mail: v-nijs at kellogg.northwestern.edu >>>> >> Skype: vincentnijs >>>> >> >>>> >> >>>> >> >>>> >> _______________________________________________ >>>> >> Numpy-discussion mailing list >>>> >> Numpy-discussion at scipy.org >>>> >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >> >>> > _______________________________________________ >>> > Numpy-discussion mailing list >>> > Numpy-discussion at scipy.org >>> > http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> > >> >> -- >> Vincent R. Nijs >> Assistant Professor of Marketing >> Kellogg School of Management, Northwestern University >> 2001 Sheridan Road, Evanston, IL 60208-2001 >> Phone: +1-847-491-4574 Fax: +1-847-491-2498 >> E-mail: v-nijs at kellogg.northwestern.edu >> Skype: vincentnijs >> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Vincent R. Nijs Assistant Professor of Marketing Kellogg School of Management, Northwestern University 2001 Sheridan Road, Evanston, IL 60208-2001 Phone: +1-847-491-4574 Fax: +1-847-491-2498 E-mail: v-nijs at kellogg.northwestern.edu Skype: vincentnijs -------------- next part -------------- An HTML attachment was scrubbed... URL: From v-nijs at kellogg.northwestern.edu Sun Jul 8 19:11:58 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Sun, 08 Jul 2007 18:11:58 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: Thanks for looking into this Torgil! I agree that this is a much more complicated setup. I'll check if there is anything I can do on the data end. Otherwise I'll go with Timothy's suggestion and read in numbers as floats and convert to int later as needed. Vincent On 7/8/07 5:40 PM, "Torgil Svensson" wrote: >> Question: If you do ignore the int's initially, once the rec array is in >> memory, would there be a quick way to check if the floats could pass as >> int's? This may seem like a backwards approach but it might be 'safer' if >> you really want to preserve the int's. > > In your case the floats don't pass as ints since you have decimals. > The attached file takes another approach (sorry for lack of comments). > If the conversion fail, the current row is stored and the iterator > exits (without setting a 'finished' parameter to true). The program > then re-calculates the conversion-functions and checks for changes. If > the changes are supported (=we have a conversion function for old data > in the format_changes dictionary) it calls fromiter again with an > iterator like this: > > def get_data_iterator(row_iter,delim,res): > for x0,x1,x2,x3,x4,x5 in res['data']: > x0=float(x0) > print (x0,x1,x2,x3,x4,x5) > yield (x0,x1,x2,x3,x4,x5) > yield > (float('2.0'),int('2'),datestr2num('4/97'),float('1.33'),float('2.26'),float(' > 1.23')) > for row in row_iter: > x0,x1,x2,x3,x4,x5=row.split(delim) > try: > yield > (float(x0),int(x1),datestr2num(x2),float(x3),float(x4),float(x5)) > except: > res['row']=row > return > res['finished']=True > > res['data'] is the previously converted data. This has the obvious > disadvantage that if only the last row has fractions in a column, > it'll cost double memory. Also if many columns change format at > different places it has to re-convert every time. > > I don't recommend this because of the drawbacks and extra complexity. > I think it is best to convert your files (or file generation) so that > float columns are represented with 0.0 instead of 0. > > Best Regards, > > //Torgil > > On 7/8/07, Vincent Nijs wrote: >> I am not (yet) very familiar with much of the functionality introduced in >> your script Torgil (izip, imap, etc.), but I really appreciate you taking >> the time to look at this! >> >> The program stopped with the following error: >> >> File "load_iter.py", line 48, in >> convert_row=lambda r: tuple(fn(x) for fn,x in >> izip(conversion_functions,r)) >> ValueError: invalid literal for int() with base 10: '2174.875' >> >> A lot of the data I use can have a column with a set of int?s (e.g., 0?s), >> but then the rest of that same column could be floats. I guess finding the >> right conversion function is the tricky part. I was thinking about sampling >> each, say, 10th obs to test which function to use. Not sure how that would >> work however. >> >> If I ignore the option of an int (i.e., everything is a float, date, or >> string) then your script is about twice as fast as mine!! >> >> Question: If you do ignore the int's initially, once the rec array is in >> memory, would there be a quick way to check if the floats could pass as >> int's? This may seem like a backwards approach but it might be 'safer' if >> you really want to preserve the int's. >> >> Thanks again! >> >> Vincent >> >> >> On 7/8/07 5:52 AM, "Torgil Svensson" wrote: >> >>> Given that both your script and the mlab version preloads the whole >>> file before calling numpy constructor I'm curious how that compares in >>> speed to using numpy's fromiter function on your data. Using fromiter >>> should improve on memory usage (~50% ?). >>> >>> The drawback is for string columns where we don't longer know the >>> width of the largest item. I made it fall-back to "object" in this >>> case. >>> >>> Attached is a fromiter version of your script. Possible speedups could >>> be done by trying different approaches to the "convert_row" function, >>> for example using "zip" or "enumerate" instead of "izip". >>> >>> Best Regards, >>> >>> //Torgil >>> >>> >>> On 7/8/07, Vincent Nijs wrote: >>>> Thanks for the reference John! csv2rec is about 30% faster than my code on >>>> the same data. >>>> >>>> If I read the code in csv2rec correctly it converts the data as it is being >>>> read using the csv modules. My setup reads in the whole dataset into an >>>> array of strings and then converts the columns as appropriate. >>>> >>>> Best, >>>> >>>> Vincent >>>> >>>> >>>> On 7/6/07 8:53 PM, "John Hunter" wrote: >>>> >>>>> On 7/6/07, Vincent Nijs wrote: >>>>>> I wrote the attached (small) program to read in a text/csv file with >>>>>> different data types and convert it into a recarray without having to >>>>>> pre-specify the dtypes or variables names. I am just too lazy to type-in >>>>>> stuff like that :) The supported types are int, float, dates, and >>>>>> strings. >>>>>> >>>>>> I works pretty well but it is not (yet) as fast as I would like so I was >>>>>> wonder if any of the numpy experts on this list might have some >>>>>> suggestion >>>>>> on how to speed it up. I need to read 500MB-1GB files so speed is >>>>>> important >>>>>> for me. >>>>> >>>>> In matplotlib.mlab svn, there is a function csv2rec that does the >>>>> same. You may want to compare implementations in case we can >>>>> fruitfully cross pollinate them. In the examples directy, there is an >>>>> example script examples/loadrec.py >>>>> _______________________________________________ >>>>> Numpy-discussion mailing list >>>>> Numpy-discussion at scipy.org >>>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> -- >> Vincent R. Nijs >> Assistant Professor of Marketing >> Kellogg School of Management, Northwestern University >> 2001 Sheridan Road, Evanston, IL 60208-2001 >> Phone: +1-847-491-4574 Fax: +1-847-491-2498 >> E-mail: v-nijs at kellogg.northwestern.edu >> Skype: vincentnijs >> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Vincent R. Nijs Assistant Professor of Marketing Kellogg School of Management, Northwestern University 2001 Sheridan Road, Evanston, IL 60208-2001 Phone: +1-847-491-4574 Fax: +1-847-491-2498 E-mail: v-nijs at kellogg.northwestern.edu Skype: vincentnijs From torgil.svensson at gmail.com Sun Jul 8 20:00:49 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Mon, 9 Jul 2007 02:00:49 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: FWIW >>> n,dt=descr[0] >>> new_dt=dt.replace('f','i') >>> descr[0]=(n,new_dt) >>> data=ra.col1.astype(new_dt) >>> ra.dtype=N.dtype(descr) >>> ra.col1=data //Torgil On 7/9/07, Vincent Nijs wrote: > > Tim, > > I do want to auto-detect. Reading numbers in as floats is probably not a > huge penalty. > > Is there an easy way to change the type of one column in a recarray that > you know? > > I tried this: > > ra.col1 = ra.col1.astype('i') > > but that didn't seem to work. I assume that means you would have to create > a new array from the old one with an updated dtype list. > > Thanks, > > Vincent > > > On 7/8/07 4:51 PM, "Timothy Hochberg" wrote: > > > > > On 7/8/07, Vincent Nijs wrote: > > Torgil, > > The function seems to work well and is slightly faster than your previous > version (about 1/6th faster). > > Yes, I do have columns that start with, what looks like, int's and then > turnTim, > > out to be floats. Something like below (col6). > > data = [['col1', 'col2', 'col3', 'col4', 'col5', 'col6'], > ['1','3','1/97','1.12','2.11','0'], > ['1','2','3/97',' 1.21','3.12','0'], > ['2','1','2/97','1.12','2.11','0'], > ['2','2','4/97','1.33','2.26',' 1.23'], > ['2','2','5/97','1.73','2.42','1.26']] > > I think what your function assumes is that the 1st element will be the > appropriate type. That may not hold if you have missing values or 'mixed > types'. > > > > Vincent, > > Do you need to auto detect the column types? Things get a lot simpler if > you have some known schema for each file; then you can simply pass that to > some reader function. It's also more robust since there's no way in general > to differentiate a column of integers from a column of floats with no > decimal part. > > If you do need to auto detect, one approach would be to always read both > int-like stuff and float-like stuff in as floats. Then after you get the > array check over the various columns and if any have no fractional parts, > make a new array where those columns are integers. > > -tim > > > Best, > > Vincent > > > On 7/8/07 3:31 PM, "Torgil Svensson" < torgil.svensson at gmail.com> wrote: > > > Hi > > > > I stumble on these types of problems from time to time so I'm > > interested in efficient solutions myself. > > > > Do you have a column which starts with something suitable for int on > > the first row (without decimal separator) but has decimals further > > down? > > > > This will be little tricky to support. One solution could be to yield > > StopIteration, calculate new type-conversion-functions and start over > > iterating over both the old data and the rest of the iterator. > > > > It'd be great if you could try the load_gen_iter.py I've attached to > > my response to Tim. > > > > Best Regards, > > > > //Torgil > > > > On 7/8/07, Vincent Nijs wrote: > >> I am not (yet) very familiar with much of the functionality introduced > in > >> your script Torgil (izip, imap, etc.), but I really appreciate you > taking > >> the time to look at this! > >> > >> The program stopped with the following error: > >> > >> File "load_iter.py", line 48, in > >> convert_row=lambda r: tuple(fn(x) for fn,x in > >> izip(conversion_functions,r)) > >> ValueError: invalid literal for int() with base 10: '2174.875' > >> > >> A lot of the data I use can have a column with a set of int's (e.g., > 0's), > >> but then the rest of that same column could be floats. I guess finding > the > >> right conversion function is the tricky part. I was thinking about > sampling > >> each, say, 10th obs to test which function to use. Not sure how that > would > >> work however. > >> > >> If I ignore the option of an int ( i.e., everything is a float, date, or > >> string) then your script is about twice as fast as mine!! > >> > >> Question: If you do ignore the int's initially, once the rec array is in > >> memory, would there be a quick way to check if the floats could pass as > >> int's? This may seem like a backwards approach but it might be 'safer' > if > >> you really want to preserve the int's. > >> > >> Thanks again! > >> > >> Vincent > >> > >> > >> On 7/8/07 5:52 AM, "Torgil Svensson" wrote: > >> > >>> Given that both your script and the mlab version preloads the whole > >>> file before calling numpy constructor I'm curious how that compares in > >>> speed to using numpy's fromiter function on your data. Using fromiter > >>> should improve on memory usage (~50% ?). > >>> > >>> The drawback is for string columns where we don't longer know the > >>> width of the largest item. I made it fall-back to "object" in this > >>> case. > >>> > >>> Attached is a fromiter version of your script. Possible speedups could > >>> be done by trying different approaches to the "convert_row" function, > >>> for example using "zip" or "enumerate" instead of "izip". > >>> > >>> Best Regards, > >>> > >>> //Torgil > >>> > >>> > >>> On 7/8/07, Vincent Nijs > wrote: > >>>> Thanks for the reference John! csv2rec is about 30% faster than my > code on > >>>> the same data. > >>>> > >>>> If I read the code in csv2rec correctly it converts the data as it is > being > >>>> read using the csv modules. My setup reads in the whole dataset into > an > >>>> array of strings and then converts the columns as appropriate. > >>>> > >>>> Best, > >>>> > >>>> Vincent > >>>> > >>>> > >>>> On 7/6/07 8:53 PM, "John Hunter" wrote: > >>>> > >>>>> On 7/6/07, Vincent Nijs wrote: > >>>>>> I wrote the attached (small) program to read in a text/csv file with > >>>>>> different data types and convert it into a recarray without having > to > >>>>>> pre-specify the dtypes or variables names. I am just too lazy to > type-in > >>>>>> stuff like that :) The supported types are int, float, dates, and > >>>>>> strings. > >>>>>> > >>>>>> I works pretty well but it is not (yet) as fast as I would like so I > was > >>>>>> wonder if any of the numpy experts on this list might have some > >>>>>> suggestion > >>>>>> on how to speed it up. I need to read 500MB-1GB files so speed is > >>>>>> important > >>>>>> for me. > >>>>> > >>>>> In matplotlib.mlab svn, there is a function csv2rec that does the > >>>>> same. You may want to compare implementations in case we can > >>>>> fruitfully cross pollinate them. In the examples directy, there is > an > >>>>> example script examples/loadrec.py > >>>>> _______________________________________________ > >>>>> Numpy-discussion mailing list > >>>>> Numpy-discussion at scipy.org > >>>>> > http://projects.scipy.org/mailman/listinfo/numpy-discussion > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Numpy-discussion mailing list > >>>> Numpy-discussion at scipy.org > > >>>> > http://projects.scipy.org/mailman/listinfo/numpy-discussion > >>>> > >>> _______________________________________________ > >>> Numpy-discussion mailing list > >>> Numpy-discussion at scipy.org > >>> > http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > >> -- > >> Vincent R. Nijs > >> Assistant Professor of Marketing > >> Kellogg School of Management, Northwestern University > >> 2001 Sheridan Road, Evanston, IL 60208-2001 > >> Phone: +1-847-491-4574 Fax: +1-847-491-2498 > >> E-mail: v-nijs at kellogg.northwestern.edu > >> Skype: vincentnijs > >> > >> > >> > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> > http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > Vincent R. Nijs > Assistant Professor of Marketing > Kellogg School of Management, Northwestern University > 2001 Sheridan Road, Evanston, IL 60208-2001 > Phone: +1-847-491-4574 Fax: +1-847-491-2498 > E-mail: v-nijs at kellogg.northwestern.edu > Skype: vincentnijs > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > Vincent R. Nijs > Assistant Professor of Marketing > Kellogg School of Management, Northwestern University > 2001 Sheridan Road, Evanston, IL 60208-2001 > Phone: +1-847-491-4574 Fax: +1-847-491-2498 > E-mail: v-nijs at kellogg.northwestern.edu > Skype: vincentnijs > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From tim.hochberg at ieee.org Sun Jul 8 23:25:11 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 8 Jul 2007 20:25:11 -0700 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: On 7/8/07, Vincent Nijs wrote: > > Thanks for looking into this Torgil! I agree that this is a much more > complicated setup. I'll check if there is anything I can do on the data > end. > Otherwise I'll go with Timothy's suggestion and read in numbers as floats > and convert to int later as needed. Here is a strategy that should allow auto detection without too much in the way of inefficiency. The basic idea is to convert till you run into a problem, store that data away, and continue the conversion with a new dtype. At the end you assemble all the chunks of data you've accumulated into one large array. It should be reasonably efficient in terms of both memory and speed. The implementation is a little rough, but it should get the idea across. -- . __ . |-\ . . tim.hochberg at ieee.org ======================================================================== def find_formats(items, last): formats = [] for i, x in enumerate(items): dt, cvt = string_to_dt_cvt(x) if last is not None: last_cvt, last_dt = last[i] if last_cvt is float and cvt is int: cvt = float formats.append((dt, cvt)) return formats class LoadInfo(object): def __init__(self, row0): self.done = False self.lastcols = None self.row0 = row0 def data_iterator(lines, converters, delim, info): yield tuple(f(x) for f, x in zip(converters, info.row0.split(delim))) try: for row in lines: yield tuple(f(x) for f, x in zip(converters, row.split(delim))) except: info.row0 = row else: info.done = True def load2(fname,delim = ',', has_varnm = True, prn_report = True): """ Loading data from a file using the csv module. Returns a recarray. """ f=open(fname,'rb') if has_varnm: varnames = [i.strip() for i in f.next().split(delim)] else: varnames = None info = LoadInfo(f.next()) chunks = [] while not info.done: row0 = info.row0.split(delim) formats = find_formats(row0, info.lastcols) if varnames is None: varnames = varnm = ['col%s' % str(i+1) for i, _ in enumerate(formate)] descr=[] conversion_functions=[] for name, (dtype, cvt_fn) in zip(varnames, formats): descr.append((name,dtype)) conversion_functions.append(cvt_fn) chunks.append(N.fromiter(data_iterator(f, conversion_functions, delim, info), descr)) if len(chunks) > 1: n = sum(len(x) for x in chunks) data = N.zeros([n], chunks[-1].dtype) offset = 0 for x in chunks: delta = len(x) data[offset:offset+delta] = x offset += delta else: [data] = chunks # load report if prn_report: print "##########################################\n" print "Loaded file: %s\n" % fname print "Nr obs: %s\n" % data.shape[0] print "Variables and datatypes:\n" for i in data.dtype.descr: print "Varname: %s, Type: %s, Sample: %s" % (i[0], i[1], str(data[i[0]][0:3])) print "\n##########################################\n" return data -------------- next part -------------- An HTML attachment was scrubbed... URL: From fullung at gmail.com Sun Jul 8 23:27:21 2007 From: fullung at gmail.com (Albert Strasheim) Date: Mon, 9 Jul 2007 05:27:21 +0200 Subject: [Numpy-discussion] Buildbot for numpy In-Reply-To: References: <20070616081155.GC20362@mentat.za.net> Message-ID: <20070709032721.GA14850@dogbert.sdsl.sun.ac.za> Hello On Mon, 02 Jul 2007, Barry Wark wrote: > I have the potential to add OS X Server Intel (64-bit) and OS X Intel > (32-bit) to the list, if I can convince my boss that the security risk Sounds good. We could definitely use these platforms. > (including DOS from compile times) is minimal. I've compiled both Currently we don't allow builds to be forced from the web page, but this might change in future. > numpy and scipy many times, so I'm not worried about resources for a > single compile/test, but can any of the regular developers tell me > about how many commits there are per day that will trigger a > compile/test? We currently only build NumPy. SciPy should probably be added at some point, once we figure out how we want to configure the Buildbot to do this. NumPy averages close to 0 commits per day at this point. SciPy is more active. Between the two, on a busy day, you could expect more than 10 and less than 100 builds. > About the more general security risk of running a buildbot slave, from > my reading of the buildbot manual (not the source, yet), it looks like > the slave is a Twisted server that runs as a normal user process. Is > there any sort of sandboxing built into the buildbot slave or is that > the responsibility of the OS (an issue I'll have to discuss with our > IT)? Through the buildbot master configuration, we tell your buildslave what to check out and which commands to execute. We have set it up to do the build in terms of a Makefile, so the master will tell the slave to run "make build" followed by "make test". Here you can make your own machine do anything that hopefully involves running python setup.py, etc. However, the configuration on the master can be changed to make your slave execute any command. In short, any NumPy/SciPy committer or anyone who controls the build master configuration (i.e., me, Stefan, our admin person, a few other people who have root access on that machine and anybody who successfully breaks into it) can make your build machine execute arbitrary code as the build slave user. The chance of this happening is small, but it's not impossible, so if this risk is unacceptable to you/your IT people, running a build slave might not be for you. ;-) Cheers, Albert From v-nijs at kellogg.northwestern.edu Mon Jul 9 09:55:25 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Mon, 09 Jul 2007 08:55:25 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: Cool! Thanks Tim. Vincent On 7/8/07 10:25 PM, "Timothy Hochberg" wrote: > > > On 7/8/07, Vincent Nijs wrote: >> Thanks for looking into this Torgil! I agree that this is a much more >> complicated setup. I'll check if there is anything I can do on the data end. >> Otherwise I'll go with Timothy's suggestion and read in numbers as floats >> and convert to int later as needed. > > Here is a strategy that should allow auto detection without too much in the > way of inefficiency. The basic idea is to convert till you run into a problem, > store that data away, and continue the conversion with a new dtype. At the end > you assemble all the chunks of data you've accumulated into one large array. > It should be reasonably efficient in terms of both memory and speed. > > The implementation is a little rough, but it should get the idea across. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Mon Jul 9 12:42:13 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 9 Jul 2007 09:42:13 -0700 Subject: [Numpy-discussion] Array protocol and wxPython Message-ID: This is mainly addressed to Chris Barker since he seems to follow wxPython pretty closely. Does anyone know if anyone ever tried to get wxPython to use the array protocol to speed up the internal conversion of arrays? I was toying with the idea of giving it a try if no one's got around to it. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Mon Jul 9 15:32:02 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 9 Jul 2007 12:32:02 -0700 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: <468FBC40.8030006@ieee.org> References: <468F285D.30308@ieee.org> <468FBC40.8030006@ieee.org> Message-ID: On 7/7/07, Travis Oliphant wrote: > > > > > > > > On 7/6/07, *Travis Oliphant* > > wrote: > > > > Timothy Hochberg wrote: > > > > > > I'm working on getting some old code working with numpy and I > > noticed > > > that bool_ is not a subclass of int. Given that python's bool > > > subclasses into and that the other scalar types are subclasses of > > > their respective counterparts it seems at first glance that > > > numpy.bool_ should subclass python's bool, which in turn > subclasses > > > int. Or am I missing something here? > > The reason it is not, is because it is not binary compatible with > > Python's integer. The numpy bool_ is always only 8-bits while the > > Python integer is 32-bits or 64-bits. > > > > This could be changed I suspect, but then it would break the > > relationship between scalars and their array counterparts > > > > > > Do you have and idea off the top of your head head how painful this > > would be from an implementation standpoint. And is there a theoretical > > reason that it is important that the scalar and array implementations > > match? I would think that, conceptually, they are all 1-bit integers, > > and it seems that the 8-bit, versus 32- or 64-bits is just an > > implementation detail. > It would probably take about 2-3 hours to make the change and about 3 > more hours to fix the problems that were not anticipated. Basically, > we would have to special-case the bool like we do the unicode scalar > (which also doesn't necessarily match the array-based representation but > instead follows the Python implementation). > > I guess I don't really see a problem in switching just the numpy.bool_ > scalar to be a sub-class of the Python bool type and adjusting the code > to make the switch when creating a scalar. I gave this a try. Since so much code is auto-generated, it can be difficult to figure out what's going on in the core matrix stuff. Still, it seems like the solution is almost absurdly easy, consisting of changing only three lines. First off, does this seem right? Code compiled against this patch passes all tests and seems to run my application right, but that's not conclusive. Please let me know if I missed something obvious. -- . __ . |-\ . . tim.hochberg at ieee.org =================================================================== Index: numpy/core/code_generators/generate_array_api.py =================================================================== --- numpy/core/code_generators/generate_array_api.py (revision 3883) +++ numpy/core/code_generators/generate_array_api.py (working copy) @@ -17,7 +17,7 @@ typedef struct { PyObject_HEAD - npy_bool obval; + npy_long obval; } PyBoolScalarObject; Index: numpy/core/include/numpy/arrayscalars.h =================================================================== --- numpy/core/include/numpy/arrayscalars.h (revision 3883) +++ numpy/core/include/numpy/arrayscalars.h (working copy) @@ -1,7 +1,7 @@ #ifndef _MULTIARRAYMODULE typedef struct { PyObject_HEAD - npy_bool obval; + npy_long obval; } PyBoolScalarObject; #endif Index: numpy/core/src/multiarraymodule.c =================================================================== --- numpy/core/src/multiarraymodule.c (revision 3883) +++ numpy/core/src/multiarraymodule.c (working copy) @@ -7417,7 +7417,7 @@ return -1; \ } - SINGLE_INHERIT(Bool, Generic); + DUAL_INHERIT(Bool, Bool, Generic); SINGLE_INHERIT(Byte, SignedInteger); SINGLE_INHERIT(Short, SignedInteger); #if SIZEOF_INT == SIZEOF_LONG -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Mon Jul 9 15:37:11 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 9 Jul 2007 12:37:11 -0700 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org> <468FBC40.8030006@ieee.org> Message-ID: On 7/7/07, Alan G Isaac wrote: > > On Sat, 7 Jul 2007, Charles R Harris apparently wrote: > > In [60]: a > > Out[60]: array([ True, True, True, True], dtype=bool) > > In [61]: a + a > > Out[61]: array([ True, True, True, True], dtype=bool) > > Yea! > Behaves like a boolean array. > And for multiplication to. > And in boolean matrices, powers work right. > (I use this.) > > > > In [62]: a + 1 > > Out[62]: array([2, 2, 2, 2]) > > Yea! > Coercion to int, as expected. > > > > In [66]: True + True > > Out[66]: 2 > > Boo! > Hopefully Python will "fix" this one day. It will almost certainly not. And the fact that numpy and Python are inconsistent this way gives my the creeps. Why not simply use & and | instead of + and *? -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From barrywark at gmail.com Mon Jul 9 15:39:05 2007 From: barrywark at gmail.com (Barry Wark) Date: Mon, 9 Jul 2007 12:39:05 -0700 Subject: [Numpy-discussion] Buildbot for numpy In-Reply-To: <20070709032721.GA14850@dogbert.sdsl.sun.ac.za> References: <20070616081155.GC20362@mentat.za.net> <20070709032721.GA14850@dogbert.sdsl.sun.ac.za> Message-ID: Albert, Thanks for the info! I will run it by folks here and see what we can fiure out. We're using numpy and scipy very heavily in our internal software, so we have an interest in making sure it works on our platform. Hopefully we (here) can agree on a strategy that will satisfy me and the IT people. I'll get back to you ASAP. thanks, Barry On 7/8/07, Albert Strasheim wrote: > Hello > > On Mon, 02 Jul 2007, Barry Wark wrote: > > > I have the potential to add OS X Server Intel (64-bit) and OS X Intel > > (32-bit) to the list, if I can convince my boss that the security risk > > Sounds good. We could definitely use these platforms. > > > (including DOS from compile times) is minimal. I've compiled both > > Currently we don't allow builds to be forced from the web page, but this > might change in future. > > > numpy and scipy many times, so I'm not worried about resources for a > > single compile/test, but can any of the regular developers tell me > > about how many commits there are per day that will trigger a > > compile/test? > > We currently only build NumPy. SciPy should probably be added at some > point, once we figure out how we want to configure the Buildbot to do > this. NumPy averages close to 0 commits per day at this point. SciPy is > more active. Between the two, on a busy day, you could expect more than > 10 and less than 100 builds. > > > About the more general security risk of running a buildbot slave, from > > my reading of the buildbot manual (not the source, yet), it looks like > > the slave is a Twisted server that runs as a normal user process. Is > > there any sort of sandboxing built into the buildbot slave or is that > > the responsibility of the OS (an issue I'll have to discuss with our > > IT)? > > Through the buildbot master configuration, we tell your buildslave what > to check out and which commands to execute. We have set it up to do the > build in terms of a Makefile, so the master will tell the slave to run > "make build" followed by "make test". Here you can make your own > machine do anything that hopefully involves running python setup.py, > etc. However, the configuration on the master can be changed to make > your slave execute any command. > > In short, any NumPy/SciPy committer or anyone who controls the build > master configuration (i.e., me, Stefan, our admin person, a few other > people who have root access on that machine and anybody who > successfully breaks into it) can make your build machine execute > arbitrary code as the build slave user. > > The chance of this happening is small, but it's not impossible, so if > this risk is unacceptable to you/your IT people, running a build slave > might not be for you. ;-) > > Cheers, > > Albert > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Mon Jul 9 15:54:02 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 09 Jul 2007 12:54:02 -0700 Subject: [Numpy-discussion] Array protocol and wxPython In-Reply-To: References: Message-ID: <4692925A.8050206@noaa.gov> Timothy Hochberg wrote: > This is mainly addressed to Chris Barker since he seems to follow > wxPython pretty closely. I do try to. > Does anyone know if anyone ever tried to get wxPython to use the array > protocol to speed up the internal conversion of arrays? Not that I've heard of. A note to Robin would probably be in order, though. I've threatened to a few times, but don't have a compelling need at the moment. > I was toying > with the idea of giving it a try if no one's got around to it. That would be great! Let me know if I can help with testing or something. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From millman at berkeley.edu Mon Jul 9 16:00:24 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 9 Jul 2007 13:00:24 -0700 Subject: [Numpy-discussion] scipy.test() warnings, errors and failures In-Reply-To: <1182805685.941953.31120@q69g2000hsb.googlegroups.com> References: <1182805685.941953.31120@q69g2000hsb.googlegroups.com> Message-ID: You can safely ignore the "ScipyTest is now called NumpyTest" warnings. To get rid of the errors you could try installing scipy from svn or user an older version of numpy. Good luck, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From saintmlx at apstat.com Mon Jul 9 18:13:34 2007 From: saintmlx at apstat.com (Xavier Saint-Mleux) Date: Mon, 09 Jul 2007 18:13:34 -0400 Subject: [Numpy-discussion] Conversion float64->int bugged? Message-ID: <4692B30E.4090404@apstat.com> Hi all, The conversion from a numpy scalar to a python int is not consistent with python's native conversion (or numarray's): if the scalar is out of bounds for an int, python and numarray automatically create a long while numpy still creates an int... with the wrong value. N.B. I am new to numpy, so please forgive me if this issue has already been discussed. I've quickly searched the archives and Travis's "Guide to NumPy", with no success. e.g. (using numpy 1.0.3): Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from numpy import * >>> l= [1e3, 1e9, 1e15, -1e3, -1e9, -1e15] >>> a= array(l) >>> map(int, l) [1000, 1000000000, 1000000000000000L, -1000, -1000000000, -1000000000000000L] >>> map(int, a) [1000, 1000000000, -2147483648, -1000, -1000000000, -2147483648] >>> map(long, a) [1000L, 1000000000L, 1000000000000000L, -1000L, -1000000000L, -1000000000000000L] >>> IMHO, numpy's conversions to int should behave like Python's 'float_int' or 'long_int' functions (see $PYTHON_SRC_DIR/Objects/floatobject.c, $PYTHON_SRC_DIR/Objects/longobject.c): if it doesn't fit in an int, return a long. For now (svn), it seems that numpy is always using PyInt_FromLong after an implicit C cast to long (which silently fails; see $NUMPY_SRC_DIR/numpy/core/src/scalarmathmodule.c.src) Is there any reason not to change this? Thanks, Xavier Saint-Mleux From aisaac at american.edu Mon Jul 9 18:48:52 2007 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 9 Jul 2007 18:48:52 -0400 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org><468FBC40.8030006@ieee.org> Message-ID: On Mon, 9 Jul 2007, Timothy Hochberg apparently wrote: > Why not simply use & and | instead of + and *? A couple reasons, none determinative. 1. numpy is right a Python is wrong it this case (but granted, I would usually go with Python is such cases) 2. consistency with Boolean matrices Elaboration on 2: Boolean matrices currently behave as expected, with standard notation. Related to this, they handle exponents correctly. Suppose arrays are changed as you suggest. Then either - array behavior and matrix behavior are decoupled, or - matrix behavior is completely broken for boolen matrices Alan Isaac PS Examples of good behavior: >>> x matrix([[True, True], [True, False]], dtype=bool) >>> y matrix([[False, True], [True, False]], dtype=bool) >>> x*y matrix([[True, True], [False, True]], dtype=bool) >>> x**2 matrix([[True, True], [True, True]], dtype=bool) >>> From torgil.svensson at gmail.com Mon Jul 9 18:46:49 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Tue, 10 Jul 2007 00:46:49 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: Elegant solution. Very readable and takes care of row0 nicely. I want to point out that this is much more efficient than my version for random/late string representation changes throughout the conversion but it suffers from 2*n memory footprint and large block copying if the string rep changes arrives very early on huge datasets. I think we can't have best of both and Tims solution is better in the general case. Maybe "use one_alt if rownumber < xxx else use other_alt" can fine-tune performance for some cases. but even ten, with many cols, it's nearly impossible to know. //Torgil On 7/9/07, Timothy Hochberg wrote: > > > On 7/8/07, Vincent Nijs wrote: > > Thanks for looking into this Torgil! I agree that this is a much more > > complicated setup. I'll check if there is anything I can do on the data > end. > > Otherwise I'll go with Timothy's suggestion and read in numbers as floats > > and convert to int later as needed. > > Here is a strategy that should allow auto detection without too much in the > way of inefficiency. The basic idea is to convert till you run into a > problem, store that data away, and continue the conversion with a new dtype. > At the end you assemble all the chunks of data you've accumulated into one > large array. It should be reasonably efficient in terms of both memory and > speed. > > The implementation is a little rough, but it should get the idea across. > > -- > . __ > . |-\ > . > . tim.hochberg at ieee.org > > ======================================================================== > > def find_formats(items, last): > formats = [] > for i, x in enumerate(items): > dt, cvt = string_to_dt_cvt(x) > if last is not None: > last_cvt, last_dt = last[i] > if last_cvt is float and cvt is int: > cvt = float > formats.append((dt, cvt)) > return formats > > class LoadInfo(object): > def __init__(self, row0): > self.done = False > self.lastcols = None > self.row0 = row0 > > def data_iterator(lines, converters, delim, info): > yield tuple(f(x) for f, x in zip(converters, info.row0.split(delim))) > try: > for row in lines: > yield tuple(f(x) for f, x in zip(converters, row.split(delim))) > except: > info.row0 = row > else: > info.done = True > > def load2(fname,delim = ',', has_varnm = True, prn_report = True): > """ > Loading data from a file using the csv module. Returns a recarray. > """ > f=open(fname,'rb') > > if has_varnm: > varnames = [i.strip() for i in f.next().split(delim)] > else: > varnames = None > > > info = LoadInfo(f.next()) > chunks = [] > > while not info.done: > row0 = info.row0.split(delim) > formats = find_formats(row0, info.lastcols ) > if varnames is None: > varnames = varnm = ['col%s' % str(i+1) for i, _ in > enumerate(formate)] > descr=[] > conversion_functions=[] > for name, (dtype, cvt_fn) in zip(varnames, formats): > descr.append((name,dtype)) > conversion_functions.append(cvt_fn) > > chunks.append(N.fromiter(data_iterator(f, conversion_functions, > delim, info), descr)) > > if len(chunks) > 1: > n = sum(len(x) for x in chunks) > data = N.zeros([n], chunks[-1].dtype) > offset = 0 > for x in chunks: > delta = len(x) > data[offset:offset+delta] = x > offset += delta > else: > [data] = chunks > > # load report > if prn_report: > print > "##########################################\n" > print "Loaded file: %s\n" % fname > print "Nr obs: %s\n" % data.shape[0] > print "Variables and datatypes:\n" > for i in data.dtype.descr: > print "Varname: %s, Type: %s, Sample: %s" % (i[0], i[1], > str(data[i[0]][0:3])) > print > "\n##########################################\n" > > return data > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From tim.hochberg at ieee.org Mon Jul 9 19:24:44 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 9 Jul 2007 16:24:44 -0700 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org> <468FBC40.8030006@ieee.org> Message-ID: On 7/9/07, Alan G Isaac wrote: > > On Mon, 9 Jul 2007, Timothy Hochberg apparently wrote: > > Why not simply use & and | instead of + and *? > > A couple reasons, none determinative. > 1. numpy is right a Python is wrong it this case I don't think I agree with this. Once you've decided to make Boolean a subclass of Int, then Python's behavior seems to be the most sensible. One could argue (and people did) about whether that was a good choice, but it's useful for a lot of practical applications. In any event, given that Boolean subclasses Int, I think the current behavior is probably for the best. (but granted, I would usually go with Python is such cases) > 2. consistency with Boolean matrices OK. I sort of read past the fact that you were referring to matrices not arrays. This doesn't matter to me personally because I don't use the matrix class. I do do matrix algebra on occasion, but the matrix class has never been helpful for me. YMMV. Elaboration on 2: > Boolean matrices currently behave as expected, with standard > notation. Related to this, they handle exponents correctly. > > Suppose arrays are changed as you suggest. > Then either > - array behavior and matrix behavior are decoupled, or > - matrix behavior is completely broken for boolen matrices > > Alan Isaac > > PS Examples of good behavior: > > >>> x > matrix([[True, True], > [True, False]], dtype=bool) > >>> y > matrix([[False, True], > [True, False]], dtype=bool) > >>> x*y > matrix([[True, True], > [False, True]], dtype=bool) > >>> x**2 > matrix([[True, True], > [True, True]], dtype=bool) x*y and x**2 are already decoupled for arrays and matrices. What if x*y was simply defined to do a boolean matrix multiply when the arguments are boolean matrices? I don't care about this that much though, so I'll let it drop. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Mon Jul 9 19:32:25 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 9 Jul 2007 16:32:25 -0700 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: On 7/9/07, Torgil Svensson wrote: > > Elegant solution. Very readable and takes care of row0 nicely. > > I want to point out that this is much more efficient than my version > for random/late string representation changes throughout the > conversion but it suffers from 2*n memory footprint and large block > copying if the string rep changes arrives very early on huge datasets. Yep. I think we can't have best of both and Tims solution is better in the > general case. It probably would not be hard to do a hybrid version. One issue is that one doesn't, in general, know the size of the dataset in advance, so you'd have to use an absolute criteria (less than 100 lines) instead of a relative criteria (less than 20% done). I suppose you could stat the file or something, but that seems like overkill. Maybe "use one_alt if rownumber < xxx else use other_alt" can > fine-tune performance for some cases. but even ten, with many cols, > it's nearly impossible to know. That sounds sensible. I have an interesting thought on how to this that's a bit hard to describe. I'll try to throw it together and post another version today or tomorrow. //Torgil > > > On 7/9/07, Timothy Hochberg wrote: > > > > > > On 7/8/07, Vincent Nijs wrote: > > > Thanks for looking into this Torgil! I agree that this is a much more > > > complicated setup. I'll check if there is anything I can do on the > data > > end. > > > Otherwise I'll go with Timothy's suggestion and read in numbers as > floats > > > and convert to int later as needed. > > > > Here is a strategy that should allow auto detection without too much in > the > > way of inefficiency. The basic idea is to convert till you run into a > > problem, store that data away, and continue the conversion with a new > dtype. > > At the end you assemble all the chunks of data you've accumulated into > one > > large array. It should be reasonably efficient in terms of both memory > and > > speed. > > > > The implementation is a little rough, but it should get the idea across. > > > > -- > > . __ > > . |-\ > > . > > . tim.hochberg at ieee.org > > > > ======================================================================== > > > > def find_formats(items, last): > > formats = [] > > for i, x in enumerate(items): > > dt, cvt = string_to_dt_cvt(x) > > if last is not None: > > last_cvt, last_dt = last[i] > > if last_cvt is float and cvt is int: > > cvt = float > > formats.append((dt, cvt)) > > return formats > > > > class LoadInfo(object): > > def __init__(self, row0): > > self.done = False > > self.lastcols = None > > self.row0 = row0 > > > > def data_iterator(lines, converters, delim, info): > > yield tuple(f(x) for f, x in zip(converters, info.row0.split > (delim))) > > try: > > for row in lines: > > yield tuple(f(x) for f, x in zip(converters, row.split > (delim))) > > except: > > info.row0 = row > > else: > > info.done = True > > > > def load2(fname,delim = ',', has_varnm = True, prn_report = True): > > """ > > Loading data from a file using the csv module. Returns a recarray. > > """ > > f=open(fname,'rb') > > > > if has_varnm: > > varnames = [i.strip() for i in f.next().split(delim)] > > else: > > varnames = None > > > > > > info = LoadInfo(f.next()) > > chunks = [] > > > > while not info.done: > > row0 = info.row0.split(delim) > > formats = find_formats(row0, info.lastcols ) > > if varnames is None: > > varnames = varnm = ['col%s' % str(i+1) for i, _ in > > enumerate(formate)] > > descr=[] > > conversion_functions=[] > > for name, (dtype, cvt_fn) in zip(varnames, formats): > > descr.append((name,dtype)) > > conversion_functions.append(cvt_fn) > > > > chunks.append(N.fromiter(data_iterator(f, conversion_functions, > > delim, info), descr)) > > > > if len(chunks) > 1: > > n = sum(len(x) for x in chunks) > > data = N.zeros([n], chunks[-1].dtype) > > offset = 0 > > for x in chunks: > > delta = len(x) > > data[offset:offset+delta] = x > > offset += delta > > else: > > [data] = chunks > > > > # load report > > if prn_report: > > print > > "##########################################\n" > > print "Loaded file: %s\n" % fname > > print "Nr obs: %s\n" % data.shape[0] > > print "Variables and datatypes:\n" > > for i in data.dtype.descr: > > print "Varname: %s, Type: %s, Sample: %s" % (i[0], i[1], > > str(data[i[0]][0:3])) > > print > > "\n##########################################\n" > > > > return data > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Mon Jul 9 23:18:05 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 9 Jul 2007 20:18:05 -0700 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: On 7/9/07, Timothy Hochberg wrote: > > > > On 7/9/07, Torgil Svensson wrote: > > > > Elegant solution. Very readable and takes care of row0 nicely. > > > > I want to point out that this is much more efficient than my version > > for random/late string representation changes throughout the > > conversion but it suffers from 2*n memory footprint and large block > > copying if the string rep changes arrives very early on huge datasets. > > > Yep. > > I think we can't have best of both and Tims solution is better in the > > general case. > > > It probably would not be hard to do a hybrid version. One issue is that > one doesn't, in general, know the size of the dataset in advance, so you'd > have to use an absolute criteria (less than 100 lines) instead of a relative > criteria (less than 20% done). I suppose you could stat the file or > something, but that seems like overkill. > > > Maybe "use one_alt if rownumber < xxx else use other_alt" can > > fine-tune performance for some cases. but even ten, with many cols, > > it's nearly impossible to know. > > > That sounds sensible. I have an interesting thought on how to this that's > a bit hard to describe. I'll try to throw it together and post another > version today or tomorrow. > OK, as promised, here's an approach that rebuilds the array if the format changes as long as the less than 'restart_length' lines have been processed. Otherwise, it uses the old strategy. Perhaps not the most efficient way, but it reuses what I'd already written with minimal changes. It's still pretty rough -- once again I didn't bother to polish it. def find_formats(items, last): formats = [] for i, x in enumerate(items): dt, cvt = string_to_dt_cvt(x) if last is not None: last_cvt, last_dt = last[i] if last_cvt is float and cvt is int: cvt = float formats.append((dt, cvt)) return formats class LoadInfo(object): def __init__(self, row0): self.done = False self.lastcols = None self.row0 = row0 self.predata = () def data_iterator(lines, converters, delim, info): for x in info.predata: yield x info.predata = () yield tuple(f(x) for f, x in zip(converters, info.row0.split(delim))) try: for row in lines: yield tuple(f(x) for f, x in zip(converters, row.split(delim))) except: info.row0 = row else: info.done = True def load2(fname,delim = ',', has_varnm = True, prn_report = True, restart_length=20): """ Loading data from a file using the csv module. Returns a recarray. """ f=open(fname,'rb') if has_varnm: varnames = [i.strip() for i in f.next().split(delim)] else: varnames = None info = LoadInfo(f.next()) chunks = [] while not info.done: row0 = info.row0.split(delim) formats = find_formats(row0, info.lastcols) if varnames is None: varnames = varnm = ['col%s' % str(i+1) for i, _ in enumerate(formate)] descr=[] conversion_functions=[] for name, (dtype, cvt_fn) in zip(varnames, formats): descr.append((name,dtype)) conversion_functions.append(cvt_fn) if len(chunks) == 1 and len(chunks[0]) < restart_length: info.predata = chunks[0].astype(descr) chunks = [] chunks.append(N.fromiter(data_iterator(f, conversion_functions, delim, info), descr)) if len(chunks) > 1: n = sum(len(x) for x in chunks) data = N.zeros([n], chunks[-1].dtype) offset = 0 for x in chunks: delta = len(x) data[offset:offset+delta] = x offset += delta else: [data] = chunks # load report if prn_report: print "##########################################\n" print "Loaded file: %s\n" % fname print "Nr obs: %s\n" % data.shape[0] print "Variables and datatypes:\n" for i in data.dtype.descr: print "Varname: %s, Type: %s, Sample: %s" % (i[0], i[1], str(data[i[0]][0:3])) print "\n##########################################\n" return data -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Tue Jul 10 02:29:41 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 10 Jul 2007 08:29:41 +0200 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org> <468FBC40.8030006@ieee.org> Message-ID: Hi, On Mon, 9 Jul 2007, Timothy Hochberg apparently wrote: > > > Why not simply use & and | instead of + and *? > > > > A couple reasons, none determinative. > > 1. numpy is right a Python is wrong it this case > > > I don't think I agree with this. Once you've decided to make Boolean a > subclass of Int, then Python's behavior seems to be the most sensible. One > could argue (and people did) about whether that was a good choice, but it's > useful for a lot of practical applications. In any event, given that Boolean > subclasses Int, I think the current behavior is probably for the best. > If bool subclasses int, this does not enforce True+True=2. Never. Boolean operation live in the Boole algebra and that's it. It's not the case with integers that cannot be represented with int. Now, if you take the algebra point of view, which is the point here, for a scientific application, you have to have True+True = True. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From haase at msg.ucsf.edu Tue Jul 10 08:39:28 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 10 Jul 2007 14:39:28 +0200 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468FBC40.8030006@ieee.org> Message-ID: On 7/10/07, Matthieu Brucher wrote: > > Hi, > > > > > > > > On Mon, 9 Jul 2007, Timothy Hochberg apparently wrote: > > > > Why not simply use & and | instead of + and *? > > > > > > A couple reasons, none determinative. > > > 1. numpy is right a Python is wrong it this case > > > > > > I don't think I agree with this. Once you've decided to make Boolean a > subclass of Int, then Python's behavior seems to be the most sensible. One > could argue (and people did) about whether that was a good choice, but it's > useful for a lot of practical applications. In any event, given that Boolean > subclasses Int, I think the current behavior is probably for the best. > > > If bool subclasses int, this does not enforce True+True=2. Never. Boolean > operation live in the Boole algebra and that's it. It's not the case with > integers that cannot be represented with int. > Now, if you take the algebra point of view, which is the point here, for a > scientific application, you have to have True+True = True. > Matthieu When you talk about algebra - one might have to restrict one self to '|' and '&' -- not use '+' and '-' E.g.: True - True = False # right !? # but if: True+True = True. # then True+True -False = True -False # ???? # here I'm already lost ... I don't think this can be done in a consistent way. In other words: a "+" operator would also need a corresponding "-" operator, and that will just look funny. I think if you want algebra, you should restrict yourself to "|" (or) and "&" (and) My two cents, Sebastian From matthieu.brucher at gmail.com Tue Jul 10 08:45:15 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 10 Jul 2007 14:45:15 +0200 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468FBC40.8030006@ieee.org> Message-ID: > > When you talk about algebra - one might have to restrict one self to '|' > and '&' > -- not use '+' and '-' > E.g.: > True - True = False # right !? Not exactly because - True = + True So True - True = True + True = True You have to stay in the algebra the whole time. # but if: > True+True = True. > # then > True+True -False = True -False # ???? > # here I'm already lost ... I don't think this can be done in a consistent > way. > > In other words: a "+" operator would also need a corresponding "-" > operator, and that will just look funny. I think if you want algebra, > you should restrict yourself to "|" (or) and "&" (and) When you make computation in the Bool algebra, you use + and * in every math book. In IT books, you see | and &. As Numpy is scientists oriented, I suppose that the definition of + and * is correct. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Tue Jul 10 09:08:02 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 10 Jul 2007 15:08:02 +0200 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468FBC40.8030006@ieee.org> Message-ID: <20070710130802.GC12525@clipper.ens.fr> On Tue, Jul 10, 2007 at 02:39:28PM +0200, Sebastian Haase wrote: > When you talk about algebra - one might have to restrict one self to '|' and '&' > -- not use '+' and '-' > E.g.: > True - True = False # right !? > # but if: > True+True = True. > # then > True+True -False = True -False # ???? > # here I'm already lost ... I don't think this can be done in a consistent way. It can, its called the Bool algebra, and it is a consistent algebra, in a mathematical sense of algebra (http://en.wikipedia.org/wiki/Boolean_algebra), actually what we are talking about is the two element bool algebra (http://en.wikipedia.org/wiki/Two-element_Boolean_algebra), and the mathematical structure we are taling about is a ring, the wikipedia article is quite comprehensible (http://en.wikipedia.org/wiki/Ring_(mathematics)) > In other words: a "+" operator would also need a corresponding "-" Yes. In other words (the ensemble theory words) each element needs to have an opposite concerning the '+' law. To understand this you need a bit of algebra theory. * An algebra has 2 laws, lets call them "+" and "*". * Each law has a neutral element for this law, ie an element a for which "a + b = b" for all b in the algebra, lets write these "n+", and "n*". * Each element a is required to have an inverse for the "+", ie an element b for wich b + a = n+, lets write the opposite of b "-b". For integer, n+ = 0, n* = 1. For Booleans, n+ = False, and n+ = True, therefore, as Matthieu points out, -True = True, as True + True = n+ = True, and -False = True, as True + False = n+ = True. So you have a consistent algebra. Now there is a law for which every element does not have an inverse, it the "*" law. You can check the out for integers. It is also true for booleans. In fact, you can proove that in an ring, n+ cannot have an inverse for the * law (it the famous divide by zero error !). In conclusion, I would like to stress that, yes, +, - and * are well defined on booleans, the definition is universal, and please don't try to change it. Ga?l From charlesr.harris at gmail.com Tue Jul 10 10:04:08 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 10 Jul 2007 08:04:08 -0600 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: <20070710130802.GC12525@clipper.ens.fr> References: <20070710130802.GC12525@clipper.ens.fr> Message-ID: On 7/10/07, Gael Varoquaux wrote: > > On Tue, Jul 10, 2007 at 02:39:28PM +0200, Sebastian Haase wrote: > > When you talk about algebra - one might have to restrict one self to '|' > and '&' > > -- not use '+' and '-' > > E.g.: > > True - True = False # right !? > > # but if: > > True+True = True. > > # then > > True+True -False = True -False # ???? > > # here I'm already lost ... I don't think this can be done in a > consistent way. > > It can, its called the Bool algebra, and it is a consistent algebra, in a > mathematical sense of algebra > (http://en.wikipedia.org/wiki/Boolean_algebra), actually what we are > talking about is the two element bool algebra > (http://en.wikipedia.org/wiki/Two-element_Boolean_algebra), and the > mathematical structure we are taling about is a ring, the wikipedia > article is quite comprehensible > (http://en.wikipedia.org/wiki/Ring_(mathematics)) > > > In other words: a "+" operator would also need a corresponding "-" > > Yes. In other words (the ensemble theory words) each element needs to > have an opposite concerning the '+' law. To understand this you need a > bit of algebra theory. > > * An algebra has 2 laws, lets call them "+" and "*". > > * Each law has a neutral element for this law, ie an element a for which > "a + b = b" for all b in the algebra, lets write these "n+", and "n*". > > * Each element a is required to have an inverse for the "+", ie an element > b for wich b + a = n+, lets write the opposite of b "-b". > > For integer, n+ = 0, n* = 1. > > For Booleans, n+ = False, and n+ = True, therefore, as Matthieu points > out, > -True = True, as True + True = n+ = True, > and -False = True, as True + False = n+ = True. > > So you have a consistent algebra. > > Now there is a law for which every element does not have an inverse, it > the "*" law. You can check the out for integers. It is also true for > booleans. In fact, you can proove that in an ring, n+ cannot have an > inverse for the * law (it the famous divide by zero error !). > > In conclusion, I would like to stress that, yes, +, - and * are well > defined on booleans, the definition is universal, and please don't try to > change it. The proper additive operation to make boolean algebra a ring is 'xor', so that 1 becomes its own inverse. Same thing in sigma rings, where folks used to use exclusive union just to make the algebra to work. But plain 'or' and 'union' work fine and are more intuitive even if they don't give the ring structure. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Tue Jul 10 10:36:55 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 10 Jul 2007 10:36:55 -0400 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: <20070710130802.GC12525@clipper.ens.fr> References: <468FBC40.8030006@ieee.org><20070710130802.GC12525@clipper.ens.fr> Message-ID: I found Gael's presentation rather puzzling for two reasons. 1. It appears to contain a `+` vs. `*` confusion. See http://en.wikipedia.org/wiki/Two-element_Boolean_algebra 2. MUCH more importantly: In implementations of TWO, we interpret `-` as unary complementation (not e.g. as additive inverse; note True does not have one). So -True is False -False is True This matches numpy: >>> -N.array([False]) array([True], dtype=bool) >>> -N.array([True]) array([False], dtype=bool) This is a GOOD THING. However, a-b should then just be shorthand for a+(-b). Here numpy does not in my opinion behave correctly: >>> N.array([False])-N.array([True]) array([True], dtype=bool) >>> N.array([False])+(-N.array([True])) array([False], dtype=bool) The second answer is the right one, in this context. I would call this second answer a bug. Cheers, Alan Isaac From aisaac at american.edu Tue Jul 10 10:36:58 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 10 Jul 2007 10:36:58 -0400 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org><468FBC40.8030006@ieee.org> Message-ID: On Mon, 9 Jul 2007, Timothy Hochberg apparently wrote: > x*y and x**2 are already decoupled for arrays and matrices. What if x*y was > simply defined to do a boolean matrix multiply when the arguments are > boolean matrices? > I don't care about this that much though, so I'll let it drop. So if x and y are arrays and you use `dot` you would get a different result than turning them into matrices and using `*`? I'd find that pretty odd. I'd also find it odd that equivalent element-by-element operations (`+`, `-`) would then return different outcomes for boolean arrays and boolean matrices. (This is what I meant by "decoupled".) This is just a user's perspective. I do not pretend to see into the design issues. However, daring to tread where I should not, I offer two observations: - matrices and 2-d arrays with dtype 'bool' should give the same result for "comparable" operations (where `dot` for arrays compares with `*` for matrices). - it would be possible to have a new class, say `boolmat`, that implements the expected behavior for boolen matrices and then make matrices and arrays of dtype 'bool' behave in the Python way (e.g., True+True is 2, yuck!). I am definitely NOT advocating this (I like the current arrangement), but it is a possibility. Cheers, Alan Isaac PS Here is Guido's justification for bool inheriting from int (http://www.python.org/dev/peps/pep-0285/). It seems that numpy's current behavior is closer to his "ideal world". 6) Should bool inherit from int? => Yes. In an ideal world, bool might be better implemented as a separate integer type that knows how to perform mixed-mode arithmetic. However, inheriting bool from int eases the implementation enormously (in part since all C code that calls PyInt_Check() will continue to work -- this returns true for subclasses of int). Also, I believe this is right in terms of substitutability: code that requires an int can be fed a bool and it will behave the same as 0 or 1. Code that requires a bool may not work when it is given an int; for example, 3 & 4 is 0, but both 3 and 4 are true when considered as truth values. From gael.varoquaux at normalesup.org Tue Jul 10 10:48:39 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 10 Jul 2007 16:48:39 +0200 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <20070710130802.GC12525@clipper.ens.fr> Message-ID: <20070710144839.GF12525@clipper.ens.fr> On Tue, Jul 10, 2007 at 10:36:55AM -0400, Alan G Isaac wrote: > I found Gael's presentation rather puzzling for two reasons. > 1. It appears to contain a `+` vs. `*` confusion. > See http://en.wikipedia.org/wiki/Two-element_Boolean_algebra Damn it. I used math conventions, for "+" and "*" (in math the "+" law of a ring is the law for which every element has an inverse). I hadn't realized it was the opposite for intuitive understanding of booleans. > 2. MUCH more importantly: > In implementations of TWO, we interpret `-` as unary > complementation (not e.g. as additive inverse; note True > does not have one). Yes, indeed, as the law for which every element has an inverse is "*", the inverse for the "+" is not defined, and therefore the "-" sign cannot design it. You are quite right that it is impossible to define "-" on the boolean set in a way that makes it follow tradition integer operations. I don't know what the conclusion of this should be in terms of the original discussion. Sorry for the noise. Ga?l From aisaac at american.edu Tue Jul 10 11:31:35 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 10 Jul 2007 11:31:35 -0400 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468FBC40.8030006@ieee.org><20070710130802.GC12525@clipper.ens.fr> Message-ID: Hi Gael, More important is the following. On Tue, 10 Jul 2007, Alan G Isaac apparently wrote: >>>> N.array([False])-N.array([True]) > array([True], dtype=bool) >>>> N.array([False])+(-N.array([True])) > array([False], dtype=bool) > The second answer is the right one, in this context. > I would call this [first!!!] answer a bug. Do you agree that the first (!!!) answer is a bug? (The basis is apparently performed as follows: integer array subtraction is first performed, and then nonzero ints are converted to True. But this gives the wrong answer and most critically breaks the equivalence of a-b and a+(-b).) Cheers, Alan From gael.varoquaux at normalesup.org Tue Jul 10 12:00:42 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 10 Jul 2007 18:00:42 +0200 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: Message-ID: <20070710160042.GC13468@clipper.ens.fr> On Tue, Jul 10, 2007 at 11:31:35AM -0400, Alan G Isaac wrote: > Do you agree that the first (!!!) answer is a bug? > (The basis is apparently performed as follows: > integer array subtraction is first performed, and > then nonzero ints are converted to True. But this > gives the wrong answer and most critically breaks > the equivalence of a-b and a+(-b).) OK, putting aside the useless maths, I agree that specifically having a-b != a+(-b) If numpy developpers agree, I think the proper solution is : """ def __sub__(self, b): return self.__add__(-b) """ I think to should allow to have more or less consistent operations. Ga?l From tim.hochberg at ieee.org Tue Jul 10 12:25:53 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 10 Jul 2007 09:25:53 -0700 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <20070710130802.GC12525@clipper.ens.fr> Message-ID: [CHOP: lots of examples] It looks like bool_s could use some general rejiggering. Let me put forth a concrete proposal that's based on matching bool_ behaviour to that of Python's bools. There is another route that could be taken where bool_ and bool are completely decoupled, but I'll skip over that for now since I don't really think it's a good idea. 1. +,- are arithmetic operators and return ints not booleans 2. *,** are arithmetic operators on scalars and arrays and return ints as above. 3. &,|,^ are the logical operators and return booleans. 4. *,** are defined on matrices to perform logical matrix multiplication and exponation. This seems like the simplest route towards something that is both internally self consistent and consistent with Python. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 10 12:53:22 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 10 Jul 2007 10:53:22 -0600 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <20070710130802.GC12525@clipper.ens.fr> Message-ID: On 7/10/07, Timothy Hochberg wrote: > > > [CHOP: lots of examples] > > It looks like bool_s could use some general rejiggering. Let me put forth > a concrete proposal that's based on matching bool_ behaviour to that of > Python's bools. There is another route that could be taken where bool_ and > bool are completely decoupled, but I'll skip over that for now since I don't > really think it's a good idea. > > 1. +,- are arithmetic operators and return ints not booleans > 2. *,** are arithmetic operators on scalars and arrays and return > ints as above. > 3. &,|,^ are the logical operators and return booleans. > 4. *,** are defined on matrices to perform logical matrix > multiplication and exponation. > > This seems like the simplest route towards something that is both > internally self consistent and consistent with Python. Looks good to me. At least it would make things consistent with bool_ being a subclass of integers if we go that way. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Tue Jul 10 13:08:49 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 10 Jul 2007 10:08:49 -0700 Subject: [Numpy-discussion] Conversion float64->int bugged? In-Reply-To: <4692B30E.4090404@apstat.com> References: <4692B30E.4090404@apstat.com> Message-ID: On 7/9/07, Xavier Saint-Mleux wrote: > > Hi all, > > The conversion from a numpy scalar to a python int is not consistent > with python's native conversion (or numarray's): if the scalar is out > of bounds for an int, python and numarray automatically create a long > while numpy still creates an int... with the wrong value. > > N.B. I am new to numpy, so please forgive me if this issue has already > been discussed. I've quickly searched the archives and Travis's "Guide > to NumPy", with no success. > > e.g. (using numpy 1.0.3): > > Python 2.4.3 (#2, Apr 27 2006, 14:43:58) > [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from numpy import * > >>> l= [1e3, 1e9, 1e15, -1e3, -1e9, -1e15] > >>> a= array(l) > >>> map(int, l) > [1000, 1000000000, 1000000000000000L, -1000, -1000000000, > -1000000000000000L] > >>> map(int, a) > [1000, 1000000000, -2147483648, -1000, -1000000000, -2147483648] > >>> map(long, a) > [1000L, 1000000000L, 1000000000000000L, -1000L, -1000000000L, > -1000000000000000L] > >>> > > IMHO, numpy's conversions to int should behave like Python's 'float_int' > or 'long_int' functions (see $PYTHON_SRC_DIR/Objects/floatobject.c, > $PYTHON_SRC_DIR/Objects/longobject.c): if it doesn't fit in an int, > return a long. For now (svn), it seems that numpy is always using > PyInt_FromLong after an implicit C cast to long (which silently fails; > see $NUMPY_SRC_DIR/numpy/core/src/scalarmathmodule.c.src) > > Is there any reason not to change this? FWIW, it seems like a good idea to me. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Tue Jul 10 15:18:04 2007 From: aisaac at american.edu (Alan Isaac) Date: Tue, 10 Jul 2007 15:18:04 -0400 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <20070710130802.GC12525@clipper.ens.fr> Message-ID: On Tue, 10 Jul 2007, Timothy Hochberg wrote: > 1. > +,- are arithmetic operators and return ints not booleans > 2. > *,** are arithmetic operators on scalars and arrays and return ints as above. > 3. > &,|,^ are the logical operators and return booleans. > 4. > *,** are defined on matrices to perform logical matrix multiplication and exponation. I am not objecting to this, but I want to make sure the costs are not overlooked. Will multiplication of boolean matrices will be different than `dot`? (It will certainly be different than `dot` for "equivalent" 2-d arrays). If I understand, unary complementation (using `-`) will be lost: so there will be no operator for unary complementation. (You might say, what about `~`, which currently works, but if we are to match Python's behavior, that is lost too.) Cheers, Alan Isaac From mpmusu at cc.usu.edu Tue Jul 10 15:48:17 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Tue, 10 Jul 2007 13:48:17 -0600 Subject: [Numpy-discussion] another broadcast/fancy indexing question In-Reply-To: References: <20070710130802.GC12525@clipper.ens.fr> Message-ID: <4693E281.3030705@cc.usu.edu> Just ran across something that doesn't quite make sense to me at the moment. Here's some code: >>> numpy.__version__ '1.0.2' >>> >>> def f1(b,c): b=b.astype(int) c=c.astype(int) return b,c >>> b,c = numpy.fromfunction(f1,(5,5)) >>> a=numpy.zeros((2,12,5,5),int) >>> a1=a[0] >>> a1[:,b,c].shape (12, 5, 5) >>> a[0,:,b,c].shape (5, 5, 12) ###why does this not return (12,5,5)? >>> So in a nutshell, it's not completely clear to me why these are returning arrays of different shapes. Can someone shed some light? Thanks, -Mark From tim.hochberg at ieee.org Tue Jul 10 17:02:54 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 10 Jul 2007 14:02:54 -0700 Subject: [Numpy-discussion] another broadcast/fancy indexing question In-Reply-To: <4693E281.3030705@cc.usu.edu> References: <20070710130802.GC12525@clipper.ens.fr> <4693E281.3030705@cc.usu.edu> Message-ID: On 7/10/07, Mark.Miller wrote: > > Just ran across something that doesn't quite make sense to me at the > moment. > > Here's some code: > > >>> numpy.__version__ > '1.0.2' > >>> > >>> def f1(b,c): > b=b.astype(int) > c=c.astype(int) > return b,c > > >>> b,c = numpy.fromfunction(f1,(5,5)) > >>> a=numpy.zeros((2,12,5,5),int) > >>> a1=a[0] > >>> a1[:,b,c].shape > (12, 5, 5) > >>> a[0,:,b,c].shape > (5, 5, 12) ###why does this not return (12,5,5)? > >>> > > So in a nutshell, it's not completely clear to me why these are > returning arrays of different shapes. Can someone shed some light? It's because you are using arrays as indices (aka Fancy-Indexing). When you do this everything works differently. In this case, everything is being broadcast to the same shape. As I understand it (and I try to use only the simplest forms of fancy indexing), what you are doing is equivalent to: -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From numpy-discussion at robince.ftml.net Tue Jul 10 17:41:42 2007 From: numpy-discussion at robince.ftml.net (numpy-discussion at robince.ftml.net) Date: Tue, 10 Jul 2007 22:41:42 +0100 Subject: [Numpy-discussion] Problems building SVN on Windows In-Reply-To: References: Message-ID: <1184103702.16997.1199496005@webmail.messagingengine.com> Hi, I am trying to build numpy on Windows from the current SVN (rev. 3884 I think) and I'm having some problems. I've successfully built ATLAS and LAPACK. I've installed Python 2.5.1 from the standard windows installer. I also have a fresh install of Cygwin with all mingw development tools etc. I created a site.cfg file from the template with the following entries: ----------- [blas_opt] libraries = f77blas, cblas, atlas library_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\lib include_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\include # [lapack_opt] libraries = lapack, f77blas, cblas, atlas library_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\lib include_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\include [fftw] libraries = fftw3 library_dirs = "C:\fftw" ----------- So the first problem is this file doesn't seem to be recognised (and these library paths aren't picked up). I've tried from the cygwin shell and windows cmd and with the file next to setup.py and also in numpy/distutils. However the only paths searched are C:\,C:\Python25\Libs,C:\Python25\lib. I got around this by copying the .a files to C:\. This is a bit messy but at least it finds ATLAS and LAPACK. Then when I try to run config/build as described on the wiki I get an error about no valid fortran compiler being defined. The full output of python setup.py -v build --compiler=mingw32 is attached. I would be greatful if anyone could help me to get this working. Also if someone could provide more details on how to ensure fftw is used. The precompiled fftw library I downloaded provides libfftw3-3.dll as well as couple of others. Will this be picked up correctly by my fftw entry in site.cfg (if site.cfg was working) or should it be "libraries=fftw3-3"? Is it OK to use this DLL or do I have to build a static library myself from source. Finally, I gather the very nice scipy website/wiki is relatively new... I was wondering if you had considered a phpbb type forum for support issues such as this. I think this can be a lot more accessible for new users (such as myself)... Thanks in advance, Robin -------------- next part -------------- A non-text attachment was scrubbed... Name: build-v.out Type: application/octet-stream Size: 9918 bytes Desc: not available URL: From zzxxben at gmail.com Tue Jul 10 17:44:49 2007 From: zzxxben at gmail.com (Ben ZX) Date: Tue, 10 Jul 2007 14:44:49 -0700 Subject: [Numpy-discussion] How do I tell f2py to generate Numeric modules? Message-ID: I ran f2py. It seems to always generate NumPy modules. How do I tell f2py to generate Numeric modules? Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpmusu at cc.usu.edu Tue Jul 10 17:50:08 2007 From: mpmusu at cc.usu.edu (Mark.Miller) Date: Tue, 10 Jul 2007 15:50:08 -0600 Subject: [Numpy-discussion] another broadcast/fancy indexing question In-Reply-To: References: <20070710130802.GC12525@clipper.ens.fr> <4693E281.3030705@cc.usu.edu> Message-ID: <4693FF10.60007@cc.usu.edu> Sorry...can you clarify? I think that some of your message got cut off. -Mark Timothy Hochberg wrote: > It's because you are using arrays as indices (aka Fancy-Indexing). When > you do this everything works differently. In this case, everything is > being broadcast to the same shape. As I understand it (and I try to use > only the simplest forms of fancy indexing), what you are doing is > equivalent to: > > -- > . __ > . |-\ > . From tim.hochberg at ieee.org Tue Jul 10 18:00:17 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 10 Jul 2007 15:00:17 -0700 Subject: [Numpy-discussion] another broadcast/fancy indexing question In-Reply-To: <4693FF10.60007@cc.usu.edu> References: <20070710130802.GC12525@clipper.ens.fr> <4693E281.3030705@cc.usu.edu> <4693FF10.60007@cc.usu.edu> Message-ID: On 7/10/07, Mark.Miller wrote: > > Sorry...can you clarify? I think that some of your message got cut off. > > -Mark > > Timothy Hochberg wrote: > > It's because you are using arrays as indices (aka Fancy-Indexing). When > > you do this everything works differently. In this case, everything is > > being broadcast to the same shape. As I understand it (and I try to use > > only the simplest forms of fancy indexing), what you are doing is > > equivalent to: Sorry about that. The missing line is: a[zeros([5,5]),:,b,c].shape That is, your '0' is being broadcast into a 5x5 array to match the shapes of b and c. That is why the two forms you give are not equivalent. As to why you get that exact shape, I'd have to peruse the fancy indexing docs to figure it out -- things are a little weird when you use multidimensional indexing. > > > -- > > . __ > > . |-\ > > . > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From saintmlx at apstat.com Tue Jul 10 19:41:08 2007 From: saintmlx at apstat.com (saintmlx) Date: Tue, 10 Jul 2007 19:41:08 -0400 Subject: [Numpy-discussion] Conversion float64->int bugged? In-Reply-To: References: <4692B30E.4090404@apstat.com> Message-ID: <46941914.6040808@apstat.com> I opened a new ticket for this: http://projects.scipy.org/scipy/numpy/ticket/549 Xavier Timothy Hochberg wrote: > > > On 7/9/07, *Xavier Saint-Mleux* > wrote: > > Hi all, > > The conversion from a numpy scalar to a python int is not consistent > with python's native conversion (or numarray's): if the scalar is out > of bounds for an int, python and numarray automatically create a long > while numpy still creates an int... with the wrong value. > > N.B. I am new to numpy, so please forgive me if this issue has already > been discussed. I've quickly searched the archives and Travis's > "Guide > to NumPy", with no success. > > e.g. (using numpy 1.0.3): > > Python 2.4.3 (#2, Apr 27 2006, 14:43:58) > [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 > Type "help", "copyright", "credits" or "license" for more > information. > >>> from numpy import * > >>> l= [1e3, 1e9, 1e15, -1e3, -1e9, -1e15] > >>> a= array(l) > >>> map(int, l) > [1000, 1000000000, 1000000000000000L, -1000, -1000000000, > -1000000000000000L] > >>> map(int, a) > [1000, 1000000000, -2147483648, -1000, -1000000000, -2147483648] > >>> map(long, a) > [1000L, 1000000000L, 1000000000000000L, -1000L, -1000000000L, > -1000000000000000L] > >>> > > IMHO, numpy's conversions to int should behave like Python's > 'float_int' > or 'long_int' functions (see $PYTHON_SRC_DIR/Objects/floatobject.c, > $PYTHON_SRC_DIR/Objects/longobject.c): if it doesn't fit in an int, > return a long. For now (svn), it seems that numpy is always using > PyInt_FromLong after an implicit C cast to long (which silently fails; > see $NUMPY_SRC_DIR/numpy/core/src/scalarmathmodule.c.src) > > Is there any reason not to change this? > > > FWIW, it seems like a good idea to me. > > > > -- > . __ > . |-\ > . > . tim.hochberg at ieee.org > >------------------------------------------------------------------------ > >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at scipy.org >http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From mszpadzik at o2.pl Wed Jul 11 06:52:52 2007 From: mszpadzik at o2.pl (=?ISO-8859-2?Q?Micha=B3_Szpadzik?=) Date: Wed, 11 Jul 2007 12:52:52 +0200 Subject: [Numpy-discussion] Conversion from numarray to numpy Message-ID: <4694B684.1000108@o2.pl> Some time ago I wrote small lib for signal processing. Now I'm trying to convert it from numarray to numpy, and I've problem: Code in numarray: for i in range(len(myinput)-m+1): cin=tempmatr[:] ctmp=tempmatr[i] xtmp=(numarray.abs(cin-ctmp))<=r x2tmp=numarray.sum(numarray.transpose(xtmp)) mcount=numarray.sum(x2tmp==m) allcount=allcount+mcount Code in numpy: for i in range(len(myinput)-m+1): cin=tempmatr[:] ctmp=tempmatr[i] xtmp=(numpy.abs(cin-ctmp))<=r x2tmp=numpy.sum(numpy.transpose(xtmp)) mcount=numpy.sum(x2tmp==m) allcount=allcount+mcount In numarray line: xtmp=(numarray.abs(cin-ctmp))<=r returns array of ones and zeros, but in numpy xtmp=(numpy.abs(cin-ctmp))<=r returns True/False I would like to know how change this True/False to 1/0 THX -- Mike Szpadzik From kwgoodman at gmail.com Wed Jul 11 07:10:10 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 11 Jul 2007 13:10:10 +0200 Subject: [Numpy-discussion] Conversion from numarray to numpy In-Reply-To: <4694B684.1000108@o2.pl> References: <4694B684.1000108@o2.pl> Message-ID: On 7/11/07, Micha? Szpadzik wrote: > I would like to know how change this True/False to 1/0 Multiplying by 1 changes the True/False to 1/0. But sum gives the same result for True/False as it does for 1/0. So maybe you don't need to convert from True/False to 1/0? From eike.welk at gmx.net Wed Jul 11 11:32:38 2007 From: eike.welk at gmx.net (Eike Welk) Date: Wed, 11 Jul 2007 17:32:38 +0200 Subject: [Numpy-discussion] Conversion from numarray to numpy In-Reply-To: <4694B684.1000108@o2.pl> References: <4694B684.1000108@o2.pl> Message-ID: <200707111732.38799.eike.welk@gmx.net> On Wednesday 11 July 2007 12:52, Micha? Szpadzik wrote: > I would like to know how change this True/False to 1/0 To convert an array of bool to an array double, you can use the function numpy.doube( ... ). The function numpy.int32( ... ) converts to integers. Regards, Eike. From saintmlx at apstat.com Wed Jul 11 13:43:23 2007 From: saintmlx at apstat.com (Xavier Saint-Mleux) Date: Wed, 11 Jul 2007 13:43:23 -0400 Subject: [Numpy-discussion] alter_code1.py changes .stddev() to .std() ... Message-ID: <469516BB.6050609@apstat.com> Hello, I'm having a lot of fun converting python code from numarray to numpy and I just found out that the alter_code1.py conversion script changes numarray's '.stddev()' to numpy's '.std()', even though the semantics are different ("sample" std.dev. or not; divide by N-1 vs. by N). Should I open yet another ticket about '.std()'? http://projects.scipy.org/scipy/numpy/ticket/502 http://projects.scipy.org/scipy/numpy/ticket/461 http://projects.scipy.org/scipy/numpy/ticket/388 This will be easy to solve once .std() has a parameter to choose the divisor, but what is everybody using in the mean time? Thanks, Xavier Saint-Mleux From eike.welk at gmx.net Wed Jul 11 15:46:54 2007 From: eike.welk at gmx.net (Eike Welk) Date: Wed, 11 Jul 2007 21:46:54 +0200 Subject: [Numpy-discussion] Conversion from numarray to numpy In-Reply-To: <200707111732.38799.eike.welk@gmx.net> References: <4694B684.1000108@o2.pl> <200707111732.38799.eike.welk@gmx.net> Message-ID: <200707112146.55626.eike.welk@gmx.net> Sorry, I made a typo! On Wednesday 11 July 2007 17:32, Eike Welk wrote: > On Wednesday 11 July 2007 12:52, Micha? Szpadzik wrote: > > I would like to know how change this True/False to 1/0 > > To convert an array of bool to an array double, you can use the > function numpy.doube( ... ). The function numpy.int32( ... ) It should read: numpy.double > converts to integers. > > Regards, > Eike. Eike. From dalcinl at gmail.com Wed Jul 11 17:38:51 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 11 Jul 2007 18:38:51 -0300 Subject: [Numpy-discussion] How do I tell f2py to generate Numeric modules? In-Reply-To: References: Message-ID: On 7/10/07, Ben ZX wrote: > I ran f2py. It seems to always generate NumPy modules. > > How do I tell f2py to generate Numeric modules? If you do $ f2py -h you will see near the beginning the option '--2d-numeric'. I never tested it (moved to numpy from its very beginings) > > Ben > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From david at ar.media.kyoto-u.ac.jp Thu Jul 12 00:32:50 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 12 Jul 2007 13:32:50 +0900 Subject: [Numpy-discussion] Sum, multiply are slow ? Message-ID: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> Hi, While profiling some code, I noticed that sum in numpy is kind of slow once you use axis argument: import numpy as N a = N.random.randn(1e5, 30) %timeit N.sum(a) #-> 26.8ms %timeit N.sum(a, 1) #-> 65.5ms %timeit N.sum(a, 0) #-> 141ms Now, if I use some tricks, I get: %timeit N.sum(a) #-> 26.8 ms %timeit N.dot(a, N.ones(a.shape[1], a.dtype)) #-> 11.3ms %timeit N.dot(N.ones((1, a.shape[0]), a.dtype), a) #-> 15.5ms I realize that dot uses optimized libraries (atlas in my case) and all, but is there any way to improve this situation ? Cheers, David From openopt at ukr.net Thu Jul 12 01:48:55 2007 From: openopt at ukr.net (dmitrey) Date: Thu, 12 Jul 2007 08:48:55 +0300 Subject: [Numpy-discussion] Sum, multiply are slow ? In-Reply-To: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> References: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> Message-ID: <4695C0C7.1000206@ukr.net> very interesting, however, it would be better if you provide exact code. I didn't use timeit and I have some troubles with the module. Regards, D. David Cournapeau wrote: > Hi, > > While profiling some code, I noticed that sum in numpy is kind of > slow once you use axis argument: > > import numpy as N > a = N.random.randn(1e5, 30) > %timeit N.sum(a) #-> 26.8ms > %timeit N.sum(a, 1) #-> 65.5ms > %timeit N.sum(a, 0) #-> 141ms > > Now, if I use some tricks, I get: > > %timeit N.sum(a) #-> 26.8 ms > %timeit N.dot(a, N.ones(a.shape[1], a.dtype)) #-> 11.3ms > %timeit N.dot(N.ones((1, a.shape[0]), a.dtype), a) #-> 15.5ms > > I realize that dot uses optimized libraries (atlas in my case) and all, > but is there any way to improve this situation ? > > Cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > From oliphant.travis at ieee.org Thu Jul 12 02:47:05 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu, 12 Jul 2007 00:47:05 -0600 Subject: [Numpy-discussion] Sum, multiply are slow ? In-Reply-To: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> References: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> Message-ID: <4695CE69.8010806@ieee.org> David Cournapeau wrote: > Hi, > > While profiling some code, I noticed that sum in numpy is kind of > slow once you use axis argument: > Yes, this is expected because when using an access argument, the following two things can happen 1) You may be skipping over large chunks of memory to get to the next available number and out-of-cache memory access is slow. 2) You have to allocate a result array. > import numpy as N > a = N.random.randn(1e5, 30) > %timeit N.sum(a) #-> 26.8ms > %timeit N.sum(a, 1) #-> 65.5ms > %timeit N.sum(a, 0) #-> 141ms > > Now, if I use some tricks, I get: > > %timeit N.sum(a) #-> 26.8 ms > %timeit N.dot(a, N.ones(a.shape[1], a.dtype)) #-> 11.3ms > %timeit N.dot(N.ones((1, a.shape[0]), a.dtype), a) #-> 15.5ms > > I realize that dot uses optimized libraries (atlas in my case) and all, > but is there any way to improve this situation ? > Sum does *not* use an optimized library so it is not too surprising that you can get speed-ups using ATLAS. It would be nice to do something to optimize the reduction functions in NumPy, but nobody has come forward with suggestions yet. Thanks for the reports, though. -Travis From david at ar.media.kyoto-u.ac.jp Thu Jul 12 02:33:03 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 12 Jul 2007 15:33:03 +0900 Subject: [Numpy-discussion] Sum, multiply are slow ? In-Reply-To: <4695CE69.8010806@ieee.org> References: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> <4695CE69.8010806@ieee.org> Message-ID: <4695CB1F.6060609@ar.media.kyoto-u.ac.jp> Travis Oliphant wrote: > David Cournapeau wrote: > >> Hi, >> >> While profiling some code, I noticed that sum in numpy is kind of >> slow once you use axis argument: >> >> > Yes, this is expected because when using an access argument, the > following two things can happen > > 1) You may be skipping over large chunks of memory to get to the next > available number and out-of-cache memory access is slow. > > 2) You have to allocate a result array. > > >> import numpy as N >> a = N.random.randn(1e5, 30) >> %timeit N.sum(a) #-> 26.8ms >> %timeit N.sum(a, 1) #-> 65.5ms >> %timeit N.sum(a, 0) #-> 141ms >> >> Now, if I use some tricks, I get: >> >> %timeit N.sum(a) #-> 26.8 ms >> %timeit N.dot(a, N.ones(a.shape[1], a.dtype)) #-> 11.3ms >> %timeit N.dot(N.ones((1, a.shape[0]), a.dtype), a) #-> 15.5ms >> >> I realize that dot uses optimized libraries (atlas in my case) and all, >> but is there any way to improve this situation ? >> >> > Sum does *not* use an optimized library so it is not too surprising that > you can get speed-ups using ATLAS. I understand that there is no optimization going on with sum or multiply. This was just to have a comparison (this kind of things varies *a lot* accross CPU of the same architecture). > It would be nice to do something to > optimize the reduction functions in NumPy, but nobody has come forward > with suggestions yet. > So this is possible to improve things ? I noticed that sum/multiply and co are using reduction functions. Should I follow the same scheme than what I did for clip (following dot related optimization, basically) ? David From kwgoodman at gmail.com Thu Jul 12 03:45:11 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 12 Jul 2007 09:45:11 +0200 Subject: [Numpy-discussion] Sum, multiply are slow ? In-Reply-To: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> References: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> Message-ID: On 7/12/07, David Cournapeau wrote: > While profiling some code, I noticed that sum in numpy is kind of > slow once you use axis argument: Here is a related thread: http://projects.scipy.org/pipermail/numpy-discussion/2007-February/025903.html From david at ar.media.kyoto-u.ac.jp Thu Jul 12 04:51:28 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 12 Jul 2007 17:51:28 +0900 Subject: [Numpy-discussion] Sum, multiply are slow ? In-Reply-To: References: <4695AEF2.7040800@ar.media.kyoto-u.ac.jp> Message-ID: <4695EB90.1080606@ar.media.kyoto-u.ac.jp> Keith Goodman wrote: > On 7/12/07, David Cournapeau wrote: > >> While profiling some code, I noticed that sum in numpy is kind of >> slow once you use axis argument: >> > > Here is a related thread: > http://projects.scipy.org/pipermail/numpy-discussion/2007-February/025903.html > Thanks, I remembered there was something about that a few months ago, could not find it. By quickly looking at the code for PyArray_Sum, this looks like it has nothing to do with caching or looping, but the way that summing is implemented (generic reduce). David From mszpadzik at o2.pl Thu Jul 12 06:04:00 2007 From: mszpadzik at o2.pl (=?UTF-8?B?TWljaGHFgiBTenBhZHppaw==?=) Date: Thu, 12 Jul 2007 12:04:00 +0200 Subject: [Numpy-discussion] Conversion from numarray to numpy In-Reply-To: <4694B684.1000108@o2.pl> References: <4694B684.1000108@o2.pl> Message-ID: <4695FC90.8070904@o2.pl> THX for help and all answers. Code: for i in range(len(myinput)-m+1): cin=tempmatr[:] ctmp=tempmatr[i] xtmp=((numpy.abs(cin-ctmp))<=r)*1 x2tmp=numpy.sum(numpy.transpose(xtmp), axis=0) mcount=numpy.sum((x2tmp==m)*1) allcount=allcount+mcount works just fine. Now it's time to optimize my lib a little :D Regards From numpy-discussion at robince.ftml.net Fri Jul 13 06:36:19 2007 From: numpy-discussion at robince.ftml.net (numpy-discussion at robince.ftml.net) Date: Fri, 13 Jul 2007 11:36:19 +0100 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: <1184103702.16997.1199496005@webmail.messagingengine.com> References: <1184103702.16997.1199496005@webmail.messagingengine.com> Message-ID: <1184322979.26088.1199971541@webmail.messagingengine.com> Hi, I am keen to evaluate numpy as an alternative to MATLAB for my PhD work and possible wider use within the department. To make a fairer comparison I wanted to build it with optimised ATLAS/LAPACK etc. hence building from source. I am running into some problems though. I am using Windows XP SP2, with latest Cygwin and I'm trying to follow the instructions on the wiki. Firstly, is what I'm trying possible? On the Installing Scipy/Windows page it says MinGW gcc and g77 are best supported, but also says to build against binary Python download you need to use MSVC. From http://boodebr.org/main/python/build-windows-extensions it seems building extensions with gcc is fine (and I built PyCrypto successfully as a test). So can I do what I am trying to do (build numpy/scipy on windows using cygwin without MSVC installed) against the downloaded Python distribution? If not, I can't find any resources about building Python from source on Windows using Cygwin, so it seems like I would be completely stuck. The next problem is that although I filled in the site.cfg file as documented (below), the setup.py script doesn't seem to pick it up and doesn't look in any of the specified directories. I can get around this by putting ATLAS/LAPACK libs in C:\, but obviously this isn't very satisfactory. Also is the entry for fftw correct? I couldn't find any information about this on the wiki. ------ [blas_opt] libraries = f77blas, cblas, atlas library_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\lib include_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\include # [lapack_opt] libraries = lapack, f77blas, cblas, atlas library_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\lib include_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\include [fftw] libraries = fftw3-3 library_dirs = "C:\fftw" ----- Following this there seem to be some more problems with the setup. "python setup.py build --compiler=mingw32" fails with: File "C:\cygwin\home\mqbxfri2\numpy_trunk\numpy\distutils\fcompiler\__init__.py", line 731, in _find_existing_fcompiler c.customize(dist) AttributeError: 'NoneType' object has no attribute 'customize' c is the result of the new_fcompiler function. "python setup.py config_fc --help-fcompiler" failes with: File "c:\Python25\lib\distutils\msvccompiler.py", line 270, in initialize "version of the compiler, but it isn't installed." % self.__product) distutils.errors.DistutilsPlatformError: Python was built with Visual Studio version 7.1, and extensions need to be built with the same version of the compiler, but it isn't installed. Although as I mentioned I can successfully build extensions with gcc. Reading through the help I saw that there is a "none" fcompiler type, so using that I get a bit further: "python setup.py build --compiler=mingw32 --fcompiler=none" run initially with build directory removed gives the same NoneType error, but then running it again the build appears to start. I then run into bug #220 http://projects.scipy.org/scipy/numpy/ticket/220: numpy\core\src\multiarraymodule.c: In function `initmultiarray': numpy\core\src\multiarraymodule.c:7563: error: `NPY_ALLOW_THREADS' undeclared (first use in this function) numpy\core\src\multiarraymodule.c:7563: error: (Each undeclared identifier is reported only once numpy\core\src\multiarraymodule.c:7563: error: for each function it appears in.) I would really appreciate any help to get this working. I understand building numpy doesn't require a fortran compiler, but scipy does. I am hoping to build scipy as well, so presumably the config system needs to recognise the cygwin g77 compiler for that to work? During the config I also see the following message: don't know how to compile Fortran code on platform 'nt' with 'gnu' compiler. Supported compilers are: absoft Does this mean it isn't possible to build scipy on Windows with Cygwin compilers? If I am eventually successful I would be happy to update the wiki with some more detailed instructions based on my experiences. Finally, I can't find any discussion of the relevant merits of ATLAS vs MKL, other than the different licensing. Is it expected that MKL performs better? Which is recommended? Thanks very much, Robin From ivilata at carabos.com Fri Jul 13 07:35:27 2007 From: ivilata at carabos.com (Ivan Vilata i Balaguer) Date: Fri, 13 Jul 2007 13:35:27 +0200 Subject: [Numpy-discussion] [ANN] PyTables & PyTables Pro 2.0 released Message-ID: <20070713113527.GC6132@rampella.terramar.selidor.net> ======================================== Announcing PyTables & PyTables Pro 2.0 ======================================== PyTables is a library for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data with support for full 64-bit file addressing. PyTables runs on top of the HDF5 library and NumPy package for achieving maximum throughput and convenient use. After more than one year of continuous development and about five months of alpha, beta and release candidates, we are very happy to announce that the PyTables and PyTables Pro 2.0 are here. We are pretty confident that the 2.0 versions are ready to be used in production scenarios, bringing higher performance, better portability (specially in 64-bit environments) and more stability than the 1.x series. You can download a source package of the PyTables 2.0 with generated PDF and HTML docs and binaries for Windows from http://www.pytables.org/download/stable/ For an on-line version of the manual, visit: http://www.pytables.org/docs/manual-2.0 In case you want to know more in detail what has changed in this version, have a look at ``RELEASE_NOTES.txt``. Find the HTML version for this document at: http://www.pytables.org/moin/ReleaseNotes/Release_2.0 If you are a user of PyTables 1.x, probably it is worth for you to look at ``MIGRATING_TO_2.x.txt`` file where you will find directions on how to migrate your existing PyTables 1.x apps to the 2.0 version. You can find an HTML version of this document at http://www.pytables.org/moin/ReleaseNotes/Migrating_To_2.x Introducing PyTables Pro 2.0 ============================ The main difference between PyTables Pro and regular PyTables is that the Pro version includes OPSI, a new indexing technology, allowing to perform data lookups in tables exceeding 10 gigarows (10**10 rows) in less than 1 tenth of a second. Wearing more than 15000 tests and having passed the complete test suite in the most common platforms (Windows, Mac OS X, Linux 32-bit and Linux 64-bit), we are pretty confident that PyTables Pro 2.0 is ready to be used in production scenarios, bringing maximum stability and top performance to those users who need it. For more info about PyTables Pro, see: http://www.carabos.com/products/pytables-pro For the operational details and benchmarks see the OPSI white paper: http://www.carabos.com/docs/OPSI-indexes.pdf Coinciding with the publication of PyTables Pro we are introducing an innovative liberation process that will allow to ultimate release the PyTables Pro 2.x series as open source. You may want to know that, by buying a PyTables Pro license, you are contributing to this process. For details, see: http://www.carabos.com/liberation New features of PyTables 2.0 series =================================== - A complete refactoring of many, many modules in PyTables. With this, the different parts of the code are much better integrated and code redundancy is kept under a minimum. A lot of new optimizations have been included as well, making working with it a smoother experience than ever before. - NumPy is finally at the core! That means that PyTables no longer needs numarray in order to operate, although it continues to be supported (as well as Numeric). This also means that you should be able to run PyTables in scenarios combining Python 2.5 and 64-bit platforms (these are a source of problems with numarray/Numeric because they don't support this combination as of this writing). - Most of the operations in PyTables have experimented noticeable speed-ups (sometimes up to 2x, like in regular Python table selections). This is a consequence of both using NumPy internally and a considerable effort in terms of refactorization and optimization of the new code. - Combined conditions are finally supported for in-kernel selections. So, now it is possible to perform complex selections like:: result = [ row['var3'] for row in table.where('(var2 < 20) | (var1 == "sas")') ] or:: complex_cond = '((%s <= col5) & (col2 <= %s)) ' \ '| (sqrt(col1 + 3.1*col2 + col3*col4) > 3)' result = [ row['var3'] for row in table.where(complex_cond % (inf, sup)) ] and run them at full C-speed (or perhaps more, due to the cache-tuned computing kernel of Numexpr, which has been integrated into PyTables). - Now, it is possible to get fields of the ``Row`` iterator by specifying their position, or even ranges of positions (extended slicing is supported). For example, you can do:: result = [ row[4] for row in table # fetch field #4 if row[1] < 20 ] result = [ row[:] for row in table # fetch all fields if row['var2'] < 20 ] result = [ row[1::2] for row in # fetch odd fields table.iterrows(2, 3000, 3) ] in addition to the classical:: result = [row['var3'] for row in table.where('var2 < 20')] - ``Row`` has received a new method called ``fetch_all_fields()`` in order to easily retrieve all the fields of a row in situations like:: [row.fetch_all_fields() for row in table.where('column1 < 0.3')] The difference between ``row[:]`` and ``row.fetch_all_fields()`` is that the former will return all the fields as a tuple, while the latter will return the fields in a NumPy void type and should be faster. Choose whatever fits better to your needs. - Now, all data that is read from disk is converted, if necessary, to the native byteorder of the hosting machine (before, this only happened with ``Table`` objects). This should help to accelerate applications that have to do computations with data generated in platforms with a byteorder different than the user machine. - The modification of values in ``*Array`` objects (through __setitem__) now doesn't make a copy of the value in the case that the shape of the value passed is the same as the slice to be overwritten. This results in considerable memory savings when you are modifying disk objects with big array values. - All leaf constructors (except for ``Array``) have received a new ``chunkshape`` argument that lets the user explicitly select the chunksizes for the underlying HDF5 datasets (only for advanced users). - All leaf constructors have received a new parameter called ``byteorder`` that lets the user specify the byteorder of their data *on disk*. This effectively allows to create datasets in other byteorders than the native platform. - Native HDF5 datasets with ``H5T_ARRAY`` datatypes are fully supported for reading now. - The test suites for the different packages are installed now, so you don't need a copy of the PyTables sources to run the tests. Besides, you can run the test suite from the Python console by using:: >>> tables.tests() Resources ========= Go to the PyTables web site for more details: http://www.pytables.org/ Go to the PyTables Pro web page for more details: http://www.carabos.com/products/pytables-pro About the HDF5 library: http://hdfgroup.org/HDF5/ About NumPy: http://numpy.scipy.org/ To know more about the company behind the development of PyTables, see: http://www.carabos.com/ Acknowledgments =============== Thanks to many users who provided feature improvements, patches, bug reports, support and suggestions. See the ``THANKS`` file in the distribution package for a (incomplete) list of contributors. Many thanks also to SourceForge who have helped to make and distribute this package! And last, but not least thanks a lot to the HDF5 and NumPy (and numarray!) makers. Without them PyTables simply would not exist. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From numpy-discussion at robince.ftml.net Fri Jul 13 10:12:42 2007 From: numpy-discussion at robince.ftml.net (numpy-discussion at robince.ftml.net) Date: Fri, 13 Jul 2007 15:12:42 +0100 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: <1184322979.26088.1199971541@webmail.messagingengine.com> References: <1184103702.16997.1199496005@webmail.messagingengine.com> <1184322979.26088.1199971541@webmail.messagingengine.com> Message-ID: <1184335962.29888.1200000423@webmail.messagingengine.com> Hi, By making replacing line 807 in fcompiler/__init__.py (return None) with: from numpy.distutils.fcompiler.none import NoneFCompiler compiler = NoneFCompiler() return compiler I have been able to get a little bit further with my problems. I'm not sure if this is the correct way to do the import since I'm quite new to Python. It seems to build OK now, but there is a linking error: I'm not sure why -lmsvcr71 is included in the linker flags, since I should be using cygwin (setup.py build --compiler=mingw32). I think this is because it is not finding the fortran libraries needed. These should come with cygwin, but I'm not sure how to point to them. Also it still reports not finding a fortran compiler, although g77 is installed in cygwin. Looking through the code I'm wondering if there is some confusion between use of os.name (which is 'nt') and sys.platform (which is 'win32') - I thought perhaps that's why its reporting not Fortran compilers supported for 'nt'. Not too sure though. Again any help greatfully received. ... lots of similar undefinied references ... C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0xe): undefined reference to `_s_wsfe' C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x29): undefined reference to `_do_fio' C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x44): undefined reference to `_do_fio' C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x49): undefined reference to `_e_wsfe' C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x5d): undefined reference to `_s_stop' collect2: ld returned 1 exit status error: Command "g++ -mno-cygwin -shared build\temp.win32-2.5\Release\numpy\linal g\lapack_litemodule.o -LC:\ -Lc:\Python25\libs -Lc:\Python25\PCBuild -llapack -l f77blas -lcblas -latlas -lpython25 -lmsvcr71 -o build\lib.win32-2.5\numpy\linalg \lapack_lite.pyd" failed with exit status 1 Thanks, Robin From matthieu.brucher at gmail.com Fri Jul 13 10:52:35 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Jul 2007 16:52:35 +0200 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: <1184322979.26088.1199971541@webmail.messagingengine.com> References: <1184103702.16997.1199496005@webmail.messagingengine.com> <1184322979.26088.1199971541@webmail.messagingengine.com> Message-ID: Hi, What version of MSVC are you using ? If you really want to have an optimized version, don't use mingw (gcc is not up to date). Matthieu 2007/7/13, numpy-discussion at robince.ftml.net < numpy-discussion at robince.ftml.net>: > > Hi, > > I am keen to evaluate numpy as an alternative to MATLAB for my PhD work > and possible wider use within the department. To make a fairer > comparison I wanted to build it with optimised ATLAS/LAPACK etc. hence > building from source. > > I am running into some problems though. > > I am using Windows XP SP2, with latest Cygwin and I'm trying to follow > the instructions on the wiki. > > Firstly, is what I'm trying possible? On the Installing Scipy/Windows > page it says MinGW gcc and g77 are best supported, but also says to > build against binary Python download you need to use MSVC. From > http://boodebr.org/main/python/build-windows-extensions it seems > building extensions with gcc is fine (and I built PyCrypto successfully > as a test). So can I do what I am trying to do (build numpy/scipy on > windows using cygwin without MSVC installed) against the downloaded > Python distribution? If not, I can't find any resources about building > Python from source on Windows using Cygwin, so it seems like I would be > completely stuck. > > The next problem is that although I filled in the site.cfg file as > documented (below), the setup.py script doesn't seem to pick it up and > doesn't look in any of the specified directories. I can get around this > by putting ATLAS/LAPACK libs in C:\, but obviously this isn't very > satisfactory. Also is the entry for fftw correct? I couldn't find any > information about this on the wiki. > > ------ > [blas_opt] > libraries = f77blas, cblas, atlas > library_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\lib > include_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\include > # > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > library_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\lib > include_dirs = C:\cygwin\home\mqbxfri2\ATLAS\atlas_win32\include > [fftw] > libraries = fftw3-3 > library_dirs = "C:\fftw" > ----- > > Following this there seem to be some more problems with the setup. > > "python setup.py build --compiler=mingw32" fails with: > File > > "C:\cygwin\home\mqbxfri2\numpy_trunk\numpy\distutils\fcompiler\__init__.py", > line 731, in _find_existing_fcompiler > c.customize(dist) > AttributeError: 'NoneType' object has no attribute 'customize' > > c is the result of the new_fcompiler function. > > "python setup.py config_fc --help-fcompiler" failes with: > File "c:\Python25\lib\distutils\msvccompiler.py", line 270, in > initialize > "version of the compiler, but it isn't installed." % self.__product) > distutils.errors.DistutilsPlatformError: Python was built with Visual > Studio version 7.1, and extensions need to be built with the same > version of the compiler, but it isn't installed. > > Although as I mentioned I can successfully build extensions with gcc. > > Reading through the help I saw that there is a "none" fcompiler type, so > using that I get a bit further: > > "python setup.py build --compiler=mingw32 --fcompiler=none" run > initially with build directory removed gives the same NoneType error, > but then running it again the build appears to start. I then run into > bug #220 http://projects.scipy.org/scipy/numpy/ticket/220: > > numpy\core\src\multiarraymodule.c: In function `initmultiarray': > numpy\core\src\multiarraymodule.c:7563: error: `NPY_ALLOW_THREADS' > undeclared (first use in this function) > numpy\core\src\multiarraymodule.c:7563: error: (Each undeclared > identifier is reported only once > numpy\core\src\multiarraymodule.c:7563: error: for each function it > appears in.) > > I would really appreciate any help to get this working. I understand > building numpy doesn't require a fortran compiler, but scipy does. I am > hoping to build scipy as well, so presumably the config system needs to > recognise the cygwin g77 compiler for that to work? During the config I > also see the following message: > don't know how to compile Fortran code on platform 'nt' with 'gnu' > compiler. Supported compilers are: absoft > Does this mean it isn't possible to build scipy on Windows with Cygwin > compilers? > > If I am eventually successful I would be happy to update the wiki with > some more detailed instructions based on my experiences. > > Finally, I can't find any discussion of the relevant merits of ATLAS vs > MKL, other than the different licensing. Is it expected that MKL > performs better? Which is recommended? > > Thanks very much, > > Robin > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmschwar at fas.harvard.edu Fri Jul 13 11:06:35 2007 From: bmschwar at fas.harvard.edu (Benjamin M. Schwartz) Date: Fri, 13 Jul 2007 11:06:35 -0400 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: <1184322979.26088.1199971541@webmail.messagingengine.com> References: <1184103702.16997.1199496005@webmail.messagingengine.com> <1184322979.26088.1199971541@webmail.messagingengine.com> Message-ID: <469794FB.70703@fas.harvard.edu> numpy-discussion at robince.ftml.net wrote: > I am keen to evaluate numpy as an alternative to MATLAB for my PhD work > and possible wider use within the department. To make a fairer > comparison I wanted to build it with optimised ATLAS/LAPACK etc. hence > building from source. Far and away the easiest solution is to install Gentoo Linux. All you need to do under gentoo is to add the "lapack" useflag and then "emerge numpy". Gentoo will automatically install ATLAS/LAPACK by default, compiled according to your settings, and then compile NumPy to use that LAPACK. As others have noted, running linux in a virtual machine under Windows may still be faster and easier than trying to configure a working installation in Windows. This is especially true with coLinux, which has near-zero overhead. --Ben From tim.hochberg at ieee.org Fri Jul 13 11:43:44 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 13 Jul 2007 08:43:44 -0700 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: <1184335962.29888.1200000423@webmail.messagingengine.com> References: <1184103702.16997.1199496005@webmail.messagingengine.com> <1184322979.26088.1199971541@webmail.messagingengine.com> <1184335962.29888.1200000423@webmail.messagingengine.com> Message-ID: FWIW, I'm fairly certain that the binaries for win32 are compiled using mingw, so I'm pretty certain that it's possible. I use MSVCC myself, so I can't be of much help though. On 7/13/07, numpy-discussion at robince.ftml.net < numpy-discussion at robince.ftml.net> wrote: > > Hi, > > By making replacing line 807 in fcompiler/__init__.py (return None) > with: > from numpy.distutils.fcompiler.none import NoneFCompiler > compiler = NoneFCompiler() > return compiler > > I have been able to get a little bit further with my problems. I'm not > sure if this is the correct way to do the import since I'm quite new to > Python. > > It seems to build OK now, but there is a linking error: I'm not sure why > -lmsvcr71 is included in the linker flags, since I should be using > cygwin (setup.py build --compiler=mingw32). I think this is because it > is not finding the fortran libraries needed. These should come with > cygwin, but I'm not sure how to point to them. Also it still reports not > finding a fortran compiler, although g77 is installed in cygwin. Looking > through the code I'm wondering if there is some confusion between use of > os.name (which is 'nt') and sys.platform (which is 'win32') - I thought > perhaps that's why its reporting not Fortran compilers supported for > 'nt'. Not too sure though. Again any help greatfully received. > > ... lots of similar undefinied references ... > C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0xe): undefined reference to > `_s_wsfe' > C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x29): undefined reference to > `_do_fio' > C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x44): undefined reference to > `_do_fio' > C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x49): undefined reference to > `_e_wsfe' > C:\/libf77blas.a(xerbla.o):xerbla.f:(.text+0x5d): undefined reference to > `_s_stop' > collect2: ld returned 1 exit status > error: Command "g++ -mno-cygwin -shared > build\temp.win32-2.5\Release\numpy\linal > g\lapack_litemodule.o -LC:\ -Lc:\Python25\libs -Lc:\Python25\PCBuild > -llapack -l > f77blas -lcblas -latlas -lpython25 -lmsvcr71 -o > build\lib.win32-2.5\numpy\linalg > \lapack_lite.pyd" failed with exit status 1 > > Thanks, > > Robin > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Fri Jul 13 18:37:52 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 13 Jul 2007 15:37:52 -0700 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: References: <1184103702.16997.1199496005@webmail.messagingengine.com> <1184322979.26088.1199971541@webmail.messagingengine.com> Message-ID: <4697FEC0.4090800@noaa.gov> > I am using Windows XP SP2, with latest Cygwin and I'm trying to follow > the instructions on the wiki. > > Firstly, is what I'm trying possible? On the Installing Scipy/Windows > page it says MinGW gcc and g77 are best supported, but also says to > build against binary Python download you need to use MSVC. With python 2.5, you can build extensions with MinGW. I haven't tried to build numpy though. > So can I do what I am trying to do (build numpy/scipy on > windows using cygwin without MSVC installed) against the downloaded > Python distribution? CygWin is NOT the same as MinGW. They both are gcc, but IIUC, cygwin builds against the cygwin unix-like standard libs, and MinGW builds against the system libs -- so only MinGW can link with a Python build with MSVC. I can't help you with MinGW and either g77 or BLAS/LAPACK though. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stefan at sun.ac.za Sat Jul 14 10:38:44 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sat, 14 Jul 2007 16:38:44 +0200 Subject: [Numpy-discussion] Should bool_ subclass int? In-Reply-To: References: <468F285D.30308@ieee.org> <468FBC40.8030006@ieee.org> Message-ID: <20070714143843.GB7182@mentat.za.net> On Mon, Jul 09, 2007 at 12:32:02PM -0700, Timothy Hochberg wrote: > I gave this a try. Since so much code is auto-generated, it can be difficult to > figure out what's going on in the core matrix stuff. Still, it seems like the > solution is almost absurdly easy, consisting of changing only three lines. > First off, does this seem right? Code compiled against this patch passes all > tests and seems to run my application right, but that's not conclusive. > > Please let me know if I missed something obvious. Can we make this change, or should we discuss the patch further? Any comments, Travis? St?fan From haase at msg.ucsf.edu Sat Jul 14 16:43:44 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sat, 14 Jul 2007 22:43:44 +0200 Subject: [Numpy-discussion] numpy.where Message-ID: Hi. Two things. 1) The doc-string of numpy.where() states that transpose(where(cond, x,y)) whould always return a 2d-array. How can this be true?? It also says (before) that if x,y are given where(cond,x,y) always returns an array of the same shape as cond .... 2) Could we have another optional argument "dtype" in numpy.where()? Otherwise I would have to always write code like this: a = N.where( arr>x, 1.0, 0.0) a = a.astype(N.float32) I use N.__version__ == '1.0.1' Thanks, Sebastian Haase From robert.kern at gmail.com Sat Jul 14 17:15:00 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 14 Jul 2007 16:15:00 -0500 Subject: [Numpy-discussion] numpy.where In-Reply-To: References: Message-ID: <46993CD4.70500@gmail.com> Sebastian Haase wrote: > Hi. > Two things. > 1) The doc-string of numpy.where() states that transpose(where(cond, > x,y)) whould always return a 2d-array. How can this be true?? It also > says (before) that if x,y are given where(cond,x,y) always returns an > array of the same shape as cond .... It is wrong. It actually meant transpose(where(condition)) > 2) Could we have another optional argument "dtype" in numpy.where()? > Otherwise I would have to always write code like this: > a = N.where( arr>x, 1.0, 0.0) > a = a.astype(N.float32) a = N.where(arr > x, N.float32(1), N.float32(0)) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From haase at msg.ucsf.edu Sun Jul 15 04:04:58 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 15 Jul 2007 10:04:58 +0200 Subject: [Numpy-discussion] numpy.where In-Reply-To: <46993CD4.70500@gmail.com> References: <46993CD4.70500@gmail.com> Message-ID: On 7/14/07, Robert Kern wrote: > Sebastian Haase wrote: > > Hi. > > Two things. > > 1) The doc-string of numpy.where() states that transpose(where(cond, > > x,y)) whould always return a 2d-array. How can this be true?? It also > > says (before) that if x,y are given where(cond,x,y) always returns an > > array of the same shape as cond .... > > It is wrong. It actually meant > > transpose(where(condition)) > > > 2) Could we have another optional argument "dtype" in numpy.where()? > > Otherwise I would have to always write code like this: > > a = N.where( arr>x, 1.0, 0.0) > > a = a.astype(N.float32) > > a = N.where(arr > x, N.float32(1), N.float32(0)) If the x,y arguments are not scalars but (large) arrays this would need lots of unneccessary temporary memory (peak of 3 times the needed output memory size). I would wish that this function, and others which generate output arrays, all get an addition optional dtype argument. (Just like the functions in nd-image) Comments? Thanks for your quick reply, Robert, as always. -Sebastian From haase at msg.ucsf.edu Sun Jul 15 04:48:50 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 15 Jul 2007 10:48:50 +0200 Subject: [Numpy-discussion] optimization of arr**-2 Message-ID: Hi, I compared for a 256x256 float32 normal-noise (x0=100,sigma=1) array the times to do 1./ (a*a) vs. a**-2 >>> U.timeIt('1./(a*a)', 1000) (0.00090877471871, 0.00939644563778, 0.00120674694689, 0.000687777554628) >>> U.timeIt('a**-2', 1000) (0.00876591857354, 0.0263829620803, 0.00952076311375, 0.00173311803255) The numbers are min,max, mean, stddev over thousand runs. N.__version == 1.0.1 The slowdown is almost 10 fold. Similar tests for **-1, and **2 show that the corresponding times are identical - i.e. those cases are optimized to not call the pow routine. Can this be fixed for the **-2 case ? Thanks, Sebastian Haase From robert.kern at gmail.com Sun Jul 15 06:02:56 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 15 Jul 2007 05:02:56 -0500 Subject: [Numpy-discussion] numpy.where In-Reply-To: References: <46993CD4.70500@gmail.com> Message-ID: <4699F0D0.8010707@gmail.com> Sebastian Haase wrote: > On 7/14/07, Robert Kern wrote: >> Sebastian Haase wrote: >>> 2) Could we have another optional argument "dtype" in numpy.where()? >>> Otherwise I would have to always write code like this: >>> a = N.where( arr>x, 1.0, 0.0) >>> a = a.astype(N.float32) >> a = N.where(arr > x, N.float32(1), N.float32(0)) > > If the x,y arguments are not scalars but (large) arrays this would > need lots of unneccessary temporary memory (peak of 3 times the needed > output memory size). > I would wish that this function, and others which generate output > arrays, all get an addition optional dtype argument. (Just like the > functions in nd-image) > > Comments? I look forward to your patch. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From seb.haase at gmx.net Sun Jul 15 06:05:53 2007 From: seb.haase at gmx.net (Sebastian Haase) Date: Sun, 15 Jul 2007 12:05:53 +0200 Subject: [Numpy-discussion] numpy.where In-Reply-To: <4699F0D0.8010707@gmail.com> References: <46993CD4.70500@gmail.com> <4699F0D0.8010707@gmail.com> Message-ID: On 7/15/07, Robert Kern wrote: > Sebastian Haase wrote: > > On 7/14/07, Robert Kern wrote: > >> Sebastian Haase wrote: > > >>> 2) Could we have another optional argument "dtype" in numpy.where()? > >>> Otherwise I would have to always write code like this: > >>> a = N.where( arr>x, 1.0, 0.0) > >>> a = a.astype(N.float32) > >> a = N.where(arr > x, N.float32(1), N.float32(0)) > > > > If the x,y arguments are not scalars but (large) arrays this would > > need lots of unneccessary temporary memory (peak of 3 times the needed > > output memory size). > > I would wish that this function, and others which generate output > > arrays, all get an addition optional dtype argument. (Just like the > > functions in nd-image) > > > > Comments? > > I look forward to your patch. > Which file(s) should I be looking for? From haase at msg.ucsf.edu Sun Jul 15 06:06:57 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 15 Jul 2007 12:06:57 +0200 Subject: [Numpy-discussion] numpy.where In-Reply-To: <4699F0D0.8010707@gmail.com> References: <46993CD4.70500@gmail.com> <4699F0D0.8010707@gmail.com> Message-ID: On 7/15/07, Robert Kern wrote: > Sebastian Haase wrote: > > On 7/14/07, Robert Kern wrote: > >> Sebastian Haase wrote: > > >>> 2) Could we have another optional argument "dtype" in numpy.where()? > >>> Otherwise I would have to always write code like this: > >>> a = N.where( arr>x, 1.0, 0.0) > >>> a = a.astype(N.float32) > >> a = N.where(arr > x, N.float32(1), N.float32(0)) > > > > If the x,y arguments are not scalars but (large) arrays this would > > need lots of unneccessary temporary memory (peak of 3 times the needed > > output memory size). > > I would wish that this function, and others which generate output > > arrays, all get an addition optional dtype argument. (Just like the > > functions in nd-image) > > > > Comments? > > I look forward to your patch. Which file(s) should I be looking for? From robert.kern at gmail.com Sun Jul 15 06:20:07 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 15 Jul 2007 05:20:07 -0500 Subject: [Numpy-discussion] numpy.where In-Reply-To: References: <46993CD4.70500@gmail.com> <4699F0D0.8010707@gmail.com> Message-ID: <4699F4D7.207@gmail.com> Sebastian Haase wrote: > Which file(s) should I be looking for? numpy/core/src/multiarraymodule.c . You will need to modify the functions array_where() and PyArray_Where(). Be sure to update ma.py to match, and update numarray/functions.py to make use of the out= argument (numarray had it and we just fake it). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tim.hochberg at ieee.org Sun Jul 15 09:36:33 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 15 Jul 2007 06:36:33 -0700 Subject: [Numpy-discussion] optimization of arr**-2 In-Reply-To: References: Message-ID: On 7/15/07, Sebastian Haase wrote: > > Hi, > I compared for a 256x256 float32 normal-noise (x0=100,sigma=1) array > the times to do > 1./ (a*a) > vs. > a**-2 > > >>> U.timeIt('1./(a*a)', 1000) > (0.00090877471871, 0.00939644563778, 0.00120674694689, 0.000687777554628) > >>> U.timeIt('a**-2', 1000) > (0.00876591857354, 0.0263829620803, 0.00952076311375, 0.00173311803255) > > The numbers are min,max, mean, stddev over thousand runs. > N.__version == 1.0.1 > > The slowdown is almost 10 fold. Similar tests for **-1, and **2 show > that the corresponding times are identical - i.e. those cases are > optimized to not call the pow routine. > > Can this be fixed for the **-2 case ? Not without some testing and discussion. If I recall correctly, we fixed all of the cases where the optimized case had the same accuracy as the unoptimized case. Optimizing 'x**-2', because it involves two operations (* and /), starts to become lose accuracy relative to pow(x,-2). The accuracy loss is relatively minor however; an additional ULP (unit in last place) or so, I believe. It's been a while however, so I may have the details scrambled. So, while I'm not dead set against it, I think we would definitely come to a consensus on how much accuracy we are willing to forgo for these notational conveniences. And document accordingly. The uncontroversial ones already got optimized. Then again, we could just leave things as they are and when you're hungry for speed, you could use '1/x**2'. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at enthought.com Sun Jul 15 13:40:42 2007 From: travis at enthought.com (Travis Vaught) Date: Sun, 15 Jul 2007 12:40:42 -0500 Subject: [Numpy-discussion] ANN: SciPy 2007 Conference Updates Message-ID: <95D2E93A-0102-4BBF-BEDA-2BAE0CC20654@enthought.com> Greetings, We're excited to have *Ivan Krsti?*, the director of security architecture for the One Laptop Per Child project as our Keynote Speaker this year. The planning for the SciPy 2007 Conference is moving along. Please see below for some important updates. Schedule Available ------------------ The full schedule of talks has been posted here: http://www.scipy.org/SciPy2007/ConferenceSchedule Early Registration Extended --------------------------- If you haven't yet registered for the conference, the early registration deadline has been extended to Wednesday, July 18th, 2007. For more information on the conference see: http://www.scipy.org/SciPy2007 Student Sponsorship ------------------- Enthought, Inc. (http://www.enthought.com) is sponsoring the registration fees for up to 5 college or graduate students to attend the conference. To apply, please send a short description of what you are studying and why you'd like to attend to info at enthought.com. Please include telephone contact information. BOFs & Sprints -------------- If you're planning to attend and are interested in selecting BOF or Sprint Session topics, please weigh in at: BOFs: http://www.scipy.org/SciPy2007/BoFs Sprints: http://www.scipy.org/SciPy2007/Sprints We're looking forward to a great conference this year! Best, Travis From numpy-discussion at robince.ftml.net Mon Jul 16 06:27:01 2007 From: numpy-discussion at robince.ftml.net (Robin Ince) Date: Mon, 16 Jul 2007 11:27:01 +0100 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: <4697FEC0.4090800@noaa.gov> References: <1184103702.16997.1199496005@webmail.messagingengine.com> <1184322979.26088.1199971541@webmail.messagingengine.com> <4697FEC0.4090800@noaa.gov> Message-ID: <1184581621.11298.1200344939@webmail.messagingengine.com> Hi, Thanks for the responses. On Fri, 13 Jul 2007 15:37:52 -0700, "Christopher Barker" said: > CygWin is NOT the same as MinGW. They both are gcc, but IIUC, cygwin > builds against the cygwin unix-like standard libs, and MinGW builds > against the system libs -- so only MinGW can link with a Python build > with MSVC. Cygwin has mingw packages which provide the MinGW libs, so that it effectively provides a MinGW install that can be used outside of Cygwin. Since Cygwin is required to build lapack and atlas (for make etc.) it doesn't make much sense to install MinGW seperately on its own. Anyway, the compiler is definitely working with Python because I successfully built the PyCrypto extension as a test. On Fri, 13 Jul 2007 16:52:35 +0200, "Matthieu Brucher" said: > What version of MSVC are you using ? > > If you really want to have an optimized version, don't use mingw (gcc > is not up to date). I'm not using MSVC - I was trying to build it with Cygwin/MinGW since it said on the Wiki that is what is best supported. On Fri, 13 Jul 2007 11:06:35 -0400, "Benjamin M. Schwartz" said: > Far and away the easiest solution is to install Gentoo Linux. All you > need to do under gentoo is to add the "lapack" useflag and then > "emerge numpy". Gentoo will automatically install ATLAS/LAPACK by > default, compiled according to your settings, and then compile NumPy > to use that LAPACK. I managed to successfully build everything on Linux without too much trouble. I guess what I'm trying to do on Windows isn't really possible at the moment - there seem to be some serious issues with distutils. As I mentioned, I can't get it to pick up any path settings from site.cfg, there is the bug with new_fcompiler returning None when a Compiler type is expected, it doesn't recognise the Cygwin/MinGW Fortran compiler, it appears to give the wrong linking flags for gcc from MinGW and also sometimes gives errors that MSVC is required, which is not true with Python 2.5. It seems most people building on Windows use MSVC, so I guess that is what I would try next, but for the time being I think I will try the coLinux suggestion. I was worried about matplotlib graphics over an X connection, but I guess there are other options like VNC etc. Perhaps the windows install section wiki could/should be updated to reflect the fact it's not currently possible to do this. Might save someone else some time! As for my unrelated question, I was still wondering if anyone has any information about the relative merits of MKL vs ATLAS etc. Thanks, Robin From matthieu.brucher at gmail.com Mon Jul 16 06:38:25 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Jul 2007 12:38:25 +0200 Subject: [Numpy-discussion] Problems building numpy In-Reply-To: <1184581621.11298.1200344939@webmail.messagingengine.com> References: <1184103702.16997.1199496005@webmail.messagingengine.com> <1184322979.26088.1199971541@webmail.messagingengine.com> <4697FEC0.4090800@noaa.gov> <1184581621.11298.1200344939@webmail.messagingengine.com> Message-ID: > > As for my unrelated question, I was still wondering if anyone has any > information about the relative merits of MKL vs ATLAS etc. > MKL is parallized, so until ATLAS is as well, MKL has the upper hand. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.copin at ipnl.in2p3.fr Mon Jul 16 09:13:09 2007 From: y.copin at ipnl.in2p3.fr (Yannick Copin) Date: Mon, 16 Jul 2007 15:13:09 +0200 Subject: [Numpy-discussion] Porting "IDL Astronomy User's Library" to numpy Message-ID: <469B6EE5.2050908@ipnl.in2p3.fr> Hi, I'd be interested in some astronomical utilities from the IDL Astronomy User's Library (http://idlastro.gsfc.nasa.gov/contents.html) converted to python/numpy. I had a look to idl2python (http://software.pseudogreen.org/i2py/), but the automatic translation fails, mostly because (I think) the conversion is Numeric-oriented, and because of the intrinsic differences in the function argument management between IDL and python. So, before pursuing in this direction, I'd like to know if this exercice has already been done, at least partially. Cheers. -- .~. Yannick COPIN (o:>* Doctus cum libro /V\ Institut de physique nucleaire de Lyon (IN2P3 - France) // \\ Tel: (33/0) 472 431 968 AIM: YnCopin ICQ: 236931013 /( )\ http://snovae.in2p3.fr/ycopin/ ^`~'^ From bioinformed at gmail.com Mon Jul 16 09:52:25 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Mon, 16 Jul 2007 09:52:25 -0400 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version Message-ID: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> Hi all, This is a bit of a SciPy question, but I thought I'd ask here since I'm already subscribed. I'd like to add some new LAPACK bindings to SciPy and was wondering if there was a minimum version requirement for LAPACK, since it would be ideal if I could use some of the newer 3.0 features. In addition to using some block methods only added in 3.0, it is very convenient to use the WORK=-1 for space queries instead of reimplementing the underlying logic in the calc_work module. The routines of most interest to me are: DGELSD DGGGLM DGGLSE I've also found that SciPy's sqrtm is broken: >>> a=matrix([[59, 64, 69], [64, 72, 80], [69, 80, 91]]) >>> sqrtm(a) array([[ 5.0084801 +1.03420519e-08j, 4.40747825 -2.06841037e-08j, 3.8064764 +1.03420519e-08j], [ 4.40747825 -2.06841037e-08j, 4.88353492 +4.13682075e-08j, 5.3595916 -2.06841037e-08j], [ 3.8064764 +1.03420519e-08j, 5.3595916 -2.06841037e-08j, 6.9127068 +1.03420519e-08j]]) >>> sqrtm(a)*sqrtm(a) array([[ 25.08487289 +1.03595922e-07j, 19.42586452 -1.82329475e-07j, 14.48926259 +7.87335527e-08j], [ 19.42586452 -1.82329475e-07j, 23.84891336 +4.04046172e-07j, 28.72522212 -2.21716697e-07j], [ 14.48926259 +7.87335527e-08j, 28.72522212 -2.21716697e-07j, 47.78551529 +1.42983144e-07j]]) So not even close... (and yes, it does deserve a bug report if one doesn't already exist) -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Mon Jul 16 10:09:02 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Jul 2007 16:09:02 +0200 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> Message-ID: Hi, Did you try numpy.dot(sqrtm(a), sqrtm(a)) ? Matthieu 2007/7/16, Kevin Jacobs : > > Hi all, > > This is a bit of a SciPy question, but I thought I'd ask here since I'm > already subscribed. I'd like to add some new LAPACK bindings to SciPy and > was wondering if there was a minimum version requirement for LAPACK, since > it would be ideal if I could use some of the newer 3.0 features. In > addition to using some block methods only added in 3.0, it is very > convenient to use the WORK=-1 for space queries instead of reimplementing > the underlying logic in the calc_work module. > > The routines of most interest to me are: > DGELSD > DGGGLM > DGGLSE > > I've also found that SciPy's sqrtm is broken: > > >>> a=matrix([[59, 64, 69], > [64, 72, 80], > [69, 80, 91]]) > >>> sqrtm(a) > array([[ 5.0084801 +1.03420519e-08j, 4.40747825 -2.06841037e-08j, > 3.8064764 +1.03420519e-08j], > [ 4.40747825 -2.06841037e-08j, 4.88353492 +4.13682075e-08j, > 5.3595916 -2.06841037e-08j], > [ 3.8064764 +1.03420519e-08j, 5.3595916 -2.06841037e-08j, > 6.9127068 +1.03420519e-08j]]) > >>> sqrtm(a)*sqrtm(a) > array([[ 25.08487289 +1.03595922e-07j, 19.42586452 -1.82329475e-07j, > 14.48926259 +7.87335527e-08j], > [ 19.42586452 -1.82329475e-07j, 23.84891336 +4.04046172e-07j, > 28.72522212 -2.21716697e-07j], > [ 14.48926259 +7.87335527e-08j, 28.72522212 -2.21716697e-07j, > 47.78551529 +1.42983144e-07j]]) > > So not even close... (and yes, it does deserve a bug report if one > doesn't already exist) > > -Kevin > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Mon Jul 16 10:09:32 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Jul 2007 16:09:32 +0200 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> Message-ID: Oups, sorry, I missed the 'a=matrix'... 2007/7/16, Matthieu Brucher : > > Hi, > > Did you try numpy.dot(sqrtm(a), sqrtm(a)) ? > > Matthieu > > 2007/7/16, Kevin Jacobs < bioinformed at gmail.com>: > > > > Hi all, > > > > This is a bit of a SciPy question, but I thought I'd ask here since I'm > > already subscribed. I'd like to add some new LAPACK bindings to SciPy and > > was wondering if there was a minimum version requirement for LAPACK, since > > it would be ideal if I could use some of the newer 3.0 features. In > > addition to using some block methods only added in 3.0, it is very > > convenient to use the WORK=-1 for space queries instead of reimplementing > > the underlying logic in the calc_work module. > > > > The routines of most interest to me are: > > DGELSD > > DGGGLM > > DGGLSE > > > > I've also found that SciPy's sqrtm is broken: > > > > >>> a=matrix([[59, 64, 69], > > [64, 72, 80], > > [69, 80, 91]]) > > >>> sqrtm(a) > > array([[ 5.0084801 +1.03420519e-08j, 4.40747825 -2.06841037e-08j, > > 3.8064764 +1.03420519e-08j], > > [ 4.40747825 -2.06841037e-08j, 4.88353492 +4.13682075e-08j, > > 5.3595916 -2.06841037e-08j], > > [ 3.8064764 +1.03420519e-08j, 5.3595916 -2.06841037e-08j, > > 6.9127068 +1.03420519e-08j]]) > > >>> sqrtm(a)*sqrtm(a) > > array([[ 25.08487289 +1.03595922e-07j, 19.42586452 -1.82329475e-07j, > > 14.48926259 +7.87335527e-08j], > > [ 19.42586452 -1.82329475e-07j, 23.84891336 +4.04046172e-07j, > > 28.72522212 -2.21716697e-07j], > > [ 14.48926259 +7.87335527e-08j, 28.72522212 -2.21716697e-07j, > > 47.78551529 +1.42983144e-07j]]) > > > > So not even close... (and yes, it does deserve a bug report if one > > doesn't already exist) > > > > -Kevin > > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.leslie at gmail.com Mon Jul 16 10:46:35 2007 From: tim.leslie at gmail.com (Tim Leslie) Date: Tue, 17 Jul 2007 00:46:35 +1000 Subject: [Numpy-discussion] Porting "IDL Astronomy User's Library" to numpy In-Reply-To: <469B6EE5.2050908@ipnl.in2p3.fr> References: <469B6EE5.2050908@ipnl.in2p3.fr> Message-ID: On 7/16/07, Yannick Copin wrote: > > Hi, > > I'd be interested in some astronomical utilities from the IDL Astronomy > User's > Library (http://idlastro.gsfc.nasa.gov/contents.html) converted to > python/numpy. I had a look to idl2python > (http://software.pseudogreen.org/i2py/), but the automatic translation > fails, > mostly because (I think) the conversion is Numeric-oriented, and because > of > the intrinsic differences in the function argument management between IDL > and > python. Perhaps you could give some examples of where the conversion programs is not working for numpy. So, before pursuing in this direction, I'd like to know if this exercice has > already been done, at least partially. Have you contacted the author of the package? The website states that the package has been tested on the IDL astronomy library, so he should be able to give you idea of how successful it was. If you do hear back from him I'd be interested to know how successful this project is, as I'm always looking for reasons to convert users in my astronomy department from things like IDL on to python :-) Cheers, Tim Leslie Cheers. > -- > .~. Yannick COPIN (o:>* Doctus cum libro > /V\ Institut de physique nucleaire de Lyon (IN2P3 - France) > // \\ Tel: (33/0) 472 431 968 AIM: YnCopin ICQ: 236931013 > /( )\ http://snovae.in2p3.fr/ycopin/ > ^`~'^ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From klemm at phys.ethz.ch Mon Jul 16 10:48:06 2007 From: klemm at phys.ethz.ch (Hanno Klemm) Date: Mon, 16 Jul 2007 16:48:06 +0200 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> Message-ID: Kevin, the problem appears to be that sqrtm() gives back an array, rather than a matrix: >>> import scipy.linalg as sl >>> a = s.matrix([[59, 64, 69],[64, 72, 80],[69, 80, 91]]) >>> type(a) >>> a matrix([[59, 64, 69], [64, 72, 80], [69, 80, 91]]) >>> a*a - N.dot(a,a) matrix([[0, 0, 0], [0, 0, 0], [0, 0, 0]]) >>> b = sl.sqrtm(a) >>> type(b) >>> N.dot(b,b) array([[ 59. +1.85288457e-22j, 64. -6.61744490e-23j, 69. +1.85288457e-22j], [ 64. -2.64697796e-23j, 72. -3.70576914e-22j, 80. -5.29395592e-23j], [ 69. +1.85288457e-22j, 80. -1.32348898e-22j, 91. +2.38228016e-22j]]) >>> b*b - N.dot(b,b) array([[-33.91512711 +1.03595922e-07j, -44.57413548 -1.82329475e-07j, -54.51073741 +7.87335527e-08j], [-44.57413548 -1.82329475e-07j, -48.15108664 +4.04046172e-07j, -51.27477788 -2.21716697e-07j], [-54.51073741 +7.87335527e-08j, -51.27477788 -2.21716697e-07j, -43.21448471 +1.42983144e-07j]]) >>> This certainly is a slightly unexpected behaviour. Hanno "Kevin Jacobs " said: > ------=_Part_59405_32758974.1184593945795 > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > Content-Transfer-Encoding: 7bit > Content-Disposition: inline > > Hi all, > > This is a bit of a SciPy question, but I thought I'd ask here since I'm > already subscribed. I'd like to add some new LAPACK bindings to SciPy and > was wondering if there was a minimum version requirement for LAPACK, since > it would be ideal if I could use some of the newer 3.0 features. In > addition to using some block methods only added in 3.0, it is very > convenient to use the WORK=-1 for space queries instead of reimplementing > the underlying logic in the calc_work module. > > The routines of most interest to me are: > DGELSD > DGGGLM > DGGLSE > > I've also found that SciPy's sqrtm is broken: > > >>> a=matrix([[59, 64, 69], > [64, 72, 80], > [69, 80, 91]]) > >>> sqrtm(a) > array([[ 5.0084801 +1.03420519e-08j, 4.40747825 -2.06841037e-08j, > 3.8064764 +1.03420519e-08j], > [ 4.40747825 -2.06841037e-08j, 4.88353492 +4.13682075e-08j, > 5.3595916 -2.06841037e-08j], > [ 3.8064764 +1.03420519e-08j, 5.3595916 -2.06841037e-08j, > 6.9127068 +1.03420519e-08j]]) > >>> sqrtm(a)*sqrtm(a) > array([[ 25.08487289 +1.03595922e-07j, 19.42586452 -1.82329475e-07j, > 14.48926259 +7.87335527e-08j], > [ 19.42586452 -1.82329475e-07j, 23.84891336 +4.04046172e-07j, > 28.72522212 -2.21716697e-07j], > [ 14.48926259 +7.87335527e-08j, 28.72522212 -2.21716697e-07j, > 47.78551529 +1.42983144e-07j]]) > > So not even close... (and yes, it does deserve a bug report if one doesn't > already exist) > > -Kevin > > ------=_Part_59405_32758974.1184593945795 > Content-Type: text/html; charset=ISO-8859-1 > Content-Transfer-Encoding: 7bit > Content-Disposition: inline > > Hi all,

This is a bit of a SciPy question, but I thought I'd ask here since I'm already subscribed.  I'd like to add some new LAPACK bindings to SciPy and was wondering if there was a minimum version requirement for LAPACK, since it would be ideal if I could use some of the newer > 3.0 features.  In addition to using some block methods only added in 3.0, it is very convenient to use the WORK=-1 for space queries instead of reimplementing the underlying logic in the calc_work module.

The routines of most interest to me are: >
DGELSD
DGGGLM
DGGLSE

I've also found that SciPy's sqrtm is broken:

>>> a=matrix([[59, 64, 69],
        [64, 72, 80],
        [69, 80, 91]])
>>> sqrtm(a)
array([[ > 5.0084801  +1.03420519e-08j,  4.40747825 -2.06841037e-08j,
         3.8064764  +1.03420519e-08j],
       [ 4.40747825 -2.06841037e-08j,  4.88353492 +4.13682075e-08j,
         5.3595916  -2.06841037e-08j],
       [ > 3.8064764  +1.03420519e-08j,  5.3595916  -2.06841037e-08j,
         6.9127068  +1.03420519e-08j]])
>>> sqrtm(a)*sqrtm(a)
array([[ 25.08487289 +1.03595922e-07j,  19.42586452 -1.82329475e-07j,
         > 14.48926259 +7.87335527e-08j],
       [ 19.42586452 -1.82329475e-07j,  23.84891336 +4.04046172e-07j,
         28.72522212 -2.21716697e-07j],
       [ 14.48926259 +7.87335527e-08j,  28.72522212 -2.21716697e-07j,
>          47.78551529 +1.42983144e-07j]])

So not even close...  (and yes, it does deserve a bug report if one doesn't already exist)

-Kevin

> > ------=_Part_59405_32758974.1184593945795-- > -- Hanno Klemm klemm at phys.ethz.ch From perry at stsci.edu Mon Jul 16 12:03:46 2007 From: perry at stsci.edu (Perry Greenfield) Date: Mon, 16 Jul 2007 12:03:46 -0400 Subject: [Numpy-discussion] [AstroPy] Porting "IDL Astronomy User's Library" to numpy In-Reply-To: <469B6EE5.2050908@ipnl.in2p3.fr> References: <469B6EE5.2050908@ipnl.in2p3.fr> Message-ID: <1F2751F6-60DF-4306-B254-D3814F1EF885@stsci.edu> On Jul 16, 2007, at 9:13 AM, Yannick Copin wrote: > Hi, > > I'd be interested in some astronomical utilities from the IDL > Astronomy User's > Library (http://idlastro.gsfc.nasa.gov/contents.html) converted to > python/numpy. I had a look to idl2python > (http://software.pseudogreen.org/i2py/), but the automatic > translation fails, > mostly because (I think) the conversion is Numeric-oriented, and > because of > the intrinsic differences in the function argument management > between IDL and > python. > > So, before pursuing in this direction, I'd like to know if this > exercice has > already been done, at least partially. > We have the idea of doing it, but not in a very literal sense (translating idl routines to Python counterparts). There has been some work in this area on our part, but because of budget pressures, much less over the last 2 years than hoped (things are looking better now, but it may be some months before activity picks up on this front again). Work so far on our part has centered on: Coordinate transformation utilities Synthetic photometry So if you are interested in doing more literal translations, please feel free to go right ahead. There is even a place to put such stuff: http://www.scipy.org/AstroLib Perry From bioinformed at gmail.com Mon Jul 16 12:24:52 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Mon, 16 Jul 2007 12:24:52 -0400 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> Message-ID: <2e1434c10707160924n5654b320wc7e29e82ba2b20d9@mail.gmail.com> Mea culpa on the msqrt example, however I still think it is wrong to get a complex square-root back when a real valued result is expected and exists. -Kevin On 7/16/07, Hanno Klemm wrote: > > > Kevin, > > the problem appears to be that sqrtm() gives back an array, rather > than a matrix: > > > >>> import scipy.linalg as sl > >>> a = s.matrix([[59, 64, 69],[64, 72, 80],[69, 80, 91]]) > >>> type(a) > > >>> a > matrix([[59, 64, 69], > [64, 72, 80], > [69, 80, 91]]) > >>> a*a - N.dot(a,a) > matrix([[0, 0, 0], > [0, 0, 0], > [0, 0, 0]]) > >>> b = sl.sqrtm(a) > >>> type(b) > > >>> N.dot(b,b) > array([[ 59. +1.85288457e-22j, 64. -6.61744490e-23j, 69. > +1.85288457e-22j], > [ 64. -2.64697796e-23j, 72. -3.70576914e-22j, 80. > -5.29395592e-23j], > [ 69. +1.85288457e-22j, 80. -1.32348898e-22j, 91. > +2.38228016e-22j]]) > >>> b*b - N.dot(b,b) > array([[-33.91512711 +1.03595922e-07j, -44.57413548 -1.82329475e-07j, > -54.51073741 +7.87335527e-08j], > [-44.57413548 -1.82329475e-07j, -48.15108664 +4.04046172e-07j, > -51.27477788 -2.21716697e-07j], > [-54.51073741 +7.87335527e-08j, -51.27477788 -2.21716697e-07j, > -43.21448471 +1.42983144e-07j]]) > >>> > > This certainly is a slightly unexpected behaviour. > > Hanno > > > "Kevin Jacobs " said: > > > ------=_Part_59405_32758974.1184593945795 > > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Content-Transfer-Encoding: 7bit > > Content-Disposition: inline > > > > Hi all, > > > > This is a bit of a SciPy question, but I thought I'd ask here since I'm > > already subscribed. I'd like to add some new LAPACK bindings to > SciPy and > > was wondering if there was a minimum version requirement for LAPACK, > since > > it would be ideal if I could use some of the newer 3.0 features. In > > addition to using some block methods only added in 3.0, it is very > > convenient to use the WORK=-1 for space queries instead of > reimplementing > > the underlying logic in the calc_work module. > > > > The routines of most interest to me are: > > DGELSD > > DGGGLM > > DGGLSE > > > > I've also found that SciPy's sqrtm is broken: > > > > >>> a=matrix([[59, 64, 69], > > [64, 72, 80], > > [69, 80, 91]]) > > >>> sqrtm(a) > > array([[ 5.0084801 +1.03420519e-08j, 4.40747825 -2.06841037e-08j, > > 3.8064764 +1.03420519e-08j], > > [ 4.40747825 -2.06841037e-08j, 4.88353492 +4.13682075e-08j, > > 5.3595916 -2.06841037e-08j], > > [ 3.8064764 +1.03420519e-08j, 5.3595916 -2.06841037e-08j, > > 6.9127068 +1.03420519e-08j]]) > > >>> sqrtm(a)*sqrtm(a) > > array([[ 25.08487289 +1.03595922e-07j, 19.42586452 -1.82329475e-07j, > > 14.48926259 +7.87335527e-08j], > > [ 19.42586452 -1.82329475e-07j, 23.84891336 +4.04046172e-07j, > > 28.72522212 -2.21716697e-07j], > > [ 14.48926259 +7.87335527e-08j, 28.72522212 -2.21716697e-07j, > > 47.78551529 +1.42983144e-07j]]) > > > > So not even close... (and yes, it does deserve a bug report if one > doesn't > > already exist) > > > > -Kevin > > > > ------=_Part_59405_32758974.1184593945795 > > Content-Type: text/html; charset=ISO-8859-1 > > Content-Transfer-Encoding: 7bit > > Content-Disposition: inline > > > > Hi all,

This is a bit of a SciPy question, but I thought > I'd ask here since I'm already subscribed.  I'd like > to add some new LAPACK bindings to SciPy and was wondering if there > was a minimum version requirement for LAPACK, since it would be ideal > if I could use some of the newer > > 3.0 features.  In addition to using some block methods only > added in 3.0, it is very convenient to use the WORK=-1 for space > queries instead of reimplementing the underlying logic in the > calc_work module.

The routines of most interest to me are: > >
DGELSD
DGGGLM
DGGLSE

I've also found that > SciPy's sqrtm is broken:

>>> a=matrix([[59, 64, > 69],
        [64, 72, > 80],
        [69, 80, > 91]])
>>> sqrtm(a)
array([[ > > 5.0084801  +1.03420519e-08j,  4.40747825 > -2.06841037e-08j,
         > 3.8064764  > +1.03420519e-08j],
       [ > 4.40747825 -2.06841037e-08j,  4.88353492 > +4.13682075e-08j,
         > 5.3595916  > -2.06841037e-08j],
       [ > > 3.8064764  +1.03420519e-08j,  5.3595916  > -2.06841037e-08j,
         > 6.9127068  +1.03420519e-08j]])
>>> > sqrtm(a)*sqrtm(a)
array([[ 25.08487289 +1.03595922e-07j,  > 19.42586452 > -1.82329475e-07j,
         > > 14.48926259 > +7.87335527e-08j],
       [ > 19.42586452 -1.82329475e-07j,  23.84891336 > +4.04046172e-07j,
         > 28.72522212 -2.21716697e-07j],
       > [ 14.48926259 +7.87335527e-08j,  28.72522212 -2.21716697e-07j,
> >          47.78551529 > +1.42983144e-07j]])

So not even close...  (and yes, it > does deserve a bug report if one doesn't already > exist)

-Kevin

> > > > ------=_Part_59405_32758974.1184593945795-- > > > > > > -- > Hanno Klemm > klemm at phys.ethz.ch > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jul 16 12:41:00 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 16 Jul 2007 11:41:00 -0500 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <2e1434c10707160924n5654b320wc7e29e82ba2b20d9@mail.gmail.com> References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> <2e1434c10707160924n5654b320wc7e29e82ba2b20d9@mail.gmail.com> Message-ID: <469B9F9C.9010102@gmail.com> Kevin Jacobs wrote: > Mea culpa on the msqrt example, however I still think it is wrong to get > a complex square-root back when a real valued result is expected and exists. No, in floating point you accumulate error. Those 1e-22j's are part of the actual result. Some systems like MATLAB implicitly silent such small imaginary components; we don't. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Mon Jul 16 13:00:45 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 16 Jul 2007 11:00:45 -0600 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <469B9F9C.9010102@gmail.com> References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> <2e1434c10707160924n5654b320wc7e29e82ba2b20d9@mail.gmail.com> <469B9F9C.9010102@gmail.com> Message-ID: On 7/16/07, Robert Kern wrote: > > Kevin Jacobs wrote: > > Mea culpa on the msqrt example, however I still think it is wrong to get > > a complex square-root back when a real valued result is expected and > exists. > > No, in floating point you accumulate error. Those 1e-22j's are part of the > actual result. Some systems like MATLAB implicitly silent such small > imaginary > components; we don't. The problem is that the given matrix has a conditon number of about 10**17 and is almost singular. A singular value decomposition works fine, but apparently the sqrtm call suffers from roundoff and takes the sqrt of a negative number. Sqrtm returns real results in better conditioned cases. In [2]: sqrtm(eye(2)) Out[2]: array([[ 1., 0.], [ 0., 1.]]) Perhaps we aren't using the best method here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bioinformed at gmail.com Mon Jul 16 13:10:12 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Mon, 16 Jul 2007 13:10:12 -0400 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> <2e1434c10707160924n5654b320wc7e29e82ba2b20d9@mail.gmail.com> <469B9F9C.9010102@gmail.com> Message-ID: <2e1434c10707161010q26e660b1nfe49744bcf331cd5@mail.gmail.com> On 7/16/07, Charles R Harris wrote: > > > > On 7/16/07, Robert Kern wrote: > > > > Kevin Jacobs wrote: > > > Mea culpa on the msqrt example, however I still think it is wrong to > > get > > > a complex square-root back when a real valued result is expected and > > exists. > > > > No, in floating point you accumulate error. Those 1e-22j's are part of > > the > > actual result. Some systems like MATLAB implicitly silent such small > > imaginary > > components; we don't. > > > The problem is that the given matrix has a conditon number of about 10**17 > and is almost singular. A singular value decomposition works fine, but > apparently the sqrtm call suffers from roundoff and takes the sqrt of a > negative number. Sqrtm returns real results in better conditioned cases. > > In [2]: sqrtm(eye(2)) > Out[2]: > array([[ 1., 0.], > [ 0., 1.]]) > > > Perhaps we aren't using the best method here. > Here is a better conditioned example: >>> a array([[ 1. , 0.5 , 0.3333, 0.25 ], [ 0.5 , 0.3333, 0.25 , 0.2 ], [ 0.3333, 0.25 , 0.2 , 0.1667], [ 0.25 , 0.2 , 0.1667, 0.1429]]) >>> b=sqrtm(a) >>> dot(b,b) array([[ 1. +0.j, 0.5 +0.j, 0.3333+0.j, 0.25 +0.j], [ 0.5 +0.j, 0.3333+0.j, 0.25 +0.j, 0.2 +0.j], [ 0.3333+0.j, 0.25 +0.j, 0.2 +0.j, 0.1667+0.j], [ 0.25 +0.j, 0.2 +0.j, 0.1667+0.j, 0.1429+0.j]]) >>> dot(b,b)-a array([[ -1.99840144e-15+0.j, -9.43689571e-16+0.j, -5.55111512e-16+0.j, -5.55111512e-16+0.j], [ -1.05471187e-15+0.j, -5.55111512e-17+0.j, 5.55111512e-17+0.j, 0.00000000e+00+0.j], [ -6.66133815e-16+0.j, 1.11022302e-16+0.j, 1.66533454e-16+0.j, 1.11022302e-16+0.j], [ -5.55111512e-16+0.j, 1.11022302e-16+0.j, 1.38777878e-16+0.j, 8.32667268e-17+0.j]]) Also verified the results against svd and eigenvalue methods for computing msqrt. I suppose I'm just used to seeing msqrt() implemented completely using real valued arithmetic. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From perry at stsci.edu Mon Jul 16 13:25:22 2007 From: perry at stsci.edu (Perry Greenfield) Date: Mon, 16 Jul 2007 13:25:22 -0400 Subject: [Numpy-discussion] [AstroPy] Porting "IDL Astronomy User's Library" to numpy In-Reply-To: <9DF78AC3-C057-489D-84FB-393371DAAC6A@gsfc.nasa.gov> References: <469B6EE5.2050908@ipnl.in2p3.fr> <1F2751F6-60DF-4306-B254-D3814F1EF885@stsci.edu> <9DF78AC3-C057-489D-84FB-393371DAAC6A@gsfc.nasa.gov> Message-ID: On Jul 16, 2007, at 1:19 PM, W.T. Bridgman wrote: > Perry, > > I believe some of those documents are getting a bit dated. They > still refer to only supporting numarray vs Numeric. Don't those need > to be updated to specify numpy? > Yes, that's certainly true. Having said that, it's probably going to be a month or two before I can get to updating them (hopefully I can find a couple hours before then). I'll see if I can find someone else to do that sooner. > Newcomers to the list might be confused if not familiar with the > history, especially considering the numpy begat numeric begat > numarray begat numpy timeline. > > Tom > From charlesr.harris at gmail.com Mon Jul 16 13:37:54 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 16 Jul 2007 11:37:54 -0600 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <2e1434c10707161010q26e660b1nfe49744bcf331cd5@mail.gmail.com> References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> <2e1434c10707160924n5654b320wc7e29e82ba2b20d9@mail.gmail.com> <469B9F9C.9010102@gmail.com> <2e1434c10707161010q26e660b1nfe49744bcf331cd5@mail.gmail.com> Message-ID: On 7/16/07, Kevin Jacobs wrote: > > On 7/16/07, Charles R Harris wrote: > > > > > > > > On 7/16/07, Robert Kern < robert.kern at gmail.com> wrote: > > > > > > Kevin Jacobs wrote: > > > > Mea culpa on the msqrt example, however I still think it is wrong to > > > get > > > > a complex square-root back when a real valued result is expected and > > > exists. > > > > > > No, in floating point you accumulate error. Those 1e-22j's are part of > > > the > > > actual result. Some systems like MATLAB implicitly silent such small > > > imaginary > > > components; we don't. > > > > > > The problem is that the given matrix has a conditon number of about > > 10**17 and is almost singular. A singular value decomposition works fine, > > but apparently the sqrtm call suffers from roundoff and takes the sqrt of a > > negative number. Sqrtm returns real results in better conditioned cases. > > > > In [2]: sqrtm(eye(2)) > > Out[2]: > > array([[ 1., 0.], > > [ 0., 1.]]) > > > > > > Perhaps we aren't using the best method here. > > > > > Here is a better conditioned example: > > >>> a > array([[ 1. , 0.5 , 0.3333, 0.25 ], > [ 0.5 , 0.3333, 0.25 , 0.2 ], > [ 0.3333, 0.25 , 0.2 , 0.1667], > [ 0.25 , 0.2 , 0.1667, 0.1429]]) > >>> b=sqrtm(a) > >>> dot(b,b) > array([[ 1. +0.j, 0.5 +0.j, 0.3333+0.j, 0.25 +0.j], > [ 0.5 +0.j, 0.3333+0.j, 0.25 +0.j, 0.2 +0.j], > [ 0.3333+0.j, 0.25 +0.j, 0.2 +0.j, 0.1667+0.j], > [ 0.25 +0.j, 0.2 +0.j, 0.1667+0.j, 0.1429+0.j]]) > >>> dot(b,b)-a > array([[ -1.99840144e-15+0.j, -9.43689571e-16+0.j, -5.55111512e-16+0.j, > -5.55111512e-16+0.j], > [ -1.05471187e-15+0.j, -5.55111512e-17+0.j , 5.55111512e-17+0.j, > 0.00000000e+00+0.j], > [ -6.66133815e-16+0.j, 1.11022302e-16+0.j, 1.66533454e-16+0.j, > 1.11022302e-16+0.j], > [ -5.55111512e-16+0.j, 1.11022302e-16+0.j , 1.38777878e-16+0.j, > 8.32667268e-17+0.j]]) > > Also verified the results against svd and eigenvalue methods for computing > msqrt. I suppose I'm just used to seeing msqrt() implemented completely > using real valued arithmetic. Hmm, I get a real result for this, although the result is wildly incorrect. Sqrtm isn't part of numpy, where are you getting it from? Mine is coming from pylab and looks remarkably buggy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bioinformed at gmail.com Mon Jul 16 13:40:25 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Mon, 16 Jul 2007 13:40:25 -0400 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> Message-ID: <2e1434c10707161040p25a0d176kf12b23b7d1825853@mail.gmail.com> On 7/16/07, Kevin Jacobs wrote: > > This is a bit of a SciPy question, but I thought I'd ask here since I'm > already subscribed. I'd like to add some new LAPACK bindings to SciPy and > was wondering if there was a minimum version requirement for LAPACK, since > it would be ideal if I could use some of the newer 3.0 features. In > addition to using some block methods only added in 3.0, it is very > convenient to use the WORK=-1 for space queries instead of reimplementing > the underlying logic in the calc_work module. > > The routines of most interest to me are: > DGELSD > DGGGLM > DGGLSE > STEGR Thanks for all of the feedback on sqrtm. Can anyone comment on the suitability of adding LAPACK 3.0 functions to SciPy.LinAlg? I need to do the work regardless, but being able to contribute it back would be very nice. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bioinformed at gmail.com Mon Jul 16 13:42:45 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Mon, 16 Jul 2007 13:42:45 -0400 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> <2e1434c10707160924n5654b320wc7e29e82ba2b20d9@mail.gmail.com> <469B9F9C.9010102@gmail.com> <2e1434c10707161010q26e660b1nfe49744bcf331cd5@mail.gmail.com> Message-ID: <2e1434c10707161042u549a138bodedfa64247c681f4@mail.gmail.com> On 7/16/07, Charles R Harris wrote: > > Hmm, > > I get a real result for this, although the result is wildly incorrect. > Sqrtm isn't part of numpy, where are you getting it from? Mine is coming > from pylab and looks remarkably buggy. > from scipy.linalg import sqrtm I'm posting on the NumPy list since I already subscribe here, although my questions are more topical for the SciPy folks. I'm hoping the overlap is considerable and I'll be forgiven (since adding yet another mailing list is sure to capsize my poor mailbox). Thanks, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jul 16 15:21:38 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 16 Jul 2007 14:21:38 -0500 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <2e1434c10707161040p25a0d176kf12b23b7d1825853@mail.gmail.com> References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> <2e1434c10707161040p25a0d176kf12b23b7d1825853@mail.gmail.com> Message-ID: <469BC542.2010005@gmail.com> Kevin Jacobs wrote: > On 7/16/07, *Kevin Jacobs >* > wrote: > > This is a bit of a SciPy question, but I thought I'd ask here since > I'm already subscribed. I'd like to add some new LAPACK bindings to > SciPy and was wondering if there was a minimum version requirement > for LAPACK, since it would be ideal if I could use some of the newer > 3.0 features. In addition to using some block methods only added in > 3.0, it is very convenient to use the WORK=-1 for space queries > instead of reimplementing the underlying logic in the calc_work module. > > The routines of most interest to me are: > DGELSD > DGGGLM > DGGLSE > > STEGR > > Thanks for all of the feedback on sqrtm. Can anyone comment on the > suitability of adding LAPACK 3.0 functions to SciPy.LinAlg? I need to > do the work regardless, but being able to contribute it back would be > very nice. And we'd certainly appreciate the contribution. I'm tentatively going to say yes, we should start requiring LAPACK 3.0 unless if there is some very important platform that only comes with an older LAPACK. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bioinformed at gmail.com Mon Jul 16 15:34:29 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Mon, 16 Jul 2007 15:34:29 -0400 Subject: [Numpy-discussion] NumPy/SciPy LAPACK version In-Reply-To: <469BC542.2010005@gmail.com> References: <2e1434c10707160652j2d0b3106o28f836096653e064@mail.gmail.com> <2e1434c10707161040p25a0d176kf12b23b7d1825853@mail.gmail.com> <469BC542.2010005@gmail.com> Message-ID: <2e1434c10707161234s3e199024gdd65a8f85675d8d7@mail.gmail.com> On 7/16/07, Robert Kern wrote: > > And we'd certainly appreciate the contribution. I'm tentatively going to > say > yes, we should start requiring LAPACK 3.0 unless if there is some very > important > platform that only comes with an older LAPACK. > Great! The added benefit is that the calc_work module effectively goes away, at least in its current form. I've checked that the current versions of the major platform-optimized math libraries all use 3.0 or greater, including Sun's math performance library, Intel MKL, and AMD's ACML. If one is willing to forgo vendor-supplied platform optimizations, then ATLAS+LAPACK or just Netlib LAPACK should suffice for many users. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jul 17 07:21:34 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 17 Jul 2007 20:21:34 +0900 Subject: [Numpy-discussion] What is the different between nanmin and min ? Message-ID: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> Hi, I noticed that min and max already ignore Nan, which raises the question: why are there nanmin and nanmax functions ? cheers, David From matthieu.brucher at gmail.com Tue Jul 17 07:35:09 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 17 Jul 2007 13:35:09 +0200 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> Message-ID: Hi, I encountered cases where numpy.min() returned nan when the first and the last values were nan. Didn't know of nanmin(), but I'll use them now ! Matthieu 2007/7/17, David Cournapeau : > > Hi, > > I noticed that min and max already ignore Nan, which raises the > question: why are there nanmin and nanmax functions ? > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue Jul 17 07:38:08 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 17 Jul 2007 13:38:08 +0200 Subject: [Numpy-discussion] Finalising documentation guidelines for NumPy Message-ID: <20070717113808.GE7290@mentat.za.net> Hi all, In May this year, Charles Harris posted to this mailing list http://thread.gmane.org/gmane.comp.python.numeric.general/15381/focus=15407 discussing some shortcomings of the current NumPy (and hence SciPy) documentation standard http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt Since we are in the process of slowly migrating all documentation to the new standard, it is worth revisiting the issue one last time, before we put in any more effort. We need a format which - parses without errors in epydoc - generates easily readable output and - places sections in a pre-determined order We also need to design a default style sheet, to aid the production of uniform documentation. At least the following changes are needed to the current standard: 1) In the parameters section, var1 : type Description. is parsed correctly but var1 : Description. breaks. This can be fixed either by omitting the colon after 'var1' in the second case, or by slightly modifying epydoc's output. 2) In the SeeAlso section, variables should be surrounded by `` to link to their relevant docstrings, i.e. :SeeAlso: - otherfunc : relationship to thisfunc. changes to :SeeAlso: - `otherfunc` : relationship to thisfunc. According to a post in the thread mentioned above, epydoc also permutes the sections in such a way that Notes and Examples appear in the wrong places. As far as I can tell, this is an epydoc issue, which we should take up with Ed Loper. If you have any information that could help us finalise the specification, or would like to share your opinion, I'd be glad to hear it. Regards St?fan From david at ar.media.kyoto-u.ac.jp Tue Jul 17 07:31:04 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 17 Jul 2007 20:31:04 +0900 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> Message-ID: <469CA878.20300@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > Hi, > > I encountered cases where numpy.min() returned nan when the first and > the last values were nan. Didn't know of nanmin(), but I'll use them now ! Mmh, interesting. Indeed, a quick test shows that as long as the last value of a rank 1 array is not Nan, min ignore nan. import numpy a = 0.1 * numpy.arange(10) numpy.min(a) a[:9] = numpy.nan numpy.min(a) # ignore Nan a = 0.1 * numpy.arange(10) a[-1] = numpy.nan numpy.min(a) # Does not ignore Nan cheers, David From kwgoodman at gmail.com Tue Jul 17 07:46:20 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 17 Jul 2007 13:46:20 +0200 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> Message-ID: On 7/17/07, David Cournapeau wrote: > I noticed that min and max already ignore Nan, which raises the > question: why are there nanmin and nanmax functions ? Using min and max when you have NaNs is dangerous. Here's an example: >> x = M.matrix([[ 1.0, 2.0, M.nan]]) >> x.min() 1.0 >> x = M.matrix([[ M.nan, 2.0, 1.0]]) >> x.min() nan I wish that min and max ignored NaNs. For me taking the time to check for NaNs (slowing down min and max) is worth it. But it seems like most people disagree. So I use nanmin and nanmax instead. From tim.hochberg at ieee.org Tue Jul 17 12:01:27 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 17 Jul 2007 09:01:27 -0700 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> Message-ID: On 7/17/07, Keith Goodman wrote: > > On 7/17/07, David Cournapeau wrote: > > I noticed that min and max already ignore Nan, which raises the > > question: why are there nanmin and nanmax functions ? > > Using min and max when you have NaNs is dangerous. Here's an example: > > >> x = M.matrix([[ 1.0, 2.0, M.nan]]) > >> x.min() > 1.0 > > >> x = M.matrix([[ M.nan, 2.0, 1.0]]) > >> x.min() > nan > > I wish that min and max ignored NaNs. For me taking the time to check > for NaNs (slowing down min and max) is worth it. But it seems like > most people disagree. So I use nanmin and nanmax instead. The time is one issue. Another is that ignoring NaNs is only correct if you are treating NaNs as missing values. If instead you are treating them as non numbers, the results of some bogus computation, then raising an error is a more appropriate response. If one was going to take the time to check for NaNs, one strategy that I would probably support would be to ignore the NaNs, but set the invalid flag. If the error state for invalid was set to ignore, then this would work as the missing value camp likes, otherwise it would raise an error or signal a warning. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From globophobe at gmail.com Tue Jul 17 12:14:15 2007 From: globophobe at gmail.com (Luis N) Date: Wed, 18 Jul 2007 01:14:15 +0900 Subject: [Numpy-discussion] Undefined symbol "fpsetsticky" Message-ID: I just built numpy from svn checkout: Python 2.4.3 (#2, Nov 8 2006, 23:56:15) [GCC 3.4.6 [FreeBSD] 20060305] on freebsd6 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.4/site-packages/numpy/__init__.py", line 39, in ? import core File "/usr/local/lib/python2.4/site-packages/numpy/core/__init__.py", line 6, in ? import umath ImportError: /usr/local/lib/python2.4/site-packages/numpy/core/umath.so: Undefined symbol "fpsetsticky" What might this suggest? From aisaac at american.edu Tue Jul 17 13:58:24 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 17 Jul 2007 13:58:24 -0400 Subject: [Numpy-discussion] Finalising documentation guidelines for NumPy In-Reply-To: <20070717113808.GE7290@mentat.za.net> References: <20070717113808.GE7290@mentat.za.net> Message-ID: On Tue, 17 Jul 2007, Stefan van der Walt apparently wrote: > var1 : > Description. > breaks. This can be fixed either by omitting the colon after > 'var1' in the second case, or by slightly modifying epydoc's output. It breaks semantically too, no? (The colon is a separator, separating the variable name from its type.) By the way, Are you referring to a particular document? I see the following document http://projects.scipy.org/scipy/numpy/wiki/DocstringStandards which points to the detailed document. http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt The detailed document seems the place to add your changes. (Note that the renderred version is unavailable: http://projects.scipy.org/scipy/numpy/wiki/HowToDocument) Cheers, Alan Isaac From stefan at sun.ac.za Tue Jul 17 18:07:38 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 18 Jul 2007 00:07:38 +0200 Subject: [Numpy-discussion] Finalising documentation guidelines for NumPy In-Reply-To: References: <20070717113808.GE7290@mentat.za.net> Message-ID: <20070717220737.GI7290@mentat.za.net> On Tue, Jul 17, 2007 at 01:58:24PM -0400, Alan G Isaac wrote: > On Tue, 17 Jul 2007, Stefan van der Walt apparently wrote: > > var1 : > > Description. > > breaks. This can be fixed either by omitting the colon after > > 'var1' in the second case, or by slightly modifying epydoc's output. > > It breaks semantically too, no? > (The colon is a separator, separating the variable name from its type.) > > By the way, > Are you referring to a particular document? I see the I'm referring to http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt Cheers St?fan From v-nijs at kellogg.northwestern.edu Wed Jul 18 02:49:33 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Wed, 18 Jul 2007 01:49:33 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: I combined some of the very useful comments/code from Tim and Torgil and came-up with the attached program to read csv files and convert the data into a recarray. I couldn?t use all of their suggestions because, frankly, I didn?t understand all of them :) The program use variable names if provided in the csv-file and can auto-detect data types. However, I also wanted to make it easy to specify data types and/or variables names if so desired. Examples are at the bottom of the file. Comments are very welcome. Thanks, Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: load_combi.py Type: application/octet-stream Size: 5559 bytes Desc: not available URL: From kwgoodman at gmail.com Wed Jul 18 07:59:21 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 18 Jul 2007 13:59:21 +0200 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> Message-ID: On 7/17/07, Timothy Hochberg wrote: > The time is one issue. Another is that ignoring NaNs is only correct if you > are treating NaNs as missing values. If instead you are treating them as non > numbers, the results of some bogus computation, then raising an error is a > more appropriate response. If one was going to take the time to check for > NaNs, one strategy that I would probably support would be to ignore the > NaNs, but set the invalid flag. If the error state for invalid was set to > ignore, then this would work as the missing value camp likes, otherwise it > would raise an error or signal a warning. That sounds great. Would a change like that have to wait until 1.1? From park.notread at gmail.com Wed Jul 18 13:40:53 2007 From: park.notread at gmail.com (Park Hays) Date: Wed, 18 Jul 2007 11:40:53 -0600 Subject: [Numpy-discussion] ld.so.1 linker errors building numpy Message-ID: I have been fighting for a couple weeks to get numpy installed, on the way to a full scipy+matplotlib system. At this point, the transcript looks something like: > python Python 2.5 (r25:51908, Sep 20 2006, 06:18:53) [GCC 3.4.6] on sunos5 >> from numpy import * -- stack trace, cut out except for last portion -- File "[snip]/linalg.py", line 25, in from numpy.linalg import lapack_lite Import Error: ld.so.1: python: fatal: relocation error: file [snip]/numpy/linalg/lapack_lite.so: symbol s_wsfe: referenced symbol not found. The python came from sunfreeware.com, if my sysadmin remembers right. I am attempting to build against an ATLAS 3.6.0 which I built with gcc 3.4.6, where I've added the -fPIC flag (since various previous error messages made me think numpy couldn't link to .a libraries, and I was going through tons of steps to get .so libraries from the ATLAS and LAPACK builds...) LAPACK 3.1.1, also built with 3.4.6 NumPy is from SVN (3881 from July 5, 2007), I think 1.0.4 is the version number, though it is not released--I think. This fixed some major problems with the distutils in 1.0.3. My uname -a output is SunOS 5.8 Generic_108528_29 sun4u sparc SUNW, Sun-Blade-1000 In some cases I had to take the failed build (from the numpy setup) and replace, at the linker stage, gcc -shared with CC -G (which invokes Sun's development tools linker). At that point the .so would be generated quietly and a restart of numpy's setup.py would continue fine. The machine on which I am doing all this is off-network, and it is extremely difficult to add new software to it. I've seen one post somewhere suggesting that using gnu ld might help (from binutils). I am in the process of adding it, but I'd prefer to find a way to build this without. Any suggestions would be appreciated! Any explanations would be gravy. -Park Hays -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jul 18 14:30:01 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 18 Jul 2007 13:30:01 -0500 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> Message-ID: <469E5C29.3040606@gmail.com> Timothy Hochberg wrote: > The time is one issue. Another is that ignoring NaNs is only correct if > you are treating NaNs as missing values. If instead you are treating > them as non numbers, the results of some bogus computation, then raising > an error is a more appropriate response. If one was going to take the > time to check for NaNs, one strategy that I would probably support would > be to ignore the NaNs, but set the invalid flag. If the error state for > invalid was set to ignore, then this would work as the missing value > camp likes, otherwise it would raise an error or signal a warning. I'd almost be willing to make max() and min() always ignore quiet NaNs. The C99 standard requires this, for example, (c.f. section F.9.9.2 of the C99 standard). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gnata at obs.univ-lyon1.fr Wed Jul 18 14:33:31 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Wed, 18 Jul 2007 20:33:31 +0200 Subject: [Numpy-discussion] -lmkl_lapack64 on i368 ?? Message-ID: <469E5CFB.6010003@obs.univ-lyon1.fr> Hi, I'm trying to update numpy by compiling the up to date svn: I get this error : gcc: numpy/linalg/lapack_litemodule.c gcc -pthread -shared build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so /usr/bin/ld: cannot find -lmkl_lapack64 collect2: ld returned 1 exit status /usr/bin/ld: cannot find -lmkl_lapack64 collect2: ld returned 1 exit status error: Command "gcc -pthread -shared build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so" failed with exit status 1 There must be something wrong in the distutils/makefile because I'm on a debian sid *i386* so why should I link against mkl_lapack64 ?? Of course, I do not have lapack64 installed on this i386 machine. -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From gzhu at peak6.com Wed Jul 18 15:06:40 2007 From: gzhu at peak6.com (Geoffrey Zhu) Date: Wed, 18 Jul 2007 14:06:40 -0500 Subject: [Numpy-discussion] Logical Selector References: <469E5CFB.6010003@obs.univ-lyon1.fr> Message-ID: <99F81FFD0EA54E4DA8D4F1BFE272F34105400FBE@ppi-mail1.chicago.peak6.net> Hi Everyone, I am finding that numpy cannot operate on boolean arrays. For example, the following does not work: x=3Darray([(1,2),(2,1),(3,1),(4,1)]) x[x[:,0]>x[:,1] and x[1:]>1,:] It gives me an syntax error: ------------------- Traceback (most recent call last): File "", line 1, in x[x[:,0]>x[:,1] and x[1:]>1,:] ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() ------------------- However, this is not what I want. I want a "piece-wise" "AND" operation on the two boolean vectors. In other words, the row is selected if both are true. How do I accomplish this? Many thanks, Geoffrey P.S The disclaimer is automatically generated by the mail server. _______________________________________________________=0A= =0A= The information in this email or in any file attached=0A= hereto is intended only for the personal and confiden-=0A= tial use of the individual or entity to which it is=0A= addressed and may contain information that is propri-=0A= etary and confidential. If you are not the intended=0A= recipient of this message you are hereby notified that=0A= any review, dissemination, distribution or copying of=0A= this message is strictly prohibited. This communica-=0A= tion is for information purposes only and should not=0A= be regarded as an offer to sell or as a solicitation=0A= of an offer to buy any financial product. Email trans-=0A= mission cannot be guaranteed to be secure or error-=0A= free. P6070214 From robert.kern at gmail.com Wed Jul 18 15:15:32 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 18 Jul 2007 14:15:32 -0500 Subject: [Numpy-discussion] Logical Selector In-Reply-To: <99F81FFD0EA54E4DA8D4F1BFE272F34105400FBE@ppi-mail1.chicago.peak6.net> References: <469E5CFB.6010003@obs.univ-lyon1.fr> <99F81FFD0EA54E4DA8D4F1BFE272F34105400FBE@ppi-mail1.chicago.peak6.net> Message-ID: <469E66D4.2030501@gmail.com> Geoffrey Zhu wrote: > Hi Everyone, > > I am finding that numpy cannot operate on boolean arrays. For example, > the following does not work: > > x=3Darray([(1,2),(2,1),(3,1),(4,1)]) > > x[x[:,0]>x[:,1] and x[1:]>1,:] > > It gives me an syntax error: > > ------------------- > Traceback (most recent call last): > File "", line 1, in > x[x[:,0]>x[:,1] and x[1:]>1,:] > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > ------------------- > > However, this is not what I want. I want a "piece-wise" "AND" operation > on the two boolean vectors. In other words, the row is selected if both > are true. How do I accomplish this? The "and" keyword tries to coerce each of its operands into a Boolean True or False value. This behavior cannot be overridden in the current Python language to yield arrays of Boolean values. However, for Boolean arrays, the bitwise logical operations &, |, and ~ work just fine for this purpose instead of "and", "or", and "not". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Wed Jul 18 16:29:34 2007 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 18 Jul 2007 10:29:34 -1000 Subject: [Numpy-discussion] Logical Selector In-Reply-To: <469E66D4.2030501@gmail.com> References: <469E5CFB.6010003@obs.univ-lyon1.fr> <99F81FFD0EA54E4DA8D4F1BFE272F34105400FBE@ppi-mail1.chicago.peak6.net> <469E66D4.2030501@gmail.com> Message-ID: <469E782E.1020801@hawaii.edu> Robert Kern wrote: > Geoffrey Zhu wrote: >> Hi Everyone, >> >> I am finding that numpy cannot operate on boolean arrays. For example, >> the following does not work: >> >> x=3Darray([(1,2),(2,1),(3,1),(4,1)]) >> >> x[x[:,0]>x[:,1] and x[1:]>1,:] >> >> It gives me an syntax error: >> >> ------------------- >> Traceback (most recent call last): >> File "", line 1, in >> x[x[:,0]>x[:,1] and x[1:]>1,:] >> ValueError: The truth value of an array with more than one element is >> ambiguous. Use a.any() or a.all() >> ------------------- >> >> However, this is not what I want. I want a "piece-wise" "AND" operation >> on the two boolean vectors. In other words, the row is selected if both >> are true. How do I accomplish this? > > The "and" keyword tries to coerce each of its operands into a Boolean True or > False value. This behavior cannot be overridden in the current Python language > to yield arrays of Boolean values. However, for Boolean arrays, the bitwise > logical operations &, |, and ~ work just fine for this purpose instead of "and", > "or", and "not". > except that their precedence is higher than that of the logical operators, so one must remember to use parentheses: (ad) which is good for clarity anyway. Eric From gnata at obs.univ-lyon1.fr Wed Jul 18 17:43:51 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Wed, 18 Jul 2007 23:43:51 +0200 Subject: [Numpy-discussion] Logical Selector In-Reply-To: <469E782E.1020801@hawaii.edu> References: <469E5CFB.6010003@obs.univ-lyon1.fr> <99F81FFD0EA54E4DA8D4F1BFE272F34105400FBE@ppi-mail1.chicago.peak6.net> <469E66D4.2030501@gmail.com> <469E782E.1020801@hawaii.edu> Message-ID: <469E8997.4010609@obs.univ-lyon1.fr> Eric Firing wrote: > Robert Kern wrote: > >> Geoffrey Zhu wrote: >> >>> Hi Everyone, >>> >>> I am finding that numpy cannot operate on boolean arrays. For example, >>> the following does not work: >>> >>> x=3Darray([(1,2),(2,1),(3,1),(4,1)]) >>> >>> x[x[:,0]>x[:,1] and x[1:]>1,:] >>> >>> It gives me an syntax error: >>> >>> ------------------- >>> Traceback (most recent call last): >>> File "", line 1, in >>> x[x[:,0]>x[:,1] and x[1:]>1,:] >>> ValueError: The truth value of an array with more than one element is >>> ambiguous. Use a.any() or a.all() >>> ------------------- >>> >>> However, this is not what I want. I want a "piece-wise" "AND" operation >>> on the two boolean vectors. In other words, the row is selected if both >>> are true. How do I accomplish this? >>> >> The "and" keyword tries to coerce each of its operands into a Boolean True or >> False value. This behavior cannot be overridden in the current Python language >> to yield arrays of Boolean values. However, for Boolean arrays, the bitwise >> logical operations &, |, and ~ work just fine for this purpose instead of "and", >> "or", and "not". >> >> > > except that their precedence is higher than that of the logical > operators, so one must remember to use parentheses: > (ad) > which is good for clarity anyway. > > Eric > Hi, Well maybe it is a bug on my box (thunderbird) but the topic of the thread is "-lmkl_lapack64 on i368 ??". Nothing to do with "Logical Selector" ;) Should I post another mail about this topic? Xavier ps : I'm just sorry for the noise if it is a bug on my side. -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From robert.kern at gmail.com Wed Jul 18 17:48:05 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 18 Jul 2007 16:48:05 -0500 Subject: [Numpy-discussion] Logical Selector In-Reply-To: <469E8997.4010609@obs.univ-lyon1.fr> References: <469E5CFB.6010003@obs.univ-lyon1.fr> <99F81FFD0EA54E4DA8D4F1BFE272F34105400FBE@ppi-mail1.chicago.peak6.net> <469E66D4.2030501@gmail.com> <469E782E.1020801@hawaii.edu> <469E8997.4010609@obs.univ-lyon1.fr> Message-ID: <469E8A95.3050002@gmail.com> Xavier Gnata wrote: > Well maybe it is a bug on my box (thunderbird) but the topic of the > thread is "-lmkl_lapack64 on i368 ??". > Nothing to do with "Logical Selector" ;) > Should I post another mail about this topic? > > Xavier > ps : I'm just sorry for the noise if it is a bug on my side. No, I think that Geoffrey Zhu accidentally hit "Reply" to your message instead of creating a new thread as he should have. We don't mean to hijack your thread, but mistakes happen. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From torgil.svensson at gmail.com Wed Jul 18 20:57:44 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Thu, 19 Jul 2007 02:57:44 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: Nice, I haven't gone through all details. That's a nice new "missing" feature, maybe all instances where we can't find a conversion should be "nan". A few comments: 1. The "load_search" functions contains all memory/performance overhead that we wanted to avoid with the fromiter function. Does this mean that you no longer have large text-files that change sting representation in the columns (aka "0" floats) ? 2. ident=" "*4 This has the same spelling error as in my first compile try .. it was meant to be "indent" 3. types = list((i,j) for i, j in zip(varnm, types2)) Isn't this the same as "types = zip(varnm, types2)" ? 4. return N.fromiter(iter(reader),dtype = types) Isn't "reader" an iterator already? What does the "iter()" operator do in this case? Best regards, //Torgil On 7/18/07, Vincent Nijs wrote: > > I combined some of the very useful comments/code from Tim and Torgil and > came-up with the attached program to read csv files and convert the data > into a recarray. I couldn't use all of their suggestions because, frankly, I > didn't understand all of them :) > > The program use variable names if provided in the csv-file and can > auto-detect data types. However, I also wanted to make it easy to specify > data types and/or variables names if so desired. Examples are at the bottom > of the file. Comments are very welcome. > > Thanks, > > Vincent > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > From v-nijs at kellogg.northwestern.edu Wed Jul 18 21:47:39 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Wed, 18 Jul 2007 20:47:39 -0500 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: Message-ID: Hi Torgil, 1. I got an email from Tim about this issue: "I finally got around to doing some more quantitative comparisons between your code and the more complicated version that I proposed. The idea behind my code was to minimize memory usage -- I figured that keeping the memory usage low would make up for any inefficiencies in the conversion process since it's been my experience that memory bandwidth dominates a lot of numeric problems as problem sized get reasonably large. I was mostly wrong. While it's true that for very large file sizes I can get my code to outperform yours, in most instances it lags behind. And the range where it does better is a fairly small range right before the machine dies with a memory error. So my conclusion is that the extra hoops my code goes through to avoid allocating extra memory isn't worth it for you to bother with.? The approach in my code is simple and robust to most data issues I could come-up with. It actually will do an appropriate conversion if there are missing values or int?s and float in the same column. It will select an appropriate string length as well. It may not be the most memory efficient setup but given Tim?s comments it is a pretty decent solution for the types of data I have access to. 2. Fixed the spelling error :) 3. I guess that is the same thing. I am not very familiar with zip, izip, map etc. just yet :) Thanks for the tip! 4. I called the function generated using exec, iter(). I need that function to transform the data using the types provided by the user. Best, Vincent On 7/18/07 7:57 PM, "Torgil Svensson" wrote: > Nice, > > I haven't gone through all details. That's a nice new "missing" > feature, maybe all instances where we can't find a conversion should > be "nan". A few comments: > > 1. The "load_search" functions contains all memory/performance > overhead that we wanted to avoid with the fromiter function. Does this > mean that you no longer have large text-files that change sting > representation in the columns (aka "0" floats) ? > > 2. ident=" "*4 > This has the same spelling error as in my first compile try .. it was > meant to be "indent" > > 3. types = list((i,j) for i, j in zip(varnm, types2)) > Isn't this the same as "types = zip(varnm, types2)" ? > > 4. return N.fromiter(iter(reader),dtype = types) > Isn't "reader" an iterator already? What does the "iter()" operator do > in this case? > > Best regards, > > //Torgil > > > On 7/18/07, Vincent Nijs wrote: >> >> I combined some of the very useful comments/code from Tim and Torgil and >> came-up with the attached program to read csv files and convert the data >> into a recarray. I couldn't use all of their suggestions because, frankly, I >> didn't understand all of them :) >> >> The program use variable names if provided in the csv-file and can >> auto-detect data types. However, I also wanted to make it easy to specify >> data types and/or variables names if so desired. Examples are at the bottom >> of the file. Comments are very welcome. >> >> Thanks, >> >> Vincent >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Vincent R. Nijs Assistant Professor of Marketing Kellogg School of Management, Northwestern University 2001 Sheridan Road, Evanston, IL 60208-2001 Phone: +1-847-491-4574 Fax: +1-847-491-2498 E-mail: v-nijs at kellogg.northwestern.edu Skype: vincentnijs -------------- next part -------------- An HTML attachment was scrubbed... URL: From v-nijs at kellogg.northwestern.edu Wed Jul 18 21:31:07 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Wed, 18 Jul 2007 20:31:07 -0500 Subject: [Numpy-discussion] Recarray to and from sqlite In-Reply-To: Message-ID: Hi, I am trying to write a couple of simple functions to (1) save recarray's to an sqlite database and (2) load a recarray from an sqllite database. I am stuck on 2 points and hope there are some people on this list that use sqlite for numpy stuff. 1. How to detect the variable names and types from the sqlite database? I am using: conn = sqlite3.connect(fname,detect_types=sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COL NAMES) but then how do you access the variable names and types and convert them to numpy types? 2. In saving the recarray to sqlite I need to get data types from data.dtype.descr and transform the names to types that sqlite knows: string --> text int --> integer float --> real I tried some things like: for i in data[0]: if type(i) == str This didn't work because the elements are numpy.strings and I couldn't get the comparison to work. I'd rather use the dtype descriptions directly but couldn't figure out how to do that either. Any suggestions are very welcome. Thanks!! Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: load_sqlite.py Type: application/octet-stream Size: 1739 bytes Desc: not available URL: From tim.hochberg at ieee.org Wed Jul 18 22:17:08 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Wed, 18 Jul 2007 19:17:08 -0700 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: <469E5C29.3040606@gmail.com> References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> <469E5C29.3040606@gmail.com> Message-ID: On 7/18/07, Robert Kern wrote: > > Timothy Hochberg wrote: > > > The time is one issue. Another is that ignoring NaNs is only correct if > > you are treating NaNs as missing values. If instead you are treating > > them as non numbers, the results of some bogus computation, then raising > > an error is a more appropriate response. If one was going to take the > > time to check for NaNs, one strategy that I would probably support would > > be to ignore the NaNs, but set the invalid flag. If the error state for > > invalid was set to ignore, then this would work as the missing value > > camp likes, otherwise it would raise an error or signal a warning. > > I'd almost be willing to make max() and min() always ignore quiet NaNs. > The C99 > standard requires this, for example, (c.f. section F.9.9.2 of the C99 > standard). Wow! It sure does. That surprises me as it seems antithetical to my understanding of the concept of NaNs being non comparable. Perhaps my understanding is just flawed, it wouldn't be the first time. Personally, I'd still rather see the warning get set when NaNs are around. That's colored by my usage patterns where if a NaN is present, it's a problem and I'd like to know about sooner rather than later. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jul 19 01:39:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 19 Jul 2007 00:39:17 -0500 Subject: [Numpy-discussion] What is the different between nanmin and min ? In-Reply-To: References: <469CA63E.7000000@ar.media.kyoto-u.ac.jp> <469E5C29.3040606@gmail.com> Message-ID: <469EF905.6030109@gmail.com> Timothy Hochberg wrote: > > On 7/18/07, *Robert Kern* > wrote: > > Timothy Hochberg wrote: > > > The time is one issue. Another is that ignoring NaNs is only > correct if > > you are treating NaNs as missing values. If instead you are treating > > them as non numbers, the results of some bogus computation, then > raising > > an error is a more appropriate response. If one was going to take the > > time to check for NaNs, one strategy that I would probably support > would > > be to ignore the NaNs, but set the invalid flag. If the error > state for > > invalid was set to ignore, then this would work as the missing value > > camp likes, otherwise it would raise an error or signal a warning. > > I'd almost be willing to make max() and min() always ignore quiet > NaNs. The C99 > standard requires this, for example, (c.f. section F.9.9.2 of the > C99 standard). > > Wow! It sure does. That surprises me as it seems antithetical to my > understanding of the concept of NaNs being non comparable. Perhaps my > understanding is just flawed, it wouldn't be the first time. Personally, > I'd still rather see the warning get set when NaNs are around. That's > colored by my usage patterns where if a NaN is present, it's a problem > and I'd like to know about sooner rather than later. Well, the IEEE-754 standard is silent, to my reading, on "min" and "max". It's also silent on what wily programmers can do with isnan(), and that's essentially what the C99 standard is specifying: using isnan() to special-case the result of fmin() and fmax(). The operations that IEEE-754 specifies (> and <) are just not being used in the presence of NaNs. Of course, if C99 is free to do whatever the hell they want, so are we. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Thu Jul 19 01:44:53 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 19 Jul 2007 14:44:53 +0900 Subject: [Numpy-discussion] ld.so.1 linker errors building numpy In-Reply-To: References: Message-ID: <469EFA55.8020900@ar.media.kyoto-u.ac.jp> Park Hays wrote: > I have been fighting for a couple weeks to get numpy installed, on the > way to a full scipy+matplotlib system. I tried installing numpy on a solaris machine on SPARC too, with the added difficulty to have only a local account on the machine (without a compiler: had to build my own :) )... > At this point, the transcript looks something like: > > > python > Python 2.5 (r25:51908, Sep 20 2006, 06:18:53) > [GCC 3.4.6] on sunos5 > >> from numpy import * > -- stack trace, cut out except for last portion -- > File "[snip]/linalg.py", line 25, in > from numpy.linalg import lapack_lite > Import Error: ld.so.1: python: fatal: relocation error: file > [snip]/numpy/linalg/lapack_lite.so: symbol s_wsfe: referenced symbol > not found. I would recommend to do it step by step: - first, do not compile any blas/lapack. They are difficult to build correctly because of various issues I won't go into now; ATLAS only makes the matter worse :) - once you succeed building numpy without any blas/lapack, you can build your own blas/lapack. To build correctly with LAPACK 3.1.1, you only need to add -fPIC to OPT and NOOPT in the make.inc. For previous versions of LAPACK, this does NOT work, and the BLAS is broken in previous versions (you have to build BLAS separately). You can use the static archives (.a), but they HAVE to be built using -fPIC. I would recomment against using .so at first, as it adds a level of complexity. - once the above works, you can try ATLAS. I strongly recommend using the dev version of ATLAS (3.7.34 as today) because its configuration is able to handle shared library building. To build ATLAS usable by numpy/scipy, you should use the following: ./configure --with-netlib-lapack=LAPACKPATH -Fa alg -fPIC where LAPACKPATH is the full path of your static lapack library built before; you should also use the same compilers than everywhere else (this is not a must I guess, but less risk to avoid subtle issues when using different compilers). If you are willing to follow the above steps, it will be easier to debug things one after the other, I think. cheers, David From torgil.svensson at gmail.com Thu Jul 19 06:30:25 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Thu, 19 Jul 2007 12:30:25 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: Hi, 1. Your code is fast due to that you convert whole at once columns in numpy. The first step with the lists is also very fast (python implements lists as arrays). I like your version, I think it's as fast as it gets in pure python and has to keep only two versions of the data at once in memory (since the string versions can be garbage collected). If memory really is an issue, you have the nice "load_spec" version and can always convert the files once by iterating over the file twice like the attached script does. 4. Okay, that makes sense. I was confused by the fact that your generated function had the same name as the builtin iter() operator. //Torgil On 7/19/07, Vincent Nijs wrote: > > Hi Torgil, > > 1. I got an email from Tim about this issue: > > "I finally got around to doing some more quantitative comparisons between > your code and the more complicated version that I proposed. The idea behind > my code was to minimize memory usage -- I figured that keeping the memory > usage low would make up for any inefficiencies in the conversion process > since it's been my experience that memory bandwidth dominates a lot of > numeric problems as problem sized get reasonably large. I was mostly wrong. > While it's true that for very large file sizes I can get my code to > outperform yours, in most instances it lags behind. And the range where it > does better is a fairly small range right before the machine dies with a > memory error. So my conclusion is that the extra hoops my code goes through > to avoid allocating extra memory isn't worth it for you to bother with." > > The approach in my code is simple and robust to most data issues I could > come-up with. It actually will do an appropriate conversion if there are > missing values or int's and float in the same column. It will select an > appropriate string length as well. It may not be the most memory efficient > setup but given Tim's comments it is a pretty decent solution for the types > of data I have access to. > > 2. Fixed the spelling error :) > > 3. I guess that is the same thing. I am not very familiar with zip, izip, > map etc. just yet :) Thanks for the tip! > > 4. I called the function generated using exec, iter(). I need that function > to transform the data using the types provided by the user. > > Best, > > Vincent > > > On 7/18/07 7:57 PM, "Torgil Svensson" wrote: > > > Nice, > > > > I haven't gone through all details. That's a nice new "missing" > > feature, maybe all instances where we can't find a conversion should > > be "nan". A few comments: > > > > 1. The "load_search" functions contains all memory/performance > > overhead that we wanted to avoid with the fromiter function. Does this > > mean that you no longer have large text-files that change sting > > representation in the columns (aka "0" floats) ? > > > > 2. ident=" "*4 > > This has the same spelling error as in my first compile try .. it was > > meant to be "indent" > > > > 3. types = list((i,j) for i, j in zip(varnm, types2)) > > Isn't this the same as "types = zip(varnm, types2)" ? > > > > 4. return N.fromiter(iter(reader),dtype = types) > > Isn't "reader" an iterator already? What does the "iter()" operator do > > in this case? > > > > Best regards, > > > > //Torgil > > > > > > On 7/18/07, Vincent Nijs wrote: > >> > >> I combined some of the very useful comments/code from Tim and Torgil > and > >> came-up with the attached program to read csv files and convert the data > >> into a recarray. I couldn't use all of their suggestions because, > frankly, I > >> didn't understand all of them :) > >> > >> The program use variable names if provided in the csv-file and can > >> auto-detect data types. However, I also wanted to make it easy to > specify > >> data types and/or variables names if so desired. Examples are at the > bottom > >> of the file. Comments are very welcome. > >> > >> Thanks, > >> > >> Vincent > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> > http://projects.scipy.org/mailman/listinfo/numpy-discussion > >> > >> > >> > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Vincent R. Nijs > Assistant Professor of Marketing > Kellogg School of Management, Northwestern University > 2001 Sheridan Road, Evanston, IL 60208-2001 > Phone: +1-847-491-4574 Fax: +1-847-491-2498 > E-mail: v-nijs at kellogg.northwestern.edu > Skype: vincentnijs > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- A non-text attachment was scrubbed... Name: fix_tricky_columns.py Type: text/x-python Size: 2586 bytes Desc: not available URL: From torgil.svensson at gmail.com Thu Jul 19 07:34:51 2007 From: torgil.svensson at gmail.com (Torgil Svensson) Date: Thu, 19 Jul 2007 13:34:51 +0200 Subject: [Numpy-discussion] convert csv file into recarray without pre-specifying dtypes and variable names In-Reply-To: References: Message-ID: Hi again, On 7/19/07, Torgil Svensson wrote: > If memory really is an issue, you have the nice "load_spec" version > and can always convert the files once by iterating over the file twice > like the attached script does. I discovered that my script was broken and too complex. The attached script is much cleaner and has better error messages. Best regards, //Torgil On 7/19/07, Torgil Svensson wrote: > Hi, > > 1. Your code is fast due to that you convert whole at once columns in > numpy. The first step with the lists is also very fast (python > implements lists as arrays). I like your version, I think it's as fast > as it gets in pure python and has to keep only two versions of the > data at once in memory (since the string versions can be garbage > collected). > > If memory really is an issue, you have the nice "load_spec" version > and can always convert the files once by iterating over the file twice > like the attached script does. > > > 4. Okay, that makes sense. I was confused by the fact that your > generated function had the same name as the builtin iter() operator. > > > //Torgil > > > On 7/19/07, Vincent Nijs wrote: > > > > Hi Torgil, > > > > 1. I got an email from Tim about this issue: > > > > "I finally got around to doing some more quantitative comparisons between > > your code and the more complicated version that I proposed. The idea behind > > my code was to minimize memory usage -- I figured that keeping the memory > > usage low would make up for any inefficiencies in the conversion process > > since it's been my experience that memory bandwidth dominates a lot of > > numeric problems as problem sized get reasonably large. I was mostly wrong. > > While it's true that for very large file sizes I can get my code to > > outperform yours, in most instances it lags behind. And the range where it > > does better is a fairly small range right before the machine dies with a > > memory error. So my conclusion is that the extra hoops my code goes through > > to avoid allocating extra memory isn't worth it for you to bother with." > > > > The approach in my code is simple and robust to most data issues I could > > come-up with. It actually will do an appropriate conversion if there are > > missing values or int's and float in the same column. It will select an > > appropriate string length as well. It may not be the most memory efficient > > setup but given Tim's comments it is a pretty decent solution for the types > > of data I have access to. > > > > 2. Fixed the spelling error :) > > > > 3. I guess that is the same thing. I am not very familiar with zip, izip, > > map etc. just yet :) Thanks for the tip! > > > > 4. I called the function generated using exec, iter(). I need that function > > to transform the data using the types provided by the user. > > > > Best, > > > > Vincent > > > > > > On 7/18/07 7:57 PM, "Torgil Svensson" wrote: > > > > > Nice, > > > > > > I haven't gone through all details. That's a nice new "missing" > > > feature, maybe all instances where we can't find a conversion should > > > be "nan". A few comments: > > > > > > 1. The "load_search" functions contains all memory/performance > > > overhead that we wanted to avoid with the fromiter function. Does this > > > mean that you no longer have large text-files that change sting > > > representation in the columns (aka "0" floats) ? > > > > > > 2. ident=" "*4 > > > This has the same spelling error as in my first compile try .. it was > > > meant to be "indent" > > > > > > 3. types = list((i,j) for i, j in zip(varnm, types2)) > > > Isn't this the same as "types = zip(varnm, types2)" ? > > > > > > 4. return N.fromiter(iter(reader),dtype = types) > > > Isn't "reader" an iterator already? What does the "iter()" operator do > > > in this case? > > > > > > Best regards, > > > > > > //Torgil > > > > > > > > > On 7/18/07, Vincent Nijs wrote: > > >> > > >> I combined some of the very useful comments/code from Tim and Torgil > > and > > >> came-up with the attached program to read csv files and convert the data > > >> into a recarray. I couldn't use all of their suggestions because, > > frankly, I > > >> didn't understand all of them :) > > >> > > >> The program use variable names if provided in the csv-file and can > > >> auto-detect data types. However, I also wanted to make it easy to > > specify > > >> data types and/or variables names if so desired. Examples are at the > > bottom > > >> of the file. Comments are very welcome. > > >> > > >> Thanks, > > >> > > >> Vincent > > >> _______________________________________________ > > >> Numpy-discussion mailing list > > >> Numpy-discussion at scipy.org > > >> > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > >> > > >> > > >> > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at scipy.org > > > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > -- > > Vincent R. Nijs > > Assistant Professor of Marketing > > Kellogg School of Management, Northwestern University > > 2001 Sheridan Road, Evanston, IL 60208-2001 > > Phone: +1-847-491-4574 Fax: +1-847-491-2498 > > E-mail: v-nijs at kellogg.northwestern.edu > > Skype: vincentnijs > > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: fix_tricky_columns.py Type: text/x-python Size: 2668 bytes Desc: not available URL: From gnata at obs.univ-lyon1.fr Thu Jul 19 08:58:50 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Thu, 19 Jul 2007 14:58:50 +0200 Subject: [Numpy-discussion] Wrong lapack version detection (32/64bits) Message-ID: <469F600A.2060801@obs.univ-lyon1.fr> Hi, I'm trying to update numpy by compiling the up to date svn: I get this error : gcc: numpy/linalg/lapack_litemodule.c gcc -pthread -shared build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so /usr/bin/ld: cannot find -lmkl_lapack64 collect2: ld returned 1 exit status /usr/bin/ld: cannot find -lmkl_lapack64 collect2: ld returned 1 exit status error: Command "gcc -pthread -shared build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so" failed with exit status 1 There must be something wrong in the distutils/makefile because I'm on a debian sid *i386* so why should I link against mkl_lapack64 ?? Of course, I do not have lapack64 installed on this i386 machine. I have try to simply fix that in the config file of numpy replacing lapack64 by lapack32 everywhere but it fails (and it is not an acceptable fix). Can anyone reproduce that?? Xavier -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From gnata at obs.univ-lyon1.fr Thu Jul 19 09:20:20 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Thu, 19 Jul 2007 15:20:20 +0200 Subject: [Numpy-discussion] Wrong lapack version detection (32/64bits) In-Reply-To: <469F600A.2060801@obs.univ-lyon1.fr> References: <469F600A.2060801@obs.univ-lyon1.fr> Message-ID: <469F6514.6010002@obs.univ-lyon1.fr> Xavier Gnata wrote: > Hi, > > I'm trying to update numpy by compiling the up to date svn: > > I get this error : > gcc: numpy/linalg/lapack_litemodule.c > gcc -pthread -shared > build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o > -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o > build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so > /usr/bin/ld: cannot find -lmkl_lapack64 > collect2: ld returned 1 exit status > /usr/bin/ld: cannot find -lmkl_lapack64 > collect2: ld returned 1 exit status > error: Command "gcc -pthread -shared > build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o > -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o > build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so" failed with exit > status 1 > > There must be something wrong in the distutils/makefile because I'm on a > debian sid *i386* so why should I link against mkl_lapack64 ?? > Of course, I do not have lapack64 installed on this i386 machine. > I have try to simply fix that in the config file of numpy replacing > lapack64 by lapack32 everywhere but it fails (and it is not an acceptable fix). > > Can anyone reproduce that?? > > Xavier > > > Trying to modify /numpy/distutils/system_info.py this way (only for test purpose...): # lapack_libs = self.get_libs('lapack_libs',['mkl_lapack32','mkl_lapack64']) lapack_libs = self.get_libs('lapack_libs',['mkl_lapack32']) I'm able to compile numpy but import numpy fails: ImportError: /usr/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so: undefined symbol: zgesdd_ Looks like the procedure to detect the lapack version is fully buggy (or maybe the lapack debian pacakges??) It used to work ;) Xavier -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From v-nijs at kellogg.northwestern.edu Thu Jul 19 22:42:42 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Thu, 19 Jul 2007 21:42:42 -0500 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: Message-ID: I am interesting in using sqlite (or pytables) to store data for scientific research. I wrote the attached test program to save and load a simulated 11x500,000 recarray. Average save and load times are given below (timeit with 20 repetitions). The save time for sqlite is not really fair because I have to delete the data table each time before I create the new one. It is still pretty slow in comparison. Loading the recarray from sqlite is significantly slower than pytables or cPickle. I am hoping there may be more efficient ways to save and load recarray?s from/to sqlite than what I am now doing. Note that I infer the variable names and types from the data rather than specifying them manually. I?d luv to hear from people using sqlite, pytables, and cPickle about their experiences. saving recarray with cPickle: 1.448568 sec/pass saving recarray with pytable: 3.437228 sec/pass saving recarray with sqlite: 193.286204 sec/pass loading recarray using cPickle: 0.471365 sec/pass loading recarray with pytable: 0.692838 sec/pass loading recarray with sqlite: 15.977018 sec/pass Best, Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: load_tables_test.py Type: application/octet-stream Size: 4154 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Thu Jul 19 22:42:32 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 20 Jul 2007 11:42:32 +0900 Subject: [Numpy-discussion] Wrong lapack version detection (32/64bits) In-Reply-To: <469F6514.6010002@obs.univ-lyon1.fr> References: <469F600A.2060801@obs.univ-lyon1.fr> <469F6514.6010002@obs.univ-lyon1.fr> Message-ID: <46A02118.50706@ar.media.kyoto-u.ac.jp> Xavier Gnata wrote: > Xavier Gnata wrote: > >> Hi, >> >> I'm trying to update numpy by compiling the up to date svn: >> >> I get this error : >> gcc: numpy/linalg/lapack_litemodule.c >> gcc -pthread -shared >> build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o >> -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o >> build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so >> /usr/bin/ld: cannot find -lmkl_lapack64 >> collect2: ld returned 1 exit status >> /usr/bin/ld: cannot find -lmkl_lapack64 >> collect2: ld returned 1 exit status >> error: Command "gcc -pthread -shared >> build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o >> -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o >> build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so" failed with exit >> status 1 >> >> There must be something wrong in the distutils/makefile because I'm on a >> debian sid *i386* so why should I link against mkl_lapack64 ?? >> Of course, I do not have lapack64 installed on this i386 machine. >> I have try to simply fix that in the config file of numpy replacing >> lapack64 by lapack32 everywhere but it fails (and it is not an acceptable fix). >> >> Can anyone reproduce that?? >> >> Xavier >> >> >> >> > > Trying to modify /numpy/distutils/system_info.py this way (only for > test purpose...): > > # lapack_libs = > self.get_libs('lapack_libs',['mkl_lapack32','mkl_lapack64']) > lapack_libs = self.get_libs('lapack_libs',['mkl_lapack32']) > > I'm able to compile numpy but import numpy fails: > > svn numpy and scipy work perfectly on Ubuntu, and there are no noticeable differences between sid and Ubuntu to make a difference a priori. Do you have the mkl installed at all on your computer ? If not, and without any modification to the site.cfg of numpy, I don't see why numpy would try to detect the mkl. Maybe some really recent (eg a few days) changes in the trunk ? I build quite regularly the lastest numpy (several times / week) on Ubuntu without problems. > ImportError: > /usr/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so: undefined > symbol: zgesdd_ > > Looks like the procedure to detect the lapack version is fully buggy (or > maybe the lapack debian pacakges??) > Quite the contrary: debian LAPACK packages were the only ones working properly for a long time. David From gael.varoquaux at normalesup.org Fri Jul 20 02:16:33 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 20 Jul 2007 08:16:33 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: References: Message-ID: <20070720061633.GC26437@clipper.ens.fr> On Thu, Jul 19, 2007 at 09:42:42PM -0500, Vincent Nijs wrote: > I'd luv to hear from people using sqlite, pytables, and cPickle about > their experiences. I was about to point you to this discussion: http://projects.scipy.org/pipermail/scipy-user/2007-April/011724.html but I see that you participated in it. I store data from each of my experimental run with pytables. What I like about it is the hierarchical organization of the data which allows me to save a complete description of the experiment, with strings, and extensible data structures. Another thing I like is that I can load this in Matlab (I can provide enhanced script for hdf5, if somebody wants them), and I think it is possible to read hdf5 in Origin. I don't use these software, but some colleagues do. So I think the choices between pytables and cPickle boils down to whether you want to share the data with other software than Python or not. Ga?l From strawman at astraw.com Fri Jul 20 04:59:13 2007 From: strawman at astraw.com (Andrew Straw) Date: Fri, 20 Jul 2007 01:59:13 -0700 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <20070720061633.GC26437@clipper.ens.fr> References: <20070720061633.GC26437@clipper.ens.fr> Message-ID: <46A07961.9000806@astraw.com> Gael Varoquaux wrote: > On Thu, Jul 19, 2007 at 09:42:42PM -0500, Vincent Nijs wrote: >> I'd luv to hear from people using sqlite, pytables, and cPickle about >> their experiences. > > I was about to point you to this discussion: > http://projects.scipy.org/pipermail/scipy-user/2007-April/011724.html > > but I see that you participated in it. > > I store data from each of my experimental run with pytables. What I like > about it is the hierarchical organization of the data which allows me to > save a complete description of the experiment, with strings, and > extensible data structures. Another thing I like is that I can load this > in Matlab (I can provide enhanced script for hdf5, if somebody wants > them), and I think it is possible to read hdf5 in Origin. I don't use > these software, but some colleagues do. I want that Matlab script! I have colleagues with whom the least common denominator is currently .mat files. I'd be much happier if it was hdf5 files. Can you post it on the scipy wiki cookbook? (Or the pytables wiki?) Cheers! Andrew From gael.varoquaux at normalesup.org Fri Jul 20 05:24:34 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 20 Jul 2007 11:24:34 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <46A07961.9000806@astraw.com> References: <20070720061633.GC26437@clipper.ens.fr> <46A07961.9000806@astraw.com> Message-ID: <20070720092434.GL26437@clipper.ens.fr> On Fri, Jul 20, 2007 at 01:59:13AM -0700, Andrew Straw wrote: > I want that Matlab script! I new I really should put these things on line, I have just been wanting to iron them a bit, but it has been almost two year since I have touched these, so ... http://scipy.org/Cookbook/hdf5_in_Matlab Feel free to improve them, and to write similar scripts in Python. Ga?l From ivilata at carabos.com Fri Jul 20 05:55:29 2007 From: ivilata at carabos.com (Ivan Vilata i Balaguer) Date: Fri, 20 Jul 2007 11:55:29 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <20070720092434.GL26437@clipper.ens.fr> References: <20070720061633.GC26437@clipper.ens.fr> <46A07961.9000806@astraw.com> <20070720092434.GL26437@clipper.ens.fr> Message-ID: <20070720095529.GB6241@rampella.terramar.selidor.net> Gael Varoquaux (el 2007-07-20 a les 11:24:34 +0200) va dir:: > I new I really should put these things on line, I have just been wanting > to iron them a bit, but it has been almost two year since I have touched > these, so ... > > http://scipy.org/Cookbook/hdf5_in_Matlab Wow, that looks really sweet and simple, useful code. Great! :: Ivan Vilata i Balaguer >qo< http://www.carabos.com/ C?rabos Coop. V. V V Enjoy Data "" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From gnata at obs.univ-lyon1.fr Fri Jul 20 06:20:05 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Fri, 20 Jul 2007 12:20:05 +0200 Subject: [Numpy-discussion] Wrong lapack version detection (32/64bits) In-Reply-To: <46A02118.50706@ar.media.kyoto-u.ac.jp> References: <469F600A.2060801@obs.univ-lyon1.fr> <469F6514.6010002@obs.univ-lyon1.fr> <46A02118.50706@ar.media.kyoto-u.ac.jp> Message-ID: <46A08C55.2020409@obs.univ-lyon1.fr> >>> Hi, >>> >>> I'm trying to update numpy by compiling the up to date svn: >>> >>> I get this error : >>> gcc: numpy/linalg/lapack_litemodule.c >>> gcc -pthread -shared >>> build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o >>> -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o >>> build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so >>> /usr/bin/ld: cannot find -lmkl_lapack64 >>> collect2: ld returned 1 exit status >>> /usr/bin/ld: cannot find -lmkl_lapack64 >>> collect2: ld returned 1 exit status >>> error: Command "gcc -pthread -shared >>> build/temp.linux-i686-2.4/numpy/linalg/lapack_litemodule.o >>> -lmkl_lapack32 -lmkl_lapack64 -lmkl -lvml -lguide -lpthread -o >>> build/lib.linux-i686-2.4/numpy/linalg/lapack_lite.so" failed with exit >>> status 1 >>> >>> There must be something wrong in the distutils/makefile because I'm on a >>> debian sid *i386* so why should I link against mkl_lapack64 ?? >>> Of course, I do not have lapack64 installed on this i386 machine. >>> I have try to simply fix that in the config file of numpy replacing >>> lapack64 by lapack32 everywhere but it fails (and it is not an acceptable fix). >>> >>> Can anyone reproduce that?? >>> >>> Xavier >>> >>> >>> >>> >>> >> Trying to modify /numpy/distutils/system_info.py this way (only for >> test purpose...): >> >> # lapack_libs = >> self.get_libs('lapack_libs',['mkl_lapack32','mkl_lapack64']) >> lapack_libs = self.get_libs('lapack_libs',['mkl_lapack32']) >> >> I'm able to compile numpy but import numpy fails: >> >> >> > svn numpy and scipy work perfectly on Ubuntu, and there are no > noticeable differences between sid and Ubuntu to make a difference a priori. > > Do you have the mkl installed at all on your computer ? Ok I have mkl installed on my machine (it has by installed by a well know closed sources software....) /usr/lib/libmkl.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), not stripped /usr/lib/libmkl_lapack32.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), not stripped I just cannont remove them (because I will break another part of my system) but anyway, numpy detects them in a wrong way : I have no 64 version of mkl on this i386 machine :)) I have try to remove them and numpy compiles/works again but it is not a solution. I also have debian lapack package installed /usr/lib/liblapack.so.3.0. Numpy used to detect/use only this debian version of lapack. I never had a cusomized version of site.cfg. I have seen some recent changes (svn commits) on the way numpy detects the libs so it looks like we have a bug here when both libmkl_lapack32.so and liblapack.so (debian) are installed. I just would like to be able to tell numpy to use liblapack.so instead of this non free libmkl_lapack32.so Xavier > If not, and > without any modification to the site.cfg of numpy, I don't see why numpy > would try to detect the mkl. Maybe some really recent (eg a few days) > changes in the trunk ? I build quite regularly the lastest numpy > (several times / week) on Ubuntu without problems. > >> ImportError: >> /usr/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so: undefined >> symbol: zgesdd_ >> >> Looks like the procedure to detect the lapack version is fully buggy (or >> maybe the lapack debian pacakges??) >> >> > Quite the contrary: debian LAPACK packages were the only ones working > properly for a long time. > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From faltet at carabos.com Fri Jul 20 07:17:59 2007 From: faltet at carabos.com (Francesc Altet) Date: Fri, 20 Jul 2007 13:17:59 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: References: Message-ID: <200707201318.01462.faltet@carabos.com> A Divendres 20 Juliol 2007 04:42, Vincent Nijs escrigu?: > I am interesting in using sqlite (or pytables) to store data for scientific > research. I wrote the attached test program to save and load a simulated > 11x500,000 recarray. Average save and load times are given below (timeit > with 20 repetitions). The save time for sqlite is not really fair because I > have to delete the data table each time before I create the new one. It is > still pretty slow in comparison. Loading the recarray from sqlite is > significantly slower than pytables or cPickle. I am hoping there may be > more efficient ways to save and load recarray?s from/to sqlite than what I > am now doing. Note that I infer the variable names and types from the data > rather than specifying them manually. > > I?d luv to hear from people using sqlite, pytables, and cPickle about their > experiences. > > saving recarray with cPickle: 1.448568 sec/pass > saving recarray with pytable: 3.437228 sec/pass > saving recarray with sqlite: 193.286204 sec/pass > > loading recarray using cPickle: 0.471365 sec/pass > loading recarray with pytable: 0.692838 sec/pass > loading recarray with sqlite: 15.977018 sec/pass For a more fair comparison, and for large amounts of data, you should inform PyTables about the expected number of rows (see [1]) that you will end feeding into the tables so that it can choose the best chunksize for I/O purposes. I've redone the benchmarks (the new script is attached) with this 'optimization' on and here are my numbers: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= PyTables version: 2.0 HDF5 version: 1.6.5 NumPy version: 1.0.3 Zlib version: 1.2.3 LZO version: 2.01 (Jun 27 2005) Python version: 2.5 (r25:51908, Nov 3 2006, 12:01:01) [GCC 4.0.2 20050901 (prerelease) (SUSE Linux)] Platform: linux2-x86_64 Byte-ordering: little -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test saving recarray using cPickle: 0.197113 sec/pass Test saving recarray with pytables: 0.234442 sec/pass Test saving recarray with pytables (with zlib): 1.973649 sec/pass Test saving recarray with pytables (with lzo): 0.925558 sec/pass Test loading recarray using cPickle: 0.151379 sec/pass Test loading recarray with pytables: 0.165399 sec/pass Test loading recarray with pytables (with zlib): 0.553251 sec/pass Test loading recarray with pytables (with lzo): 0.264417 sec/pass As you can see, the differences between raw cPickle and PyTables are much less than not informing about the total number of rows. In fact, an automatic optimization can easily be done in PyTables so that when the user is passing a recarray, the total length of the recarray would be compared with the default number of expected rows (currently 10000), and if the former is larger, then the length of the recarray should be chosen instead. I also have added the times when using compression just in case you are interested using it. Here are the final file sizes: $ ls -sh data total 132M 24M data-lzo.h5 43M data-None.h5 43M data.pickle 25M data-zlib.h5 Of course, this is using completely random data, but with real data the compression levels are expected to be higher than this. [1] http://www.pytables.org/docs/manual/ch05.html#expectedRowsOptim Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" -------------- next part -------------- A non-text attachment was scrubbed... Name: load_tables_test.py Type: application/x-python Size: 4964 bytes Desc: not available URL: From pearu at cens.ioc.ee Fri Jul 20 07:35:28 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri, 20 Jul 2007 13:35:28 +0200 Subject: [Numpy-discussion] Wrong lapack version detection (32/64bits) In-Reply-To: <46A08C55.2020409@obs.univ-lyon1.fr> References: <469F600A.2060801@obs.univ-lyon1.fr> <469F6514.6010002@obs.univ-lyon1.fr> <46A02118.50706@ar.media.kyoto-u.ac.jp> <46A08C55.2020409@obs.univ-lyon1.fr> Message-ID: <46A09E00.9080009@cens.ioc.ee> Xavier Gnata wrote: > > I just would like to be able to tell numpy to use liblapack.so instead > of this non free libmkl_lapack32.so For that set the following environment variable when building numpy/scipy: export MKL=None Hth, Pearu From gzhu at peak6.com Fri Jul 20 09:30:40 2007 From: gzhu at peak6.com (Geoffrey Zhu) Date: Fri, 20 Jul 2007 08:30:40 -0500 Subject: [Numpy-discussion] Logical Selector References: <469E5CFB.6010003@obs.univ-lyon1.fr> <99F81FFD0EA54E4DA8D4F1BFE272F34105400FBE@ppi-mail1.chicago.peak6.net> <469E66D4.2030501@gmail.com><469E782E.1020801@hawaii.edu> <469E8997.4010609@obs.univ-lyon1.fr> Message-ID: <99F81FFD0EA54E4DA8D4F1BFE272F341054013B2@ppi-mail1.chicago.peak6.net> >Hi, >Well maybe it is a bug on my box (thunderbird) but the topic of the thread is "-lmkl_lapack64 on i368 ??". >Nothing to do with "Logical Selector" ;) Should I post another mail about this topic? >Xavier >ps : I'm just sorry for the noise if it is a bug on my side. >-- Hi Xavier, I didn't know mailing lists track threads. The messages look all independent in Microsoft Outlook. So I just hit reply on your message, changed the title, and put in my message... Sorry about that. Geoffrey _______________________________________________________=0A= =0A= The information in this email or in any file attached=0A= hereto is intended only for the personal and confiden-=0A= tial use of the individual or entity to which it is=0A= addressed and may contain information that is propri-=0A= etary and confidential. If you are not the intended=0A= recipient of this message you are hereby notified that=0A= any review, dissemination, distribution or copying of=0A= this message is strictly prohibited. This communica-=0A= tion is for information purposes only and should not=0A= be regarded as an offer to sell or as a solicitation=0A= of an offer to buy any financial product. Email trans-=0A= mission cannot be guaranteed to be secure or error-=0A= free. P6070214 From v-nijs at kellogg.northwestern.edu Fri Jul 20 09:35:51 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Fri, 20 Jul 2007 08:35:51 -0500 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <20070720061633.GC26437@clipper.ens.fr> Message-ID: Gael, Sounds very interesting! Would you mind sharing an example (with code if possible) of how you organize your experimental data in pytables. I have been thinking about how I might organize my data in pytables and would luv to hear how an experienced user does that. Given the speed differences it looks like pytables is going to be a better solution for my needs. Still curious however ... does no one on this list use (and like) sqlite? Could anyone suggest any other list where I might find users of python and sqlite (and numpy)? Thanks, Vincent On 7/20/07 1:16 AM, "Gael Varoquaux" wrote: > On Thu, Jul 19, 2007 at 09:42:42PM -0500, Vincent Nijs wrote: >> I'd luv to hear from people using sqlite, pytables, and cPickle about >> their experiences. > > I was about to point you to this discussion: > http://projects.scipy.org/pipermail/scipy-user/2007-April/011724.html > > but I see that you participated in it. > > I store data from each of my experimental run with pytables. What I like > about it is the hierarchical organization of the data which allows me to > save a complete description of the experiment, with strings, and > extensible data structures. Another thing I like is that I can load this > in Matlab (I can provide enhanced script for hdf5, if somebody wants > them), and I think it is possible to read hdf5 in Origin. I don't use > these software, but some colleagues do. > > So I think the choices between pytables and cPickle boils down to whether > you want to share the data with other software than Python or not. > > Ga?l > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From tim.hochberg at ieee.org Fri Jul 20 09:48:21 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 20 Jul 2007 06:48:21 -0700 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: References: <20070720061633.GC26437@clipper.ens.fr> Message-ID: On 7/20/07, Vincent Nijs wrote: > > Gael, > > Sounds very interesting! Would you mind sharing an example (with code if > possible) of how you organize your experimental data in pytables. I have > been thinking about how I might organize my data in pytables and would luv > to hear how an experienced user does that. > > Given the speed differences it looks like pytables is going to be a better > solution for my needs. > > Still curious however ... does no one on this list use (and like) sqlite? > > Could anyone suggest any other list where I might find users of python and > sqlite (and numpy)? You could try the db-sig. You can get to the archives, and I imagine subscribe to it, from: http://www.python.org/community/sigs/current/ I don't know if that'll be helpful for you, but I imagine that they know something about python + sqlllite. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From v-nijs at kellogg.northwestern.edu Fri Jul 20 10:10:34 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Fri, 20 Jul 2007 09:10:34 -0500 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <200707201318.01462.faltet@carabos.com> Message-ID: Thanks Francesc! That does work much better: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= PyTables version: 2.0 HDF5 version: 1.6.5 NumPy version: 1.0.4.dev3852 Zlib version: 1.2.3 BZIP2 version: 1.0.2 (30-Dec-2001) Python version: 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] Platform: darwin-Power Macintosh Byte-ordering: big -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test saving recarray using cPickle: 1.620880 sec/pass Test saving recarray with pytables: 2.074591 sec/pass Test saving recarray with pytables (with zlib): 14.320498 sec/pass Test loading recarray using cPickle: 1.023015 sec/pass Test loading recarray with pytables: 0.882411 sec/pass Test loading recarray with pytables (with zlib): 3.692698 sec/pass On 7/20/07 6:17 AM, "Francesc Altet" wrote: > A Divendres 20 Juliol 2007 04:42, Vincent Nijs escrigu?: >> I am interesting in using sqlite (or pytables) to store data for scientific >> research. I wrote the attached test program to save and load a simulated >> 11x500,000 recarray. Average save and load times are given below (timeit >> with 20 repetitions). The save time for sqlite is not really fair because I >> have to delete the data table each time before I create the new one. It is >> still pretty slow in comparison. Loading the recarray from sqlite is >> significantly slower than pytables or cPickle. I am hoping there may be >> more efficient ways to save and load recarray?s from/to sqlite than what I >> am now doing. Note that I infer the variable names and types from the data >> rather than specifying them manually. >> >> I?d luv to hear from people using sqlite, pytables, and cPickle about their >> experiences. >> >> saving recarray with cPickle: 1.448568 sec/pass >> saving recarray with pytable: 3.437228 sec/pass >> saving recarray with sqlite: 193.286204 sec/pass >> >> loading recarray using cPickle: 0.471365 sec/pass >> loading recarray with pytable: 0.692838 sec/pass >> loading recarray with sqlite: 15.977018 sec/pass > > For a more fair comparison, and for large amounts of data, you should inform > PyTables about the expected number of rows (see [1]) that you will end > feeding into the tables so that it can choose the best chunksize for I/O > purposes. > > I've redone the benchmarks (the new script is attached) with > this 'optimization' on and here are my numbers: > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > PyTables version: 2.0 > HDF5 version: 1.6.5 > NumPy version: 1.0.3 > Zlib version: 1.2.3 > LZO version: 2.01 (Jun 27 2005) > Python version: 2.5 (r25:51908, Nov 3 2006, 12:01:01) > [GCC 4.0.2 20050901 (prerelease) (SUSE Linux)] > Platform: linux2-x86_64 > Byte-ordering: little > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test saving recarray using cPickle: 0.197113 sec/pass > Test saving recarray with pytables: 0.234442 sec/pass > Test saving recarray with pytables (with zlib): 1.973649 sec/pass > Test saving recarray with pytables (with lzo): 0.925558 sec/pass > > Test loading recarray using cPickle: 0.151379 sec/pass > Test loading recarray with pytables: 0.165399 sec/pass > Test loading recarray with pytables (with zlib): 0.553251 sec/pass > Test loading recarray with pytables (with lzo): 0.264417 sec/pass > > As you can see, the differences between raw cPickle and PyTables are much less > than not informing about the total number of rows. In fact, an automatic > optimization can easily be done in PyTables so that when the user is passing > a recarray, the total length of the recarray would be compared with the > default number of expected rows (currently 10000), and if the former is > larger, then the length of the recarray should be chosen instead. > > I also have added the times when using compression just in case you are > interested using it. Here are the final file sizes: > > $ ls -sh data > total 132M > 24M data-lzo.h5 43M data-None.h5 43M data.pickle 25M data-zlib.h5 > > Of course, this is using completely random data, but with real data the > compression levels are expected to be higher than this. > > [1] http://www.pytables.org/docs/manual/ch05.html#expectedRowsOptim > > Cheers, -- Vincent R. Nijs Assistant Professor of Marketing Kellogg School of Management, Northwestern University 2001 Sheridan Road, Evanston, IL 60208-2001 Phone: +1-847-491-4574 Fax: +1-847-491-2498 E-mail: v-nijs at kellogg.northwestern.edu Skype: vincentnijs From lbolla at gmail.com Fri Jul 20 10:46:00 2007 From: lbolla at gmail.com (lorenzo bolla) Date: Fri, 20 Jul 2007 16:46:00 +0200 Subject: [Numpy-discussion] expm Message-ID: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> hi all. is there a function in numpy to compute the exp of a matrix, similar to expm in matlab? for example: expm([[0,0],[0,0]]) = eye(2) thanks, lorenzo. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Fri Jul 20 10:50:17 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 20 Jul 2007 16:50:17 +0200 Subject: [Numpy-discussion] expm In-Reply-To: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> Message-ID: <46A0CBA9.3090606@iam.uni-stuttgart.de> lorenzo bolla wrote: > hi all. > is there a function in numpy to compute the exp of a matrix, similar > to expm in matlab? > for example: > expm([[0,0],[0,0]]) = eye(2) > > thanks, > lorenzo. > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Numpy doesn't provide expm but scipy does. >>> from scipy.linalg import expm, expm2, expm3 Nils From gael.varoquaux at normalesup.org Fri Jul 20 10:56:28 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 20 Jul 2007 16:56:28 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: References: <20070720061633.GC26437@clipper.ens.fr> Message-ID: <20070720145628.GM26437@clipper.ens.fr> On Fri, Jul 20, 2007 at 08:35:51AM -0500, Vincent Nijs wrote: > Sounds very interesting! Would you mind sharing an example (with code if > possible) of how you organize your experimental data in pytables. I have > been thinking about how I might organize my data in pytables and would luv > to hear how an experienced user does that. I can show you the processing code. The experiment I have close to me is run by Matlab, the one that is fully controlled by Python is a continent away. Actually, I am really lazy, so I am just going to copy brutally the IO module. Something that can be interesting is that the data is saved by the expirement control framework on a computer (called Krubcontrol), this data can then be retrieve using the "fetch_files" Python command, that puts it on the server and logs it into a data base like hash table. When we want to retrieve the data we have a special object krubdata, which uses some fancy indexing to retrieve by data, or specifying the keywords. I am sorry I am not providing the code that is writing the hdf5 files, it is an incredible useless mess, trust me. I would be able to factor out the output code out of the 5K matlab lines. Hopefuly you'll be able to get an idea of the structure of the hdf5 files by looking at the code that does the loading. I haven't worked with this data for a while, so I can't tell you Some of the Python code might be useful to others, especially the hashing and retrieving part. The reason why I didn't use a relational DB is that I simply don't trust them enough for my precious data. Ga?l -------------- next part -------------- """ Krub.load Routines to load the data saved by the experiment and build useful structures out of it. Author: Gael Varoquaux Copyright: Laboratoire Charles Fabry de l'Institut d'Optique License: BSD-like """ # Avoid division problems from __future__ import division # To load hdf5 import tables # Do not display any warnings (FIXME: this is too strict) tables.warnings.filterwarnings('ignore') # regular expressions import re import os, sys, shutil import datetime # Module for object persistence import shelve # provide globbing from glob import glob from numpy import array # FIXME: This will pose problem when pytables transit to numpy. from numarray.strings import CharArray # FIXME: This is to much hardcoded data_root = "/home/manip/data" db_file_name = "/home/manip/analysis/krubDB.db" def load_h5(file_name): """ Loads an hdf5 file and returns a dict with the hdf5 data in it. """ file = tables.openFile(file_name) out_dict = {} for key, value in file.leaves.iteritems(): if isinstance(value, tables.UnImplemented): continue try: value = value.read() try: if isinstance(value, CharArray): value = value.tolist() except Exception, inst: print "Couldn't convert %s to a list" % key print inst if len(value) == 1: value = value[0] out_dict[key[1:]] = value except Exception, inst: print "couldn't load %s" % key print inst file.close() return(out_dict) def load_Krub(file_name): """ Loads a file created by cameraview and returns a dict with the data restructured in a more pleasant way. """ data = load_h5(file_name) # Store the params in a dict try: params = {} for name, value in zip(data['SCparamsnames'], data['SCparams']): params[name] = value data.update(params) data['params'] = params data.pop('SCparams') data.pop('SCparamsnames') except Exception, inst: print "couldn't convert params to a dict: " print inst return data def load_seq(file_list): """ Loads a sequence of hdf5 files created by cameraview and returns a list of dicts with the data. """ return [ load_Krub(file_name) for file_name in file_list ] def build_param_table(file_list): """ Scans the given list of files and returns a dictionary of dictionaries discribing the files, and the experimental parameters. """ out_dict = {} for file_name in file_list: data = load_Krub(file_name) if 'params' in data: params = data['params'] else: params = {} params['filename'] = file_name if 'sequencename' in data: params['sequencename'] = data['sequencename'] if 'fitfunction' in data: params['fitfunction'] = data['fitfunction'] if 'loopposition' in data: params['loopposition'] = data['loopposition'] if 'roi' in data: params['roi'] = data['roi'] # Check that the filename has the timestamp if re.match(r".*\d\d_\d\d_\d\d", file_name[:-3]): params['time'] = int( file_name[-11:-9] + file_name[-8:-6] + file_name[-5:-3] ) # Check whether the directory of the file has the datestamp. full_path = os.path.abspath(file_name) params['fullpath'] = full_path dir_path = full_path.replace(data_root+os.sep,'') dir_name = dir_path.split(os.sep)[0] if re.match(r"\d\d\d\d\d\d", dir_name): params['date'] = int(dir_name) out_dict[full_path] = params # Delete manually the data, let us not trust the garbage collector # here: we cannot afford wasting memory del data print >>sys.stderr, ".", return out_dict def add_files(file_list): """ Adds the given files to the Krub database. """ # An ugly hack to change the file permissions even if we do not own # the file: start a new file, and replace the old one with the new # one. hash_table = build_param_table(file_list) dbase_new = shelve.open(db_file_name + "new") dbase_old = shelve.open(db_file_name) dbase_new.update(dbase_old) dbase_new.update(hash_table) dbase_old.close() dbase_new.close() os.chmod(db_file_name + "new", 0777) shutil.move(db_file_name, db_file_name + "old") shutil.move(db_file_name + "new", db_file_name) def rebuild_db(): """ Rescans the complete data directories to rebuild the database. """ database = {} for dirpath, dirnames, filenames in os.walk(data_root): print "\nscanning ", dirpath h5files = [dirpath + os.sep + filename for filename in filenames if filename[-3:]==".h5"] database.update(build_param_table(h5files)) os.rename(db_file_name, db_file_name+"back") dbase = shelve.open(db_file_name) dbase.update(database) dbase.close() os.chmod(db_file_name, 0777) def query_db(**kwargs): """ Queries the database to find files matching certain parameters. Returns the database entries (dictionnaries) of these files. >>> query_db(molasse_time=8., seq_name='FORT_2b', mot_load_time_s= 6.) """ dbase = shelve.open(db_file_name) out_dict = {} for file_name, params in dbase.iteritems(): store = True for param, value in kwargs.iteritems(): if param in params: if not params[param] == value: store = False break if store: out_dict[file_name] = params dbase.close() return out_dict def select_seq(seq, **kwargs): """ Selects filenames in the given list according to the specified parameters. The files must be in the database. >>> select_seq(krubdata[:], seq_name='FORT_2b') """ # FIXME: This is way to much copied and pasted from query_db dbase = shelve.open(db_file_name) out_list = [] for file_name in seq: params = dbase[file_name] store = True for param, value in kwargs.iteritems(): if param in params: if not params[param] == value: store = False break else: store = False break if store: out_list += [file_name, ] dbase.close() return out_list def extract_param(seq, param_name): """ Return an array with all the values the given parameter takes in the sequence of file names given. """ dbase = shelve.open(db_file_name) out_list = [] for file_name in seq: params = dbase[file_name] if param_name in params: out_list += [ params[param_name], ] # Use a set to have unique entries: out_list = array(list(set(out_list))) out_list.sort() return out_list ########################################################################### # Hack to use the gnome-vfs to update the files from Krubcontrol ########################################################################### import gnomevfs FLAGS = gnomevfs.PERM_USER_ALL + gnomevfs.PERM_GROUP_ALL + \ gnomevfs.PERM_OTHER_ALL def fetch_files(): """ updates the data from krubcontrol """ if not gnomevfs.exists('smb://krubcontrol/data'): raise IOError, "Cannot connect to Krubcontrol" file_list = _walk_gnomevfs('smb://krubcontrol/data/Manip/data') if len(file_list) == 0: print "Nothing new" else: print "Adding files to database" add_files([file_name for file_name in file_list if file_name[-3:]=='.h5' ]) def _walk_gnomevfs(uri, base='smb://krubcontrol/data/Manip/data'): """ Private function used to scan remote windows drives """ file_list = [] dir_iterator = gnomevfs.open_directory(uri) for entry in dir_iterator: if entry.name[0] == '.': continue entry_uri = uri + "/" + entry.name local_uri = entry_uri.replace(base,"file://" + data_root) disk_uri = local_uri.replace("file://", "") if entry.type == gnomevfs.FILE_TYPE_DIRECTORY: if not gnomevfs.exists(local_uri): gnomevfs.make_directory(local_uri, FLAGS) os.chmod(disk_uri, 0777) file_list += _walk_gnomevfs(entry_uri) else: if not gnomevfs.exists(local_uri): file_list += [disk_uri, ] print "uploading :", entry_uri inuri = gnomevfs.URI(entry_uri) outuri = gnomevfs.URI(local_uri) gnomevfs.xfer_uri(inuri, outuri, gnomevfs.XFER_DEFAULT, gnomevfs.XFER_ERROR_MODE_ABORT, gnomevfs.XFER_OVERWRITE_MODE_SKIP) os.chmod(disk_uri, 0777) return file_list class KrubData(object): """ An indexed object to access the data stored in the database. This object returns a list of file names pointing to data matching given criteria. It can be called with one or to indexing parameters: the first parameter is the hour indexes of the data, in the form "hhmmss", as an integer, with no leading zeros. The second indexing parameter is the data. If it is omitted it defaults to the current day. >>> krubdata[150833] ['/home/manip/data/061016/FORT_2b_15_08_33.h5'] >>> krubdata[150833,61016] ['/home/manip/data/061016/FORT_2b_15_08_33.h5'] Time indexes support slices: >>> krubdata[150700:150800,61016] ['/home/manipdata/061016/FORT_2b_15_07_04.h5', '/home/manip/data/061016/FORT_2b_15_07_19.h5', '/home/manip/data/061016/FORT_2b_15_07_34.h5', '/home/manip/data/061016/FORT_2b_15_07_48.h5'] >>> krubdata[150700:150800:2,61016] # Skip 1 out of 2 ['/home/manip/data/061016/FORT_2b_15_07_04.h5', '/home/manip/data/061016/FORT_2b_15_07_34.h5'] Both times and date can be called with negative integers. The indexes then refer to the nth last day, or shot: >>> krubdata[194900:,-1] # Data taken yesterday, after 19:49 ['/home/manip/data/061018/FORT_2b_19_49_05.h5'] >>> krubdata[-2:,] # Last 2 shots ['/home/manip/data/061018/FORT_2b_19_48_53.h5', '/home/manip/data/061018/FORT_2b_19_49_05.h5'] *see also:* query_db, build_param_table, and the doc for Krub.io WARNING : do not write 0 in front of the date : for 06.12.13 write 61213 and not 061213 """ def __getitem__(self, *args): """ Use the indexing to retrive the data. First set of index is the time in hhmmss. Leading zeros should be suppressed. """ # Only one index given, date is today: today = datetime.date.today() # I don't now why the args are passed in a tuple, if there is a # date argument. Lets get rid of this if isinstance(args[0], tuple) : args = args[0] # Parse the date argument. if len(args)==1: print "No date index given, defaulting to today." date = int(today.strftime('%y%m%d')) elif args[1]<0: date = today - datetime.timedelta(days=-args[1]) date = int(date.strftime('%y%m%d')) else: date = args[1] # Parse the time argument time_segment = None time_start = None time_stop = None time = None if not isinstance(args[0], slice): # If this is not a slice, it must be an int if args[0]<0: # Counting from the back. Make it a one-spaced slice, to # reuse our back-counting code. relative_time_start = args[0] relative_time_stop = args[0]+1 time_step = None time_segment = True else: time=args[0] if isinstance(args[0], slice): relative_time_start = None relative_time_stop = None time_step = None time_segment = args[0] if time_segment.start and time_segment.start<0: relative_time_start = time_segment.start elif time_segment.start: time_start = time_segment.start-1 else: time_start = time_segment.start if time_segment.stop and time_segment.stop<0: relative_time_stop = time_segment.stop elif time_segment.stop: time_stop = time_segment.stop+1 else: time_stop = time_segment.stop if time_segment.step: time_step = time_segment.step # Open the database dbase = shelve.open(db_file_name) out_list = [] for file_name, params in dbase.iteritems(): if not ('date' in params and params['date'] == date) : continue if not 'time' in params : continue if time and not params['time'] == time : continue if time_start and not params['time'] > time_start : continue if time_stop and not params['time'] < time_stop : continue out_list += [file_name, ] # Now deal with the relative times, and the step if time_segment: # We need to sort the list by time. get_time = lambda x: dbase[x]['time'] out_list.sort(key=get_time) out_list = out_list[ relative_time_start:relative_time_stop:time_step] dbase.close() return out_list krubdata = KrubData() From gnata at obs.univ-lyon1.fr Fri Jul 20 11:29:35 2007 From: gnata at obs.univ-lyon1.fr (Xavier Gnata) Date: Fri, 20 Jul 2007 17:29:35 +0200 Subject: [Numpy-discussion] Wrong lapack version detection (32/64bits) In-Reply-To: <46A09E00.9080009@cens.ioc.ee> References: <469F600A.2060801@obs.univ-lyon1.fr> <469F6514.6010002@obs.univ-lyon1.fr> <46A02118.50706@ar.media.kyoto-u.ac.jp> <46A08C55.2020409@obs.univ-lyon1.fr> <46A09E00.9080009@cens.ioc.ee> Message-ID: <46A0D4DF.60707@obs.univ-lyon1.fr> Pearu Peterson wrote: > Xavier Gnata wrote: > > >> I just would like to be able to tell numpy to use liblapack.so instead >> of this non free libmkl_lapack32.so >> > > For that set the following environment variable when building numpy/scipy: > > export MKL=None > > Hth, > Pearu > ok it works with export MKL=None but IMHO there is something wrong in the mkl detection code on i386. Xavier -- ############################################ Xavier Gnata CRAL - Observatoire de Lyon 9, avenue Charles Andr? 69561 Saint Genis Laval cedex Phone: +33 4 78 86 85 28 Fax: +33 4 78 86 83 86 E-mail: gnata at obs.univ-lyon1.fr ############################################ From faltet at carabos.com Fri Jul 20 11:53:09 2007 From: faltet at carabos.com (Francesc Altet) Date: Fri, 20 Jul 2007 17:53:09 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: References: Message-ID: <200707201753.10852.faltet@carabos.com> Vincent, A Divendres 20 Juliol 2007 15:35, Vincent Nijs escrigu?: > Still curious however ... does no one on this list use (and like) sqlite? First of all, while I'm not a heavy user of relational databases, I've used them as references for benchmarking purposes. Hence, based on my own benchmarking experience, I'd say that, for writing, relational databases do take a lot of safety measures to ensure that all the data that is written to the disk is safe and that the data relationships don't get broken, and that takes times (a lot of time, in fact). I'm not sure about whether some of these safety measures can be relaxed, but even though some relational databases would allow this, my feel (beware, I can be wrong) is that you won't be able to reach cPickle/PyTables speed (cPickle/PyTables are not observing security measures in that regard because they are not thought for these tasks). In this sense, the best writing speed that I was able to achieve with Postgres (I don't know whether sqlite support this) is by simulating that your data comes from a file stream and using the "cursor.copy_from()" method. Using this approach I was able to accelerate a 10x (if I remember well) the injecting speed, but even with this, PyTables can be another 10x faster. You can see an exemple of usage in the Postgres backend [1] used for doing the benchmarks for comparing PyTables and Postgres speeds. Regarding reading speed, my diggins [2] seems to indicate that the bottleneck here is not related with safety, but with the need of the relational databases pythonic APIs of wrapping *every* element retrieved out of the database with a Python container (int, float, string...). On the contrary, PyTables does take advantage of creating an empty recarray as the container to keep all the retrieved data, and that's very fast compared with the former approach. To somewhat quantify this effect in function of the size of the dataset retrieved, you can see the figure 14 of [3] (as you can see, the larger the dataset retrieved, the larger the difference in terms of speed). Incidentally, and as it is said there, I'm hoping that NumPy containers should eventually be discovered by relational database wrappers makers, so these wrapping times would be removed completely, but I'm currently not aware of any package taking this approach. [1] http://www.pytables.org/trac/browser/trunk/bench/postgres_backend.py [2] http://thread.gmane.org/gmane.comp.python.numeric.general/9704 [3] http://www.carabos.com/docs/OPSI-indexes.pdf Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From peridot.faceted at gmail.com Fri Jul 20 12:53:12 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 20 Jul 2007 12:53:12 -0400 Subject: [Numpy-discussion] expm In-Reply-To: <46A0CBA9.3090606@iam.uni-stuttgart.de> References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> Message-ID: On 20/07/07, Nils Wagner wrote: > lorenzo bolla wrote: > > hi all. > > is there a function in numpy to compute the exp of a matrix, similar > > to expm in matlab? > > for example: > > expm([[0,0],[0,0]]) = eye(2) > Numpy doesn't provide expm but scipy does. > >>> from scipy.linalg import expm, expm2, expm3 Just as a warning, numpy does provide expm1, but it does something different (exp(x)-1, computed directly). Anne From bioinformed at gmail.com Fri Jul 20 13:03:09 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 20 Jul 2007 13:03:09 -0400 Subject: [Numpy-discussion] expm In-Reply-To: References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> Message-ID: <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> On 7/20/07, Anne Archibald wrote: > > On 20/07/07, Nils Wagner wrote: > > lorenzo bolla wrote: > > > hi all. > > > is there a function in numpy to compute the exp of a matrix, similar > > > to expm in matlab? > > > for example: > > > expm([[0,0],[0,0]]) = eye(2) > > Numpy doesn't provide expm but scipy does. > > >>> from scipy.linalg import expm, expm2, expm3 > > Just as a warning, numpy does provide expm1, but it does something > different (exp(x)-1, computed directly). > On a separate note, I'm working to provide faster and more accurate versions of sqrtm and expm. The current versions do not take full advantage of LAPACK. Here are some preliminary benchmarks: Ill-conditioned ---------------- linalg.sqrtm : error=9.37e-27, 573.38 usec/pass sqrtm_svd : error=2.16e-28, 142.38 usec/pass sqrtm_eig : error=4.79e-27, 270.38 usec/pass sqrtm_symmetric: error=1.04e-27, 239.30 usec/pass sqrtm_symmetric2: error=2.73e-27, 190.03 usec/pass Well-conditioned ---------------- linalg.sqrtm : error=1.83e-29, 478.67 usec/pass sqrtm_svd : error=8.11e-30, 130.57 usec/pass sqrtm_eig : error=4.50e-30, 255.56 usec/pass sqrtm_symmetric: error=2.78e-30, 237.61 usec/pass sqrtm_symmetric2: error=3.35e-30, 167.27 usec/pass Large ---------------- linalg.sqrtm : error=5.95e-25, 8450081.68 usec/pass sqrtm_svd : error=1.64e-24, 151206.61 usec/pass sqrtm_eig : error=6.31e-24, 549837.40 usec/pass sqrtm_symmetric: error=8.55e-25, 177422.29 usec/pass where: def sqrtm_svd(x): u,s,vt = linalg.svd(x) return dot(u,transpose((s**0.5)*transpose(vt))) def sqrtm_eig(x): d,e = linalg.eig(x) d = (d**0.5).astype(float) return dot(e,transpose(d*e)) def sqrtm_symmetric(x,cond=1e-7): d,e = linalg.eigh(x) d[d From bioinformed at gmail.com Fri Jul 20 13:03:09 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 20 Jul 2007 13:03:09 -0400 Subject: [Numpy-discussion] expm In-Reply-To: References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> Message-ID: <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> On 7/20/07, Anne Archibald wrote: > > On 20/07/07, Nils Wagner wrote: > > lorenzo bolla wrote: > > > hi all. > > > is there a function in numpy to compute the exp of a matrix, similar > > > to expm in matlab? > > > for example: > > > expm([[0,0],[0,0]]) = eye(2) > > Numpy doesn't provide expm but scipy does. > > >>> from scipy.linalg import expm, expm2, expm3 > > Just as a warning, numpy does provide expm1, but it does something > different (exp(x)-1, computed directly). > On a separate note, I'm working to provide faster and more accurate versions of sqrtm and expm. The current versions do not take full advantage of LAPACK. Here are some preliminary benchmarks: Ill-conditioned ---------------- linalg.sqrtm : error=9.37e-27, 573.38 usec/pass sqrtm_svd : error=2.16e-28, 142.38 usec/pass sqrtm_eig : error=4.79e-27, 270.38 usec/pass sqrtm_symmetric: error=1.04e-27, 239.30 usec/pass sqrtm_symmetric2: error=2.73e-27, 190.03 usec/pass Well-conditioned ---------------- linalg.sqrtm : error=1.83e-29, 478.67 usec/pass sqrtm_svd : error=8.11e-30, 130.57 usec/pass sqrtm_eig : error=4.50e-30, 255.56 usec/pass sqrtm_symmetric: error=2.78e-30, 237.61 usec/pass sqrtm_symmetric2: error=3.35e-30, 167.27 usec/pass Large ---------------- linalg.sqrtm : error=5.95e-25, 8450081.68 usec/pass sqrtm_svd : error=1.64e-24, 151206.61 usec/pass sqrtm_eig : error=6.31e-24, 549837.40 usec/pass sqrtm_symmetric: error=8.55e-25, 177422.29 usec/pass where: def sqrtm_svd(x): u,s,vt = linalg.svd(x) return dot(u,transpose((s**0.5)*transpose(vt))) def sqrtm_eig(x): d,e = linalg.eig(x) d = (d**0.5).astype(float) return dot(e,transpose(d*e)) def sqrtm_symmetric(x,cond=1e-7): d,e = linalg.eigh(x) d[d From nwagner at iam.uni-stuttgart.de Fri Jul 20 13:50:05 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 20 Jul 2007 19:50:05 +0200 Subject: [Numpy-discussion] expm In-Reply-To: <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> Message-ID: On Fri, 20 Jul 2007 13:03:09 -0400 "Kevin Jacobs " wrote: > On 7/20/07, Anne Archibald >wrote: >> >> On 20/07/07, Nils Wagner >>wrote: >> > lorenzo bolla wrote: >> > > hi all. >> > > is there a function in numpy to compute the exp of a >>matrix, similar >> > > to expm in matlab? >> > > for example: >> > > expm([[0,0],[0,0]]) = eye(2) >> > Numpy doesn't provide expm but scipy does. >> > >>> from scipy.linalg import expm, expm2, expm3 >> >> Just as a warning, numpy does provide expm1, but it does >>something >> different (exp(x)-1, computed directly). >> > > On a separate note, I'm working to provide faster and >more accurate versions > of sqrtm and expm. The current versions do not take >full advantage of > LAPACK. Here are some preliminary benchmarks: > > Ill-conditioned > ---------------- > linalg.sqrtm : error=9.37e-27, 573.38 usec/pass > sqrtm_svd : error=2.16e-28, 142.38 usec/pass > sqrtm_eig : error=4.79e-27, 270.38 usec/pass > sqrtm_symmetric: error=1.04e-27, 239.30 usec/pass > sqrtm_symmetric2: error=2.73e-27, 190.03 usec/pass > > Well-conditioned > ---------------- > linalg.sqrtm : error=1.83e-29, 478.67 usec/pass > sqrtm_svd : error=8.11e-30, 130.57 usec/pass > sqrtm_eig : error=4.50e-30, 255.56 usec/pass > sqrtm_symmetric: error=2.78e-30, 237.61 usec/pass > sqrtm_symmetric2: error=3.35e-30, 167.27 usec/pass > > Large > ---------------- > linalg.sqrtm : error=5.95e-25, 8450081.68 usec/pass > sqrtm_svd : error=1.64e-24, 151206.61 usec/pass > sqrtm_eig : error=6.31e-24, 549837.40 usec/pass > sqrtm_symmetric: error=8.55e-25, 177422.29 usec/pass > > where: > > def sqrtm_svd(x): > u,s,vt = linalg.svd(x) > return dot(u,transpose((s**0.5)*transpose(vt))) > > def sqrtm_eig(x): > d,e = linalg.eig(x) > d = (d**0.5).astype(float) > return dot(e,transpose(d*e)) > > def sqrtm_symmetric(x,cond=1e-7): > d,e = linalg.eigh(x) > d[d return dot(e,transpose((d**0.5)*e)).astype(float) > > def sqrtm_symmetric2(x): > # Not as robust due to initial Cholesky step > l=linalg.cholesky(x,lower=1) > u,s,vt = linalg.svd(l) > return dot(u,transpose(s*u)) > > with SciPy linked against ACML. > > -Kevin Kevin, Your sqrtm_eig(x) function won't work if x is defective. See test_defective.py for details. Have you considered the algorithms proposed by Nick Higham for various matrix functions ? http://www.maths.manchester.ac.uk/~higham/pap-mf.html Nils -------------- next part -------------- A non-text attachment was scrubbed... Name: test_defective.py Type: text/x-python Size: 467 bytes Desc: not available URL: From Chris.Barker at noaa.gov Fri Jul 20 14:16:56 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 20 Jul 2007 11:16:56 -0700 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <200707201753.10852.faltet@carabos.com> References: <200707201753.10852.faltet@carabos.com> Message-ID: <46A0FC18.1000402@noaa.gov> Another small note: I'm pretty sure sqlite stores everything as strings. This just plain has to be slower than storing the raw binary representation (and may mean for slight differences in fp values on the round-trip). HDF is designed for this sort of thing, sqlite is not. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From faltet at carabos.com Fri Jul 20 14:30:42 2007 From: faltet at carabos.com (Francesc Altet) Date: Fri, 20 Jul 2007 20:30:42 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <46A0FC18.1000402@noaa.gov> References: <200707201753.10852.faltet@carabos.com> <46A0FC18.1000402@noaa.gov> Message-ID: <200707202030.43675.faltet@carabos.com> A Divendres 20 Juliol 2007 20:16, Christopher Barker escrigu?: > Another small note: > > I'm pretty sure sqlite stores everything as strings. This just plain has > to be slower than storing the raw binary representation (and may mean > for slight differences in fp values on the round-trip). HDF is designed > for this sort of thing, sqlite is not. Yeah, that was the case with sqlite 2. However, starting with sqlite 3, developers provided the ability to store integer and real numbers in a more compact format [1]. Sqlite 3 is the version included in Python 2.5 (the python version that Vincent was benchmarking), so this shouldn't make a big difference compared with other relational databases. [1] http://www.sqlite.org/datatype3.html Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From bioinformed at gmail.com Fri Jul 20 14:45:43 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 20 Jul 2007 14:45:43 -0400 Subject: [Numpy-discussion] expm In-Reply-To: References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> Message-ID: <2e1434c10707201145h7a2fa78ci2b3bfa44983f34a8@mail.gmail.com> On 7/20/07, Nils Wagner wrote: > > Your sqrtm_eig(x) function won't work if x is defective. > See test_defective.py for details. I am aware, though at least on my system, the SVD-based method is by far the fastest and robust (and can be made more robust by the addition of a relative condition number threshold). The eig version was included mainly for comparison. Have you considered the algorithms proposed by > Nick Higham for various matrix functions ? > > http://www.maths.manchester.ac.uk/~higham/pap-mf.html Yep. He is one of my heros. The downside is that direct Python implementations of many of his algorithms will almost always be significantly slower than using algorithms that leave the heavy-lifting to LAPACK. This is certainly the case for the current sqrtm and expm code. Even in the Python domain, there is significant room to optimize the current sqrtm implementation, since one of the key inner loops can be trivially vectorized. I'll certainly give that a shot and add it to my next round of benchmarks. However, I'm not sure that I want to commit to going to the next level by developing and maintaining a C implementation of sqrtm. While it has the potential to achieve near-optimal performance for that method, we may be reaching the point of diminishing returns for my needs. I certainly won't complain if someone else is willing to do so. Thanks, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Fri Jul 20 15:12:07 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 20 Jul 2007 21:12:07 +0200 Subject: [Numpy-discussion] expm In-Reply-To: <2e1434c10707201145h7a2fa78ci2b3bfa44983f34a8@mail.gmail.com> References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> <2e1434c10707201145h7a2fa78ci2b3bfa44983f34a8@mail.gmail.com> Message-ID: On Fri, 20 Jul 2007 14:45:43 -0400 "Kevin Jacobs " wrote: > On 7/20/07, Nils Wagner >wrote: >> >> Your sqrtm_eig(x) function won't work if x is defective. >> See test_defective.py for details. > > > I am aware, though at least on my system, the SVD-based >method is by far the > fastest and robust (and can be made more robust by the >addition of a > relative condition number threshold). Your sqrtm_svd(x) method also returns a wrong result, if x is defective. Cheers, Nils The eig version >was included mainly > for comparison. > From bioinformed at gmail.com Fri Jul 20 15:12:28 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 20 Jul 2007 15:12:28 -0400 Subject: [Numpy-discussion] expm In-Reply-To: References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> Message-ID: <2e1434c10707201212o74e2470s85c78cfb884db143@mail.gmail.com> On 7/20/07, Nils Wagner wrote: > > Your sqrtm_eig(x) function won't work if x is defective. > See test_defective.py for details. > I've added several defective matrices to my test cases and the SVD method doesn't work well as I'd thought (which is obvious in retrospect). I'm going to circle back and see what I can do to speed up Nick Higham's algorithm to the point where it is useful for me. Otherwise, my need is for a fast sqrtm for symmetric positive definite inputs, so I may still end up using the SVD approach and contributing a complementary sqrtm_nondefective back to scipy. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 20 15:45:48 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 20 Jul 2007 13:45:48 -0600 Subject: [Numpy-discussion] expm In-Reply-To: <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> Message-ID: On 7/20/07, Kevin Jacobs wrote: > > On 7/20/07, Anne Archibald wrote: > > > > On 20/07/07, Nils Wagner wrote: > > > lorenzo bolla wrote: > > > > hi all. > > > > is there a function in numpy to compute the exp of a matrix, similar > > > > > > to expm in matlab? > > > > for example: > > > > expm([[0,0],[0,0]]) = eye(2) > > > Numpy doesn't provide expm but scipy does. > > > >>> from scipy.linalg import expm, expm2, expm3 > > > > Just as a warning, numpy does provide expm1, but it does something > > different (exp(x)-1, computed directly). > > > > On a separate note, I'm working to provide faster and more accurate > versions of sqrtm and expm. The current versions do not take full advantage > of LAPACK. Here are some preliminary benchmarks: > > Ill-conditioned > ---------------- > linalg.sqrtm : error=9.37e-27, 573.38 usec/pass > sqrtm_svd : error=2.16e-28, 142.38 usec/pass > sqrtm_eig : error=4.79e-27, 270.38 usec/pass > sqrtm_symmetric: error= 1.04e-27, 239.30 usec/pass > sqrtm_symmetric2: error=2.73e-27, 190.03 usec/pass > > Well-conditioned > ---------------- > linalg.sqrtm : error=1.83e-29, 478.67 usec/pass > sqrtm_svd : error=8.11e-30, 130.57 usec/pass > sqrtm_eig : error=4.50e-30, 255.56 usec/pass > sqrtm_symmetric: error=2.78e-30, 237.61 usec/pass > sqrtm_symmetric2: error=3.35e-30, 167.27 usec/pass > > Large > ---------------- > linalg.sqrtm : error=5.95e-25 , 8450081.68 usec/pass > sqrtm_svd : error=1.64e-24, 151206.61 usec/pass > sqrtm_eig : error=6.31e-24, 549837.40 usec/pass > sqrtm_symmetric: error=8.55e-25, 177422.29 usec/pass > > where: > > def sqrtm_svd(x): > u,s,vt = linalg.svd(x) > return dot(u,transpose((s**0.5)*transpose(vt))) > > def sqrtm_eig(x): > d,e = linalg.eig(x) > d = (d**0.5).astype(float) > return dot(e,transpose(d*e)) > > def sqrtm_symmetric(x,cond=1e-7): > d,e = linalg.eigh(x) > d[d return dot(e,transpose((d**0.5)*e)).astype(float) > > def sqrtm_symmetric2(x): > # Not as robust due to initial Cholesky step > l=linalg.cholesky(x,lower=1) > u,s,vt = linalg.svd(l) > return dot(u,transpose(s*u)) > > with SciPy linked against ACML. I expect using sqrt(x) will be faster than x**.5. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Fri Jul 20 16:07:42 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 20 Jul 2007 13:07:42 -0700 Subject: [Numpy-discussion] expm In-Reply-To: References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> Message-ID: On 7/20/07, Charles R Harris wrote: [SNIP] > > > I expect using sqrt(x) will be faster than x**.5. > You might want to check that. I believe that x**0.5 is one of the magic special cases that is optimized to run fast (by calling sqrt in this case). IIRC the full set is [-1, 0, 0.5, 1, 2]. These were all of the ones that we believed could be implemented without loosing accuracy relative to using pow. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bioinformed at gmail.com Fri Jul 20 16:16:16 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 20 Jul 2007 16:16:16 -0400 Subject: [Numpy-discussion] expm In-Reply-To: References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> Message-ID: <2e1434c10707201316y137d7814pc283f2c3467109ce@mail.gmail.com> On 7/20/07, Charles R Harris wrote: > > I expect using sqrt(x) will be faster than x**.5. > I did test this at one point and was also surprised that sqrt(x) seemed slower than **.5. However I found out otherwise while preparing a timeit script to demonstrate this observation. Unfortunately, I didn't save the precise script I used to explore this issue the first time. On my system for arrays with more than 2 elements, sqrt is indeed faster. For smaller arrays, the different is negligible, but inches out in favor of **0.5. Thanks, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bioinformed at gmail.com Fri Jul 20 16:20:29 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 20 Jul 2007 16:20:29 -0400 Subject: [Numpy-discussion] expm In-Reply-To: <2e1434c10707201316y137d7814pc283f2c3467109ce@mail.gmail.com> References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> <2e1434c10707201316y137d7814pc283f2c3467109ce@mail.gmail.com> Message-ID: <2e1434c10707201320y3f5f1976x45c809a5a98c8f75@mail.gmail.com> On 7/20/07, Kevin Jacobs wrote: > > On 7/20/07, Charles R Harris wrote: > > > > I expect using sqrt(x) will be faster than x**.5. > > > > I did test this at one point and was also surprised that sqrt(x) seemed > slower than **.5. However I found out otherwise while preparing a timeit > script to demonstrate this observation. Unfortunately, I didn't save the > precise script I used to explore this issue the first time. On my system > for arrays with more than 2 elements, sqrt is indeed faster. For smaller > arrays, the different is negligible, but inches out in favor of ** 0.5. > This is just not my day. My observations above are valid for integer arrays, but not float arrays: sqrt(int array) : 6.98 usec/pass (int array)**0.5 : 22.75 usec/pass sqrt(float array) : 6.70 usec/pass (float array)**0.5: 4.66 usec/pass Generated by: import timeit n=100000 t=timeit.Timer(stmt='sqrt(arange(3))',setup='from numpy import arange,array,sqrt\nx=arange(100)') print 'sqrt(int array) : %5.2f usec/pass' % (1000000*t.timeit(number=n)/n) t=timeit.Timer(stmt='x**0.5',setup='from numpy import arange,array\nx=arange(100)') print '(int array)**0.5 : %5.2f usec/pass' % (1000000*t.timeit(number=n)/n) t=timeit.Timer(stmt='sqrt(arange(3))',setup='from numpy import arange,array,sqrt\nx=arange(100,dtype=float)') print 'sqrt(float array) : %5.2f usec/pass' % (1000000*t.timeit(number=n)/n) t=timeit.Timer(stmt='x**0.5',setup='from numpy import arange,array\nx=arange(100,dtype=float)') print '(float array)**0.5: %5.2f usec/pass' % (1000000*t.timeit(number=n)/n) -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Fri Jul 20 17:40:56 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Fri, 20 Jul 2007 14:40:56 -0700 Subject: [Numpy-discussion] expm In-Reply-To: <2e1434c10707201320y3f5f1976x45c809a5a98c8f75@mail.gmail.com> References: <80c99e790707200746i11bec580ub93760d8d91e1f6d@mail.gmail.com> <46A0CBA9.3090606@iam.uni-stuttgart.de> <2e1434c10707201003j7afac61fw8de3785a817106d0@mail.gmail.com> <2e1434c10707201316y137d7814pc283f2c3467109ce@mail.gmail.com> <2e1434c10707201320y3f5f1976x45c809a5a98c8f75@mail.gmail.com> Message-ID: On 7/20/07, Kevin Jacobs wrote: > > On 7/20/07, Kevin Jacobs > wrote: > > > > On 7/20/07, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > > > > I expect using sqrt(x) will be faster than x**.5. > > > > > > > I did test this at one point and was also surprised that sqrt(x) seemed > > slower than **.5. However I found out otherwise while preparing a timeit > > script to demonstrate this observation. Unfortunately, I didn't save the > > precise script I used to explore this issue the first time. On my system > > for arrays with more than 2 elements, sqrt is indeed faster. For smaller > > arrays, the different is negligible, but inches out in favor of ** 0.5. > > > > > This is just not my day. My observations above are valid for integer > arrays, but not float arrays: > > sqrt(int array) : 6.98 usec/pass > (int array)**0.5 : 22.75 usec/pass > sqrt(float array) : 6.70 usec/pass > (float array)**0.5: 4.66 usec/pass > >From the source, it appears that powers [-1, 0, 0.5, 1, 2] are optimized for float and complex types, while one power, 2, is optimized for other types. I can't recall why that is however. Generated by: > > import timeit > > n=100000 > > t=timeit.Timer(stmt='sqrt(arange(3))',setup='from numpy import > arange,array,sqrt\nx=arange(100)') > print 'sqrt(int array) : %5.2f usec/pass' % (1000000*t.timeit > (number=n)/n) > > t=timeit.Timer(stmt='x**0.5',setup='from numpy import > arange,array\nx=arange(100)') > print '(int array)** 0.5 : %5.2f usec/pass' % (1000000*t.timeit > (number=n)/n) > > t=timeit.Timer(stmt='sqrt(arange(3))',setup='from numpy import > arange,array,sqrt\nx=arange(100,dtype=float)') > print 'sqrt(float array) : %5.2f usec/pass' % (1000000* t.timeit > (number=n)/n) > > t=timeit.Timer(stmt='x**0.5',setup='from numpy import > arange,array\nx=arange(100,dtype=float)') > print '(float array)**0.5: %5.2f usec/pass' % (1000000*t.timeit(number=n)/n) > > > -Kevin > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Fri Jul 20 18:54:30 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 20 Jul 2007 19:54:30 -0300 Subject: [Numpy-discussion] array(1) not fortran contiguous? Message-ID: Is the following inteded? Why array(1) is not fortran contiguous? In [1]: from numpy import * In [2]: __version__ Out[2]: '1.0.3' In [3]: array(1).flags Out[3]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From v-nijs at kellogg.northwestern.edu Sun Jul 22 11:21:18 2007 From: v-nijs at kellogg.northwestern.edu (Vincent Nijs) Date: Sun, 22 Jul 2007 10:21:18 -0500 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: <200707201753.10852.faltet@carabos.com> Message-ID: FYI I asked a question about the load and save speed of recarray's using pickle vs pysqlite on the pysqlite list and got the response linked below. Doesn't look like sqlite can do much better than what I found. http://lists.initd.org/pipermail/pysqlite/2007-July/001085.html I also passed on Francesc's idea to use numpy containers in relational database wrappers such as pysqlite. This is apparently not possible since "in a "relational database you don't know the type of the values in advance. Some values might be NULL" and "and you might even have different types for the same column" http://lists.initd.org/pipermail/pysqlite/2007-July/001087.html I would assume the NULL's could be treated as missing values (?) Don't know about the different types in one column however. Vincent On 7/20/07 10:53 AM, "Francesc Altet" wrote: > Vincent, > > A Divendres 20 Juliol 2007 15:35, Vincent Nijs escrigu?: >> Still curious however ... does no one on this list use (and like) sqlite? > > First of all, while I'm not a heavy user of relational databases, I've used > them as references for benchmarking purposes. Hence, based on my own > benchmarking experience, I'd say that, for writing, relational databases do > take a lot of safety measures to ensure that all the data that is written to > the disk is safe and that the data relationships don't get broken, and that > takes times (a lot of time, in fact). I'm not sure about whether some of > these safety measures can be relaxed, but even though some relational > databases would allow this, my feel (beware, I can be wrong) is that you > won't be able to reach cPickle/PyTables speed (cPickle/PyTables are not > observing security measures in that regard because they are not thought for > these tasks). > > In this sense, the best writing speed that I was able to achieve with > Postgres (I don't know whether sqlite support this) is by simulating that > your data comes from a file stream and using the "cursor.copy_from()" method. > Using this approach I was able to accelerate a 10x (if I remember well) the > injecting speed, but even with this, PyTables can be another 10x faster. You > can see an exemple of usage in the Postgres backend [1] used for doing the > benchmarks for comparing PyTables and Postgres speeds. > > Regarding reading speed, my diggins [2] seems to indicate that the bottleneck > here is not related with safety, but with the need of the relational > databases pythonic APIs of wrapping *every* element retrieved out of the > database with a Python container (int, float, string...). On the contrary, > PyTables does take advantage of creating an empty recarray as the container > to keep all the retrieved data, and that's very fast compared with the former > approach. To somewhat quantify this effect in function of the size of the > dataset retrieved, you can see the figure 14 of [3] (as you can see, the > larger the dataset retrieved, the larger the difference in terms of speed). > Incidentally, and as it is said there, I'm hoping that NumPy containers > should eventually be discovered by relational database wrappers makers, so > these wrapping times would be removed completely, but I'm currently not aware > of any package taking this approach. > > [1] http://www.pytables.org/trac/browser/trunk/bench/postgres_backend.py > [2] http://thread.gmane.org/gmane.comp.python.numeric.general/9704 > [3] http://www.carabos.com/docs/OPSI-indexes.pdf > > Cheers, From lfriedri at imtek.de Mon Jul 23 03:57:45 2007 From: lfriedri at imtek.de (Lars Friedrich) Date: Mon, 23 Jul 2007 09:57:45 +0200 Subject: [Numpy-discussion] tofile speed Message-ID: <46A45F79.4020401@imtek.de> Hello everyone, I am using array.tofile successfully for a data-acqusition-streaming application. I mean that I do the following: for a long time: temp = dataAcquisisionDevice.getData() temp.tofile(myDataFile) temp is a numpy array that is used for storing the data temporarily. The data acquisition device is acquiring continuously and writing the data to a buffer from which I can read with .getData(). This works fine, but of course, when I turn the sample rate higher, there is a point when temp.toFile is too slow. The dataAcquisitionDevice's buffer will run full before I can fetch the data again. (temp has a size of ~Mbyte, and the for loop has a period of ~0.5 seconds so that increasing the chunk size won't help) I have no idea how efficient array.tofile() is. Maybe it is terribly efficient and what I see is just the limitation of my hardware (harddisk). Currently I can stream with roughly 4 Mbyte/s, which is quite fast, I guess. However, if anyone can point me to a way to write my data to harddisk faster, I would be very happy! Thanks Lars -- Dipl.-Ing. Lars Friedrich Photonic Measurement Technology Department of Microsystems Engineering -- IMTEK University of Freiburg Georges-K?hler-Allee 102 D-79110 Freiburg Germany phone: +49-761-203-7531 fax: +49-761-203-7537 room: 01 088 email: lfriedri at imtek.de From haase at msg.ucsf.edu Mon Jul 23 04:09:03 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Mon, 23 Jul 2007 10:09:03 +0200 Subject: [Numpy-discussion] tofile speed In-Reply-To: <46A45F79.4020401@imtek.de> References: <46A45F79.4020401@imtek.de> Message-ID: Just a guess out of my hat: there might be a buffer class in the standard python library... I'm thinking of a class that implements file-I/O and collects input up to a maximum buffer size before it copies the same byte stream to it's output. Since I/O is more efficient if larger chunks are written this should improve the overall performance. How large are your data-chunks per write ? (IOW: what is len(temp.data)) HTH, Sebastian Haase On 7/23/07, Lars Friedrich wrote: > Hello everyone, > > I am using array.tofile successfully for a data-acqusition-streaming > application. I mean that I do the following: > > for a long time: > temp = dataAcquisisionDevice.getData() > temp.tofile(myDataFile) > > temp is a numpy array that is used for storing the data temporarily. The > data acquisition device is acquiring continuously and writing the data > to a buffer from which I can read with .getData(). This works fine, but > of course, when I turn the sample rate higher, there is a point when > temp.toFile is too slow. The dataAcquisitionDevice's buffer will run > full before I can fetch the data again. > > (temp has a size of ~Mbyte, and the for loop has a period of ~0.5 > seconds so that increasing the chunk size won't help) > > I have no idea how efficient array.tofile() is. Maybe it is terribly > efficient and what I see is just the limitation of my hardware > (harddisk). Currently I can stream with roughly 4 Mbyte/s, which is > quite fast, I guess. However, if anyone can point me to a way to write > my data to harddisk faster, I would be very happy! > > Thanks > > Lars > > > -- > Dipl.-Ing. Lars Friedrich > > Photonic Measurement Technology > Department of Microsystems Engineering -- IMTEK > University of Freiburg > Georges-K?hler-Allee 102 > D-79110 Freiburg > Germany > > phone: +49-761-203-7531 > fax: +49-761-203-7537 > room: 01 088 > email: lfriedri at imtek.de > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From ivilata at carabos.com Mon Jul 23 02:47:37 2007 From: ivilata at carabos.com (Ivan Vilata i Balaguer) Date: Mon, 23 Jul 2007 08:47:37 +0200 Subject: [Numpy-discussion] Pickle, pytables, and sqlite - loading and saving recarray's In-Reply-To: References: <200707201753.10852.faltet@carabos.com> Message-ID: <20070723064737.GA19445@tardis.terramar.selidor.net> Vincent Nijs (el 2007-07-22 a les 10:21:18 -0500) va dir:: > [...] > I would assume the NULL's could be treated as missing values (?) Don't know > about the different types in one column however. Maybe a masked array would do the trick, with NULL values masked out. :: Ivan Vilata i Balaguer >qo< http://www.carabos.com/ C?rabos Coop. V. V V Enjoy Data "" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From timmortimer at d2.net.au Mon Jul 23 06:50:24 2007 From: timmortimer at d2.net.au (Tim Mortimer) Date: Mon, 23 Jul 2007 20:20:24 +0930 Subject: [Numpy-discussion] getting numPy happening for sciPy Message-ID: <46A487F0.7070404@d2.net.au> Hello There, I'm pretty new to Python, picking it up recently in order to begin to experiment with Csound / Python interconnectivity (generating scores, GUI elements, using Vpython to model mechanical "systems"... that sort of thing...) Have so far written about 500 lines of Python code (over the last few weeks) to do various housekeeping of of my Csound .sco's & beginning to play with some Python / Csound score generative stuff. Sowhen it comes to Python i'm a novice, but with some grasp of the basics. Anyway, I wanted to "beef up" my python arsenal with some of the SciPY stuff - initially a wider & more solid range of random number generators, histograms & statistical packages etc. So it is with some regret that i see at present that it is not possible to build SciPy on top of the standard "out the box" NumPy installation? I am not an experienced programmer, so the idea of building NumPy from the "bleeding edge" repository is beyond my capability, as there appears to be no specific instructions for how to do this (that don't assume you have some degree of experience at what your doing anyway.. ) So I guess my question is 1) can i get an idiots guide to what's required to get the current NumPy installation happening in order to host SciPy on top of it? 2) if 1) involves a whole lot of faffing about & acquiring a whole heap of new skill set (that only really has limited long term relevance to me, & isn't likely to stick in my head very long anyway) - how long do I have to wait before i can just run a new NumPy installer, followed by a SciPy installer, & get the full kit & kaboodle happening? Regrettably, I'm on Win XP. Thanks for any help or advice. Can't imaging i'll be having too much of a contribution to this community, but will be interested to see what sort of topics float through the NumPY world.... with thanks Tim - Adelaide, Australia. From stefan at sun.ac.za Mon Jul 23 08:50:00 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 23 Jul 2007 14:50:00 +0200 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A487F0.7070404@d2.net.au> References: <46A487F0.7070404@d2.net.au> Message-ID: <20070723125000.GQ7290@mentat.za.net> Hi Tim On Mon, Jul 23, 2007 at 08:20:24PM +0930, Tim Mortimer wrote: > I am not an experienced programmer, so the idea of building NumPy from > the "bleeding edge" repository is beyond my capability, as there appears > to be no specific instructions for how to do this (that don't assume you > have some degree of experience at what your doing anyway.. ) > > So I guess my question is > > 1) can i get an idiots guide to what's required to get the current NumPy > installation happening in order to host SciPy on top of it? One way is to use Enthoughts egg installer: http://code.enthought.com/enstaller/ That way, you won't have linear algebra routines optimised specifically for your platform, but you'll have a fully functional numpy, scipy (and optionally matplotlib etc.) installation. Regards St?fan From steve at shrogers.com Mon Jul 23 08:54:37 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Mon, 23 Jul 2007 06:54:37 -0600 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A487F0.7070404@d2.net.au> References: <46A487F0.7070404@d2.net.au> Message-ID: <46A4A50D.80503@shrogers.com> Tim Mortimer wrote: > Anyway, I wanted to "beef up" my python arsenal with some of the SciPY > stuff - initially a wider & more solid range of random number > generators, histograms & statistical packages etc. > > So it is with some regret that i see at present that it is not possible > to build SciPy on top of the standard "out the box" NumPy installation? > > I am not an experienced programmer, so the idea of building NumPy from > the "bleeding edge" repository is beyond my capability, as there appears > to be no specific instructions for how to do this (that don't assume you > have some degree of experience at what your doing anyway.. ) > > So I guess my question is > > 1) can i get an idiots guide to what's required to get the current NumPy > installation happening in order to host SciPy on top of it? > > 2) if 1) involves a whole lot of faffing about & acquiring a whole heap > of new skill set (that only really has limited long term relevance to > me, & isn't likely to stick in my head very long anyway) - how long do I > have to wait before i can just run a new NumPy installer, followed by a > SciPy installer, & get the full kit & kaboodle happening? > > Regrettably, I'm on Win XP. > G'day Tim: I don't know of any simple build instructions for Windows, but if you're patient, there will probably be updated packaged releases of SciPy + NumPy that play well together "real soon now". Regards, Steve From seandavi at gmail.com Mon Jul 23 09:00:55 2007 From: seandavi at gmail.com (Sean Davis) Date: Mon, 23 Jul 2007 09:00:55 -0400 Subject: [Numpy-discussion] Combining record arrays Message-ID: <264855a00707230600k17677468h96acd62a0b7530@mail.gmail.com> I am a relatively new numpy user, so sorry for the relatively simple questions. I hope someone can help clarify a couple of things for me. 1) What is the difference between recarrays and arrays created using N.rec.array? If they are different, why use one over the other? 2) How can I add columns to an established recarray (or N.rec.array)? How about rows? Thanks, Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jul 23 12:02:41 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Jul 2007 11:02:41 -0500 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A4A50D.80503@shrogers.com> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> Message-ID: <46A4D121.9040603@gmail.com> Steven H. Rogers wrote: > I don't know of any simple build instructions for Windows, but if you're > patient, there will probably be updated packaged releases of SciPy + > NumPy that play well together "real soon now". We'll need a volunteer release manager for that, or it won't happen. Most of the principals are very busy right now. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From schaffer at optonline.net Mon Jul 23 12:05:55 2007 From: schaffer at optonline.net (Les Schaffer) Date: Mon, 23 Jul 2007 12:05:55 -0400 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A4D121.9040603@gmail.com> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> Message-ID: <46A4D1E3.4040501@optonline.net> Robert Kern wrote: > We'll need a volunteer release manager for that, or it won't happen. Most of the > principals are very busy right now. will it compile with Visual C++ 2005 Express? if so, i'd give it a try. Les Schaffer From charlesr.harris at gmail.com Mon Jul 23 12:13:36 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Jul 2007 10:13:36 -0600 Subject: [Numpy-discussion] tofile speed In-Reply-To: <46A45F79.4020401@imtek.de> References: <46A45F79.4020401@imtek.de> Message-ID: On 7/23/07, Lars Friedrich wrote: > > Hello everyone, > > I am using array.tofile successfully for a data-acqusition-streaming > application. I mean that I do the following: > > for a long time: > temp = dataAcquisisionDevice.getData() > temp.tofile(myDataFile) > > temp is a numpy array that is used for storing the data temporarily. The > data acquisition device is acquiring continuously and writing the data > to a buffer from which I can read with .getData(). This works fine, but > of course, when I turn the sample rate higher, there is a point when > temp.toFile is too slow. The dataAcquisitionDevice's buffer will run > full before I can fetch the data again. > > (temp has a size of ~Mbyte, and the for loop has a period of ~0.5 > seconds so that increasing the chunk size won't help) > > I have no idea how efficient array.tofile() is. Maybe it is terribly > efficient and what I see is just the limitation of my hardware > (harddisk). Currently I can stream with roughly 4 Mbyte/s, which is > quite fast, I guess. However, if anyone can point me to a way to write > my data to harddisk faster, I would be very happy! 4 MB/s is extremely slow, these days most drives will do better than 50 MB/s during sustained writes. Raid-0 will about double that rate if you aren't terribly worried about drive failure. What operating system and hardware are you using? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Mon Jul 23 12:54:58 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 23 Jul 2007 18:54:58 +0200 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A4D1E3.4040501@optonline.net> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A4D1E3.4040501@optonline.net> Message-ID: 2007/7/23, Les Schaffer : > > Robert Kern wrote: > > We'll need a volunteer release manager for that, or it won't happen. > Most of the > > principals are very busy right now. > > will it compile with Visual C++ 2005 Express? if so, i'd give it a try. > Windows binaries must be compiled with the same compiler as Python, so it is (sadly IMHO) Visual 2003. Well in fact, I could install one if needed (I have a licence) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmbdev at gmail.com Mon Jul 23 22:03:33 2007 From: tmbdev at gmail.com (Thomas Breuel) Date: Tue, 24 Jul 2007 04:03:33 +0200 Subject: [Numpy-discussion] output arguments In-Reply-To: <7e51d15d0707231741k6412e198xd4192470d2175975@mail.gmail.com> References: <7e51d15d0707231741k6412e198xd4192470d2175975@mail.gmail.com> Message-ID: <7e51d15d0707231903u1a991a48wf55e1192f10cafce@mail.gmail.com> Hi, core NumPy doesn't seem to support a lot of output arguments, or common composite operations. For example, a common operation is something like a = outer(b,c) or a += outer(b,c) There are some workarounds, but they aren't pretty. Consistently providing output arguments throughout NumPy would help; is there any reason this isn't being done? For example, it would be nice if "outer" supported: outer(b,c,output=a) outer(b,c,increment=a) outer(b,c,increment=a,scale=eps) or maybe one could specify an accumulation ufunc, with addition, multiplication, min, and max being fast, and with an optional scale parameter. Another approach might be to provide, in addition to the convenient high-level NumPy operations, direct bindings for BLAS and/or similar libraries, with Fortran-like procedural interfaces, but I can't find any such libraries in NumPy or SciPy. Am I missing something? It seems like writing native code to speed up these kinds of operations isn't really so great because (1) it unnecessarily complicates development and packaging, and (2) is likely to perform worse than BLAS and similar libraries. How are other people handling these cases? Thanks, Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jul 23 23:50:47 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Jul 2007 22:50:47 -0500 Subject: [Numpy-discussion] output arguments In-Reply-To: <7e51d15d0707231903u1a991a48wf55e1192f10cafce@mail.gmail.com> References: <7e51d15d0707231741k6412e198xd4192470d2175975@mail.gmail.com> <7e51d15d0707231903u1a991a48wf55e1192f10cafce@mail.gmail.com> Message-ID: <46A57717.7060300@gmail.com> Thomas Breuel wrote: > Hi, > > core NumPy doesn't seem to support a lot of output arguments, or common > composite operations. For example, a common operation is something like > > a = outer(b,c) > > or > > a += outer(b,c) > > There are some workarounds, but they aren't pretty. Consistently > providing output arguments throughout NumPy would help; is there any > reason this isn't being done? Time. > For example, it would be nice if "outer" > supported: > > outer(b,c,output=a) > outer(b,c,increment=a) > outer(b,c,increment=a,scale=eps) > > or maybe one could specify an accumulation ufunc, with addition, > multiplication, min, and max being fast, and with an optional scale > parameter. What would the increment and scale parameters do? Why should they be part of the outer() interface? Have you looked at the .outer() method on the add, multiply, minimum, and maximum ufuncs? > Another approach might be to provide, in addition to the convenient > high-level NumPy operations, direct bindings for BLAS and/or similar > libraries, with Fortran-like procedural interfaces, but I can't find any > such libraries in NumPy or SciPy. Am I missing something? scipy.linalg.flapack, etc. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lfriedri at imtek.de Tue Jul 24 02:16:04 2007 From: lfriedri at imtek.de (Lars Friedrich) Date: Tue, 24 Jul 2007 08:16:04 +0200 Subject: [Numpy-discussion] tofile speed Message-ID: <46A59924.9050906@imtek.de> Hello everyone, thank you for the replies. Sebastian, the chunk size is roughly 4*10^6 samples, with two byte per sample, this is about 8MB. I can vary this size, but increasing it only helps for much smaller values. For example, when I use a size of 100 Samples, I am much too slow. It gets better for 1000 Samples, 10000 Samples and so on. But since I already reached a chunksize in the region of megabytes, I have difficulties to increase my buffer size further. I also have the feeling that increasing does not help in this size region. (correct me if I am wrong...) Chuck, I am using a Windows XP system with a new (few months old) Maxtor SATA-drive. Lars From haase at msg.ucsf.edu Tue Jul 24 04:16:42 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 24 Jul 2007 10:16:42 +0200 Subject: [Numpy-discussion] tofile speed In-Reply-To: <46A59924.9050906@imtek.de> References: <46A59924.9050906@imtek.de> Message-ID: So you are saying that a given tofile() call returns only after 2 seconds !? Can you measure the getData() call time (just comment the tofile out for a while- I that doesn't use 100% CPU ..) ? (timeit module is needed - I think) Maybe multithreading might help - so that tofile() and GetData() can overlap. But 2 sec is really slow .... -S. On 7/24/07, Lars Friedrich wrote: > Hello everyone, > > thank you for the replies. > > Sebastian, the chunk size is roughly 4*10^6 samples, with two byte per > sample, this is about 8MB. I can vary this size, but increasing it only > helps for much smaller values. For example, when I use a size of 100 > Samples, I am much too slow. It gets better for 1000 Samples, 10000 > Samples and so on. But since I already reached a chunksize in the region > of megabytes, I have difficulties to increase my buffer size further. I > also have the feeling that increasing does not help in this size region. > (correct me if I am wrong...) > > Chuck, I am using a Windows XP system with a new (few months old) Maxtor > SATA-drive. > > Lars > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From haase at msg.ucsf.edu Tue Jul 24 04:17:51 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 24 Jul 2007 10:17:51 +0200 Subject: [Numpy-discussion] tofile speed In-Reply-To: References: <46A59924.9050906@imtek.de> Message-ID: Your are not generating text files - right? On 7/24/07, Sebastian Haase wrote: > So you are saying that a given tofile() call returns only after 2 seconds !? > Can you measure the getData() call time (just comment the tofile out > for a while- I that doesn't use 100% CPU ..) ? (timeit module is > needed - I think) > Maybe multithreading might help - so that tofile() and GetData() can overlap. > But 2 sec is really slow .... > > -S. > > > On 7/24/07, Lars Friedrich wrote: > > Hello everyone, > > > > thank you for the replies. > > > > Sebastian, the chunk size is roughly 4*10^6 samples, with two byte per > > sample, this is about 8MB. I can vary this size, but increasing it only > > helps for much smaller values. For example, when I use a size of 100 > > Samples, I am much too slow. It gets better for 1000 Samples, 10000 > > Samples and so on. But since I already reached a chunksize in the region > > of megabytes, I have difficulties to increase my buffer size further. I > > also have the feeling that increasing does not help in this size region. > > (correct me if I am wrong...) > > > > Chuck, I am using a Windows XP system with a new (few months old) Maxtor > > SATA-drive. > > > > Lars > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > From timmortimer at d2.net.au Tue Jul 24 06:03:17 2007 From: timmortimer at d2.net.au (Tim Mortimer) Date: Tue, 24 Jul 2007 19:33:17 +0930 Subject: [Numpy-discussion] getting numPy happening for sciPy Message-ID: <46A5CE65.7070908@d2.net.au> Thanks for the responses, http://code.enthought.com/enstaller/ I'll probably skip this. Although it looks like a useful tool, the computer / Python 2.5 installation i'm "managing" isn't on the internet, so any synchronisation or what not probably won't be possible there. ( i think that's what that was all about...) Happy to wait for a new NumPy package, plenty to keep me busy till then. Will watch this list for announcements. thanks to you all. From steve at shrogers.com Tue Jul 24 09:29:01 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 24 Jul 2007 07:29:01 -0600 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A4D121.9040603@gmail.com> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> Message-ID: <46A5FE9D.7050205@shrogers.com> Robert Kern wrote: > Steven H. Rogers wrote: > > >> I don't know of any simple build instructions for Windows, but if you're >> patient, there will probably be updated packaged releases of SciPy + >> NumPy that play well together "real soon now". >> > > We'll need a volunteer release manager for that, or it won't happen. Most of the > principals are very busy right now. > I thought I saw signs that something would be happening soon. What would be the scope of this release manager's responsibilities? # Steve From zyzhu2000 at gmail.com Tue Jul 24 14:39:25 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Tue, 24 Jul 2007 13:39:25 -0500 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 Message-ID: Hi, I am about to write a C extension module. C functions in the module will take and return numpy arrays. I found a tutorial online, but I am not sure about the following: 1. Can I compile my extension with Visual Studio 2005? My impression is that I will have to link with numpy libraries, and, if numpy was compiled with a different compiler, I might have problems. However, if numpy is a DLL, maybe there is a way that I can build a LIB file based on the DLL and link with the LIB file. Does anyone have experience in doing this? 2. I am new to writing python extensions. The tutorial is doing things by hand. Does anyone know what is the best way to do this? How about SWIG? Thanks, Geoffrey -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jul 24 14:48:58 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Jul 2007 13:48:58 -0500 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A5FE9D.7050205@shrogers.com> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A5FE9D.7050205@shrogers.com> Message-ID: <46A6499A.2020608@gmail.com> Steven H. Rogers wrote: > Robert Kern wrote: >> Steven H. Rogers wrote: >> >>> I don't know of any simple build instructions for Windows, but if you're >>> patient, there will probably be updated packaged releases of SciPy + >>> NumPy that play well together "real soon now". >>> >> We'll need a volunteer release manager for that, or it won't happen. Most of the >> principals are very busy right now. >> > I thought I saw signs that something would be happening soon. Real life got in the way. > What > would be the scope of this release manager's responsibilities? 1) Make a numpy 1.0.3.1 point release. Some changes in 1.0.3-2 which should have only made changes to the files included in the numpy source tarball also seem to impair building scipy. We would need to branch from the 1.0.3 tag to fix this. I don't think the changes to numpy.distutils in the trunk have been vetted enough for making a numpy 1.0.4 release. Here is some information on the mechanics of this: http://projects.scipy.org/scipy/numpy/wiki/MakingReleases 2) Make a scipy 0.5.3 release. Deciding what goes in is mostly a matter of announcing a cutoff date to make sure people don't have half of a refactoring in. 3) The release manager will need to make sure that the releases build and run their test suite on at least the Big Three: Windows, some kind of Linux, and Mac OS X. Usually, they will need to delegate for some of these platforms. The binary builds for Windows should be provided for download along with the source. 4) Binaries: Windows binaries are pretty much necessary. If possible, try to use an ATLAS library that does *not* use SSE2 instructions. We have had problems with people getting segfaults on older hardware. scipy binaries should *not* include FFTW or UMFPACK since they are GPLed. I can help with OS X binaries now that I've finally figured out how to get scipy to link statically against the gfortran runtime. 5) The tarballs and binaries should be uploaded to the Sourceforge site. The Cheeseshop records should be updated to record the new versions. An announcement should be made to python-announce and the relevant mailing lists. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Jul 24 14:55:16 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Jul 2007 13:55:16 -0500 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: References: Message-ID: <46A64B14.4060007@gmail.com> Geoffrey Zhu wrote: > Hi, > > I am about to write a C extension module. C functions in the module will > take and return numpy arrays. I found a tutorial online, but I am not > sure about the following: > > 1. Can I compile my extension with Visual Studio 2005? My impression is > that I will have to link with numpy libraries, and, if numpy was > compiled with a different compiler, I might have problems. However, if > numpy is a DLL, maybe there is a way that I can build a LIB file based > on the DLL and link with the LIB file. Does anyone have experience in > doing this? numpy isn't the issue. The main Windows Python distribution requires Visual Studio 2003 for building extensions. One can do it with mingw, though, with care. > 2. I am new to writing python extensions. The tutorial is doing things > by hand. Does anyone know what is the best way to do this? How about SWIG? There isn't a single best way. They all have tradeoffs. It might be easiest for you to actually just write your C functions into a DLL without referencing numpy or Python at all and call those functions using ctypes. That avoids needing a specific compiler and is a pretty handy tool to learn. If you do want to write an extension, I think I might suggest starting with writing one by hand. It helps with the other techniques to know what's going on underneath. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From zyzhu2000 at gmail.com Tue Jul 24 15:01:14 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Tue, 24 Jul 2007 14:01:14 -0500 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <46A64B14.4060007@gmail.com> References: <46A64B14.4060007@gmail.com> Message-ID: Hi Robert, On 7/24/07, Robert Kern wrote: > Geoffrey Zhu wrote: > > Hi, > > > > I am about to write a C extension module. C functions in the module will > > take and return numpy arrays. I found a tutorial online, but I am not > > sure about the following: > > > > 1. Can I compile my extension with Visual Studio 2005? My impression is > > that I will have to link with numpy libraries, and, if numpy was > > compiled with a different compiler, I might have problems. However, if > > numpy is a DLL, maybe there is a way that I can build a LIB file based > > on the DLL and link with the LIB file. Does anyone have experience in > > doing this? > > numpy isn't the issue. The main Windows Python distribution requires Visual > Studio 2003 for building extensions. One can do it with mingw, though, with care. > > > 2. I am new to writing python extensions. The tutorial is doing things > > by hand. Does anyone know what is the best way to do this? How about SWIG? > > There isn't a single best way. They all have tradeoffs. It might be easiest for > you to actually just write your C functions into a DLL without referencing numpy > or Python at all and call those functions using ctypes. That avoids needing a > specific compiler and is a pretty handy tool to learn. > > If you do want to write an extension, I think I might suggest starting with > writing one by hand. It helps with the other techniques to know what's going on > underneath. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Thanks for your help. Do you know what exactly is the issue of having to use VS2003 to build extensions? If the interactions are done at DLL level, shouldn't call compilers that can generate DLLs work? It doesn't look like using ctypes would be an option, as my goal is to 'vectorize' some operations. Thanks a lot, Geoffrey From robert.kern at gmail.com Tue Jul 24 15:08:10 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Jul 2007 14:08:10 -0500 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: References: <46A64B14.4060007@gmail.com> Message-ID: <46A64E1A.3050706@gmail.com> Geoffrey Zhu wrote: > Thanks for your help. Do you know what exactly is the issue of having > to use VS2003 to build extensions? If the interactions are done at DLL > level, shouldn't call compilers that can generate DLLs work? Mostly it's an issue of the C runtime that is used for either compiler. C extensions need to use the same runtime as Python itself. Mostly. > It doesn't look like using ctypes would be an option, as my goal is to > 'vectorize' some operations. You mean you need to use PyMultiIter objects for broadcasting? Yeah, that would require an extension. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From zyzhu2000 at gmail.com Tue Jul 24 15:18:03 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Tue, 24 Jul 2007 14:18:03 -0500 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <46A64E1A.3050706@gmail.com> References: <46A64B14.4060007@gmail.com> <46A64E1A.3050706@gmail.com> Message-ID: Hi Robert, On 7/24/07, Robert Kern wrote: > Geoffrey Zhu wrote: > > Thanks for your help. Do you know what exactly is the issue of having > > to use VS2003 to build extensions? If the interactions are done at DLL > > level, shouldn't call compilers that can generate DLLs work? > > Mostly it's an issue of the C runtime that is used for either compiler. C > extensions need to use the same runtime as Python itself. Mostly. If it is a problem of the runtime library conflict, I can probably statically link the VS2005 runtime into my extension DLL and there would be no conflict. Do you see any problems with this plan? :-) > > It doesn't look like using ctypes would be an option, as my goal is to > > 'vectorize' some operations. > > You mean you need to use PyMultiIter objects for broadcasting? Yeah, that would > require an extension. I haven't looked at PyMultilter objects. I am just trying to build a 'vector version' of my C function so that it can do batch calculations. For example, for a vector X, I can do for x in X: y=my_func(x) Or I can do Y=my_vector_func(X). The latter is probably much more efficient. That's why I need the extension. Thanks, Geoffrey > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco From robert.kern at gmail.com Tue Jul 24 15:32:10 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Jul 2007 14:32:10 -0500 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: References: <46A64B14.4060007@gmail.com> <46A64E1A.3050706@gmail.com> Message-ID: <46A653BA.1060003@gmail.com> Geoffrey Zhu wrote: > Hi Robert, > > On 7/24/07, Robert Kern wrote: >> Geoffrey Zhu wrote: >>> Thanks for your help. Do you know what exactly is the issue of having >>> to use VS2003 to build extensions? If the interactions are done at DLL >>> level, shouldn't call compilers that can generate DLLs work? >> Mostly it's an issue of the C runtime that is used for either compiler. C >> extensions need to use the same runtime as Python itself. Mostly. > > If it is a problem of the runtime library conflict, I can probably > statically link the VS2005 runtime into my extension DLL and there > would be no conflict. Do you see any problems with this plan? :-) If it works, then no, no problem. distutils may balk at building your extension, though. >>> It doesn't look like using ctypes would be an option, as my goal is to >>> 'vectorize' some operations. >> You mean you need to use PyMultiIter objects for broadcasting? Yeah, that would >> require an extension. > > I haven't looked at PyMultilter objects. I am just trying to build a > 'vector version' of my C function so that it can do batch > calculations. For example, for a vector X, I can do > > for x in X: y=my_func(x) > > Or I can do Y=my_vector_func(X). > > The latter is probably much more efficient. That's why I need the extension. ctypes would be an option, then. You would just do the loop in C. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Tue Jul 24 15:46:50 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 24 Jul 2007 21:46:50 +0200 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <46A653BA.1060003@gmail.com> References: <46A64B14.4060007@gmail.com> <46A64E1A.3050706@gmail.com> <46A653BA.1060003@gmail.com> Message-ID: <20070724194650.GA14207@clipper.ens.fr> On Tue, Jul 24, 2007 at 02:32:10PM -0500, Robert Kern wrote: > > I haven't looked at PyMultilter objects. I am just trying to build a > > 'vector version' of my C function so that it can do batch > > calculations. For example, for a vector X, I can do > > for x in X: y=my_func(x) > > Or I can do Y=my_vector_func(X). > > The latter is probably much more efficient. That's why I need the extension. > ctypes would be an option, then. You would just do the loop in C. I agree with Robert that ctypes is probably the simplest option. Have a look at http://scipy.org/Cookbook/Ctypes HTH, Ga?l From lfriedri at imtek.de Wed Jul 25 05:58:09 2007 From: lfriedri at imtek.de (Lars Friedrich) Date: Wed, 25 Jul 2007 11:58:09 +0200 Subject: [Numpy-discussion] tofile speed Message-ID: <46A71EB1.8030704@imtek.de> Hello, I tried the following: ####### start code a = N.random.rand(1000000) myFile = file('test.bin', 'wb') for i in range(100): a.tofile(myFile) myFile.close() ####### end code And this gives roughly 50 MB/s on my office-machine but only 6.5 MB/s on the machine that I was reporting about. Both computers use Python 2.4.3 with enthought 1.0.0 and numpy 1.0.1 So I think I will go and check the harddisk-drivers. array.tofile does not seem to be the problem and actually seems to be very fast. Any other recommendations? Thanks Lars From subscriber100 at rjs.org Wed Jul 25 09:38:55 2007 From: subscriber100 at rjs.org (Ray Schumacher) Date: Wed, 25 Jul 2007 06:38:55 -0700 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 Message-ID: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> Geoffrey Zhu wrote: > Hi, > > I am about to write a C extension module. C functions in the module will > take and return numpy arrays. I found a tutorial online, but I am not > sure about the following: I agree with others that ctypes might be your best path. The codeGenerator is magic, if you ask me: http://starship.python.net/crew/theller/ctypes/old/codegen.html But, if the function is simple, why not weave.inline? What I have done is run the function once, hunt down the long-named library, copy it to the local directory, then include it explicitly and call its function. This eliminates some overhead time for the call. I use it to convert packed IEEE data from an ADC data read function, and it's faster than the manufacturer's own function version that returns scaled integers! Ray From gael.varoquaux at normalesup.org Wed Jul 25 09:41:37 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 25 Jul 2007 15:41:37 +0200 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> References: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> Message-ID: <20070725134137.GF25804@clipper.ens.fr> On Wed, Jul 25, 2007 at 06:38:55AM -0700, Ray Schumacher wrote: > The codeGenerator is magic, if you ask me: > http://starship.python.net/crew/theller/ctypes/old/codegen.html Can it wrap code passing around arrays ? If so it really does magic that I don't understand. Ga?l From stefan at sun.ac.za Wed Jul 25 10:44:08 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 25 Jul 2007 16:44:08 +0200 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <20070725134137.GF25804@clipper.ens.fr> References: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> <20070725134137.GF25804@clipper.ens.fr> Message-ID: <20070725144408.GO8728@mentat.za.net> On Wed, Jul 25, 2007 at 03:41:37PM +0200, Gael Varoquaux wrote: > On Wed, Jul 25, 2007 at 06:38:55AM -0700, Ray Schumacher wrote: > > The codeGenerator is magic, if you ask me: > > http://starship.python.net/crew/theller/ctypes/old/codegen.html > > Can it wrap code passing around arrays ? If so it really does magic that > I don't understand. If your array is contiguous, it really is only a matter of passing along a pointer and dimensions. By writing your C-functions in the form void func(double* data, int rows, int cols, double* out) { } wrapping becomes trivial. Cheers St?fan From gael.varoquaux at normalesup.org Wed Jul 25 10:50:03 2007 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 25 Jul 2007 16:50:03 +0200 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <20070725144408.GO8728@mentat.za.net> References: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> <20070725134137.GF25804@clipper.ens.fr> <20070725144408.GO8728@mentat.za.net> Message-ID: <20070725145003.GA28080@clipper.ens.fr> On Wed, Jul 25, 2007 at 04:44:08PM +0200, Stefan van der Walt wrote: > On Wed, Jul 25, 2007 at 03:41:37PM +0200, Gael Varoquaux wrote: > > On Wed, Jul 25, 2007 at 06:38:55AM -0700, Ray Schumacher wrote: > > > The codeGenerator is magic, if you ask me: > > > http://starship.python.net/crew/theller/ctypes/old/codegen.html > > Can it wrap code passing around arrays ? If so it really does magic that > > I don't understand. > If your array is contiguous, it really is only a matter of passing > along a pointer and dimensions. > By writing your C-functions in the form > void func(double* data, int rows, int cols, double* out) { } > wrapping becomes trivial. Yes, I have done this many times. It is trivial and very convenient. I was just wondering if the code generator could detect this pattern. Ga?l From charlesr.harris at gmail.com Wed Jul 25 10:58:22 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 25 Jul 2007 08:58:22 -0600 Subject: [Numpy-discussion] tofile speed In-Reply-To: <46A71EB1.8030704@imtek.de> References: <46A71EB1.8030704@imtek.de> Message-ID: On 7/25/07, Lars Friedrich wrote: > > Hello, > > I tried the following: > > > ####### start code > > a = N.random.rand(1000000) > > myFile = file('test.bin', 'wb') > > for i in range(100): > a.tofile(myFile) > > myFile.close() > > ####### end code > > > And this gives roughly 50 MB/s on my office-machine but only 6.5 MB/s on > the machine that I was reporting about. > > Both computers use Python 2.4.3 with enthought 1.0.0 and numpy 1.0.1 > > So I think I will go and check the harddisk-drivers. array.tofile does > not seem to be the problem and actually seems to be very fast. Any other > recommendations? You might check what disk controllers the disks are using. I got an almost x10 speedup moving some disks from a DELL PCI CERC board to the onboard SATA and using software raid. Sometimes DMA isn't enabled, but that is pretty rare these days. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyzhu2000 at gmail.com Wed Jul 25 15:34:46 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Wed, 25 Jul 2007 14:34:46 -0500 Subject: [Numpy-discussion] Should I use numpy array? Message-ID: Hi, I am writing a function that would take a list of datetime objects and a list of single letter characters (such as ["A","B","C"]). The number of items tend to be big and both the list and the numpy array have all the functionalities I need. Do you think I should use numpy arrays or the regular lists? Thanks, cg From robert.kern at gmail.com Wed Jul 25 15:40:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 25 Jul 2007 14:40:17 -0500 Subject: [Numpy-discussion] Should I use numpy array? In-Reply-To: References: Message-ID: <46A7A721.3060307@gmail.com> Geoffrey Zhu wrote: > Hi, > > I am writing a function that would take a list of datetime objects and > a list of single letter characters (such as ["A","B","C"]). The number > of items tend to be big and both the list and the numpy array have all > the functionalities I need. > > Do you think I should use numpy arrays or the regular lists? If lists have all the functionality that you need, then I would probably recommend sticking with them if the contents are going to be objects rather than numbers. numpy object arrays can be tricky. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From zpincus at stanford.edu Wed Jul 25 15:54:48 2007 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed, 25 Jul 2007 15:54:48 -0400 Subject: [Numpy-discussion] Change to get_printoptions? Message-ID: <00F1A9D1-1944-40B1-8505-D818B7BB6B86@stanford.edu> Hello all, I just recently updated to the SVN version of numpy to test my code against it, and found that a small change made to numpy.get_printoptions (it now returns a dictionary instead of a list) breaks my code. Here's the changeset: http://projects.scipy.org/scipy/numpy/changeset/3877 I'm not really looking forward to needing to detect numpy versions just so I can do the right thing with get_printoptions, but I do agree that the new version of the function is more sensible. My question is if there's any particular policy about backwards- incompatible python api changes, or if I need to be aware of their possibility at every point release. (Either is fine -- I'm happy for numpy to be better at the cost of incompatibility, but I'd like to know if changes like these are the rule or exception.) Also, in terms of compatibility checking, has anyone written a little function to check if numpy is within a particular version range? Specifically, one that handles numpy's built from SVN as well as from release tarballs. Thanks, Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine From tmbdev at gmail.com Wed Jul 25 18:47:21 2007 From: tmbdev at gmail.com (Thomas Breuel) Date: Thu, 26 Jul 2007 00:47:21 +0200 Subject: [Numpy-discussion] output arguments In-Reply-To: <46A57717.7060300@gmail.com> References: <7e51d15d0707231741k6412e198xd4192470d2175975@mail.gmail.com> <7e51d15d0707231903u1a991a48wf55e1192f10cafce@mail.gmail.com> <46A57717.7060300@gmail.com> Message-ID: <7e51d15d0707251547v46fa5d0o2f6d59239cf17008@mail.gmail.com> > > For example, it would be nice if "outer" > > supported: > > > > outer(b,c,output=a) > > outer(b,c,increment=a) > > outer(b,c,increment=a,scale=eps) > > > > or maybe one could specify an accumulation ufunc, with addition, > > multiplication, min, and max being fast, and with an optional scale > > parameter. > > What would the increment and scale parameters do? Why should they be part > of the > outer() interface? Have you looked at the .outer() method on the add, > multiply, > minimum, and maximum ufuncs? output=a would put the output of the operation into a increment=a would increment the values in a by the result of the operation increment=a,scale=eps would increment the values in a by the result of multiplying the result by eps. I think a good goal for a numerical scripting language should be to allow people to express common numerical algorithms without extraneous array allocations for intermediate results; with a few dozen such primitives, a lot of native code can be avoided in my experience. > Another approach might be to provide, in addition to the convenient > > high-level NumPy operations, direct bindings for BLAS and/or similar > > libraries, with Fortran-like procedural interfaces, but I can't find any > > such libraries in NumPy or SciPy. Am I missing something? > > scipy.linalg.flapack, etc. Thanks; I'll have to take a look. I don't remember whether LAPACK contains these low-level ops itself. Cheers, Thomas. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Jul 25 19:30:22 2007 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed, 25 Jul 2007 16:30:22 -0700 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> References: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> Message-ID: <46A7DD0E.9090809@noaa.gov> Ray Schumacher wrote: > I agree with others that ctypes might be your best path. Pyrex is a good bet too: http://www.scipy.org/Cookbook/Pyrex_and_NumPy The advantage with pyrex is that you don't have to write any C at all. You will have to use a compiler that is compatible with your Python build. I found MingGW very easy to use with the python.org python2.5 on Windows-- distutils has built-in support for it. http://boodebr.org/main/python/build-windows-extensions -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at ar.media.kyoto-u.ac.jp Wed Jul 25 22:32:18 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 26 Jul 2007 11:32:18 +0900 Subject: [Numpy-discussion] Compile extension modules with Visual Studio 2005 In-Reply-To: <20070725145003.GA28080@clipper.ens.fr> References: <6.2.3.4.2.20070725062658.02eb3830@pop-server.san.rr.com> <20070725134137.GF25804@clipper.ens.fr> <20070725144408.GO8728@mentat.za.net> <20070725145003.GA28080@clipper.ens.fr> Message-ID: <46A807B2.7060102@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > On Wed, Jul 25, 2007 at 04:44:08PM +0200, Stefan van der Walt wrote: >> On Wed, Jul 25, 2007 at 03:41:37PM +0200, Gael Varoquaux wrote: >>> On Wed, Jul 25, 2007 at 06:38:55AM -0700, Ray Schumacher wrote: >>>> The codeGenerator is magic, if you ask me: >>>> http://starship.python.net/crew/theller/ctypes/old/codegen.html > >>> Can it wrap code passing around arrays ? If so it really does magic that >>> I don't understand. > >> If your array is contiguous, it really is only a matter of passing >> along a pointer and dimensions. > >> By writing your C-functions in the form > >> void func(double* data, int rows, int cols, double* out) { } > >> wrapping becomes trivial. > > Yes, I have done this many times. It is trivial and very convenient. I > was just wondering if the code generator could detect this pattern. I don't see either how to magically generate those functions, since C has no concept of arrays (I mean outside a serie of contiguous bytes): if you see the declaration int f(double* in, int rows, int cols), the compiler does not know that it means a double array of size rows * cols, and that the in(i, j) is given by in[i*rows+j]. Actually, you don't know either without reading the source code or code conventions :). Now, if you have always the same convention, I think it is conceptually possible to automatically generate the wrappers, a bit like swig does with typemaps for example (maybe f2py can do it for Fortran code ? I have never used f2py, but I think Fortran has a concept of arrays and matrices ?). David From gerard.vermeulen at grenoble.cnrs.fr Thu Jul 26 02:09:10 2007 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Thu, 26 Jul 2007 08:09:10 +0200 Subject: [Numpy-discussion] ANN: PyQwt-5.0.1 released Message-ID: <20070726080910.75bb9aca@zombie.grenoble.cnrs.fr> What is PyQwt ( http://pyqwt.sourceforge.net ) ? - it is a set of Python bindings for the Qwt C++ class library which extends the Qt framework with widgets for scientific and engineering applications. It provides a widget to plot 2-dimensional data and various widgets to display and control bounded or unbounded floating point values. - it requires and extends PyQt, a set of Python bindings for Qt. - it supports the use of PyQt, Qt, Qwt, and optionally NumPy or SciPy in a GUI Python application or in an interactive Python session. - it runs on POSIX, Mac OS X and Windows platforms (practically any platform supported by Qt and Python). - it plots fast: displaying data with 100,000 points takes about 0.1 s (PyQwt with Qt-3 is faster than with Qt-4). - it is licensed under the GPL with an exception to allow dynamic linking with non-free releases of Qt and PyQt. The most important new features of PyQwt-5.0.1 are: - support for Qt-4.3, SIP-4.7, and PyQt-4.3 - support for the N-D array interface specification ( http://numpy.scipy.org/array_interface.shtml ). - the iqt module also supports an interactive Python session without the help of the GNU readline module. - PyQwt is now part of the PyQt-Py2.5-gpl-4.3 binary installer for Windows ( http://www.riverbankcomputing.com/Downloads/PyQt4/GPL/ ) The most important bug fix in PyQwt-5.0.1 is: - removal of a huge memory leak in the conversion from an array to a QImage. PyQwt-5.0.1 supports: 1. Python-2.5, or -2.4. 2. PyQt-3.17. 3. PyQt-4.3, or PyQt-4.2. 3 SIP-4.7, or SIP-4.6. 4. Qt-3.3, or Qt-3.2. 5. Qt-4.3, or Qt-4.2. 6. Recent versions of NumPy, numarray, and/or Numeric. Enjoy -- Gerard Vermeulen From ludwigbrinckmann at gmail.com Thu Jul 26 04:53:47 2007 From: ludwigbrinckmann at gmail.com (Ludwig M Brinckmann) Date: Thu, 26 Jul 2007 09:53:47 +0100 Subject: [Numpy-discussion] Downsampling a 2D array with min/max and nullvalues Message-ID: <3f7a6e1c0707260153q73dd190epc1a4eafdd716df98@mail.gmail.com> Hi there, I have a 2D array of size, lets say 40000 * 512, which I need to downsample by a step of 4 in the y direction, and step 3 in the x direction, so I would get an array of 10000 * 170 (or 171, the edges do not matter much). But what I need to retain are the maxima and minima in every 4*3 window. I have a solution that works by first clipping off the edges of the array, so that its shape is divisible by 4*3, and then applying a two step process that will compute the min and max in strips --first in the x direction and then in the y-direction through reshaping and reduce (not super elegant, but it seems to work): def downsampleArray1D(data, step, nullValue): print 'downsampling %d' %step if data.shape[1] % step != 0: xlen = data.shape[1] - data.shape[1] % step data = data[...,:xlen] data1d = data.ravel() assert(data1d.shape[0] % step == 0) data1d.shape = (len(data1d)/step, step) maxdata = numpy.maximum.reduce(data1d,1) mindata = numpy.minimum.reduce(data1d,1) mindata.shape = ((data.shape[0],int(data.shape[1] / step))) maxdata.shape = ((data.shape[0],int(data.shape[1] / step))) return mindata, maxdata def downsampleArray2D(data, yStep, xStep, nullValue): # first adjust the data to the step size by ignoring edges if data.shape[0] % yStep != 0: ylen = data.shape[0] - data.shape[0] % yStep data = data[:ylen,...] if data.shape[1] % xStep != 0: xlen = data.shape[1] - data.shape[1] % xStep data = data[...,:xlen] minx, maxx = downsampleArray1D(data, xStep, nullValue) minxt = numpy.transpose(minx) minxt1, dummy = downsampleArray1D(minxt, yStep, nullValue) maxxt = numpy.transpose(maxx) dummy, maxxt1 = downsampleArray1D(maxxt, yStep, nullValue) minimum = numpy.transpose(minxt1) maximum = numpy.transpose(maxxt1) return minimum, maximum Now I need a solution that will ignore null values, i.e. that any value that is equivalent to e.g. -999 ignored in the min and max computation, but if all values in a 4*3 window are nullvalues, then min and max should be set to the null value. Any suggestions? Ludwig -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at shrogers.com Thu Jul 26 07:11:56 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Thu, 26 Jul 2007 05:11:56 -0600 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A6499A.2020608@gmail.com> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A5FE9D.7050205@shrogers.com> <46A6499A.2020608@gmail.com> Message-ID: <46A8817C.6000502@shrogers.com> Robert Kern wrote: > Steven H. Rogers wrote: > >> Robert Kern wrote: >> >>> Steven H. Rogers wrote: >>> >>> >>>> I don't know of any simple build instructions for Windows, but if you're >>>> patient, there will probably be updated packaged releases of SciPy + >>>> NumPy that play well together "real soon now". >>>> >>>> >>> We'll need a volunteer release manager for that, or it won't happen. Most of the >>> principals are very busy right now. >>> >>> >> I thought I saw signs that something would be happening soon. >> > > Real life got in the way. > Yes, life happens. > >> What >> would be the scope of this release manager's responsibilities? >> > > 1) Make a numpy 1.0.3.1 point release. Some changes in 1.0.3-2 which should have > only made changes to the files included in the numpy source tarball also seem to > impair building scipy. We would need to branch from the 1.0.3 tag to fix this. I > don't think the changes to numpy.distutils in the trunk have been vetted enough > for making a numpy 1.0.4 release. Here is some information on the mechanics of this: > > http://projects.scipy.org/scipy/numpy/wiki/MakingReleases > > 2) Make a scipy 0.5.3 release. Deciding what goes in is mostly a matter of > announcing a cutoff date to make sure people don't have half of a refactoring in. > > 3) The release manager will need to make sure that the releases build and run > their test suite on at least the Big Three: Windows, some kind of Linux, and Mac > OS X. Usually, they will need to delegate for some of these platforms. The > binary builds for Windows should be provided for download along with the source. > > 4) Binaries: Windows binaries are pretty much necessary. If possible, try to use > an ATLAS library that does *not* use SSE2 instructions. We have had problems > with people getting segfaults on older hardware. scipy binaries should *not* > include FFTW or UMFPACK since they are GPLed. I can help with OS X binaries now > that I've finally figured out how to get scipy to link statically against the > gfortran runtime. > > 5) The tarballs and binaries should be uploaded to the Sourceforge site. The > Cheeseshop records should be updated to record the new versions. An announcement > should be made to python-announce and the relevant mailing lists. > > Unfortunately, I can't see committing to something like this now. I might be able to start in November if no one else volunteers. From david at ar.media.kyoto-u.ac.jp Thu Jul 26 07:34:19 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 26 Jul 2007 20:34:19 +0900 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A8817C.6000502@shrogers.com> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A5FE9D.7050205@shrogers.com> <46A6499A.2020608@gmail.com> <46A8817C.6000502@shrogers.com> Message-ID: <46A886BB.7070403@ar.media.kyoto-u.ac.jp> Steven H. Rogers wrote: > Robert Kern wrote: > >> Steven H. Rogers wrote: >> >> >>> Robert Kern wrote: >>> >>> >>>> Steven H. Rogers wrote: >>>> >>>> >>>> >>>>> I don't know of any simple build instructions for Windows, but if you're >>>>> patient, there will probably be updated packaged releases of SciPy + >>>>> NumPy that play well together "real soon now". >>>>> >>>>> >>>>> >>>> We'll need a volunteer release manager for that, or it won't happen. Most of the >>>> principals are very busy right now. >>>> >>>> >>>> >>> I thought I saw signs that something would be happening soon. >>> >>> >> Real life got in the way. >> >> > Yes, life happens. > >> >> >>> What >>> would be the scope of this release manager's responsibilities? >>> >>> >> 1) Make a numpy 1.0.3.1 point release. Some changes in 1.0.3-2 which should have >> only made changes to the files included in the numpy source tarball also seem to >> impair building scipy. We would need to branch from the 1.0.3 tag to fix this. I >> don't think the changes to numpy.distutils in the trunk have been vetted enough >> for making a numpy 1.0.4 release. Here is some information on the mechanics of this: >> >> http://projects.scipy.org/scipy/numpy/wiki/MakingReleases >> >> 2) Make a scipy 0.5.3 release. Deciding what goes in is mostly a matter of >> announcing a cutoff date to make sure people don't have half of a refactoring in. >> >> 3) The release manager will need to make sure that the releases build and run >> their test suite on at least the Big Three: Windows, some kind of Linux, and Mac >> OS X. Usually, they will need to delegate for some of these platforms. The >> binary builds for Windows should be provided for download along with the source. >> >> 4) Binaries: Windows binaries are pretty much necessary. If possible, try to use >> an ATLAS library that does *not* use SSE2 instructions. We have had problems >> with people getting segfaults on older hardware. scipy binaries should *not* >> include FFTW or UMFPACK since they are GPLed. I can help with OS X binaries now >> that I've finally figured out how to get scipy to link statically against the >> gfortran runtime. >> >> 5) The tarballs and binaries should be uploaded to the Sourceforge site. The >> Cheeseshop records should be updated to record the new versions. An announcement >> should be made to python-announce and the relevant mailing lists. >> >> >> I am willing to volunteer for the scipy part: I have quite extensive experience with building on linux now, and I can now build on windows without too much difficulties (I mean hardware-wise). Concerning the release date: it basically means giving enough time to solve the current bugs, right ? I solved a few bugs from the 0.5.3 milestone, but some of them are outside my knowledge (weave, linalg bugs which depend on fortran code). David From robert.kern at gmail.com Thu Jul 26 14:58:31 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Jul 2007 13:58:31 -0500 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A886BB.7070403@ar.media.kyoto-u.ac.jp> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A5FE9D.7050205@shrogers.com> <46A6499A.2020608@gmail.com> <46A8817C.6000502@shrogers.com> <46A886BB.7070403@ar.media.kyoto-u.ac.jp> Message-ID: <46A8EED7.6080006@gmail.com> David Cournapeau wrote: > I am willing to volunteer for the scipy part: I have quite extensive > experience with building on linux now, and I can now build on windows > without too much difficulties (I mean hardware-wise). > > Concerning the release date: it basically means giving enough time to > solve the current bugs, right ? There are too many. Build bugs should be fixed and anything that impairs the functioning of whole packages. Incorporating patches already submitted would be the next priority. Fixing isolated little bugs can be pushed back. > I solved a few bugs from the 0.5.3 > milestone, but some of them are outside my knowledge (weave, linalg bugs > which depend on fortran code). You're much too modest. That was hardly "a few bugs." You've been a great help. Thank you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Thu Jul 26 15:13:21 2007 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 26 Jul 2007 09:13:21 -1000 Subject: [Numpy-discussion] Downsampling a 2D array with min/max and nullvalues In-Reply-To: <3f7a6e1c0707260153q73dd190epc1a4eafdd716df98@mail.gmail.com> References: <3f7a6e1c0707260153q73dd190epc1a4eafdd716df98@mail.gmail.com> Message-ID: <46A8F251.2050105@hawaii.edu> Ludwig, Masked arrays will do exactly what you want. You have your choice of the numpy.ma version or the external maskedarray class. Eric Ludwig M Brinckmann wrote: > Hi there, > > I have a 2D array of size, lets say 40000 * 512, which I need to > downsample by a step of 4 in the y direction, and step 3 in the x > direction, so I would get an array of 10000 * 170 (or 171, the edges do > not matter much). But what I need to retain are the maxima and minima in > every 4*3 window. > > I have a solution that works by first clipping off the edges of the > array, so that its shape is divisible by 4*3, and then applying a two > step process that will compute the min and max in strips --first in the > x direction and then in the y-direction through reshaping and reduce > (not super elegant, but it seems to work): > > > def downsampleArray1D(data, step, nullValue): > print 'downsampling %d' %step > if data.shape[1] % step != 0: > xlen = data.shape[1] - data.shape[1] % step > data = data[...,:xlen] > data1d = data.ravel() > assert(data1d.shape[0] % step == 0) > data1d.shape = (len(data1d)/step, step) > maxdata = numpy.maximum.reduce(data1d,1) > mindata = numpy.minimum.reduce(data1d,1) > mindata.shape = ((data.shape[0],int(data.shape[1] / step))) > maxdata.shape = ((data.shape[0],int(data.shape[1] / step))) > return mindata, maxdata > > def downsampleArray2D(data, yStep, xStep, nullValue): > # first adjust the data to the step size by ignoring edges > if data.shape[0] % yStep != 0: > ylen = data.shape[0] - data.shape[0] % yStep > data = data[:ylen,...] > if data.shape[1] % xStep != 0: > xlen = data.shape[1] - data.shape[1] % xStep > data = data[...,:xlen] > minx, maxx = downsampleArray1D(data, xStep, nullValue) > minxt = numpy.transpose(minx) > minxt1, dummy = downsampleArray1D(minxt, yStep, nullValue) > > maxxt = numpy.transpose(maxx) > dummy, maxxt1 = downsampleArray1D(maxxt, yStep, nullValue) > > minimum = numpy.transpose(minxt1) > maximum = numpy.transpose(maxxt1) > > return minimum, maximum > > > Now I need a solution that will ignore null values, i.e. that any value > that is equivalent to e.g. -999 ignored in the min and max computation, > but if all values in a 4*3 window are nullvalues, then min and max > should be set to the null value. > > Any suggestions? > > Ludwig > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From kwmsmith at gmail.com Thu Jul 26 15:15:50 2007 From: kwmsmith at gmail.com (Kurt Smith) Date: Thu, 26 Jul 2007 14:15:50 -0500 Subject: [Numpy-discussion] Possible bug -- importing numpy before f2py compiled module gives seg fault In-Reply-To: References: <1185319236.245847.35000@57g2000hsv.googlegroups.com> Message-ID: Hello - I've come up with the following test case to illustrate my problem: file empty.f: subroutine empty(arr, nx,ny ) implicit none integer, intent(in) :: nx,ny real, dimension(nx,ny), intent(out) :: arr print *, "in empty." arr = 1.0e0 end subroutine empty using the following to compile: f2py -c -m empty -lSystemStubs --opt=-O3 --fcompiler=ibm --f90exec=/ opt/ibmcmp/xlf/8.1/bin/xlf90 --f90flags=-O3 empty.f Builds empty.so just fine. Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin >>> from empty import empty as myempty >>> from numpy import * # for illustration purposes. >>> myempty(1,1) in empty. array([[ 1.]], dtype=float32) >>> But when I import numpy first, I get the following: Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin >>> from numpy import * >>> from empty import empty as myempty >>> myempty(1,1) Segmentation fault $ Platform: Mac OS X, 10.4.10, using ibm-xlfortran v. 8.1. $ f2py -v 2_3844 From schaffer at optonline.net Thu Jul 26 15:47:34 2007 From: schaffer at optonline.net (Les Schaffer) Date: Thu, 26 Jul 2007 15:47:34 -0400 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A4D1E3.4040501@optonline.net> Message-ID: <46A8FA56.7030209@optonline.net> Matthieu Brucher wrote: > Windows binaries must be compiled with the same compiler as Python, so > it is (sadly IMHO) Visual 2003. Well in fact, I could install one if > needed (I have a licence) > > i am going to see if my department can grab a license. if so, i would be willing to collaborate to keep Numpy/Scipy current on Windows. even though i prefer linux for my own work, i find my physics students haven't all made the switch to the light side. Les From zyzhu2000 at gmail.com Thu Jul 26 18:28:04 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Thu, 26 Jul 2007 17:28:04 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? Message-ID: Hi Everyone, I finally build a C extension. The one problem I found is that it is too picky about the input. For example, it accepts array([1.0,2.0,3.0]) with no problem, but when I pass in array([1,2,3]), since the dtype of the array is now int, my extension does not like it. How do I handle this situation? Is there any way to access any data type that can be converted into a double? Thanks, cg From robert.kern at gmail.com Thu Jul 26 18:34:29 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Jul 2007 17:34:29 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? In-Reply-To: References: Message-ID: <46A92175.3090300@gmail.com> Geoffrey Zhu wrote: > Hi Everyone, > > I finally build a C extension. The one problem I found is that it is > too picky about the input. For example, it accepts > array([1.0,2.0,3.0]) with no problem, but when I pass in > array([1,2,3]), since the dtype of the array is now int, my extension > does not like it. Okay. Show us the code that you are using, and we can help you find a better way. > How do I handle this situation? Is there any way to access any data > type that can be converted into a double? I usually use PyArray_FROM_OTF(). That handles the usual cases. It's pretty much like starting off a pure Python function with asarray(x, dtype=whatever). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From zyzhu2000 at gmail.com Thu Jul 26 18:38:06 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Thu, 26 Jul 2007 17:38:06 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? In-Reply-To: <46A92175.3090300@gmail.com> References: <46A92175.3090300@gmail.com> Message-ID: On 7/26/07, Robert Kern wrote: > Geoffrey Zhu wrote: > > Hi Everyone, > > > > I finally build a C extension. The one problem I found is that it is > > too picky about the input. For example, it accepts > > array([1.0,2.0,3.0]) with no problem, but when I pass in > > array([1,2,3]), since the dtype of the array is now int, my extension > > does not like it. > > Okay. Show us the code that you are using, and we can help you find a better way. > > > How do I handle this situation? Is there any way to access any data > > type that can be converted into a double? > > I usually use PyArray_FROM_OTF(). That handles the usual cases. It's pretty much > like starting off a pure Python function with asarray(x, dtype=whatever). > That is going to make a copy of the memory every time and might slow down things a lot? From robert.kern at gmail.com Thu Jul 26 18:43:57 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Jul 2007 17:43:57 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? In-Reply-To: References: <46A92175.3090300@gmail.com> Message-ID: <46A923AD.2040905@gmail.com> Geoffrey Zhu wrote: > On 7/26/07, Robert Kern wrote: >> Geoffrey Zhu wrote: >>> Hi Everyone, >>> >>> I finally build a C extension. The one problem I found is that it is >>> too picky about the input. For example, it accepts >>> array([1.0,2.0,3.0]) with no problem, but when I pass in >>> array([1,2,3]), since the dtype of the array is now int, my extension >>> does not like it. >> Okay. Show us the code that you are using, and we can help you find a better way. >> >>> How do I handle this situation? Is there any way to access any data >>> type that can be converted into a double? >> I usually use PyArray_FROM_OTF(). That handles the usual cases. It's pretty much >> like starting off a pure Python function with asarray(x, dtype=whatever). >> > That is going to make a copy of the memory every time and might slow > down things a lot? Not if you pass it an array with the requested properties. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From zyzhu2000 at gmail.com Thu Jul 26 19:39:31 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Thu, 26 Jul 2007 18:39:31 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? In-Reply-To: <46A923AD.2040905@gmail.com> References: <46A92175.3090300@gmail.com> <46A923AD.2040905@gmail.com> Message-ID: > >>> How do I handle this situation? Is there any way to access any data > >>> type that can be converted into a double? > >> I usually use PyArray_FROM_OTF(). That handles the usual cases. It's pretty much > >> like starting off a pure Python function with asarray(x, dtype=whatever). > >> > > That is going to make a copy of the memory every time and might slow > > down things a lot? > > Not if you pass it an array with the requested properties. > Neat. Do you know if PyArray_FROM_OTF() increments the reference count of the returned object? The documentation does not say. My guess is yes. From robert.kern at gmail.com Thu Jul 26 20:03:11 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Jul 2007 19:03:11 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? In-Reply-To: References: <46A92175.3090300@gmail.com> <46A923AD.2040905@gmail.com> Message-ID: <46A9363F.4040805@gmail.com> Geoffrey Zhu wrote: >>>>> How do I handle this situation? Is there any way to access any data >>>>> type that can be converted into a double? >>>> I usually use PyArray_FROM_OTF(). That handles the usual cases. It's pretty much >>>> like starting off a pure Python function with asarray(x, dtype=whatever). >>>> >>> That is going to make a copy of the memory every time and might slow >>> down things a lot? >> Not if you pass it an array with the requested properties. > > Neat. Do you know if PyArray_FROM_OTF() increments the reference count > of the returned object? The documentation does not say. My guess is > yes. Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant.travis at ieee.org Thu Jul 26 21:07:39 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu, 26 Jul 2007 19:07:39 -0600 Subject: [Numpy-discussion] Change to get_printoptions? In-Reply-To: <00F1A9D1-1944-40B1-8505-D818B7BB6B86@stanford.edu> References: <00F1A9D1-1944-40B1-8505-D818B7BB6B86@stanford.edu> Message-ID: <46A9455B.6020302@ieee.org> Zachary Pincus wrote: > Hello all, > > I just recently updated to the SVN version of numpy to test my code > against it, and found that a small change made to > numpy.get_printoptions (it now returns a dictionary instead of a > list) breaks my code. > > Here's the changeset: > http://projects.scipy.org/scipy/numpy/changeset/3877 > > I'm not really looking forward to needing to detect numpy versions > just so I can do the right thing with get_printoptions, but I do > agree that the new version of the function is more sensible. My > question is if there's any particular policy about backwards- > incompatible python api changes, or if I need to be aware of their > possibility at every point release. (Either is fine -- I'm happy for > numpy to be better at the cost of incompatibility, but I'd like to > know if changes like these are the rule or exception.) At this point, changes like you experienced should be the exception. But, occasionally they will happen. We really try to document them when they occur but this relies on those of us who make the changes making adequate note about them. When the version jumps to 1.1, then there may be more incompatibilities but they should be documented. Thanks for the note and the reminder. -Travis From zyzhu2000 at gmail.com Thu Jul 26 21:14:46 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Thu, 26 Jul 2007 20:14:46 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? In-Reply-To: <46A9363F.4040805@gmail.com> References: <46A92175.3090300@gmail.com> <46A923AD.2040905@gmail.com> <46A9363F.4040805@gmail.com> Message-ID: On 7/26/07, Robert Kern wrote: > Geoffrey Zhu wrote: > >>>>> How do I handle this situation? Is there any way to access any data > >>>>> type that can be converted into a double? > >>>> I usually use PyArray_FROM_OTF(). That handles the usual cases. It's pretty much > >>>> like starting off a pure Python function with asarray(x, dtype=whatever). > >>>> > >>> That is going to make a copy of the memory every time and might slow > >>> down things a lot? > >> Not if you pass it an array with the requested properties. > > > > Neat. Do you know if PyArray_FROM_OTF() increments the reference count > > of the returned object? The documentation does not say. My guess is > > yes. > > Yes. Hi Robert, This is probably off the topic. Do you know such a function for regular python objects? For example, I know a PyObject is a number, but I don't know the exact type. Is there any quick way to convert it to a C double type? Thanks, cg From robert.kern at gmail.com Thu Jul 26 21:17:40 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Jul 2007 20:17:40 -0500 Subject: [Numpy-discussion] C-extension -- how do I accept a vector of both type double and type int? In-Reply-To: References: <46A92175.3090300@gmail.com> <46A923AD.2040905@gmail.com> <46A9363F.4040805@gmail.com> Message-ID: <46A947B4.2090506@gmail.com> Geoffrey Zhu wrote: > This is probably off the topic. Do you know such a function for > regular python objects? For example, I know a PyObject is a number, > but I don't know the exact type. Is there any quick way to convert it > to a C double type? I just answered your question on the python-list. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Thu Jul 26 21:34:47 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 27 Jul 2007 10:34:47 +0900 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A8EED7.6080006@gmail.com> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A5FE9D.7050205@shrogers.com> <46A6499A.2020608@gmail.com> <46A8817C.6000502@shrogers.com> <46A886BB.7070403@ar.media.kyoto-u.ac.jp> <46A8EED7.6080006@gmail.com> Message-ID: <46A94BB7.3070401@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > David Cournapeau wrote: > > >> I am willing to volunteer for the scipy part: I have quite extensive >> experience with building on linux now, and I can now build on windows >> without too much difficulties (I mean hardware-wise). >> >> Concerning the release date: it basically means giving enough time to >> solve the current bugs, right ? >> > > There are too many. Build bugs should be fixed and anything that impairs the > functioning of whole packages. Incorporating patches already submitted would be > the next priority. Fixing isolated little bugs can be pushed back. > > I thought that releasing something before the end of summer would be a good release date: a new release is available before the beginning of the new "university" year. Would you agree on a date like end of august ? (if I become the release manager, this is also more compatible with my schedule). For the bugs, I was not talking about all the bugs in trac, but the ones in 0.5.3 milestone (10-11 bugs, I think). David From nwagner at iam.uni-stuttgart.de Fri Jul 27 02:42:35 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 27 Jul 2007 08:42:35 +0200 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A94BB7.3070401@ar.media.kyoto-u.ac.jp> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A5FE9D.7050205@shrogers.com> <46A6499A.2020608@gmail.com> <46A8817C.6000502@shrogers.com> <46A886BB.7070403@ar.media.kyoto-u.ac.jp> <46A8EED7.6080006@gmail.com> <46A94BB7.3070401@ar.media.kyoto-u.ac.jp> Message-ID: <46A993DB.7080706@iam.uni-stuttgart.de> David Cournapeau wrote: > Robert Kern wrote: > >> David Cournapeau wrote: >> >> >> >>> I am willing to volunteer for the scipy part: I have quite extensive >>> experience with building on linux now, and I can now build on windows >>> without too much difficulties (I mean hardware-wise). >>> >>> Concerning the release date: it basically means giving enough time to >>> solve the current bugs, right ? >>> >>> >> There are too many. Build bugs should be fixed and anything that impairs the >> functioning of whole packages. Incorporating patches already submitted would be >> the next priority. Fixing isolated little bugs can be pushed back. >> >> >> > I thought that releasing something before the end of summer would be a > good release date: a new release is available before the beginning of > the new "university" year. Would you agree on a date like end of august > ? (if I become the release manager, this is also more compatible with my > schedule). > > For the bugs, I was not talking about all the bugs in trac, but the ones > in 0.5.3 milestone (10-11 bugs, I think). > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Actually 8 tickets http://projects.scipy.org/scipy/scipy/query?status=new&status=assigned&status=reopened&milestone=0.5.3+Release Ticket #389 can be closed. It's already fixed. AFAIK Dmitrey is working on ticket #464. I didn't check the patch by bart for ticket #360. IMHO ticket #406 is not so important for 0.5.3 release. I cannot reproduce the problem concerning #401. It is Mac specific problem. Am I missing something ? Nils From ludwigbrinckmann at gmail.com Fri Jul 27 06:15:21 2007 From: ludwigbrinckmann at gmail.com (Ludwig M Brinckmann) Date: Fri, 27 Jul 2007 11:15:21 +0100 Subject: [Numpy-discussion] Bug with MA and reduce? Message-ID: <3f7a6e1c0707270315y15865b8ay292391d752023303@mail.gmail.com> I have ma.minimum.reduce return a minimum value that does not exist in the array. The following code prints -1 as the minimum of the MA, I believe it should be 1. import numpy shape = (100) data = numpy.ones(shape, numpy.int16) data[2:40] = 3 # dummy data data[45:70] = -999 # null values mask = numpy.ma.make_mask_none(data.shape) mask[data == -999] = True ma = numpy.ma.MaskedArray(data, mask = mask) min = numpy.ma.minimum.reduce(ma,0) print min Am I doing something really stupid here? Ludwig -------------- next part -------------- An HTML attachment was scrubbed... URL: From ludwigbrinckmann at gmail.com Fri Jul 27 10:35:34 2007 From: ludwigbrinckmann at gmail.com (Ludwig M Brinckmann) Date: Fri, 27 Jul 2007 15:35:34 +0100 Subject: [Numpy-discussion] Bug with MA and reduce? In-Reply-To: <3f7a6e1c0707270315y15865b8ay292391d752023303@mail.gmail.com> References: <3f7a6e1c0707270315y15865b8ay292391d752023303@mail.gmail.com> Message-ID: <3f7a6e1c0707270735q1bab53fbj1442891f92749ebb@mail.gmail.com> This is a follow-up to an earlier mail that reported a suspected bug in the reduce/minimum operation of numpy.ma. I have tried the same code with the scipy sandbox maskedarray implementation and that gives me the correct output. For comparison: # import numpy.core.ma as MA import maskedarray as MA shape = (100) data = numpy.ones(shape, numpy.int16) data[2:40] = 3 data[45:70] = -999 mask = MA.make_mask_none(data.shape) mask[data == -999] = True ma = MA.MaskedArray(data, mask = mask) min = MA.minimum.reduce(ma,0) print min With maskedarray I get, as expected 1, with numpy.core.ma I get -1, a value that is not in the array. I am using Python 2.44 on XP, the maskedarray is the svn latest, the numpy.core.ma was 1.0.2, but I have tested it with only the current svn version of ma.py and it produces the wrong output. Ludwig On 27/07/07, Ludwig M Brinckmann wrote: > > I have ma.minimum.reduce return a minimum value that does not exist in the > array. > > The following code prints -1 as the minimum of the MA, I believe it should > be 1. > > import numpy > shape = (100) > data = numpy.ones (shape, numpy.int16) > data[2:40] = 3 # dummy data > data[45:70] = -999 # null values > mask = numpy.ma.make_mask_none(data.shape) > mask[data == -999] = True > ma = numpy.ma.MaskedArray(data, mask = mask) > min = numpy.ma.minimum.reduce(ma,0) > print min > > Am I doing something really stupid here? > > Ludwig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Jul 27 10:38:47 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri, 27 Jul 2007 16:38:47 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform Message-ID: <20070727143847.GC7447@mentat.za.net> Hi all, The build is still failing on winXP 64-bit, as shown on the buildbot page http://buildbot.scipy.org/Windows%20XP%20x86_64%20MSVC/builds/25/step-shell/0 with the error AttributeError: MSVCCompiler instance has no attribute '_MSVCCompiler__root' Could someone familiar with the MSVC compilers please take a look? Thanks St?fan From pearu at cens.ioc.ee Fri Jul 27 10:54:45 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri, 27 Jul 2007 16:54:45 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <20070727143847.GC7447@mentat.za.net> References: <20070727143847.GC7447@mentat.za.net> Message-ID: <46AA0735.7020003@cens.ioc.ee> Stefan van der Walt wrote: > Hi all, > > The build is still failing on winXP 64-bit, as shown on the buildbot > page > > http://buildbot.scipy.org/Windows%20XP%20x86_64%20MSVC/builds/25/step-shell/0 > > with the error > > AttributeError: MSVCCompiler instance has no attribute '_MSVCCompiler__root' > > Could someone familiar with the MSVC compilers please take a look? I think the problem is in the environment of the buildbot machine `Windows XP x86_64 MSVC`. Basically, I would try setting the following environment variables in this machine: DISTUTILS_USE_SDK and MSSdk Then the build might succeed. For more information, read the code in Python distutils/msvccompiler.py file. Pearu From stefan at sun.ac.za Fri Jul 27 11:16:52 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri, 27 Jul 2007 17:16:52 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <46AA0735.7020003@cens.ioc.ee> References: <20070727143847.GC7447@mentat.za.net> <46AA0735.7020003@cens.ioc.ee> Message-ID: <20070727151652.GE7447@mentat.za.net> On Fri, Jul 27, 2007 at 04:54:45PM +0200, Pearu Peterson wrote: > > > Stefan van der Walt wrote: > > Hi all, > > > > The build is still failing on winXP 64-bit, as shown on the buildbot > > page > > > > http://buildbot.scipy.org/Windows%20XP%20x86_64%20MSVC/builds/25/step-shell/0 > > > > with the error > > > > AttributeError: MSVCCompiler instance has no attribute '_MSVCCompiler__root' > > > > Could someone familiar with the MSVC compilers please take a look? > > I think the problem is in the environment of the buildbot machine > `Windows XP x86_64 MSVC`. Basically, I would try setting the following > environment variables in this machine: > DISTUTILS_USE_SDK and MSSdk > Then the build might succeed. > > For more information, read the code in Python distutils/msvccompiler.py > file. Thanks, Pearu -- I'll take a look. Why the uninformative error message, though? Isn't distutils supposed to automagically detect the MSVC compiler? Regards St?fan From pearu at cens.ioc.ee Fri Jul 27 11:35:39 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri, 27 Jul 2007 17:35:39 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <20070727151652.GE7447@mentat.za.net> References: <20070727143847.GC7447@mentat.za.net> <46AA0735.7020003@cens.ioc.ee> <20070727151652.GE7447@mentat.za.net> Message-ID: <46AA10CB.6060206@cens.ioc.ee> Stefan van der Walt wrote: > On Fri, Jul 27, 2007 at 04:54:45PM +0200, Pearu Peterson wrote: >> >> Stefan van der Walt wrote: >>> Hi all, >>> >>> The build is still failing on winXP 64-bit, as shown on the buildbot >>> page >>> >>> http://buildbot.scipy.org/Windows%20XP%20x86_64%20MSVC/builds/25/step-shell/0 >>> >>> with the error >>> >>> AttributeError: MSVCCompiler instance has no attribute '_MSVCCompiler__root' >>> >>> Could someone familiar with the MSVC compilers please take a look? >> I think the problem is in the environment of the buildbot machine >> `Windows XP x86_64 MSVC`. Basically, I would try setting the following >> environment variables in this machine: >> DISTUTILS_USE_SDK and MSSdk >> Then the build might succeed. >> >> For more information, read the code in Python distutils/msvccompiler.py >> file. > > Thanks, Pearu -- I'll take a look. Why the uninformative error > message, though? Isn't distutils supposed to automagically detect the > MSVC compiler? I think this is bug in Python distutils/msvccompiler.py. Let me know if defining these variables work, then we can implement a workaround or show more informative messages on failure. Note that one may get such an error only on AMD64 platform, the MSVC compiler uses other code on Intel machines and that works automagically indeed. Pearu From zpincus at stanford.edu Fri Jul 27 12:58:04 2007 From: zpincus at stanford.edu (Zachary Pincus) Date: Fri, 27 Jul 2007 12:58:04 -0400 Subject: [Numpy-discussion] getting numPy happening for sciPy In-Reply-To: <46A993DB.7080706@iam.uni-stuttgart.de> References: <46A487F0.7070404@d2.net.au> <46A4A50D.80503@shrogers.com> <46A4D121.9040603@gmail.com> <46A5FE9D.7050205@shrogers.com> <46A6499A.2020608@gmail.com> <46A8817C.6000502@shrogers.com> <46A886BB.7070403@ar.media.kyoto-u.ac.jp> <46A8EED7.6080006@gmail.com> <46A94BB7.3070401@ar.media.kyoto-u.ac.jp> <46A993DB.7080706@iam.uni-stuttgart.de> Message-ID: On Jul 27, 2007, at 2:42 AM, Nils Wagner wrote: > I cannot reproduce the problem concerning #401. It is Mac specific > problem. Am I missing something ? I can't reproduce this problem either. I just yesterday built scipy from SVN on two different OS X 10.4.10 boxes, one using the fortran compiler from hpc.sourceforge.net (not the latest 2007 release, but the december 2006 one), and the other using the compiler from r.research.att.com/tools. Everything else was similar, and everything worked fine with regard to ticket 401. On the other hand, when I tried to compile scipy using the latest (2007-05) gfortran from hpc.sourceforge.net, I got bizarre link errors about MACOSX_DEPLOYMENT_TARGET being set incorrectly. (See previous email here http://projects.scipy.org/pipermail/scipy-user/ 2007-June/012542.html ). Interestingly, with the earlier version of gfortran from hpc.sourceforge.net, and with the r.research.att.com/ tools version, this problem does not arise. Anyhow, my point is that there are still odd linker errors (as in ticket 401) lurking that may or may not have anything to do with scipy per se, but might have to do with odd and perhaps buggy builds of gfortran. Feh -- I wish Apple would just start including a fortran compiler with the rest of their dev tools. The situation otherwise is not good. Zach From efiring at hawaii.edu Fri Jul 27 14:05:21 2007 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 27 Jul 2007 08:05:21 -1000 Subject: [Numpy-discussion] Bug with MA and reduce? In-Reply-To: <3f7a6e1c0707270735q1bab53fbj1442891f92749ebb@mail.gmail.com> References: <3f7a6e1c0707270315y15865b8ay292391d752023303@mail.gmail.com> <3f7a6e1c0707270735q1bab53fbj1442891f92749ebb@mail.gmail.com> Message-ID: <46AA33E1.6090509@hawaii.edu> Ludwig M Brinckmann wrote: > This is a follow-up to an earlier mail that reported a suspected bug in > the reduce/minimum operation of numpy.ma . > > I have tried the same code with the scipy sandbox maskedarray > implementation and that gives me the correct output. For comparison: Yes, I think I see where the bug is coming from in numpy.ma; it is assigning fill values for min and max operations as if all integer types were the system default integer. Maskedarray uses a dictionary to assign fill values for these operations, so it correctly takes into account the type. Maskedarray has had much more recent development and maintenance than ma, and may replace it in numpy 1.1. I have not seen any objections to this proposal. From my standpoint, the sooner it happens, the better--unless someone raises a fundamental objection to the approach maskedarray is taking. Eric > > # import numpy.core.ma as MA > import maskedarray as MA > shape = (100) > data = numpy.ones(shape, numpy.int16) > data[2:40] = 3 > data[45:70] = -999 > mask = MA.make_mask_none (data.shape) > mask[data == -999] = True > ma = MA.MaskedArray(data, mask = mask) > min = MA.minimum.reduce(ma,0) > print min > > With maskedarray I get, as expected 1, with numpy.core.ma > I get -1, a value that is not in the array. > > I am using Python 2.44 on XP, the maskedarray is the svn latest, the > numpy.core.ma was 1.0.2, but I have tested it > with only the current svn version of ma.py and it produces the wrong output. > > > Ludwig > > > > On 27/07/07, *Ludwig M Brinckmann* > wrote: > > I have ma.minimum.reduce return a minimum value that does not exist > in the array. > > The following code prints -1 as the minimum of the MA, I believe it > should be 1. > > import numpy > shape = (100) > data = numpy.ones (shape, numpy.int16) > data[2:40] = 3 # dummy data > data[45:70] = -999 # null values > mask = numpy.ma.make_mask_none(data.shape) > mask[data == -999] = True > ma = numpy.ma.MaskedArray(data, mask = mask) > min = numpy.ma.minimum.reduce(ma,0) > print min > > Am I doing something really stupid here? > > Ludwig > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From pearu at cens.ioc.ee Fri Jul 27 18:54:52 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 28 Jul 2007 00:54:52 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <46AA10CB.6060206@cens.ioc.ee> References: <20070727143847.GC7447@mentat.za.net> <46AA0735.7020003@cens.ioc.ee> <20070727151652.GE7447@mentat.za.net> <46AA10CB.6060206@cens.ioc.ee> Message-ID: <46AA77BC.6080408@cens.ioc.ee> Ok, I have now enabled DISTUTILS_USE_SDK for AMD64 Windows platform and it seems working.. However, the build still fails but now the reason seems to be related to numpy ticket 164: http://projects.scipy.org/scipy/numpy/ticket/164 Pearu I think buildbot is great! From stefan at sun.ac.za Fri Jul 27 20:28:51 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sat, 28 Jul 2007 02:28:51 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <46AA77BC.6080408@cens.ioc.ee> References: <20070727143847.GC7447@mentat.za.net> <46AA0735.7020003@cens.ioc.ee> <20070727151652.GE7447@mentat.za.net> <46AA10CB.6060206@cens.ioc.ee> <46AA77BC.6080408@cens.ioc.ee> Message-ID: <20070728002850.GF7447@mentat.za.net> On Sat, Jul 28, 2007 at 12:54:52AM +0200, Pearu Peterson wrote: > Ok, I have now enabled DISTUTILS_USE_SDK for > AMD64 Windows platform and it seems working.. Fantastic, thanks! > However, the build still fails but now the > reason seems to be related to numpy ticket 164: > > http://projects.scipy.org/scipy/numpy/ticket/164 I'll ask Albert whether he would have a look at it again. Cheers St?fan From fullung at gmail.com Fri Jul 27 23:19:03 2007 From: fullung at gmail.com (Albert Strasheim) Date: Sat, 28 Jul 2007 05:19:03 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <20070728002850.GF7447@mentat.za.net> References: <20070727143847.GC7447@mentat.za.net> <46AA0735.7020003@cens.ioc.ee> <20070727151652.GE7447@mentat.za.net> <46AA10CB.6060206@cens.ioc.ee> <46AA77BC.6080408@cens.ioc.ee> <20070728002850.GF7447@mentat.za.net> Message-ID: <20070728031903.GA25416@dogbert.sdsl.sun.ac.za> Hello all On Sat, 28 Jul 2007, Stefan van der Walt wrote: > On Sat, Jul 28, 2007 at 12:54:52AM +0200, Pearu Peterson wrote: > > Ok, I have now enabled DISTUTILS_USE_SDK for > > AMD64 Windows platform and it seems working.. > > Fantastic, thanks! > > > However, the build still fails but now the > > reason seems to be related to numpy ticket 164: > > > > http://projects.scipy.org/scipy/numpy/ticket/164 > > I'll ask Albert whether he would have a look at it again. Let's see. Using this build log: http://buildbot.scipy.org/Windows%20XP%20x86_64%20MSVC/builds/31/step-shell/0 numpy\core\src\umathmodule.c.src(73) : warning C4273: 'logf' : inconsistent dll linkage numpy\core\src\umathmodule.c.src(74) : warning C4273: 'sqrtf' : inconsistent dll linkage Judging from the math.h on my 32-bit system, these declarations should look like this: float __cdecl logf(float); float __cdecl sqrtf(float); but they're missing the __cdecl in the NumPy code. Somewhere a macro needs to be defined to __cdecl on Windows (and left empty on other platforms) and including in the NumPy declarations. numpy\core\src\umathmodule.c.src(604) : warning C4013: 'fabsf' undefined; assuming extern returning int numpy\core\src\umathmodule.c.src(604) : warning C4013: 'hypotf' undefined; assuming extern returning int Judging from the patch attached to ticket #164, these functions aren't available for some reason. Maybe check the header to see if there's a way to turn them on using some preprocessor magic. If not, do what the patch does. numpy\core\src\umathmodule.c.src(604) : warning C4244: 'function' : conversion from 'int' to 'float', possible loss of data A cast should suppress this warning. numpy\core\src\umathmodule.c.src(625) : warning C4013: 'rintf' undefined; assuming extern returning int Add this function like the patch does. numpy\core\src\umathmodule.c.src(625) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(626) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(632) : warning C4244: 'initializing' : conversion from 'int' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(641) : warning C4244: 'initializing' : conversion from 'int' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(1107) : warning C4244: '=' : conversion from 'double' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(1107) : warning C4244: '=' : conversion from 'double' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(1107) : warning C4244: '=' : conversion from 'double' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(1107) : warning C4244: '=' : conversion from 'double' to 'float', possible loss of data numpy\core\src\umathmodule.c.src(1349) : warning C4244: '=' : conversion from 'npy_longlong' to 'double', possible loss of data numpy\core\src\umathmodule.c.src(1350) : warning C4244: '=' : conversion from 'npy_longlong' to 'double', possible loss of data numpy\core\src\umathmodule.c.src(1349) : warning C4244: '=' : conversion from 'npy_ulonglong' to 'double', possible loss of data numpy\core\src\umathmodule.c.src(1350) : warning C4244: '=' : conversion from 'npy_ulonglong' to 'double', possible loss of data More casts probably. numpy\core\src\umathmodule.c.src(1583) : warning C4146: unary minus operator applied to unsigned type, result still unsigned numpy\core\src\umathmodule.c.src(1583) : warning C4146: unary minus operator applied to unsigned type, result still unsigned numpy\core\src\umathmodule.c.src(1583) : warning C4146: unary minus operator applied to unsigned type, result still unsigned Potential bugs. Look closely at these. numpy\core\src\umathmodule.c.src(1625) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data Cast. numpy\core\src\umathmodule.c.src(2013) : warning C4013: 'frexpf' undefined; assuming extern returning int Add this function. numpy\core\src\umathmodule.c.src(2013) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data Cast probably. numpy\core\src\umathmodule.c.src(2030) : warning C4013: 'ldexpf' undefined; assuming extern returning int Add this function. numpy\core\src\umathmodule.c.src(2030) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data Cast probably. build\src.win32-2.5\numpy\core\__umath_generated.c(15) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(21) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(27) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(30) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(45) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(45) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(51) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(54) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(63) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(72) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(72) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(78) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(114) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(153) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(174) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(177) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(189) : error C2099: initializer is not a constant build\src.win32-2.5\numpy\core\__umath_generated.c(192) : error C2099: initializer is not a constant For whatever reason, these function pointers aren't constant, so I think the initialization should be moved into the InitOperators function if that makes sense to do. numpy\core\src\ufuncobject.c(717) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data numpy\core\src\ufuncobject.c(1130) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data numpy\core\src\ufuncobject.c(1451) : warning C4244: '=' : conversion from 'npy_intp' to 'int', possible loss of data numpy\core\src\ufuncobject.c(1452) : warning C4244: '=' : conversion from 'npy_intp' to 'int', possible loss of data numpy\core\src\ufuncobject.c(2113) : warning C4244: '=' : conversion from 'npy_intp' to 'int', possible loss of data numpy\core\src\ufuncobject.c(2962) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data Potential bugs. Look closely at these. Otherwise cast to suppress the warnings. Hope this helped. Cheers, Albert From fullung at gmail.com Fri Jul 27 23:24:49 2007 From: fullung at gmail.com (Albert Strasheim) Date: Sat, 28 Jul 2007 05:24:49 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <20070728031903.GA25416@dogbert.sdsl.sun.ac.za> References: <20070727143847.GC7447@mentat.za.net> <46AA0735.7020003@cens.ioc.ee> <20070727151652.GE7447@mentat.za.net> <46AA10CB.6060206@cens.ioc.ee> <46AA77BC.6080408@cens.ioc.ee> <20070728002850.GF7447@mentat.za.net> <20070728031903.GA25416@dogbert.sdsl.sun.ac.za> Message-ID: <20070728032449.GB25416@dogbert.sdsl.sun.ac.za> Hello On Sat, 28 Jul 2007, Albert Strasheim wrote: > float __cdecl logf(float); > float __cdecl sqrtf(float); > > but they're missing the __cdecl in the NumPy code. Somewhere a macro > needs to be defined to __cdecl on Windows (and left empty on other > platforms) and including in the NumPy declarations. included > numpy\core\src\umathmodule.c.src(632) : warning C4244: 'initializing' : conversion from 'int' to 'float', possible loss of data > numpy\core\src\umathmodule.c.src(641) : warning C4244: 'initializing' : conversion from 'int' to 'float', possible loss of data > > More casts probably. Looks like initializing these values with a float value (e.g., 0.0f and not 0) will fix these. If it's hard to modify the code generate to do this, a cast should be fine. Cheers, Albert From pearu at cens.ioc.ee Sat Jul 28 16:09:23 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 28 Jul 2007 22:09:23 +0200 Subject: [Numpy-discussion] build on windows 64-bit platform In-Reply-To: <20070728032449.GB25416@dogbert.sdsl.sun.ac.za> References: <20070727143847.GC7447@mentat.za.net> <46AA0735.7020003@cens.ioc.ee> <20070727151652.GE7447@mentat.za.net> <46AA10CB.6060206@cens.ioc.ee> <46AA77BC.6080408@cens.ioc.ee> <20070728002850.GF7447@mentat.za.net> <20070728031903.GA25416@dogbert.sdsl.sun.ac.za> <20070728032449.GB25416@dogbert.sdsl.sun.ac.za> Message-ID: <46ABA273.70001@cens.ioc.ee> Hi, I finally got numpy to build on Windows XP x86_64 MSVC. The code needed to get it work is within DISTUTILS_USE_SDK defines. However, the tests fail on importing numpy: the package is not found. Could someone with access to this machine take a look at the configuration of installing numpy? Regards, Pearu From fullung at gmail.com Sat Jul 28 17:34:18 2007 From: fullung at gmail.com (Albert Strasheim) Date: Sat, 28 Jul 2007 23:34:18 +0200 Subject: [Numpy-discussion] Intel MKL 9.1 on Windows (was: Re: VMWare Virtual Appliance...) In-Reply-To: <20070612010615.GA24822@dogbert.sdsl.sun.ac.za> References: <20070612010615.GA24822@dogbert.sdsl.sun.ac.za> Message-ID: <20070728213417.GA8793@dogbert.sdsl.sun.ac.za> Hello all Turns out there's a third option: [mkl] include_dirs = C:\Program Files\Intel\MKL\9.1\include library_dirs = C:\Program Files\Intel\MKL\9.1\ia32\lib mkl_libs = mkl_c_dll, libguide40 lapack_libs = mkl_lapack Note mkl_c_dll, not mkl_c. From what I understand from the MKL release notes and user guide, one should be able to mix mkl_c (a static library) and libguide40 (a dynamic library), but this seems to crash in practice. Anyway, with the site.cfg as above, NumPy passes its tests with MKL 9.1 on 32-bit Windows. Whoot! Regards, Albert On Tue, 12 Jun 2007, Albert Strasheim wrote: > Cancel that. It seems the problems with these two tests are being > caused by Intel MKL 9.1 on Windows. However, 9.0 works fine. > > You basically you have 2 options when linking against MKL on Windows as > far as the mkl_libs go. > > [mkl] > include_dirs = C:\Program Files\Intel\MKL\9.1\include > library_dirs = C:\Program Files\Intel\MKL\9.1\ia32\lib > mkl_libs = mkl_c, libguide40 > lapack_libs = mkl_lapack > > or > > mkl_libs = mkl_c, libguide > > I think libguide is the library that contains various thread and OpenMP > related bits and pieces. If you link against libguide, you get the > following error when running the NumPy tests: > > OMP abort: Initializing libguide.lib, but found libguide.lib already initialized. > This may cause performance degradation and correctness issues. > Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore > this problem and force the program to continue anyway. > Please note that the use of KMP_DUPLICATE_LIB_OK is unsupported > and using it may cause undefined behavior. > For more information, please contact Intel(R) Premier Support. > > I think this happens because multiple submodules inside NumPy are > linked against this libguide library, but this caused some > initialization code to be executed multiple times inside the same > process, which shouldn't happen. > > If one sets KMP_DUPLICATE_LIB_OK=TRUE, the tests actually work with > Intel MKL 9.0, but no matter what you do with Intel MKL 9.1, i.e., > > - link against libguide40 or > - link against libguide and don't set KMP_... or > - link against libguide and set KMP_... > > the following tests always segfault: > > numpy.core.tests.test_defmatrix.test_casting.check_basic > numpy.core.tests.test_numeric.test_dot.check_matmat > > Cheers, > > Albert > > On Tue, 12 Jun 2007, Albert Strasheim wrote: > > > I've set up a 32-bit Windows XP guest inside VMWare Server 1.0.3 on a 64-bit > > Linux machine and two of the NumPy tests are segfaulting for some strange > > reason. They are: > > > > numpy.core.tests.test_defmatrix.test_casting.check_basic > > numpy.core.tests.test_numeric.test_dot.check_matmat > > > > Do these pass for you? I'm inclined to blame VMWare at this point... > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From William.T.Bridgman.1 at gsfc.nasa.gov Mon Jul 16 13:19:05 2007 From: William.T.Bridgman.1 at gsfc.nasa.gov (W.T. Bridgman) Date: Mon, 16 Jul 2007 13:19:05 -0400 Subject: [Numpy-discussion] [AstroPy] Porting "IDL Astronomy User's Library" to numpy In-Reply-To: <1F2751F6-60DF-4306-B254-D3814F1EF885@stsci.edu> References: <469B6EE5.2050908@ipnl.in2p3.fr> <1F2751F6-60DF-4306-B254-D3814F1EF885@stsci.edu> Message-ID: <9DF78AC3-C057-489D-84FB-393371DAAC6A@gsfc.nasa.gov> Perry, I believe some of those documents are getting a bit dated. They still refer to only supporting numarray vs Numeric. Don't those need to be updated to specify numpy? Newcomers to the list might be confused if not familiar with the history, especially considering the numpy begat numeric begat numarray begat numpy timeline. Tom On Jul 16, 2007, at 12:03 PM, Perry Greenfield wrote: > > On Jul 16, 2007, at 9:13 AM, Yannick Copin wrote: > >> Hi, >> >> I'd be interested in some astronomical utilities from the IDL >> Astronomy User's >> Library (http://idlastro.gsfc.nasa.gov/contents.html) converted to >> python/numpy. I had a look to idl2python >> (http://software.pseudogreen.org/i2py/), but the automatic >> translation fails, >> mostly because (I think) the conversion is Numeric-oriented, and >> because of >> the intrinsic differences in the function argument management >> between IDL and >> python. >> >> So, before pursuing in this direction, I'd like to know if this >> exercice has >> already been done, at least partially. >> > We have the idea of doing it, but not in a very literal sense > (translating idl routines to Python counterparts). There has been > some work in this area on our part, but because of budget pressures, > much less over the last 2 years than hoped (things are looking better > now, but it may be some months before activity picks up on this front > again). > > Work so far on our part has centered on: > > Coordinate transformation utilities > Synthetic photometry > > So if you are interested in doing more literal translations, please > feel free to go right ahead. There is even a place to put such stuff: > http://www.scipy.org/AstroLib > > Perry > _______________________________________________ > AstroPy mailing list > AstroPy at scipy.org > http://lists.astropy.scipy.org/mailman/listinfo/astropy > -- Dr. William T."Tom" Bridgman Scientific Visualization Studio Global Science & Technology, Inc. NASA/Goddard Space Flight Center Email: William.T.Bridgman.1 at gsfc.nasa.gov Code 610.3 Phone: 301-286-1346 Greenbelt, MD 20771 FAX: 301-286-1634 http://svs.gsfc.nasa.gov/ From goddard at cgl.ucsf.edu Wed Jul 18 13:01:10 2007 From: goddard at cgl.ucsf.edu (Tom Goddard) Date: Wed, 18 Jul 2007 10:01:10 -0700 Subject: [Numpy-discussion] a.flat[3:7] is a copy? Message-ID: <469E4756.3060801@cgl.ucsf.edu> Does taking a slice of a flatiter always make a copy? That appears to be the behaviour in numpy 1.0.3. For example a.flat[1:3][0] = 5 does not modify the original array a, even when a is contiguous. Is this a bug? >>> import numpy as n >>> n.version.version '1.0.3' >>> a = n.zeros((2,3), n.int32) >>> a array([[0, 0, 0], [0, 0, 0]]) >>> b = a.flat[1:3] >>> b[0] = 5 >>> b array([5, 0]) >>> a array([[0, 0, 0], [0, 0, 0]]) >>> b = None >>> a array([[0, 0, 0], [0, 0, 0]]) This behavior does not seem to match what is described in the Numpy book (Dec 7, 2006 version), section 3.1.3 "Other attributes", page 51: "flat Returns an iterator object (numpy.flatiter) that acts like a 1-d version of the array. 1-d indexing works on this array and it can be passed in to most routines as an array wherein a 1-d array will be constructed from it. The new 1-d array will reference this array's data if this array is C-style contiguous, otherwise, new memory will be allocated for the 1-d array, the UPDATEIFCOPY flag will be set for the new array, and this array will have its WRITEABLE flag set FALSE until the the last reference to the new array disappears. When the last reference to the new 1-d array disappears, the data will be copied over to this non-contiguous array. This is done so that a.flat effectively references the current array regardless of whether or not it is contiguous or non-contiguous." Tom Goddard UC San Francisco From zyzhu2000 at gmail.com Tue Jul 24 11:22:00 2007 From: zyzhu2000 at gmail.com (computer_guy) Date: Tue, 24 Jul 2007 15:22:00 -0000 Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays Message-ID: <1185290520.026238.23680@w3g2000hsg.googlegroups.com> Hi Everyone, I am going to write some external C functions that takes in numpy arrays as parameters and return numpy arrays. I have the following questions: 1. What should I do in my C code? 2. Can I use any C compiler to build my library that takes numpy arrays? I am using Windows XP and Visual Studio 2005. 3. How can I generate the python binding? Thanks, cg From pwilliams at astro.berkeley.edu Fri Jul 20 15:24:24 2007 From: pwilliams at astro.berkeley.edu (Peter Williams) Date: Fri, 20 Jul 2007 12:24:24 -0700 Subject: [Numpy-discussion] f2py: returning strings? Message-ID: <1184959464.4726.615.camel@cosmic.berkeley.edu> Hi, I have some fortran code that I'm trying to wrap with f2py. The f2py mailing list seems really dead, so I thought I'd try asking here. My problem is that I can't get a routine to return a string (using the f2py from numpy 1.0.3). The Fortran looks like this: subroutine uvDatGta(object,aval) implicit none character object*(*),aval*(*) ... where object is an input and aval is an output. I've tried passing aval as a numpy.chararray and using all sorts of intent() flags in the f2py headers. No matter what I do, though, f2py seems to want to treat aval as a char *, and it seems that it doesn't have the facilities to map strings as outputs from routines. Is there a way to get f2py to make this work? Ideally, I'd like to be able to go aval = uvdatgta (object) and get a plain Python string back, but I can always write a wrapper in Python around some chararray stuff. Thanks, Peter -- Peter Williams / pwilliams at astro.berkeley.edu Department of Astronomy, UC Berkeley From ludwigbrinckmann at gmail.com Tue Jul 24 04:27:23 2007 From: ludwigbrinckmann at gmail.com (Ludwig M Brinckmann) Date: Tue, 24 Jul 2007 09:27:23 +0100 Subject: [Numpy-discussion] Downsampling array, retaining min and max values in window Message-ID: <3f7a6e1c0707240127u1b6b514cy549c6ea60497c873@mail.gmail.com> Hi there, I have a large array, lets say 40000 * 512, which I need to downsample by a factor of 4 in the y direction, by factor 3 in the x direction, so that my resulting arrays are 10000 * 170 (or 171 this does not matter very much) - but of all the values I will need to retain in the downsampled arrays the minimum and maximum of the original data, rather than computing an average or just picking every third/fourth value in the array. So essentially I have a 4*3 window, for which I need the min and max in this window, and store the result of applying this window to the original array as my results. What is the best way to do this? Regards Ludwig -------------- next part -------------- An HTML attachment was scrubbed... URL: From Csaba.Kiss at scienomics.com Tue Jul 24 10:50:03 2007 From: Csaba.Kiss at scienomics.com (Csaba) Date: Tue, 24 Jul 2007 16:50:03 +0200 Subject: [Numpy-discussion] numpy-1.0.3 compile-problems on WindowsXP/Visual Studio 2005 Message-ID: <000501c7ce01$f9ed34f0$7200a8c0@KCPC1> Hi, did somebody suceeded to compile/install numpy-1.0.3 from the sources under WindowsXP with Visual Studio 2005? Shortly after issuing the > python setup.py build command, the build breaks with a strange error and even stranger recommendation: > ............................. > building extension "numpy.core.multiarray" sources > Generating build\src.win32-2.5\numpy\core\config.h > No module named msvccompiler in numpy.distutils, trying from distutils.. > error: Python was built with Visual Studio 2003; > extensions must be built with a compiler than can generate compatible binaries. > Visual Studio 2003 was not found on this system. If you have Cygwin installed, > you can try compiling with MingW32, by passing "-c mingw32" to setup.py. > I successfully built and used Python-2.5, Qt-4.3.0, sip-4.5.2 and PyQwt-5.0.0 from the very same environment ( WindowsXP, nmake from VS-2005, Platform SDK, etc.). Everything is built with the very same binaries and tools behind, why would numpy-1.0.3's setup.py beleave that I had or ever used VS-2003? > extensions must be built with a compiler than can generate compatible binaries. I don't really understand this neither, compattible to whom or what? Please help me. Best Regards, Csaba F. Kiss - csaba.kiss at scienomics.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From amirh at MIT.EDU Wed Jul 25 16:45:42 2007 From: amirh at MIT.EDU (Amir Hirsch) Date: Wed, 25 Jul 2007 16:45:42 -0400 Subject: [Numpy-discussion] Installing Numpy on Python 2.3 Windows Message-ID: <20070725164542.1gynf50vpsqsogkk@webmail.mit.edu> Hi Everyone, I'm trying to install the Numpy package on Python 2.3 running under Windows. I downloaded numpy-1.0.3.win32-py2.3.exe and ran it, and it complains that "Python version 2.3 required, which was not found in the registry" The Python 2.3 installation I am using came with OpenOffice.org 2.2 and it must not have registered python with Windows. I require PyUNO and Numpy (and PyOpenGL and Ctypes) to work together for the application I am developing and PyUno seems only to work with the OOo distribution of Python 2.3. I have gotten this all to work in multiple Linux environments (after much pain) with a script that installs all the necessary libraries from source, which will not work in the Windows environment. I'm not familiar with Windows, and any help would be greatly appreciated! Amir From tom.duck at dal.ca Wed Jul 25 23:01:05 2007 From: tom.duck at dal.ca (Thomas J. Duck) Date: Thu, 26 Jul 2007 00:01:05 -0300 Subject: [Numpy-discussion] Memory leak for in-place Numeric+numpy addition Message-ID: <01895508-84D8-4214-B7F6-50B637B1D2F7@dal.ca> Hi, There seems to be a memory leak when arrays are added in-place for mixed Numeric/numpy applications. For example, memory usage quickly ramps up when the following program is executed: import Numeric,numpy x = Numeric.zeros((2000,2000),typecode=Numeric.Float64) for j in range(200): print j y = numpy.zeros((2000,2000),dtype=numpy.float64) x += y If I use exclusively Numeric arrays, or exclusively numpy arrays, or add a Numeric array in-place to a numpy array, there is not a problem. It is only in the case that a numpy array is added in place to a Numeric array that the leak exists. Deleting the variable y in each iteration has no effect. I am using numpy 1.0.1-8 and Numeric 24.2-7 under Debian linux. I'm not sure if this is a numpy or Numeric problem, but thought I would send it along in case there is interest and the problem can be resolved. Unfortunately, I can't move to an exclusively numpy or Numeric approach because of the other packages that I depend on. Thanks, Tom From sklein at cpcug.org Thu Jul 26 18:01:46 2007 From: sklein at cpcug.org (Stanley A. Klein) Date: Thu, 26 Jul 2007 18:01:46 -0400 (EDT) Subject: [Numpy-discussion] How do I get numpy.distutils.core setup to properly handle an optimize=1 in setup.cfg when doing bdist_rpm? Message-ID: <1683.207.188.248.157.1185487306.squirrel@www.cpcug.org> I'm trying to do an rpm package of enthought kiva for a Fedora 5 system. Enthought kiva imports numpy.distutils.core setup, probably because it has a lot of mathematical functions to compile as part of the package. Doing bdist_rpm for Fedora requires using a setup.cfg that contains [install] optimize=1 because SE-Linux needs to know all the files involved, including the pyc and pyo files, and this is best done by having distutils/setuptools both create the pyc/pyo files and the rpm spec file. However, when I run the enthought kiva setup.py with the proper setup.cfg, I get an error ("unpackaged files") indicating that the numpy distutils did not properly process the setup.cfg file. How do I fix it? Thanks. Stan Klein From sven.prevrhal at radiology.ucsf.edu Fri Jul 27 14:03:19 2007 From: sven.prevrhal at radiology.ucsf.edu (Sven Prevrhal) Date: Fri, 27 Jul 2007 11:03:19 -0700 Subject: [Numpy-discussion] numarray problem when compiling WrapITK external project PyBuffer Message-ID: <006301c7d078$69e867c0$791f3640@anatta> (NumPy community, read on this only seems unrelated) I use NumPy 1.0.3 , MSVC 2005 Express, ITK CVS from 23Jul2007, WrapITK for Python only, wrap for the standard types plus signed short (imho that should be switched on by default because many DICOM images are in that type). BTW I had to include the /bigobj compiler switch in the CMakeLists.txt for ITK for that to work. ItkVtkGlue compiled, but PyBuffer gave errors: 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value and 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value at several occasions. At these line numbers, the numarray function import_array() is called. I checked the numarray code and import_array() is a #define that does not return a value: #define import_array() {if (_import_array() < 0) {PyErr_Print(); PyErr_SetString(PyExc_ImportError, "numpy.core.multiarray failed to import"); return; } } I realize that's more a problem with numarray than PyBuffer. Since that's inline-used the compiler is probably thwarted by the return inside the macro. Consequently, if I replace the #define with a void to make it a real function call PyBuffer compiles just fine. Thanks, Sven Here is the whole build log: 1>------ Build started: Project: _BufferConversionPython, Configuration: Release Win32 ------ 1>Generating wrap_itkPyBuffer.xml 1>Generating wrap_itkPyBuffer.idx 1>Generating wrap_itkPyBufferPython.cxx 1>Generating wrap_BufferConversionPythonPython.cxx 1>create swig package BufferConversionPython 1> init module: itkPyBuffer 1>Compiling... 1>wrap_itkPyBufferPython.cxx 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=unsigned char, 1> VImageDimension=3 1> ] 1> .\wrap_itkPyBufferPython.cxx(1542) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=float, 1> VImageDimension=3 1> ] 1> .\wrap_itkPyBufferPython.cxx(1646) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=float, 1> VImageDimension=2 1> ] 1> .\wrap_itkPyBufferPython.cxx(1750) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=unsigned short, 1> VImageDimension=3 1> ] 1> .\wrap_itkPyBufferPython.cxx(1854) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=unsigned char, 1> VImageDimension=2 1> ] 1> .\wrap_itkPyBufferPython.cxx(1958) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=unsigned short, 1> VImageDimension=2 1> ] 1> .\wrap_itkPyBufferPython.cxx(2062) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=short, 1> VImageDimension=3 1> ] 1> .\wrap_itkPyBufferPython.cxx(2151) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(46) : error C2561: 'itk::PyBuffer::GetArrayFromImage' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(73) : see declaration of 'itk::PyBuffer::GetArrayFromImage' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(38) : while compiling class template member function 'PyObject *itk::PyBuffer::GetArrayFromImage(itk::Image *)' 1> with 1> [ 1> TImage=itk::Image, 1> TPixel=short, 1> VImageDimension=2 1> ] 1> .\wrap_itkPyBufferPython.cxx(2240) : see reference to class template instantiation 'itk::PyBuffer' being compiled 1> with 1> [ 1> TImage=itk::Image 1> ] 1>c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\i tkPyBuffer.txx(77) : error C2561: 'itk::PyBuffer::GetImageFromArray' : function must return a value 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.h(78) : see declaration of 'itk::PyBuffer::GetImageFromArray' 1> with 1> [ 1> TImage=itk::Image 1> ] 1> c:\users\sprevrha\src\insight\wrapping\wrapitk\externalprojects\pybuffer\itk PyBuffer.txx(75) : while compiling class template member function 'const itk::SmartPointer itk::PyBuffer::GetImageFromArray(PyObject *)' 1> with 1> [ 1> TObjectType=itk::Image, 1> TImage=itk::Image 1> ] 1>wrap_BufferConversionPythonPython.cxx 1>Generating Code... 1>Build log was saved at "file://c:\packages\Insight-CVS-VC2005-WrapITK-ExternalProjects\PyBuffer\_Bu fferConversionPython.dir\Release\BuildLog.htm" 1>_BufferConversionPython - 16 error(s), 0 warning(s) 2>------ Build started: Project: ALL_BUILD, Configuration: Release Win32 ------ 2>"Build all projects" 2>Build log was saved at "file://c:\packages\Insight-CVS-VC2005-WrapITK-ExternalProjects\PyBuffer\ALL _BUILD.dir\Release\BuildLog.htm" 2>ALL_BUILD - 0 error(s), 0 warning(s) ========== Build: 1 succeeded, 1 failed, 2 up-to-date, 0 skipped ========== -------------- next part -------------- An HTML attachment was scrubbed... URL: From udo at physics.rutgers.edu Sat Jul 28 12:15:15 2007 From: udo at physics.rutgers.edu (Viktor Oudovenko) Date: Sat, 28 Jul 2007 12:15:15 -0400 Subject: [Numpy-discussion] PLEASE help with mpi4py Message-ID: <003501c7d132$7bbad400$37e50680@M90> Dear Lisandro, I'd be very grateful is you could help us to understand why mpi4py does not work with MPI-1 . We've succeeded to build and run mpi4py with MPI-2 (mpd) only. We did not get thought for MPI-1 case and for MPI-2 (smpd). We are using SGE queuing system and for it we need weather MPI-1 or MPI-2 (smpd) working. What we've got so far plz see attachment where we are sending you 2 configuration files and 3 log files plus info file about python. The only conclusion we've come to is that to get MPI-1 working we need to recompile python. Python we used is the standard installation of OpenSuSE 10.2 . If you need any other information which we've missed to provide plz let us know. Thank you very much in advance for your help. Regards, viktor -------------- next part -------------- A non-text attachment was scrubbed... Name: 2send.tgz Type: application/octet-stream Size: 3575 bytes Desc: not available URL: From matthieu.brucher at gmail.com Sun Jul 29 05:34:10 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 29 Jul 2007 11:34:10 +0200 Subject: [Numpy-discussion] Downsampling array, retaining min and max values in window In-Reply-To: <3f7a6e1c0707240127u1b6b514cy549c6ea60497c873@mail.gmail.com> References: <3f7a6e1c0707240127u1b6b514cy549c6ea60497c873@mail.gmail.com> Message-ID: Hi, I think you should look into scipy.ndimage which has minimum_filter and maximum_filter Matthieu 2007/7/24, Ludwig M Brinckmann : > > Hi there, > > I have a large array, lets say 40000 * 512, which I need to downsample by > a factor of 4 in the y direction, by factor 3 in the x direction, so that my > resulting arrays are 10000 * 170 (or 171 this does not matter very much) - > but of all the values I will need to retain in the downsampled arrays the > minimum and maximum of the original data, rather than computing an average > or just picking every third/fourth value in the array. > So essentially I have a 4*3 window, for which I need the min and max in > this window, and store the result of applying this window to the original > array as my results. > > What is the best way to do this? > > Regards > Ludwig > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Sun Jul 29 05:38:04 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 29 Jul 2007 11:38:04 +0200 Subject: [Numpy-discussion] a.flat[3:7] is a copy? In-Reply-To: <469E4756.3060801@cgl.ucsf.edu> References: <469E4756.3060801@cgl.ucsf.edu> Message-ID: Hi, Did you try ravel() instead ? If a copy is not needed, it returns a 1D view of the array. Matthieu 2007/7/18, Tom Goddard : > > Does taking a slice of a flatiter always make a copy? That appears to > be the behaviour in numpy 1.0.3. > For example a.flat[1:3][0] = 5 does not modify the original array a, > even when a is contiguous. Is this a bug? > > >>> import numpy as n > >>> n.version.version > '1.0.3' > >>> a = n.zeros((2,3), n.int32) > >>> a > array([[0, 0, 0], > [0, 0, 0]]) > >>> b = a.flat[1:3] > >>> b[0] = 5 > >>> b > array([5, 0]) > >>> a > array([[0, 0, 0], > [0, 0, 0]]) > >>> b = None > >>> a > array([[0, 0, 0], > [0, 0, 0]]) > > This behavior does not seem to match what is described in the Numpy book > (Dec 7, 2006 version), section 3.1.3 "Other attributes", page 51: > > "flat > > Returns an iterator object (numpy.flatiter) that acts like a 1-d > version of the array. > 1-d indexing works on this array and it can be passed in to most > routines as > an array wherein a 1-d array will be constructed from it. The new 1-d > array > will reference this array's data if this array is C-style contiguous, > otherwise, > new memory will be allocated for the 1-d array, the UPDATEIFCOPY flag > will be set for the new array, and this array will have its WRITEABLE > flag > set FALSE until the the last reference to the new array disappears. > When the > last reference to the new 1-d array disappears, the data will be > copied over to > this non-contiguous array. This is done so that a.flat effectively > references the > current array regardless of whether or not it is contiguous or > non-contiguous." > > Tom Goddard > UC San Francisco > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Sun Jul 29 05:42:37 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 29 Jul 2007 11:42:37 +0200 Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays In-Reply-To: <1185290520.026238.23680@w3g2000hsg.googlegroups.com> References: <1185290520.026238.23680@w3g2000hsg.googlegroups.com> Message-ID: Hi, The simplest way of doing this is with ctypes : http://scipy.org/Cookbook/Ctypes Matthieu 2007/7/24, computer_guy : > > Hi Everyone, > > I am going to write some external C functions that takes in numpy > arrays as parameters and return numpy arrays. I have the following > questions: > > 1. What should I do in my C code? > 2. Can I use any C compiler to build my library that takes numpy > arrays? I am using Windows XP and Visual Studio 2005. > 3. How can I generate the python binding? > > Thanks, > cg > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haase at msg.ucsf.edu Sun Jul 29 05:50:53 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 29 Jul 2007 11:50:53 +0200 Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays In-Reply-To: References: <1185290520.026238.23680@w3g2000hsg.googlegroups.com> Message-ID: Note also that there was essentially the very same question on this list just a few days ago. At the time, there were many answers and quite a discussion... Hope you can find the list archive at scipy.org. -Sebastian On 7/29/07, Matthieu Brucher wrote: > Hi, > > The simplest way of doing this is with ctypes : http://scipy.org/Cookbook/Ctypes > > Matthieu > > > 2007/7/24, computer_guy < zyzhu2000 at gmail.com>: > > > Hi Everyone, > > > > I am going to write some external C functions that takes in numpy > > arrays as parameters and return numpy arrays. I have the following > > questions: > > > > 1. What should I do in my C code? > > 2. Can I use any C compiler to build my library that takes numpy > > arrays? I am using Windows XP and Visual Studio 2005. > > 3. How can I generate the python binding? > > > > Thanks, > > cg > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From robert.kern at gmail.com Sun Jul 29 06:01:05 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 29 Jul 2007 05:01:05 -0500 Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays In-Reply-To: References: <1185290520.026238.23680@w3g2000hsg.googlegroups.com> Message-ID: <46AC6561.4000707@gmail.com> Sebastian Haase wrote: > Note also that there was essentially the very same question on this > list just a few days ago. At the time, there were many answers and > quite a discussion... The OP is the same in both. We just got a burst of emails (including the one that starts this thread) that had been delayed for some reason. So it's no surprise, and no one's fault, that there's some duplication here. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From haase at msg.ucsf.edu Sun Jul 29 06:27:11 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 29 Jul 2007 12:27:11 +0200 Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays In-Reply-To: <46AC6561.4000707@gmail.com> References: <1185290520.026238.23680@w3g2000hsg.googlegroups.com> <46AC6561.4000707@gmail.com> Message-ID: On 7/29/07, Robert Kern wrote: > Sebastian Haase wrote: > > Note also that there was essentially the very same question on this > > list just a few days ago. At the time, there were many answers and > > quite a discussion... > > The OP is the same in both. We just got a burst of emails (including the one > that starts this thread) that had been delayed for some reason. So it's no > surprise, and no one's fault, that there's some duplication here. Oooh - I see - there is the date: July 24 ... [ another email just came in is from 7/18 ...] That's quite interesting. I have never seen such a delay before .... Was some computer sitting on them being turnted off for 10 days ? ;-) -Sebastian. From zyzhu2000 at gmail.com Sun Jul 29 12:11:17 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Sun, 29 Jul 2007 11:11:17 -0500 Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays In-Reply-To: References: <1185290520.026238.23680@w3g2000hsg.googlegroups.com> <46AC6561.4000707@gmail.com> Message-ID: Hi Sebastian, > Oooh - I see - there is the date: July 24 ... > [ another email just came in is from 7/18 ...] > That's quite interesting. > I have never seen such a delay before .... > Was some computer sitting on them being turnted off for 10 days ? ;-) > > -Sebastian. Not knowing that the mailing list and the google group "Numpy Discussion" are one and the same, originally I posted this question through Google Groups. Until now it never showed up in the group. I thought it was lost and made another post to the mailing list. In fact with the help of people on this mailing list and some trial and error, I have already built an extended moudle that does exactly that. Thanks for your help. Geoffrey From lou_boog2000 at yahoo.com Sun Jul 29 16:10:41 2007 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Sun, 29 Jul 2007 13:10:41 -0700 (PDT) Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays In-Reply-To: Message-ID: <714823.57168.qm@web34412.mail.mud.yahoo.com> I wrote a basic article on C extensions using NumPy arrays on the SciPy.org site. See: Cookbook/C_Extensions/NumPy at http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays?highlight=%28%28----%28-%2A%29%28%5Cr%29%3F%5Cn%29%28.%2A%29CategoryCookbook%5Cb%29 It's very basic, but should get you started. Note that once you get the patterns down for NumPy then other extensions are mostly the same pattern over and over. -- Lou Pecora ------------------------------------------------- > > 2007/7/24, computer_guy < zyzhu2000 at gmail.com>: > > > > > Hi Everyone, > > > > > > I am going to write some external C functions > that takes in numpy > > > arrays as parameters and return numpy arrays. I > have the following > > > questions: > > > > > > 1. What should I do in my C code? > > > 2. Can I use any C compiler to build my library > that takes numpy > > > arrays? I am using Windows XP and Visual Studio > 2005. > > > 3. How can I generate the python binding? > > > > > > Thanks, > > > cg -- Lou Pecora, my views are my own. --------------- Great spirits have always encountered violent opposition from mediocre minds. -Albert Einstein ____________________________________________________________________________________ Got a little couch potato? Check out fun summer activities for kids. http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz From zyzhu2000 at gmail.com Sun Jul 29 21:54:58 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Sun, 29 Jul 2007 20:54:58 -0500 Subject: [Numpy-discussion] Build external C functions that take numpy arrays as parameters and return numpy arrays In-Reply-To: <714823.57168.qm@web34412.mail.mud.yahoo.com> References: <714823.57168.qm@web34412.mail.mud.yahoo.com> Message-ID: On 7/29/07, Lou Pecora wrote: > I wrote a basic article on C extensions using NumPy > arrays on the SciPy.org site. See: > Cookbook/C_Extensions/NumPy at > > http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays?highlight=%28%28----%28-%2A%29%28%5Cr%29%3F%5Cn%29%28.%2A%29CategoryCookbook%5Cb%29 > > It's very basic, but should get you started. Note > that once you get the patterns down for NumPy then > other extensions are mostly the same pattern over and > over. > > -- Lou Pecora That was the 'template' I was using. From haase at msg.ucsf.edu Mon Jul 30 09:22:56 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Mon, 30 Jul 2007 15:22:56 +0200 Subject: [Numpy-discussion] epydoc documentation of numpy package Message-ID: Hi, For general interest - I just found this web page: http://stsdas.stsci.edu/pyraf/stscidocs/imagestats_pkg/imagestats_api/numpy-module.html I seems to me to be a nice overview of all functions in numpy. Question: is there an "official" version of this on the scipy.org site ? Or a link? Comments? -Sebastian Haase From chan_dhf at yahoo.de Mon Jul 30 10:01:46 2007 From: chan_dhf at yahoo.de (Danny Chan) Date: Mon, 30 Jul 2007 16:01:46 +0200 (CEST) Subject: [Numpy-discussion] reading 10 bit raw data into an array Message-ID: <601104.83172.qm@web26212.mail.ukl.yahoo.com> Hi all! I'm trying to read a data file that contains a raw image file. Every pixel is assigned a value from 0 to 1023, and all pixels are stored from top left to bottom right pixel in binary format in this file. I know the width and the height of the image, so all that would be required is to read 10 bits at a time and store it these as an integer. I played around with the fromstring and fromfile function, and I read the documentation for dtype objects, but I'm still confused. It seems simple enough to read data in a format with a standard bitwidth, but how can I read data in a non-standard format. Can anyone help? Greets, Danny --------------------------------- Alles was der Gesundheit und Entspannung dient.BE A BETTER MEDIZINMANN! -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Jul 30 10:32:44 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 30 Jul 2007 16:32:44 +0200 Subject: [Numpy-discussion] reading 10 bit raw data into an array In-Reply-To: <601104.83172.qm@web26212.mail.ukl.yahoo.com> References: <601104.83172.qm@web26212.mail.ukl.yahoo.com> Message-ID: <20070730143244.GO7447@mentat.za.net> On Mon, Jul 30, 2007 at 04:01:46PM +0200, Danny Chan wrote: > I'm trying to read a data file that contains a raw image file. Every pixel is > assigned a value from 0 to 1023, and all pixels are stored from top left to > bottom right pixel in binary format in this file. I know the width and the > height of the image, so all that would be required is to read 10 bits at a time > and store it these as an integer. I played around with the fromstring and > fromfile function, and I read the documentation for dtype objects, but I'm > still confused. It seems simple enough to read data in a format with a standard > bitwidth, but how can I read data in a non-standard format. Can > anyone help? AFAIK, numpy's dtypes all have widths >= 1 byte. The easiest solution I can think of is to use fromfile to read 5 bytes at a time, and then to use divmod to obtain your 4 values. Cheers St?fan From zyzhu2000 at gmail.com Mon Jul 30 11:12:06 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Mon, 30 Jul 2007 10:12:06 -0500 Subject: [Numpy-discussion] How to implement a 'pivot table?' Message-ID: Hi Everyone, I am wondering what is the best (and fast) way to build a pivot table aside from the 'brute force way?' I want to transform an numpy array into a pivot table. For example, if I have a numpy array like below: Region Date # of Units ---------- ---------- -------------- East 1/1 10 East 1/1 20 East 1/2 30 West 1/1 40 West 1/2 50 West 1/2 60 I want to transform this into the following table, where f() is a given aggregate function: Date Region 1/1 1/2 ---------- East f(10,20) f(30) West f(40) f(50,60) I can regroup them into 'sets' and do it the brute force way, but that is kind of slow to execute. Does anyone know a better way? Thanks, Geoffrey From oliphant.travis at ieee.org Mon Jul 30 12:54:26 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Jul 2007 10:54:26 -0600 Subject: [Numpy-discussion] reading 10 bit raw data into an array In-Reply-To: <601104.83172.qm@web26212.mail.ukl.yahoo.com> References: <601104.83172.qm@web26212.mail.ukl.yahoo.com> Message-ID: <46AE17C2.80301@ieee.org> Danny Chan wrote: > Hi all! > I'm trying to read a data file that contains a raw image file. Every > pixel is assigned a value from 0 to 1023, and all pixels are stored from > top left to bottom right pixel in binary format in this file. I know the > width and the height of the image, so all that would be required is to > read 10 bits at a time and store it these as an integer. I played around > with the fromstring and fromfile function, and I read the documentation > for dtype objects, but I'm still confused. It seems simple enough to > read data in a format with a standard bitwidth, but how can I read data > in a non-standard format. Can anyone help? > This kind of bit-manipulation must be done using bit operations on standard size data types even in C. The file reading and writing libraries use bytes as their common denominator. I would read in the entire image into a numpy array of unsigned bytes and then use slicing, masking, and bit-shifting to take 5 bytes at a time and convert them to 4 values of a 16-bit unsigned image. Basically, you would do something like # read in entire image into 1-d unsigned byte array # create 16-bit array of the correct 2-D size # use flat indexing to store into the new array # new.flat[::4] = old[::5] + bitwise_or(old[1::5], MASK1b) << SHIFT1b # new.flat[1::4] = bitwise_or(old[1::5], MASK2a) << SHIFT2a + bitwise_or(old[2::5], MASK2b) << SHIFT2b # etc. The exact MASKS and shifts to use is left as an exercise for the reader :-) -Travis From tim.hochberg at ieee.org Mon Jul 30 13:12:40 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 30 Jul 2007 10:12:40 -0700 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: On 7/30/07, Geoffrey Zhu wrote: > > Hi Everyone, > > I am wondering what is the best (and fast) way to build a pivot table > aside from the 'brute force way?' What's the brute force way? It's easier to offer an improved suggestion if we know what we're trying to beat. I want to transform an numpy array into a pivot table. For example, if > I have a numpy array like below: > > Region Date # of Units > ---------- ---------- -------------- > East 1/1 10 > East 1/1 20 > East 1/2 30 > West 1/1 40 > West 1/2 50 > West 1/2 60 > > I want to transform this into the following table, where f() is a > given aggregate function: > > Date > Region 1/1 1/2 > ---------- > East f(10,20) f(30) > West f(40) f(50,60) > > > I can regroup them into 'sets' and do it the brute force way, but that > is kind of slow to execute. Does anyone know a better way? I would use a python to dictionary to assemble lists of values. I would key off (region/date) tuples. In outline: map = {} dates = set() regions = set() for (region, date, units) in data: dates.add(date) regions.add(regions) key = (region, date) if key not in map: map[key] = [] map[key].append(data) Once you have map, regions and dates, you can trivially make a table as above. The details will depend on what format you want the table to have, but it should be easy to do. Thanks, > Geoffrey > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant.travis at ieee.org Mon Jul 30 14:33:28 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Jul 2007 12:33:28 -0600 Subject: [Numpy-discussion] Downsampling array, retaining min and max values in window In-Reply-To: References: <3f7a6e1c0707240127u1b6b514cy549c6ea60497c873@mail.gmail.com> Message-ID: <46AE2EF8.8090901@ieee.org> Matthieu Brucher wrote: > Hi, > > I think you should look into scipy.ndimage which has minimum_filter > and maximum_filter > > Matthieu > > 2007/7/24, Ludwig M Brinckmann < ludwigbrinckmann at gmail.com > >: > > Hi there, > > I have a large array, lets say 40000 * 512, which I need to > downsample by a factor of 4 in the y direction, by factor 3 in the > x direction, so that my resulting arrays are 10000 * 170 (or 171 > this does not matter very much) - but of all the values I will > need to retain in the downsampled arrays the minimum and maximum > of the original data, rather than computing an average or just > picking every third/fourth value in the array. > So essentially I have a 4*3 window, for which I need the min and > max in this window, and store the result of applying this window > to the original array as my results. > > What is the best way to do this? > scipy.signal.order_filter order_filter(a, domain, order) Perform an order filter on an N-dimensional array. Description: Perform an order filter on the array in. The domain argument acts as a mask centered over each pixel. The non-zero elements of domain are used to select elements surrounding each input pixel which are placed in a list. The list is sorted, and the output for that pixel is the element corresponding to rank in the sorted list. Inputs: in -- an N-dimensional input array. domain -- a mask array with the same number of dimensions as in. Each dimension should have an odd number of elements. rank -- an non-negative integer which selects the element from the sorted list (0 corresponds to the largest element, 1 is the next largest element, etc.) Output: (out,) out -- the results of the order filter in an array with the same shape as in. Run the order filter and then select out every 4th element in the first dimension and 3rd element Untested: mask = numpy.ones(4,3) out = scipy.signal.order_filter(in, mask, 0) new = out[::4,::3] From zyzhu2000 at gmail.com Mon Jul 30 15:32:17 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Mon, 30 Jul 2007 14:32:17 -0500 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: Hi Timothy, On 7/30/07, Timothy Hochberg wrote: > > > On 7/30/07, Geoffrey Zhu wrote: > > Hi Everyone, > > > > I am wondering what is the best (and fast) way to build a pivot table > > aside from the 'brute force way?' > > What's the brute force way? It's easier to offer an improved suggestion if > we know what we're trying to beat. > > > I want to transform an numpy array into a pivot table. For example, if > > I have a numpy array like below: > > > > Region Date # of Units > > ---------- ---------- -------------- > > East 1/1 10 > > East 1/1 20 > > East 1/2 30 > > West 1/1 40 > > West 1/2 50 > > West 1/2 60 > > > > I want to transform this into the following table, where f() is a > > given aggregate function: > > > > Date > > Region 1/1 1/2 > > ---------- > > East f(10,20) f(30) > > West f(40) f(50,60) > > > > > > I can regroup them into 'sets' and do it the brute force way, but that > > is kind of slow to execute. Does anyone know a better way? > > I would use a python to dictionary to assemble lists of values. I would key > off (region/date) tuples. In outline: > > map = {} > dates = set() > regions = set() > for (region, date, units) in data: > dates.add(date) > regions.add(regions) > key = (region, date) > if key not in map: > map[key] = [] > map[key].append(data) > > Once you have map, regions and dates, you can trivially make a table as > above. The details will depend on what format you want the table to have, > but it should be easy to do. > > > > Thanks, > > Geoffrey > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > . __ > . |-\ > . > . tim.hochberg at ieee.org > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion The 'brute-force' way is basically what you suggested -- looping through all the records and building a two-way hash-table of the data. The problem of the brute-force' approach is that it is not taking advantage of facilities of numpy and can be slow in speed. If only there is some built-in mechanism in numpy to handle this. The other thing I am not sure is in your map object above, do I append the row number to the numpy array or do I append the row object (such as data[r])? Thanks, Geoffrey From oliphant.travis at ieee.org Mon Jul 30 16:05:28 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon, 30 Jul 2007 14:05:28 -0600 Subject: [Numpy-discussion] Memory leak for in-place Numeric+numpy addition In-Reply-To: <01895508-84D8-4214-B7F6-50B637B1D2F7@dal.ca> References: <01895508-84D8-4214-B7F6-50B637B1D2F7@dal.ca> Message-ID: <46AE4488.5080803@ieee.org> Thomas J. Duck wrote: > Hi, > > There seems to be a memory leak when arrays are added in-place > for mixed Numeric/numpy applications. For example, memory usage > quickly ramps up when the following program is executed: > > > import Numeric,numpy > x = Numeric.zeros((2000,2000),typecode=Numeric.Float64) > for j in range(200): > print j > y = numpy.zeros((2000,2000),dtype=numpy.float64) > x += y > > > If I use exclusively Numeric arrays, or exclusively numpy arrays, or > add a Numeric array in-place to a numpy array, there is not a > problem. It is only in the case that a numpy array is added in place > to a Numeric array that the leak exists. Deleting the variable y in > each iteration has no effect. > > I am using numpy 1.0.1-8 and Numeric 24.2-7 under Debian linux. > > I'm not sure if this is a numpy or Numeric problem, but thought > I would send it along in case there is interest and the problem can > be resolved. Unfortunately, I can't move to an exclusively numpy or > Numeric approach because of the other packages that I depend on. > What other packages do you depend on that use Numeric. These should get ported to use NumPy. Perhaps somebody can help do that. -Travis From ryanlists at gmail.com Mon Jul 30 16:51:47 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon, 30 Jul 2007 15:51:47 -0500 Subject: [Numpy-discussion] column_stack with mixed data types Message-ID: I have data in a spreadsheet where the first column is an integer. the second is a float, columns 3-5 are strings, and columns 6 and 7 are floats. I have each column as a list, but when I use column_stack, I get back a 2D array of strings. What is the easiest way to get a recarray out of this list of lists? Is recarray my best/only choice? Thanks, Ryan From ryanlists at gmail.com Mon Jul 30 18:22:24 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon, 30 Jul 2007 17:22:24 -0500 Subject: [Numpy-discussion] column_stack with mixed data types In-Reply-To: References: Message-ID: In writing a function to parse my data, I ran into some unexpected behavior. Is it intentional that a recarray can only be created by a list of tuples and not by a list of lists? Here is what I ran into: ipdb>biglist[0:10] Out[68]: [[7, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''], [8, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''], [9, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''], [10, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''], [11, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''], [12, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''], [13, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, 'With headliner'], [14, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, 'With headliner'], [15, 20.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.25, ''], [16, 20.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.25, '']] ipdb>tuplelist=[tuple(row) for row in biglist] ipdb>tuplelist[0:10] Out[68]: [(7, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''), (8, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''), (9, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''), (10, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''), (11, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''), (12, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''), (13, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, 'With headliner'), (14, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, 'With headliner'), (15, 20.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.25, ''), (16, 20.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.25, '')] ipdb>dt Out[68]: dtype([('Test Number', 'array(biglist, dtype=dt) *** TypeError: expected a readable buffer object This does not work for a list of lists, but does work for a list of tuples: ipdb>array(tuplelist, dtype=dt) Out[68]: array([(7, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''), (8, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''), (9, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.75, ''), (10, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''), (11, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''), (12, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, ''), (13, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, 'With headliner'), (14, 20.0, 'HE 4.0 pcf', 'Flat', 'Large', 32.6875, 4.6875, 'With headliner'), (15, 20.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.25, ''), (16, 20.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.25, ''), (17, 20.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.25, ''), (18, 15.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.0, ''), (19, 15.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.0, ''), (20, 15.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 32.6875, 5.0, ''), (21, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.5625, ''), (22, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.5625, ''), (23, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.5625, ''), (24, 10.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.2812999999999999, ''), (25, 10.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.2812999999999999, ''), (26, 10.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.2812999999999999, ''), (27, 5.0, 'HE 4.0 pcf', 'Flat', 'Small', 32.6875, 2.0937999999999999, ''), (28, 20.0, 'P420', 'Flat', 'Small', 32.6875, 2.7812999999999999, ''), (29, 20.0, 'P420', 'Flat', 'Small', 32.6875, 2.7812999999999999, ''), (30, 20.0, 'E175', 'Flat', 'Small', 32.6875, 2.7812999999999999, ''), (31, 20.0, 'E175', 'Flat', 'Small', 32.6875, 2.7812999999999999, ''), (32, 20.0, 'E175', 'Flat', 'Small', 32.6875, 2.7812999999999999, ''), (33, 20.0, 'E175', 'Flat', 'Small', 32.6875, 2.7812999999999999, ''), (34, 15.0, 'E175', 'Flat', 'Small', 32.6875, 2.5, ''), (35, 15.0, 'E175', 'Flat', 'Small', 32.6875, 2.5, ''), (36, 15.0, 'E175', 'Flat', 'Small', 32.6875, 2.5, ''), (37, 20.0, 'E175', 'Hemi', '0.2493', 32.6875, 5.1875, ''), (38, 20.0, 'E175', 'Hemi', '0.2493', 32.6875, 5.1875, ''), (39, 20.0, 'E175', 'Hemi', '0.2493', 32.6875, 5.1875, ''), (40, 15.0, 'E175', 'Hemi', '0.2493', 32.6875, 5.0, ''), (41, 15.0, 'E175', 'Hemi', '0.2493', 32.6875, 5.0, ''), (42, 15.0, 'E175', 'Hemi', '0.2493', 32.6875, 5.0, ''), (43, 15.0, 'E175', 'Flat', 'Small', 10.5625, 2.5625, ''), (44, 15.0, 'E175', 'Flat', 'Small', 10.5625, 2.5625, ''), (45, 15.0, 'E175', 'Flat', 'Small', 10.5625, 2.5625, ''), (46, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.5625, 2.5625, ''), (47, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.5625, 2.5625, ''), (48, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.5625, 2.5625, ''), (49, 10.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.25, 2.25, ''), (50, 10.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.25, 2.25, ''), (51, 10.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.25, 2.25, ''), (52, 10.0, 'E175', 'Flat', 'Small', 10.25, 2.25, ''), (53, 10.0, 'E175', 'Flat', 'Small', 10.25, 2.25, ''), (54, 10.0, 'E175', 'Flat', 'Small', 10.25, 2.25, ''), (55, 5.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.125, 2.1875, ''), (56, 5.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.125, 2.1875, ''), (57, 5.0, 'HE 4.0 pcf', 'Flat', 'Small', 10.125, 2.1875, ''), (58, 15.0, 'HE 4.0 pcf', 'Flat', 'Large', 12.5, 4.5, ''), (59, 15.0, 'HE 4.0 pcf', 'Flat', 'Large', 12.5, 4.5, ''), (60, 15.0, 'HE 4.0 pcf', 'Flat', 'Large', 12.5, 4.5, ''), (61, 15.0, 'E175', 'Flat', 'Large', 12.5, 4.5, ''), (62, 15.0, 'E175', 'Flat', 'Large', 12.5, 4.5, ''), (63, 15.0, 'E175', 'Flat', 'Large', 12.5, 4.5, ''), (64, 10.0, 'HE 4.0 pcf', 'Flat', 'Large', 12.125, 4.375, ''), (65, 10.0, 'HE 4.0 pcf', 'Flat', 'Large', 12.125, 4.375, ''), (66, 10.0, 'HE 4.0 pcf', 'Flat', 'Large', 12.125, 4.375, ''), (67, 10.0, 'E175', 'Flat', 'Large', 12.125, 4.375, ''), (68, 10.0, 'E175', 'Flat', 'Large', 12.125, 4.375, ''), (69, 10.0, 'E175', 'Flat', 'Large', 12.125, 4.375, ''), (70, 10.0, 'E175', 'Hemi', '0.2493', 12.875, 4.8125, ''), (71, 10.0, 'E175', 'Hemi', '0.2493', 12.875, 4.8125, ''), (72, 10.0, 'E175', 'Hemi', '0.2493', 12.875, 4.8125, ''), (73, 10.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 12.875, 4.8125, ''), (74, 10.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 12.875, 4.8125, ''), (75, 10.0, 'HE 4.0 pcf', 'Hemi', '0.2493', 12.875, 4.8125, ''), (76, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 18.3125, 2.5, ''), (77, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 18.3125, 2.5, ''), (78, 15.0, 'HE 4.0 pcf', 'Flat', 'Small', 18.3125, 2.5, ''), (79, 15.0, 'E175', 'Flat', 'Small', 18.3125, 2.5, ''), (80, 15.0, 'E175', 'Flat', 'Small', 18.3125, 2.5, ''), (81, 15.0, 'E175', 'Flat', 'Small', 18.3125, 2.5, ''), (82, 20.0, 'E175', 'Flat', 'Small', 45.6875, 2.7187999999999999, ''), (83, 20.0, 'E175', 'Flat', 'Small', 45.6875, 2.7187999999999999, ''), (84, 20.0, 'E175', 'Flat', 'Small', 45.6875, 2.7187999999999999, ''), (85, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 45.6875, 2.7187999999999999, ''), (86, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 45.6875, 2.7187999999999999, ''), (87, 20.0, 'HE 4.0 pcf', 'Flat', 'Small', 45.6875, 2.7187999999999999, '')], dtype=[('Test Number', ' On 7/30/07, Ryan Krauss wrote: > I have data in a spreadsheet where the first column is an integer. the > second is a float, columns 3-5 are strings, and columns 6 and 7 are > floats. I have each column as a list, but when I use column_stack, I > get back a 2D array of strings. What is the easiest way to get a > recarray out of this list of lists? Is recarray my best/only choice? > > Thanks, > > Ryan > From robert.kern at gmail.com Mon Jul 30 18:26:19 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 30 Jul 2007 17:26:19 -0500 Subject: [Numpy-discussion] column_stack with mixed data types In-Reply-To: References: Message-ID: <46AE658B.2080203@gmail.com> Ryan Krauss wrote: > In writing a function to parse my data, I ran into some unexpected > behavior. Is it intentional that a recarray can only be created by a > list of tuples and not by a list of lists? Yeah. The array() constructor needs some information for it to work its magic. Requiring tuples for the actual records is an arbitrary choice, but a reasonable one given that. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ryanlists at gmail.com Mon Jul 30 19:02:38 2007 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon, 30 Jul 2007 18:02:38 -0500 Subject: [Numpy-discussion] column_stack with mixed data types In-Reply-To: <46AE658B.2080203@gmail.com> References: <46AE658B.2080203@gmail.com> Message-ID: I just tend to think in terms of lists rather than tuples. Why is a tuple a more reasonable choice than a list? (I'm really asking and not being argumentative, since you can't hear my tone.) Ryan On 7/30/07, Robert Kern wrote: > Ryan Krauss wrote: > > In writing a function to parse my data, I ran into some unexpected > > behavior. Is it intentional that a recarray can only be created by a > > list of tuples and not by a list of lists? > > Yeah. The array() constructor needs some information for it to work its magic. > Requiring tuples for the actual records is an arbitrary choice, but a reasonable > one given that. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Mon Jul 30 19:10:42 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 30 Jul 2007 18:10:42 -0500 Subject: [Numpy-discussion] column_stack with mixed data types In-Reply-To: References: <46AE658B.2080203@gmail.com> Message-ID: <46AE6FF2.1080809@gmail.com> Ryan Krauss wrote: > I just tend to think in terms of lists rather than tuples. Why is a > tuple a more reasonable choice than a list? (I'm really asking and > not being argumentative, since you can't hear my tone.) The key thing is that the type of the container of records is different from the type of the record, so you'd either have to have a list of tuples (or a list of lists of ... tuples) or a tuple of lists. There is a tendency to use lists for homogeneous collections and tuples for inhomogeneous collections; Guido says that this is the main difference in how he uses tuples and lists. When you get an answer to a DB-API2 SQL query, you get a list of (usually augmented) tuples. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dalcinl at gmail.com Mon Jul 30 20:53:11 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 30 Jul 2007 21:53:11 -0300 Subject: [Numpy-discussion] Installing Numpy on Python 2.3 Windows In-Reply-To: <20070725164542.1gynf50vpsqsogkk@webmail.mit.edu> References: <20070725164542.1gynf50vpsqsogkk@webmail.mit.edu> Message-ID: On 7/25/07, Amir Hirsch wrote: > The Python 2.3 installation I am using came with OpenOffice.org 2.2 and it must > not have registered python with Windows. I require PyUNO and Numpy (and > PyOpenGL and Ctypes) to work together for the application I am developing and > PyUno seems only to work with the OOo distribution of Python 2.3. I only can suggest a hack (not a win guy, hopefully). Try to do a normal installation ofr python 2.3 in you machine (or other), and next monkey copy the ouput of the numpy installation inside the Python related to openoffice. Of course, not completelly sure if it will work, but it is a easy try. Other way could be to manually monkey modify your registry (but I have not idea of what is needed for this working). -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From haase at msg.ucsf.edu Tue Jul 31 03:26:05 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 31 Jul 2007 09:26:05 +0200 Subject: [Numpy-discussion] column_stack with mixed data types In-Reply-To: <46AE6FF2.1080809@gmail.com> References: <46AE658B.2080203@gmail.com> <46AE6FF2.1080809@gmail.com> Message-ID: On 7/31/07, Robert Kern wrote: > Ryan Krauss wrote: > > I just tend to think in terms of lists rather than tuples. Why is a > > tuple a more reasonable choice than a list? (I'm really asking and > > not being argumentative, since you can't hear my tone.) > > The key thing is that the type of the container of records is different from the > type of the record, so you'd either have to have a list of tuples (or a list of > lists of ... tuples) or a tuple of lists. There is a tendency to use lists for > homogeneous collections and tuples for inhomogeneous collections; Guido says > that this is the main difference in how he uses tuples and lists. When you get > an answer to a DB-API2 SQL query, you get a list of (usually augmented) tuples. > This is the best explanation I have heard about this yet. Where does he say this ? Just for reference .... Thanks, Sebastian From robert.kern at gmail.com Tue Jul 31 03:32:31 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 31 Jul 2007 02:32:31 -0500 Subject: [Numpy-discussion] column_stack with mixed data types In-Reply-To: References: <46AE658B.2080203@gmail.com> <46AE6FF2.1080809@gmail.com> Message-ID: <46AEE58F.2040805@gmail.com> Sebastian Haase wrote: > On 7/31/07, Robert Kern wrote: >> Ryan Krauss wrote: >>> I just tend to think in terms of lists rather than tuples. Why is a >>> tuple a more reasonable choice than a list? (I'm really asking and >>> not being argumentative, since you can't hear my tone.) >> The key thing is that the type of the container of records is different from the >> type of the record, so you'd either have to have a list of tuples (or a list of >> lists of ... tuples) or a tuple of lists. There is a tendency to use lists for >> homogeneous collections and tuples for inhomogeneous collections; Guido says >> that this is the main difference in how he uses tuples and lists. When you get >> an answer to a DB-API2 SQL query, you get a list of (usually augmented) tuples. >> > This is the best explanation I have heard about this yet. > Where does he say this ? Just for reference .... http://mail.python.org/pipermail/python-dev/2003-March/033964.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From haase at msg.ucsf.edu Tue Jul 31 05:56:36 2007 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 31 Jul 2007 11:56:36 +0200 Subject: [Numpy-discussion] column_stack with mixed data types In-Reply-To: <46AEE58F.2040805@gmail.com> References: <46AE658B.2080203@gmail.com> <46AE6FF2.1080809@gmail.com> <46AEE58F.2040805@gmail.com> Message-ID: On 7/31/07, Robert Kern wrote: > Sebastian Haase wrote: > > On 7/31/07, Robert Kern wrote: > >> Ryan Krauss wrote: > >>> I just tend to think in terms of lists rather than tuples. Why is a > >>> tuple a more reasonable choice than a list? (I'm really asking and > >>> not being argumentative, since you can't hear my tone.) > >> The key thing is that the type of the container of records is different from the > >> type of the record, so you'd either have to have a list of tuples (or a list of > >> lists of ... tuples) or a tuple of lists. There is a tendency to use lists for > >> homogeneous collections and tuples for inhomogeneous collections; Guido says > >> that this is the main difference in how he uses tuples and lists. When you get > >> an answer to a DB-API2 SQL query, you get a list of (usually augmented) tuples. > >> > > This is the best explanation I have heard about this yet. > > Where does he say this ? Just for reference .... > > http://mail.python.org/pipermail/python-dev/2003-March/033964.html > Thanks, apparently this still hasn't made it into the Python Style Guid though: http://www.python.org/dev/peps/pep-0008/ Refer also: [Python-Dev] Tuples vs lists Aahz aahz at pythoncraft.com 12 Mar 2003 http://mail.python.org/pipermail/python-dev/2003-March/033981.html [Python-Dev] Christian Tismer tismer at tismer.com 12 Mar 2003 http://mail.python.org/pipermail/python-dev/2003-March/033966.html he says: """I never realized this, and I'm a bit stunned. (but by no means negative about it, just surprized)""" Cheers, Sebastian From emsellem at obs.univ-lyon1.fr Tue Jul 31 10:28:09 2007 From: emsellem at obs.univ-lyon1.fr (Eric Emsellem) Date: Tue, 31 Jul 2007 16:28:09 +0200 Subject: [Numpy-discussion] simple problem with arange / roundoff Message-ID: <46AF46F9.40800@obs.univ-lyon1.fr> Hi, I discovered a bug in one of my program probably due to a round-off problem in a "arange" statement. I use something like: step = (end - start) / (npix - 1.) gridX = num.arange(start-step/2., end+step/2., step) where I wish to get a simple 1D array with npix+1 numbers going from (start-step/2.) to (end+step/2.). But then, "arange" often gets me an array only going from "start-step/2." to "end - step/2." instead, due very probably to round-off problems (I guess it does not reach the last value because <<(start-step/2.) + npix * step >> is found to be larger than (end+step/2.). Here is an example: start = -30. end = 30. npix = 31 step = (end - start) / (npix - 1.) gridX = num.arange(start-step/2., end+step/2., step) array([-31., -29., -27., -25., -23., -21., -19., -17., -15., -13., -11., -9., -7., -5., -3., -1., 1., 3., 5., 7., 9., 11., 13., 15., 17., 19., 21., 23., 25., 27., 29.]) As you can see, it does not go up to 31., but only to 29, although step is = 2.0 Is there is a way out of this ? (except by doing the silly: gridX = num.arange(start-step/2., end+1.001*step/2., step) ) Thanks for any input there (and sorry for the silly question) Eric From tim.hochberg at ieee.org Tue Jul 31 10:40:53 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 31 Jul 2007 07:40:53 -0700 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: On 7/30/07, Geoffrey Zhu wrote: > > Hi Timothy, > > On 7/30/07, Timothy Hochberg wrote: > > > > > > On 7/30/07, Geoffrey Zhu wrote: > > > Hi Everyone, > > > > > > I am wondering what is the best (and fast) way to build a pivot table > > > aside from the 'brute force way?' > > > > What's the brute force way? It's easier to offer an improved suggestion > if > > we know what we're trying to beat. > > > > > I want to transform an numpy array into a pivot table. For example, if > > > I have a numpy array like below: > > > > > > Region Date # of Units > > > ---------- ---------- -------------- > > > East 1/1 10 > > > East 1/1 20 > > > East 1/2 30 > > > West 1/1 40 > > > West 1/2 50 > > > West 1/2 60 > > > > > > I want to transform this into the following table, where f() is a > > > given aggregate function: > > > > > > Date > > > Region 1/1 1/2 > > > ---------- > > > East f(10,20) f(30) > > > West f(40) f(50,60) > > > > > > > > > I can regroup them into 'sets' and do it the brute force way, but that > > > is kind of slow to execute. Does anyone know a better way? > > > > I would use a python to dictionary to assemble lists of values. I would > key > > off (region/date) tuples. In outline: > > > > map = {} > > dates = set() > > regions = set() > > for (region, date, units) in data: > > dates.add(date) > > regions.add(regions) > > key = (region, date) > > if key not in map: > > map[key] = [] > > map[key].append(data) > > > > Once you have map, regions and dates, you can trivially make a table as > > above. The details will depend on what format you want the table to > have, > > but it should be easy to do. > > [SNIP] The 'brute-force' way is basically what you suggested -- looping > through all the records and building a two-way hash-table of the data. > > The problem of the brute-force' approach is that it is not taking > advantage of facilities of numpy and can be slow in speed. If only > there is some built-in mechanism in numpy to handle this. I'm curious; have you tried this and found it slow, or is this a hunch based on the reputed slowness of Python? Algorithmically, you can't do much better: the dictionary and set operations are O(1), so the whole operation is O(N), and you won't do any better than that, order wise. What your left with is trying to reduce constant factors. There are various ways one might go about reducing constant factors, but they depend on the details of the problem. For example, if the dates are dense and you are going to parse them anyway, you could replace the hash with table that you index into with the date as an integer. I'm not sure that you are going to do a lot better than the brute force algorithm in the generic force case though. -tim The other thing I am not sure is in your map object above, do I append > the row number to the numpy array or do I append the row object (such > as data[r])? > > Thanks, > Geoffrey > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue Jul 31 10:41:25 2007 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 31 Jul 2007 16:41:25 +0200 Subject: [Numpy-discussion] simple problem with arange / roundoff In-Reply-To: <46AF46F9.40800@obs.univ-lyon1.fr> References: <46AF46F9.40800@obs.univ-lyon1.fr> Message-ID: On 7/31/07, Eric Emsellem wrote: > Here is an example: > > start = -30. > end = 30. > npix = 31 > step = (end - start) / (npix - 1.) > gridX = num.arange(start-step/2., end+step/2., step) > array([-31., -29., -27., -25., -23., -21., -19., -17., -15., -13., -11., > -9., -7., -5., -3., -1., 1., 3., 5., 7., 9., 11., > 13., 15., 17., 19., 21., 23., 25., 27., 29.]) > > As you can see, it does not go up to 31., but only to 29, although step > is = 2.0 >> import numpy.matlib as M >> M.linspace(-30,30,31) array([-30., -28., -26., -24., -22., -20., -18., -16., -14., -12., -10., -8., -6., -4., -2., 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22., 24., 26., 28., 30.]) From tim.hochberg at ieee.org Tue Jul 31 10:44:20 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 31 Jul 2007 07:44:20 -0700 Subject: [Numpy-discussion] simple problem with arange / roundoff In-Reply-To: <46AF46F9.40800@obs.univ-lyon1.fr> References: <46AF46F9.40800@obs.univ-lyon1.fr> Message-ID: On 7/31/07, Eric Emsellem wrote: > > Hi, > > I discovered a bug in one of my program probably due to a round-off > problem in a "arange" statement. > I use something like: > > step = (end - start) / (npix - 1.) > gridX = num.arange(start-step/2., end+step/2., step) > > where I wish to get a simple 1D array with npix+1 numbers going from > (start-step/2.) to (end+step/2.). > > But then, "arange" often gets me an array only going from > "start-step/2." to "end - step/2." instead, due very probably to > round-off problems (I guess it does not reach the last value because > <<(start-step/2.) + npix * step >> is found to be larger than > (end+step/2.). > > Here is an example: > > start = -30. > end = 30. > npix = 31 > step = (end - start) / (npix - 1.) > gridX = num.arange(start-step/2., end+step/2., step) > array([-31., -29., -27., -25., -23., -21., -19., -17., -15., -13., -11., > -9., -7., -5., -3., -1., 1., 3., 5., 7., 9., 11., > 13., 15., 17., 19., 21., 23., 25., 27., 29.]) > > As you can see, it does not go up to 31., but only to 29, although step > is = 2.0 > > Is there is a way out of this ? > (except by doing the silly: gridX = num.arange(start-step/2., > end+1.001*step/2., step) ) Yes. Don't use arange with floating point numbers steps. Either write this as something equivalent to: gridX = num.arange(npix) * step + start or use linspace. gridX = num.linspace(start, stop, npix) Thanks for any input there (and sorry for the silly question) > > Eric > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyzhu2000 at gmail.com Tue Jul 31 11:00:18 2007 From: zyzhu2000 at gmail.com (Geoffrey Zhu) Date: Tue, 31 Jul 2007 10:00:18 -0500 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: Hi Timothy, On 7/31/07, Timothy Hochberg wrote: > [SNIP] > > > The 'brute-force' way is basically what you suggested -- looping > > through all the records and building a two-way hash-table of the data. > > > > The problem of the brute-force' approach is that it is not taking > > advantage of facilities of numpy and can be slow in speed. If only > > there is some built-in mechanism in numpy to handle this. > > I'm curious; have you tried this and found it slow, or is this a hunch based > on the reputed slowness of Python? Algorithmically, you can't do much > better: the dictionary and set operations are O(1), so the whole operation > is O(N), and you won't do any better than that, order wise. What your left > with is trying to reduce constant factors. I agree that algorithmically you can't do much better. It is basically a C vs Python thing. One advantage of numpy is that you can do vectorized operations at the speed of C. With this algorithm, we have to process the data element by element and the speed advantage of numpy is lost. Since data has to be stored in python sets and maps, I imagine the storage advantage is also lost. I was in fact looking for some implemention of this algorithm in numpy (and so C) that does exactly this, or some implementation of this algorithm that can leverage the fast numpy routines to do this. I haven't tried it with the real data load yet. I know the number of records will be huge and it is just a hunch that it will be slow. > There are various ways one might go about reducing constant factors, but > they depend on the details of the problem. For example, if the dates are > dense and you are going to parse them anyway, you could replace the hash > with table that you index into with the date as an integer. I'm not sure > that you are going to do a lot better than the brute force algorithm in the > generic force case though. Unfortunately it has to be something generic. Thanks a lot for your help. Geoffrey From Chris.Barker at noaa.gov Tue Jul 31 12:49:59 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 31 Jul 2007 09:49:59 -0700 Subject: [Numpy-discussion] New buffer interface for py3k Message-ID: <46AF6837.2040104@noaa.gov> Hi all, I was just reading about py3k in Guido's Blog: http://www.artima.com/weblogs/viewpost.jsp?thread=208549 And I saw this: * At the C level, there will be a new, much improved buffer API, which will provide better integration with numpy. (PEP 3118) You can read PEP 3118 here: http://www.python.org/dev/peps/pep-3118/ Nice job, Travis and Carl (and everyone else who contributed this)! I'm looking forward to having this (particularly once it gets integrated into a wide variety of other packages) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Jul 31 13:03:43 2007 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 31 Jul 2007 10:03:43 -0700 Subject: [Numpy-discussion] simple problem with arange / roundoff In-Reply-To: References: <46AF46F9.40800@obs.univ-lyon1.fr> Message-ID: <46AF6B6F.8030608@noaa.gov> Keith Goodman wrote: >>> import numpy.matlib as M >>> M.linspace(-30,30,31) Tim mentioned it, but just to be clear: linspace() is so useful that it now lives in the numpy namespace: >>> numpy.linspace is numpy.matlib.linspace True >>> numpy.linspace(1,10,10) array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]) >>> -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From fperez.net at gmail.com Tue Jul 31 15:54:49 2007 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 31 Jul 2007 13:54:49 -0600 Subject: [Numpy-discussion] Unpleasant behavior with poly1d and numpy scalar multiplication Message-ID: Hi all, consider this little script: from numpy import poly1d, float, float32 p=poly1d([1.,2.]) three=float(3) three32=float32(3) print 'three*p:',three*p print 'three32*p:',three32*p print 'p*three32:',p*three32 which produces when run: In [3]: run pol1d.py three*p: 3 x + 6 three32*p: [ 3. 6.] p*three32: 3 x + 6 The fact that multiplication between poly1d objects and numbers is: - non-commutative when the numbers are numpy scalars - different for the same number if it is a python float vs a numpy scalar is rather unpleasant, and I can see this causing hard to find bugs, depending on whether your code gets a parameter that came as a python float or a numpy one. This was found today by a colleague on numpy 1.0.4.dev3937. It feels like a bug to me, do others agree? Or is it consistent with a part of the zen of numpy I've missed thus far? Thanks, f From tim.hochberg at ieee.org Tue Jul 31 17:43:09 2007 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Tue, 31 Jul 2007 14:43:09 -0700 Subject: [Numpy-discussion] Unpleasant behavior with poly1d and numpy scalar multiplication In-Reply-To: References: Message-ID: On 7/31/07, Fernando Perez wrote: > > Hi all, > > consider this little script: > > from numpy import poly1d, float, float32 > p=poly1d([1.,2.]) > three=float(3) > three32=float32(3) > > print 'three*p:',three*p > print 'three32*p:',three32*p > print 'p*three32:',p*three32 > > > which produces when run: > > In [3]: run pol1d.py > three*p: > 3 x + 6 > three32*p: [ 3. 6.] > p*three32: > 3 x + 6 > > > The fact that multiplication between poly1d objects and numbers is: > > - non-commutative when the numbers are numpy scalars > - different for the same number if it is a python float vs a numpy scalar > > is rather unpleasant, and I can see this causing hard to find bugs, > depending on whether your code gets a parameter that came as a python > float or a numpy one. > > This was found today by a colleague on numpy 1.0.4.dev3937. It feels > like a bug to me, do others agree? Or is it consistent with a part of > the zen of numpy I've missed thus far? It looks like a bug to me, but it also looks like it's going to be tricky to fix. What looks like is going on is that float32.__mul__ is called first. For some reason it calls poly1d.__array__. If one comments out __array__ it ends up doing something odd with __iter__ and __len__ and spitting out a different wrong answer. If both of those are removed, this script works OK. My guess is that this is the scalar object being too clever, but it might just be a bad interaction between the scalar object and poly1d. Poly1d has a lot of, perhaps too much, trickiness. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.nijs at gmail.com Tue Jul 31 16:53:55 2007 From: vincent.nijs at gmail.com (Vincent) Date: Tue, 31 Jul 2007 20:53:55 -0000 Subject: [Numpy-discussion] How to implement a 'pivot table?' In-Reply-To: References: Message-ID: <1185915235.065591.136850@e9g2000prf.googlegroups.com> Generating these types of summary statistics is very common in SAS. In SAS you would set up a sequence of procedures. First sort by the variables of interest and then calculate the metrics of interest by the combination of values. In numpy/scipy this might be something like: 1. Sort by date and region 2. Determine 1st and last index of the blocks 3. Calculate mean,sum, etc. for each of the blocks. Based on the earlier arguments I am wondering, however, if this would provide any speed up. I am very interested in this issue so if you implement some general procedures and perhaps speed tests please share them with the list. Best, Vincent On Jul 31, 10:00 am, "Geoffrey Zhu" wrote: > Hi Timothy, > > On 7/31/07, Timothy Hochberg wrote: > > > [SNIP] > > > > The 'brute-force' way is basically what you suggested -- looping > > > through all the records and building a two-way hash-table of the data. > > > > The problem of the brute-force' approach is that it is not taking > > > advantage of facilities of numpy and can be slow in speed. If only > > > there is some built-in mechanism in numpy to handle this. > > > I'm curious; have you tried this and found it slow, or is this a hunch based > > on the reputed slowness of Python? Algorithmically, you can't do much > > better: the dictionary and set operations are O(1), so the whole operation > > is O(N), and you won't do any better than that, order wise. What your left > > with is trying to reduce constant factors. > > I agree that algorithmically you can't do much better. It is basically > a C vs Python thing. One advantage of numpy is that you can do > vectorized operations at the speed of C. With this algorithm, we have > to process the data element by element and the speed advantage of > numpy is lost. Since data has to be stored in python sets and maps, I > imagine the storage advantage is also lost. > > I was in fact looking for some implemention of this algorithm in numpy > (and so C) that does exactly this, or some implementation of this > algorithm that can leverage the fast numpy routines to do this. > > I haven't tried it with the real data load yet. I know the number of > records will be huge and it is just a hunch that it will be slow. > > > There are various ways one might go about reducing constant factors, but > > they depend on the details of the problem. For example, if the dates are > > dense and you are going to parse them anyway, you could replace the hash > > with table that you index into with the date as an integer. I'm not sure > > that you are going to do a lot better than the brute force algorithm in the > > generic force case though. > > Unfortunately it has to be something generic. > > Thanks a lot for your help. > Geoffrey > _______________________________________________ > Numpy-discussion mailing list > Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion From sfkings at gmail.com Mon Jul 30 21:18:23 2007 From: sfkings at gmail.com (kingshuk ghosh) Date: Mon, 30 Jul 2007 18:18:23 -0700 Subject: [Numpy-discussion] numpy installation problem Message-ID: <7fd38bfa0707301818kd280ce2vdb1e1cb0b0a23111@mail.gmail.com> Hi, I downloaded numpy1.0.3-2.tar and unzipped and untared. However somehow new numpy does not work. It invokes the old numpy 0.9.6 when i import numpy from python and type in numpy.version.version . I tried to change path and once I do that and when I do import numpy it says "running from source directory" and then if I try numpy.version.version it gives some error. Is there something obvious I am missing after unzipping and untaring the numpy source file ? For example do I need to do something to install the new numpy1.0.3 ? Or do I also need to download full python package ? I am trying to run this on Red Hat Linux 3.2.2-5 which has a gcc 3.2.2 and the version of python is 2.4 . Any help will be greatly appreciated. Cheers Kings -------------- next part -------------- An HTML attachment was scrubbed... URL: