From hurak at control.felk.cvut.cz Wed Feb 1 06:22:03 2006 From: hurak at control.felk.cvut.cz (=?ISO-8859-2?Q?Zden=ECk_Hur=E1k?=) Date: Wed Feb 1 06:22:03 2006 Subject: [Numpy-discussion] Re: Numerical Mathematics Consortium vs. scipy and numpy References: <43DE59CB.6070601@noaa.gov> Message-ID: Christopher Barker wrote: > In general, > unfortunately, it looks to be quite commercially-focused: how does one > get the open source community to be represented in this kind of thing? I do not know the details, but one of the members of the consortium - Scilab consortium - is kind of open-source. Well, their license is not purely free (http://www.scilab.org/legal/license.html, http://www.scilab.org/legal/index_legal.php?page=faq.html#q6), but it is definitely not a commercial project. Perhaps some Scilab people could answer some open-source related questions. Note that I am not related to the NMC in any way, it is really that I only found this link and as a newcomer to Python computing community, I am simply interested what are the attitudes towards these issues here. Zdenek From chanley at stsci.edu Wed Feb 1 06:47:03 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Wed Feb 1 06:47:03 2006 Subject: [Numpy-discussion] numpy.dtype problem Message-ID: <43E0C9A8.6080200@stsci.edu> The following seems to have stopped working: In [6]: import numpy In [7]: a = numpy.ones((3,3),dtype=numpy.int32) In [8]: a.dtype.name --------------------------------------------------------------------------- exceptions.MemoryError Traceback (most recent call last) /data/sparty1/dev/devCode/ MemoryError: Chris From cjw at sympatico.ca Wed Feb 1 08:17:09 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Wed Feb 1 08:17:09 2006 Subject: [Numpy-discussion] RE: [Numpy-user] possible error with isarrtype In-Reply-To: <43E03349.10208@ieee.org> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> Message-ID: <43E0DE8D.6020907@sympatico.ca> Travis Oliphant wrote: > Colin J. Williams wrote: > >> One of the deprecated names is ArrayType. This seems to be closer to >> the Python style than ndarray. > > > Not really. I agree with what you say below, but doesn't ArrayType have a greater similarity to the Python types than ndarray? [Dbg]>>> import types [Dbg]>>> dir(types) ['BooleanType', 'BufferType', 'BuiltinFunctionType', 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', 'Instance Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan geType', '__builtins__', '__doc__', '__file__', '__name__'] [Dbg]>>> I presume that the aim is still that numpy will become a part of the Python offering. Colin W. > Rather than test: > type(var) == types.IntType > you should be testing > isinstance(var, int) > > just like rather than testing > type(somearray) == ArrayType > > you should be testing > isinstance(somearray, ndarray) > > Python style has changed a bit since 2.2 allowed sub-typing builtings > > -Travis > From faltet at carabos.com Wed Feb 1 08:38:01 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 1 08:38:01 2006 Subject: [Numpy-discussion] numpy.dtype problem In-Reply-To: <43E0C9A8.6080200@stsci.edu> References: <43E0C9A8.6080200@stsci.edu> Message-ID: <200602011736.53745.faltet@carabos.com> A Dimecres 01 Febrer 2006 15:46, Christopher Hanley va escriure: > The following seems to have stopped working: > > > In [6]: import numpy > > In [7]: a = numpy.ones((3,3),dtype=numpy.int32) > > In [8]: a.dtype.name > --------------------------------------------------------------------------- > exceptions.MemoryError Traceback (most > recent call last) > > /data/sparty1/dev/devCode/ > > MemoryError: Below is a patch for this. It seems to me that Travis is introducing new *scalar data types. I'm not sure if they should appear in this case, but perhaps he can throw some light on this. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" Index: numpy/core/src/arrayobject.c =================================================================== --- numpy/core/src/arrayobject.c (revision 2043) +++ numpy/core/src/arrayobject.c (working copy) @@ -8132,18 +8132,21 @@ static PyObject * arraydescr_typename_get(PyArray_Descr *self) { - int len; - PyTypeObject *typeobj = self->typeobj; + int len; + char *w_unders; + PyTypeObject *typeobj = self->typeobj; PyObject *res; - /* Both are equivalents, but second is more resistent to changes */ -/* len = strlen(typeobj->tp_name) - 8; */ - if (PyTypeNum_ISUSERDEF(self->type_num)) { res = PyString_FromString(typeobj->tp_name); } else { - len = strchr(typeobj->tp_name, (int)'_')-(typeobj->tp_name); + w_unders = strchr(typeobj->tp_name, (int)'_'); + if (w_unders != NULL) + len = w_unders-(typeobj->tp_name); + else + /* '_' not found! returning the complete name! */ + len = strlen(typeobj->tp_name); res = PyString_FromStringAndSize(typeobj->tp_name, len); } if (PyTypeNum_ISEXTENDED(self->type_num) && self->elsize != 0) { From gerard.vermeulen at grenoble.cnrs.fr Wed Feb 1 09:16:59 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Wed Feb 1 09:16:59 2006 Subject: [Numpy-discussion] RE: [Numpy-user] possible error with isarrtype In-Reply-To: <43E0DE8D.6020907@sympatico.ca> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> <43E0DE8D.6020907@sympatico.ca> Message-ID: <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> On Wed, 01 Feb 2006 11:15:09 -0500 "Colin J. Williams" wrote: [ depration + style ] > [Dbg]>>> import types > [Dbg]>>> dir(types) > ['BooleanType', 'BufferType', 'BuiltinFunctionType', > 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', > 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', > 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', > 'Instance > Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', > 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', > 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', > 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan > geType', '__builtins__', '__doc__', '__file__', '__name__'] > [Dbg]>>> > Isn't the types module becoming superfluous? [packer at zombie ~]$ python -E Python 2.4 (#2, Feb 12 2005, 00:29:46) [GCC 3.4.3 (Mandrakelinux 10.2 3.4.3-3mdk)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> long >>> int >>> str >>> dict >>> Gerard From oliphant.travis at ieee.org Wed Feb 1 09:31:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 1 09:31:08 2006 Subject: [Numpy-discussion] numpy.dtype problem In-Reply-To: <200602011736.53745.faltet@carabos.com> References: <43E0C9A8.6080200@stsci.edu> <200602011736.53745.faltet@carabos.com> Message-ID: <43E0F037.9030503@ieee.org> Francesc Altet wrote: >A Dimecres 01 Febrer 2006 15:46, Christopher Hanley va escriure: > > >>The following seems to have stopped working: >> >> >>In [6]: import numpy >> >>In [7]: a = numpy.ones((3,3),dtype=numpy.int32) >> >>In [8]: a.dtype.name >>--------------------------------------------------------------------------- >>exceptions.MemoryError Traceback (most >>recent call last) >> >>/data/sparty1/dev/devCode/ >> >>MemoryError: >> >> > >Below is a patch for this. It seems to me that Travis is introducing >new *scalar data types. I'm not sure if they should appear in this >case, but perhaps he can throw some light on this. > > No, I'm not introducing anything new. I just changed the name of the scalar type objects. They used to be conveying type information for the array (but that is now handled by the dtype objects), and so I changed the name of the scalars from _arrtype to scalar to better convey what they are. The code in the name attribute getter was expecting an underscore which isn't there anymore. -Travis From oliphant at ee.byu.edu Wed Feb 1 11:37:09 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 11:37:09 2006 Subject: [Numpy-discussion] Re: NumPy Behavior In-Reply-To: <1138822085.2972.38.camel@d-128-95-235-238.dhcp4.washington.edu> References: <1138759409.5372.14.camel@d-128-95-235-238.dhcp4.washington.edu> <43E0328F.5000400@ieee.org> <1138768894.2965.10.camel@zen> <43E05EC6.9090302@ieee.org> <1138779155.4596.20.camel@zen> <43E0676A.8050401@ieee.org> <1138822085.2972.38.camel@d-128-95-235-238.dhcp4.washington.edu> Message-ID: <43E10DC8.4040002@ee.byu.edu> Jay Painter wrote: >Travis, > >I updated to the latest svn code this morning and ran some mmlib tests. >Your commit has fixed the segmentation fault/illegal instruction error, >but the mmlib test suite fails. I'll look into it tonight, it may just >be intentional differences between Numeric and NumPy. I'll let you know >what I find. > > Be sure to look over the list of differences in the sample chapter of my book (available at numeric.scipy.org) The numpy.lib.convertcode module can be used to make most of the changes (but there may be a few it misses). >As I've been working with Numeric, I have often desired some particular >features which I'd be willing to work on with NumPy. Maybe these have >come up before? > >1) Alternative implementations of matrixmultiply() and >eigenvalues()/eigenvalues() for symmetric matrices. For example, there >is a analytic expression for the eigenvalues of a 3x3 symmetric matrix. > >2) New C implemented vectorM and matrixMN classes which support the >array interface. This could allow for lower memory usage via pool >allocations and the customized implementations in item #1. The ones I >wish were there are: > >class vector3: >class vector4: >class matrix33: >class matrix44: >class symmetric_matrix33: >class symmetric_matrix44: > >Given this, here's a useful function for graphics applications: > >matrixmultiply431(type matrix44, type vector3) > >This function multiplies the 4x4 matrix by the three dimensional vector >by implicitly adding a fourth element with a value of 1.0 to the vector. > > This is actually a benefit of the array interface. It allows many different objects to *be* arrays and allow fast converions when possible. Specialized small-arrays are a good idea, I think, just like specialized (sparse) large arrays. Perhaps it would make sense to define a base-class array object that has only a very few things defined like the number of dimensions, a pointer to the actual memory, the flags, and perhaps a pointer to the typeobject. This would leave things like how the dimensions are stored up for sub-classes to define. -Travis From oliphant at ee.byu.edu Wed Feb 1 14:17:13 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 14:17:13 2006 Subject: [Numpy-discussion] Examples of new (nested) record-types Message-ID: <43E13345.9020701@ee.byu.edu> I've just checked in some tests for the nested-record support in numpy. These tests were written by Francesc Alted and are very useful (they helped track down at-least two reference-counting errors). But, a big utility they have is to show a method for defining and constructing arrays of nested records. Anybody wanting to figure out how to use that facility in NumPy better would benefit by looking at the code in /numpy/core/tests/test_numerictypes.py in the SVN version of NumPy. -Travis P.S. Here is an example of the kind of structure he makes arrays of in this file... # This is the structure of the table used for nested objects (DON'T PANIC!): # # +-+---------------------------------+-----+----------+-+-+ # |x|Info |color|info |y|z| # | +-----+--+----------------+----+--+ +----+-----+ | | # | |value|y2|Info2 |name|z2| |Name|Value| | | # | | | +----+-----+--+--+ | | | | | | | # | | | |name|value|y3|z3| | | | | | | | # +-+-----+--+----+-----+--+--+----+--+-----+----+-----+-+-+ # After defining an array of these guys you could get at an array of y3 fields using a['Info']['Info2']['y3'] Or, reca = a.view(recarray) reca.Info.Info2.y3 -Travis From oliphant at ee.byu.edu Wed Feb 1 14:25:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 14:25:02 2006 Subject: [Numpy-discussion] RE: [Numpy-user] possible error with isarrtype In-Reply-To: <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> <43E0DE8D.6020907@sympatico.ca> <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> Message-ID: <43E13502.6050207@ee.byu.edu> Gerard Vermeulen wrote: >On Wed, 01 Feb 2006 11:15:09 -0500 >"Colin J. Williams" wrote: > >[ depration + style ] > > > >>[Dbg]>>> import types >>[Dbg]>>> dir(types) >>['BooleanType', 'BufferType', 'BuiltinFunctionType', >>'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', >>'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', >>'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', >>'Instance >>Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', >>'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', >>'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', >>'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan >>geType', '__builtins__', '__doc__', '__file__', '__name__'] >>[Dbg]>>> >> >> >> > >Isn't the types module becoming superfluous? > > > That's the point I was trying to make. ArrayType is to ndarray as DictionaryType is to dict. My understanding is that the use of types.DictionaryType is discouraged. -Travis From chris at pseudogreen.org Wed Feb 1 14:40:13 2006 From: chris at pseudogreen.org (Christopher Stawarz) Date: Wed Feb 1 14:40:13 2006 Subject: [Numpy-discussion] Patch for scalartypes.inc.src Message-ID: <6e9a368d2cc315a6b218e210b361afdd@pseudogreen.org> I ran into a couple bugs in scalartypes.inc.src: - The switch statement in PyArray_ScalarAsCtype was missing some breaks. - When SIZEOF_LONGDOUBLE == SIZEOF_DOUBLE, PREC_REPR and PREC_STR should both be 17, not 15. (This matches what's done in floatobject.c for Python's float.) The attached patch (against SVN revision 2045) fixes both problems. Cheers, Chris -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: scalartypes_patch.txt URL: From agn at noc.soton.ac.uk Wed Feb 1 15:04:01 2006 From: agn at noc.soton.ac.uk (George Nurser) Date: Wed Feb 1 15:04:01 2006 Subject: [Numpy-discussion] numpy with ACML In-Reply-To: <200601282102.53281.luszczek@cs.utk.edu> References: <4A52806B-348E-4138-92B8-1D3F50E1D39B@noc.soton.ac.uk> <200601282102.53281.luszczek@cs.utk.edu> Message-ID: <7577860F-0907-4EA0-9A61-3EABE50A90F4@noc.soton.ac.uk> > There is code for that on netlib: > http://www.netlib.org/blas/blast-forum/cblas.tgz > > I used it myself for my C code before and it worked just fine. > > Piotr Piotr, Thanks. I got numpy to work using the cblas & acml. Details at the bottom of the email. I then ran the bench.py tests on numpy [1 processor Opteron ?1.8 GHZ] and got slightly unexpected answers: numpy times given both linked to cblas+acml and not linked. Neither of numarray, Numeric linked to any blas: python bench.py Tests x.T*y x*y.T A*x A*B A.T*x half 2in2 Dimension: 5 Array 0.5700 0.1600 0.1200 0.1600 0.6200 0.4300 0.4800 --acml +cblas Matrix 3.1000 0.9300 0.4000 0.4600 0.6500 1.7000 2.6200--acml +cblas Array 0.6400 0.1700 0.1500 0.1800 0.6100 0.3600 0.4000 Matrix 3.2300 0.6900 0.4100 0.4600 0.6700 1.4900 2.3400 NumArr 1.2100 2.8500 0.2700 2.8600 5.0000 4.1100 6.8300 Numeri 0.7300 0.1800 0.1600 0.2000 0.4100 0.3300 0.4300 Dimension: 50 Array 5.9200 0.8400 0.2900 6.9300 8.0900 2.3600 2.4500--acml +cblas Matrix 30.5500 1.8500 0.6000 7.4500 0.9300 3.7100 4.6400--acml +cblas Array 6.5900 2.7100 0.7500 25.3100 8.5000 0.5600 0.6100 Matrix 32.5200 3.2600 1.0200 25.6100 1.2900 1.7400 2.5900 NumArr 12.6600 3.9700 0.7400 27.7900 6.4900 4.5500 7.1900 Numeri 7.9700 1.5000 0.6500 24.2700 7.4200 0.6000 2.3200 Dimension: 500 Array 0.9800 3.2900 0.6100 65.0000 10.8600 2.3100 2.5500--acml +cblas Matrix 3.5300 3.3500 0.6400 64.9300 0.6500 2.3300 2.6100--acml +cblas Array 1.0900 4.5600 0.8300 589.0000 11.0700 0.1300 0.2600 Matrix 3.7000 4.5800 0.8400 593.7300 1.1700 0.1300 0.3200 NumArr 1.6700 3.3100 0.7700 417.5600 4.3900 0.8500 1.1000 Numeri 1.1900 3.5200 0.7800 559.8100 9.7400 0.8000 2.4100 -- acml+blas indeed speeds up matrix multiplication by factor of 10. but --doesn't really help vector dot products. --slows down searching operations half, 2in2 by factor of 10. Matrices generally much slower than arrays, except for A.T*x, which is ~10x faster for matrices. I also tried with the goto blas library linked in with cblas. Similar results, except slightly faster x.T*y. But trickier to get linked. --George Nurser ------------------------------------------------------------------------ ---------------------------------------------------- making the cblas.a library was straightforward. I just changed the flags in Makefile.LINUX to: CFLAGS = -O3 -DADD_ -pthread -fno-strict-aliasing -m64 -msse2 - mfpmath=sse -march=opteron -fPIC FFLAGS = -Wall -fno-second-underscore -fPIC -O3 -funroll-loops - march=opteron -mmmx -msse2 -msse -m3dnow RANLIB = ranlib BLLIB = where libacml.so lives/libacml.so then link Makefile.LINUX to Makefile.in and make. The resulting cblas.a must then be moved or linked to libcblas.a in the *same* directory as the libacml.so. This directory then needs to be added to the $LD_LIBRARY_PATH if it is not a standard one. I needed a site.cfg in numpy/numpy/distutils/site.cfg as follows: [blas] blas_libs = cblas, acml library_dirs = where libacml.so lives include_dirs = where cblas.h lives [lapack] language = f77 lapack_libs = acml library_dirs = where libacml.so lives include_dirs = where acml *.h live Then numpy and scipy both seem to build fine. numpy passes t=numpy.test(), scipy passes scipy.test(level=10). From oliphant at ee.byu.edu Wed Feb 1 15:46:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 15:46:03 2006 Subject: [Numpy-discussion] NumPy SVN not bulding.... Message-ID: <43E1480D.8040503@ee.byu.edu> After changeset 2046, I'm not able to build NumPy. This is what I'm getting... Revsion 2045 works fine. ##### msg: Extension instance has no attribute '__getitem__' Extension instance has no attribute '__getitem__' FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib/atlas'] language = c define_macros = [('NO_ATLAS_INFO', 2)] include_dirs = ['/usr/include/atlas'] Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas Traceback (most recent call last): File "setup.py", line 76, in ? setup_package() File "setup.py", line 63, in setup_package config.add_subpackage('numpy') File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, in add_subpackage config = self.get_subpackage(subpackage_name,subpackage_path) File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, in get_subpackage config = setup_module.configuration(*args) File "/home/oliphant/numpy/numpy/setup.py", line 10, in configuration config.add_subpackage('core') File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, in add_subpackage config = self.get_subpackage(subpackage_name,subpackage_path) File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, in get_subpackage config = setup_module.configuration(*args) File "numpy/core/setup.py", line 207, in configuration config.add_data_dir('tests') File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 594, in add_data_dir self.add_data_files((ds,filenames)) File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 660, in add_data_files dist.data_files.extend(data_dict.items()) AttributeError: 'NoneType' object has no attribute 'extend' From cookedm at physics.mcmaster.ca Wed Feb 1 15:54:09 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 1 15:54:09 2006 Subject: [Numpy-discussion] NumPy SVN not bulding.... In-Reply-To: <43E1480D.8040503@ee.byu.edu> (Travis Oliphant's message of "Wed, 01 Feb 2006 16:45:17 -0700") References: <43E1480D.8040503@ee.byu.edu> Message-ID: Travis Oliphant writes: > After changeset 2046, I'm not able to build NumPy. Obviously my fault then. I'll poke at it. [dave learns yet again to test before committing...] > > This is what I'm getting... > > > Revsion 2045 works fine. > > > ##### msg: Extension instance has no attribute '__getitem__' > Extension instance has no attribute '__getitem__' > FOUND: > libraries = ['ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/lib/atlas'] > language = c > define_macros = [('NO_ATLAS_INFO', 2)] > include_dirs = ['/usr/include/atlas'] > > Warning: distutils distribution has been initialized, it may be too > late to add an extension _dotblas > Traceback (most recent call last): > File "setup.py", line 76, in ? > setup_package() > File "setup.py", line 63, in setup_package > config.add_subpackage('numpy') > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, > in add_subpackage > config = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, > in get_subpackage > config = setup_module.configuration(*args) > File "/home/oliphant/numpy/numpy/setup.py", line 10, in configuration > config.add_subpackage('core') > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, > in add_subpackage > config = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, > in get_subpackage > config = setup_module.configuration(*args) > File "numpy/core/setup.py", line 207, in configuration > config.add_data_dir('tests') > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 594, > in add_data_dir > self.add_data_files((ds,filenames)) > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 660, > in add_data_files > dist.data_files.extend(data_dict.items()) > AttributeError: 'NoneType' object has no attribute 'extend' > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cookedm at physics.mcmaster.ca Wed Feb 1 16:43:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 1 16:43:02 2006 Subject: [Numpy-discussion] NumPy SVN not bulding.... In-Reply-To: (David M. Cooke's message of "Wed, 01 Feb 2006 18:53:25 -0500") References: <43E1480D.8040503@ee.byu.edu> Message-ID: cookedm at physics.mcmaster.ca (David M. Cooke) writes: > Travis Oliphant writes: > >> After changeset 2046, I'm not able to build NumPy. > > Obviously my fault then. I'll poke at it. > > [dave learns yet again to test before committing...] Ok, 2048 fixes it. (Used a wrong variable name when refactoring) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant.travis at ieee.org Thu Feb 2 06:30:12 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 06:30:12 2006 Subject: [Numpy-discussion] Re: [SciPy-user] scipy 0.4.6 release? In-Reply-To: <43E1F355.1050001@ftw.at> References: <43DA1A6F.3050103@hoc.net> <1138380174.43da4d8ed9ae6@webmail.colorado.edu> <43E1A561.6000103@dslextreme.com> <43E1F355.1050001@ftw.at> Message-ID: <43E21738.4060105@ieee.org> Ed Schofield wrote: >Erick Tryzelaar wrote: > > > >>Any chance we could get a minor version bump to fix this and the >>dtypechar/dtype.char bug in Lib/weave/standard_array_spec.py (both >>already fixed in svn)? These two changes get my weave test code and then >>I can release my darwinports package. Thanks, >> >> >> >> >I think this is a good idea. The most recent release (0.4.4) also isn't >compatible with the latest NumPy (0.9.4). I could work on making a new >release this weekend if people agree. > > I'll roll out NumPy 0.9.5 at the same time so we have two versions that work together. There have been some bug-fixes and a few (minor) feature changes. But, I am running out of numbers for 1.0 release :-) -Travis From schofield at ftw.at Thu Feb 2 07:39:15 2006 From: schofield at ftw.at (Ed Schofield) Date: Thu Feb 2 07:39:15 2006 Subject: [Numpy-discussion] Re: [SciPy-user] scipy 0.4.6 release? In-Reply-To: <43E21738.4060105@ieee.org> References: <43DA1A6F.3050103@hoc.net> <1138380174.43da4d8ed9ae6@webmail.colorado.edu> <43E1A561.6000103@dslextreme.com> <43E1F355.1050001@ftw.at> <43E21738.4060105@ieee.org> Message-ID: <43E2275F.3050705@ftw.at> Travis Oliphant wrote: > Ed Schofield wrote: > >> I think this is a good idea. The most recent release (0.4.4) also isn't >> compatible with the latest NumPy (0.9.4). I could work on making a new >> release this weekend if people agree. > > I'll roll out NumPy 0.9.5 at the same time so we have two versions > that work together. There have been some bug-fixes and a few (minor) > feature changes. But, I am running out of numbers for 1.0 release :-) That sounds good :) How about a stream of 1.0 release candidates for Numpy, starting with 1.0-rc1? For what it's worth, I think we should exercise some patience and caution before releasing a 1.0 version of NumPy, because this is likely to signify an API freeze. The recent dtype changes are a case in point -- the API is cleaner now, but the change required many small changes in SciPy. SciPy is lucky to have helpful developers close to NumPy too, but some other projects won't be able to respond as quickly to compatibility-breaking improvements. Some things I have in mind: stronger type-checking for unsafe casts, and ensuring operations on matrices return matrices ... ;) -- Ed From svetosch at gmx.net Thu Feb 2 08:07:03 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Thu Feb 2 08:07:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] scipy 0.4.6 release? In-Reply-To: <43E2275F.3050705@ftw.at> References: <43DA1A6F.3050103@hoc.net> <1138380174.43da4d8ed9ae6@webmail.colorado.edu> <43E1A561.6000103@dslextreme.com> <43E1F355.1050001@ftw.at> <43E21738.4060105@ieee.org> <43E2275F.3050705@ftw.at> Message-ID: <43E22E01.3000604@gmx.net> Ed Schofield schrieb: > ensuring operations on matrices return matrices ... ;) > Yes please! I'm so glad to have you on my side... -Sven From ashamed at issihk.net Thu Feb 2 09:15:07 2006 From: ashamed at issihk.net (=?windows-1251?B?MTQgLSAxOCD05eLw4Ov/IDIwMDYg4+7k4A==?=) Date: Thu Feb 2 09:15:07 2006 Subject: [Numpy-discussion] =?windows-1251?B?0err4OTx6uD/IOvu4+jx8ujq4C4gwvHlIOjt8fLw8+zl7fL7IOIg7uTt7uwg6vPw?= =?windows-1251?B?8eU=?= Message-ID: <07a501c62819$1e33a5ca$6832af6d@issihk.net> ???????????? ???? ?? ????????? ????????? 14 - 18 ??????? 2006 ???? ???? ?? ??????????? ? ????? ?? ???????? ??????? - ??????? ?????? ?????? ????????? ?? ??????? ??????? ? ????? ????????? ??????? ?? ???????????? ???????? ?? ??????????? ??? ??????????????? ???????? - ?? ?????? ???????????? ???? ??????? ??? ?? ?????? ?????? ??? ????????, ?? ? ???? ?????? ?? ?? ????? ?????????? ???????: ??? ??????? ?????????? ????????? ?????? ????? ???????????? ??? ??????? ???, ????? ????? ?? ???????????? ??? ????????? ????????? ???????? ? ????????? ??? ?? ?????? ??????????? ???? ???????? ?? ?????? ????????? ????????? ????? ??????????? ???????????? ?????????? ?????????. ??????? ???????? ??????? ????? ?? ??????? ????? ??????????? ???????????? ?????????? ?????????, ? ?????? ???????????????? ?????????????? ?????????? ?????????. ?? ???????? ?? ??????? ? ??????????? ??????? ? ????????? ?????????? ???????, ???????? ??????????????? ?????? ??????, ???????????? ????????????, ? ????? ????? ????????? ???????????? ???????? ?????????? ?????? ??? ?????? ??????. ?????????? ??????????, ??????????? ? ???????????? ?????????? ? ????????? ??????????????? ??????????????????? ????????-??????????: ???????????? ???? ?? ????????? ????????? 14 - 18 ??????? 2006 ???? ??????? ????????????? ??????????? ? ???????? ??????????? ???? ?????????? ????????. ???????????? ???????, ?????? ?????????? ???????? ? ?????? ??????? ??? ????? ??????????? ? ??????????? ??????? ??????????? ???????. ? ????????? ?????: ??????????????? ?????????????? ?????????? ????????? ? ???????????? ??????? ??????????????? ? ??????????? ????????? ?????? ???????: (095) 98?-65l6 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Thu Feb 2 10:20:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 2 10:20:03 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables Message-ID: <200602021919.31191.faltet@carabos.com> Hi, As some of you know (and are impatiently waiting for ;-) I'm in the process of giving support to numpy in PyTables, so that the users can transparently make use of numpy objects (or Numeric or numarray). Well, I'm glad to say that the process is almost done (bar some issues with character array support and unicode arrays). Fortunately, thanks to the provision of the array interface, saving and reading numpy objects have nearly the same performance than using numarray objects (that, as you know, are in the core of PyTables). So, if you want to have a try to the new PyTables, you can download this preliminary version of it from: http://pytables.carabos.com/download/preliminary/pytables-1.3beta1.tar.gz I'm attaching some examples so that you can see how to use numpy in combination with PyTables, by simply specifying the correct flavor in the data type definition for Table and EArray (the same goes for CArray and VLArray, although no examples are provided here) objects. Also, as I already said in other occasions, when numpy would stabilize enough, we are planning to make numpy the core data container for PyTables. Meanwhile, please, test the couple PyTables/numpy and report any error or glitch that you may notice, so that it can get as stable as possible in order to easy the transition numarray-->numpy. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" -------------- next part -------------- A non-text attachment was scrubbed... Name: array1-numpy.py Type: application/x-python Size: 1200 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: earray1-numpy.py Type: application/x-python Size: 490 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: table1-numpy.py Type: application/x-python Size: 1218 bytes Desc: not available URL: From mfmorss at aep.com Thu Feb 2 12:00:06 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Thu Feb 2 12:00:06 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables In-Reply-To: <200602021919.31191.faltet@carabos.com> Message-ID: This is good news. We're planning, tentatively, to implement a big project in Python that would make heavy use both of Numpy and Pytables. We're also waiting, however, to for Numpy to stabilize. It's a little disconcerting to see how turbulent it is right now. Mark F. Morss Principal Analyst, Market Risk American Electric Power Francesc Altet To Sent by: pytables-users at lists.sourceforge.ne numpy-discussion- t admin at lists.sourc cc eforge.net numpy-discussion at lists.sourceforge. net Subject 02/02/2006 01:19 [Numpy-discussion] Beta support for PM Numpy in PyTables Hi, As some of you know (and are impatiently waiting for ;-) I'm in the process of giving support to numpy in PyTables, so that the users can transparently make use of numpy objects (or Numeric or numarray). Well, I'm glad to say that the process is almost done (bar some issues with character array support and unicode arrays). Fortunately, thanks to the provision of the array interface, saving and reading numpy objects have nearly the same performance than using numarray objects (that, as you know, are in the core of PyTables). So, if you want to have a try to the new PyTables, you can download this preliminary version of it from: http://pytables.carabos.com/download/preliminary/pytables-1.3beta1.tar.gz I'm attaching some examples so that you can see how to use numpy in combination with PyTables, by simply specifying the correct flavor in the data type definition for Table and EArray (the same goes for CArray and VLArray, although no examples are provided here) objects. Also, as I already said in other occasions, when numpy would stabilize enough, we are planning to make numpy the core data container for PyTables. Meanwhile, please, test the couple PyTables/numpy and report any error or glitch that you may notice, so that it can get as stable as possible in order to easy the transition numarray-->numpy. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" [attachment "array1-numpy.py" deleted by Mark F Morss/OR3/AEPIN] [attachment "earray1-numpy.py" deleted by Mark F Morss/OR3/AEPIN] [attachment "table1-numpy.py" deleted by Mark F Morss/OR3/AEPIN] From oliphant.travis at ieee.org Thu Feb 2 12:38:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 12:38:02 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables In-Reply-To: References: Message-ID: <43E26D5F.7030404@ieee.org> mfmorss at aep.com wrote: >This is good news. We're planning, tentatively, to implement a big project >in Python that would make heavy use both of Numpy and Pytables. We're also >waiting, however, to for Numpy to stabilize. It's a little disconcerting >to see how turbulent it is right now. > > The only way for it to stabilize is for people to start using it. So, dive in. The upheaval of the first of the year is behind us. I don't see any major changes in the works. The only possibility is a few new C-API calls to make scalar math work more easily. -Travis From absorptive at tf1.fr Thu Feb 2 13:19:03 2006 From: absorptive at tf1.fr (=?windows-1251?B?OSAtIDEwIPTl4vDg6/8gMjAwNiDj7uTg?=) Date: Thu Feb 2 13:19:03 2006 Subject: [Numpy-discussion] =?windows-1251?B?8+/w4OLr5e3o5SDx7vLw8+Tt6Org7Ogg7ODj4Ofo7eA=?= Message-ID: ?????????? ?????????? ? ??????????????? ?????????, ?????????? ?? ????????? ? ?????????? ?? ???????, ????????????? ?? ?????? ????????? ????????? ????????, ??????? ??????? ? ?????????? ????????-????????: ?????? ?????????: ?????????? ???????? ?????????? ? ????????? ???????????? ??????????? 9 - 10 ??????? 2006 ???? ???? ???????? - ???????????? ?????????? ?????? ???????? ???????? ?????????? ??? ???????????? ??????? ?????????? ?????????? ? ???????? ? ?????? ?? ??????? ???????????? ????????????? ??????? ?????????? ??????. ????????? ????????: I. ???????? ???????? - ???????? ?????? ??? ???????????? ???????????? ????????? ???? ???????????? ?????????? ???????? ??????????. ?????????????? ? ??????????? ?????????? ??????????. ????????? ? ?????? ? ??????????. II. ?????????????? ?????? ?????????? ?????????? ????????. ???????? ???????? ??????????, ????????????? ? ???????: ?? ????? ? ????????? ???????? ????????. ????????? ??????? ? ?????????? ???????? ????? ???????????. ???????????? ????????? ? ???????? ?????????, ???????????????? ??? ?????? (??????????? ??????????, ??????? ?????????). ???? ?????????????? ?????? ????????? ?????????. ????????? ?????? ? ????????? ?????????. ????????? ???????????????? ?????? ??? ??????????????? ? ????????? ?????????. ????????? ?? ?????????? ???????????????? ?????? ? ?????? ??????? "?????????? ?? ?????" (MBO). ???????? MBO, ?????????? ????? ? ????????? ????????? ????????????????. ???????? ????????? ? ???????????????? ????? ??????????. ???????? ? ?????? ?????? ?????????. NB. ????? ???????? ? ??????????? ???????? ?????????? ? ?????? ????????? ??? ???????????? ??????? ?????????? ?????????? ? ????????. III. ??????????? ?????????? ???????? ?????????? ???? ????????? ???????? ? ??????????. ??????????? ???????????? ????????? ????????. ?????? ???????? ?????????????? ????????. ?????? ?? ??????? ? ????-???????????????? ??????????. ??????????? ????????: ? ???? ????????????? ???????? ???????????? ??????????? ?????? ????????: ????????????? ?????? ?????????????? ?????????, ????????? ????????? ? ????? ??????, ????????? ???????????? ???????, ????-???????????????? ??????????. ????????????????? ???????? 2 ??? ?? 8 ?????. ???.: (495) 98?-65-39, 98?-65-36 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Thu Feb 2 15:54:16 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 15:54:16 2006 Subject: [Numpy-discussion] Learning "strides" Message-ID: I don't know if this came from numarray or not, but for me as someone who transitions from Numeric, the "strides" attribute to an ndarray is a a new feature. I've spend some time playing with it and there are some properties that I dislike. Some of these undesired properties are probably bugs and easy to fix, but others require some discussion. 1. Negative strides: >>> x = zeros(5) >>> x.strides= (-4,) >>> x array([ 0, 25, 0, -136009696, -136009536]) Looks like a bug. PyArray_CheckStrides only checks for one end of the buffer. It is easy to fix that by disallowing negative strides, but I think that would be wrong. In my view. the right sollution is to pass offset to PyArray_CheckStrides and check for both ends of the buffer. The later will change C-API. 2. Zero strides: >>> x = arange(5) >>> x.strides = 0 >>> x array([0, 0, 0, 0, 0]) >>> x += 1 >>> x array([5, 5, 5, 5, 5]) These are somewhat puzzling properties unless you know the internals. I believe ndarray with 0s in strides are quite useful and will follow up with the description of the properties I would expect from them. 3. "Fractional" strides: I call "fractional" strides that are not a multiple of "itemsize". >>> x = arange(5) >>> x.strides = 3 >>> x array([ 0, 256, 131072, 50331648, 3]) I think these should be disallowed. It is just too easy to forget that strides are given in bytes, not in elements. Ideally rather than checking for strides[i] % itemsize, I would just make strides[i] to be expressed in number of elements, not in bytes. This can be done without changing the way strides are stored internally - just multiply by itemsize in set_strides and divide in get_strides. If strides attribute was not introduced before numpy, this change should not cause any compatibility problems. If it has some history of use, it may be possible to depricate "strides" (with a deprecation warning) and introduce a different attribute, say "steps", that will be expressed in number of elements. From oliphant.travis at ieee.org Thu Feb 2 16:45:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 16:45:01 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: References: Message-ID: <43E2A751.1060807@ieee.org> Sasha wrote: >I don't know if this came from numarray or not, but for me as someone >who transitions from Numeric, the "strides" attribute to an ndarray is >a a new feature. I've spend some time playing with it and there are >some properties that I dislike. Some of these undesired properties >are probably bugs and easy to fix, but others require some discussion. > > Of course strides have always been there, they've just never been visible from Python. Allowing the user to set the strides may not be a good idea. It was done largely so that the code that deals with misaligned data could be tested. However, it also allows you a lot of flexibility for interacting with arbitrary data-buffers that might be useful, so I'm inclined to allow it if the possible problems can be fixed. Users that set strides will have to know what they are doing, of course. The average user wouldn't bother with it. >1. Negative strides: > > > >>>>x = zeros(5) >>>>x.strides= (-4,) >>>>x >>>> >>>> >array([ 0, 25, 0, -136009696, -136009536]) > >Looks like a bug. PyArray_CheckStrides only checks for one end of the >buffer. > Right. PyArray_CheckStrides needs to be better or we can't allow negative strides. > > >3. "Fractional" strides: >I call "fractional" strides that are not a multiple of "itemsize". > > In dealing with an arbitrary data-buffer, I could see this as being useful, so I'm not sure if disallowing it is a good idea. Again, setting strides is not something that should be done by the average user so I'm not as concerned about "forgetting" the units strides are in. If a user is going to be setting strides you have to assume they are being careful. A separate attribute called steps that uses element-sizes instead of byte-sizes is a possible idea. -Travis From ndarray at mac.com Thu Feb 2 17:03:07 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 17:03:07 2006 Subject: [Numpy-discussion] Zeros in strides Message-ID: As I explained in my previous post, numpy allows zeros in the "strides" tuple, but the arrays with such strides have unexpected properties. In this post I will try to explain why arrays with zeros in strides are desireable and what properties they should have. A rank-1 array with strides=0 behaves almost like a scalar, in fact scalar arithmetics is currently implemented by setting stride to 0 is generic umath loops. Like scalar, rank-1 array with stride=0 only needs a buffer of size 1*itemsize, but currently numpy does not allow creation of rank-1 arrays with buffer smaller than size*itemsize: >>> ndarray([5], strides=[0], buffer=array([1])) Traceback (most recent call last): File "", line 1, in ? TypeError: buffer is too small for requested array An array with 0 stride is a better alternative to x + zeros(n) than a scalar or rank-0 x because an array with zero stride knows its size. (With the current umath implementation, adding two arrays with stride=0, would still require n operations, but this would probably not be the case if BLAS is used instead of a generic loop). I propose to make a few changes to the way zeros in strides are handled. This looks like undocumented territory, so I don't think there are any compatibility issues. 1. Change the buffer size requirements so that dimentions with zero stride count as size=1. 2. Use strides provided to the ndarray even when buffer is not provided. Currently they are silently ignored: >>> ndarray([5], strides=[0]).strides (4,) 3. Fix augmented assignment operators. Currently: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([5, 5, 5, 5, 5]) >>> x += arange(5) >>> x array([15, 15, 15, 15, 15]) Desired: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([1, 1, 1, 1, 1]) >>> x += arange(5) >>> x array([1, 2, 3, 4, 5]) This will probably require proper handling of stride=0 case in the output arguments of ufuncs in general, so this may be harder to get right than the first two proposals. 4. Introduce xzeros and xones functions that will create stride=0 arrays as a super-fast alternative to zeros and ones. From oliphant.travis at ieee.org Thu Feb 2 17:39:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 17:39:02 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: References: Message-ID: <43E2B420.1060801@ieee.org> Sasha wrote: >A rank-1 array with strides=0 behaves almost like a scalar, in fact >scalar arithmetics is currently implemented by setting stride to 0 is >generic umath loops. Like scalar, rank-1 array with stride=0 only >needs a buffer of size 1*itemsize, but currently numpy does not allow >creation of rank-1 arrays with buffer smaller than size*itemsize: > > As you noted, broadcasting is actually done by setting strides equal to 0 in the affected dimensions. The changes you describe, however, require serious thought with C-level explanations because you will be changing some fundamental assumptions that are made throughout the code. For example, currently there is no way you can construct new memory for an array and have different strides assigned (that's why strides is ignored if no buffer is given). You would have to change the behavior of the C-level function PyArray_NewFromDescr. You need to propose how exactly you would change that. Checking for strides that won't cause later segfaults can be tricky especially if you start allowing buffer-sizes to be different than array dimensions. How do you propose to ensure that you won't walk outside of allocated memory when somebody changes the strides later? I'm concerned that your proposal has too many potential pitfalls. At least you haven't addressed them sufficiently. My current inclination is to simply disallow setting the strides attribute now that the misaligned segments of code have been tested. -Travis From ndarray at mac.com Thu Feb 2 18:00:01 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 18:00:01 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: <43E2A751.1060807@ieee.org> References: <43E2A751.1060807@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > Sasha wrote: > > > Of course strides have always been there, they've just never been > visible from Python. > I know that strides were always part of C-API, but I don't know if they were exposed to python in numarray. If they were, there is probably some history of use. Can someone confirm or deny that? > Allowing the user to set the strides may not be a good idea. It was > done largely so that the code that deals with misaligned data could be > tested. Presently settable strides attribute does not feel like an "experts only" feature. (You've documented it in your book!) > However, it also allows you a lot of flexibility for > interacting with arbitrary data-buffers that might be useful, so I'm > inclined to allow it if the possible problems can be fixed. > This is a great feature and I can see it being used to explain ndarrays to novices. I don't think it should be regarded as "for experts only." > > > >Looks like a bug. PyArray_CheckStrides only checks for one end of the > >buffer. > > > Right. PyArray_CheckStrides needs to be better or we can't allow > negative strides. > Please let me know if you plan to change PyArray_CheckStrides so that we don't duplicate effort. > >3. "Fractional" strides: > >I call "fractional" strides that are not a multiple of "itemsize". > In dealing with an arbitrary data-buffer, I could see this as being > useful, so I'm not sure if disallowing it is a good idea. Can you suggest a use-case? I cannot think of anything that cannot be handled using a record-array view of the buffer. > Again, > setting strides is not something that should be done by the average user > so I'm not as concerned about "forgetting" the units strides are in. > If a user is going to be setting strides you have to assume they are > being careful. > The problem is that many people (including myself) think that they know what strides are when they come to numpy because they used strides in other libraries (e.g. BLAS). Most people expect element-based strides. A footnote in your book "Our definition of stride here is an element-based stride, while the strides attribute returns a byte-based stride." also suggests that element-based strides are more natural. > A separate attribute called steps that uses element-sizes instead of > byte-sizes is a possible idea. Assuming strides attribute is not used except for testing, would you object to renaming current byte-based strides to "byte_strides" and implementing element-based "strides"? I would even suggest "_byte_strides" as a clearly "don't use it unless you know what you are doing" name. From ndarray at mac.com Thu Feb 2 18:17:05 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 18:17:05 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: <43E2B420.1060801@ieee.org> References: <43E2B420.1060801@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > The changes you describe, however, require serious thought with C-level > explanations because you will be changing some fundamental assumptions > that are made throughout the code. > I agree, but I would like to discuss this at the conceptual level first and maybe hear from people not intimately familiar with the C code about what they would expect from a zero stride. > For example, currently there is no way you can construct new memory for > an array and have different strides assigned (that's why strides is > ignored if no buffer is given). You would have to change the behavior > of the C-level function PyArray_NewFromDescr. You need to propose how > exactly you would change that. > Sure. I've started working on a "proof of concept" patch and will post it soon. > Checking for strides that won't cause later segfaults can be tricky > especially if you start allowing buffer-sizes to be different than array > dimensions. How do you propose to ensure that you won't walk outside > of allocated memory when somebody changes the strides later? > I think PyArray_CheckStrides would catch that, but I will have to test that once I have some code ready. > I'm concerned that your proposal has too many potential pitfalls. At > least you haven't addressed them sufficiently. My current inclination > is to simply disallow setting the strides attribute now that the > misaligned segments of code have been tested. That would be an unfortunate result of my post :-( I would suggest just to disallow zero strides in PyArray_CheckStrides until I can convince you that they are not that dangerous. From oliphant.travis at ieee.org Thu Feb 2 18:51:12 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 18:51:12 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: References: <43E2B420.1060801@ieee.org> Message-ID: <43E2C50D.3090103@ieee.org> Sasha wrote: >Sure. I've started working on a "proof of concept" patch and will post it soon. > > > Great. >>I'm concerned that your proposal has too many potential pitfalls. At >>least you haven't addressed them sufficiently. My current inclination >>is to simply disallow setting the strides attribute now that the >>misaligned segments of code have been tested. >> >> > >That would be an unfortunate result of my post :-( I would suggest >just to disallow zero strides in PyArray_CheckStrides until I can >convince you that they are not that dangerous. > > Inclinations to... and actual plans to... are quite different things :-) So, I'm waiting and seeing. You may be on to something. Let's see what others think and what you really have in mind. -Travis From oliphant.travis at ieee.org Thu Feb 2 19:03:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 19:03:08 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: References: <43E2A751.1060807@ieee.org> Message-ID: <43E2C7BB.8010808@ieee.org> Sasha wrote: >On 2/2/06, Travis Oliphant wrote: > > >Please let me know if you plan to change PyArray_CheckStrides so that >we don't duplicate effort. > > I won't do anything with it in the near future. >Can you suggest a use-case? I cannot think of anything that cannot be >handled using a record-array view of the buffer. > > Here's the issue. With records it is quite easy to generate strides that are not integer multiples of the data. For example, a record [('field1', 'f8),('field2', 'i2')] data-type would have floating point data separated by 10 bytes. When you get a view of field1 (but getting that attribute) you would get such a "misaligned" data. Look at the following: temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') temp['f1'].strides (10,) How would you represent that in the element-based strides report? So, fractional strides are actually fundamental to the ability to have record arrays. >The problem is that many people (including myself) think that they >know what strides are when they come to numpy because they used >strides in other libraries (e.g. BLAS). > > >Most people expect element-based strides. A footnote in your book >"Our definition of stride here is an element-based stride, while the >strides attribute returns a byte-based stride." also suggests that >element-based strides are more natural. > > It's easier to explain striding when you have contiguous chunks of memory of the same data-type, but record-arrays change that and require byte-based striding. >Assuming strides attribute is not used except for testing, would you >object to renaming current byte-based strides to "byte_strides" and >implementing element-based "strides"? > > I wouldn't have a problem with that, necessarily (though there is already an __array_strides__ attribute that is byte-based for the array interface --- except it returns None for C-style contiguous so we really don't need another attribute). The remaining issue is how will fractional strides be represented? -Travis From faltet at carabos.com Fri Feb 3 08:43:43 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 3 08:43:43 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables In-Reply-To: <43E3639C.7090708@web.de> References: <200602021919.31191.faltet@carabos.com> <43E3639C.7090708@web.de> Message-ID: <200602031742.35235.faltet@carabos.com> A Divendres 03 Febrer 2006 15:07, N. Volbers va escriure: > I tried to install the beta and discovered that it is not possible to > build w/o numarray. So is numpy just optional and numarray a requirement > or will it be possible to build pytables only with numpy support ? No, numarray is still a *requeriment* for compiling PyTables; NumPy and Numeric are *not needed* at all for compilation. However, if they are present (I mean, at run-time, not at compile-time), they can be used both to provide input data to be written to disk and to get output data read from disk. You can even have different objects with different flavors (currently "numarray", "numpy", "numeric" or "python") in the same PyTables file, so that you can retrieve different objects (numarray, Numpy, Numeric or pure Python) in the same session depending on its flavor (but of course, this is not for the faint-hearted ;-). It is the magic of array interface: http://numeric.scipy.org/array_interface.html that allows doing this in a very efficient manner. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From ndarray at mac.com Fri Feb 3 10:10:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 10:10:02 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: <6C7F52FD-C2A6-45D1-AB68-AE5D04C61BE5@local> References: <43E2A751.1060807@ieee.org> <43E2C7BB.8010808@ieee.org> <6C7F52FD-C2A6-45D1-AB68-AE5D04C61BE5@local> Message-ID: On Feb 2, 2006, at 10:02 PM, Travis Oliphant wrote: >> >> Please let me know if you plan to change PyArray_CheckStrides so that >> we don't duplicate effort. >> >> > I won't do anything with it in the near future. > Attached patch deals with negative strides and prohibits zero strides. I think we can agree that this is the right behavior while zero-stride semantics are being discussed. Since I am touching C- API, I would like you to take a look before I commit. Also I am not sure "self->data - new->data" is always the right was to compute offset in array_strides_set . -- sasha -------------- next part -------------- Index: numpy/core/include/numpy/arrayobject.h =================================================================== --- numpy/core/include/numpy/arrayobject.h (revision 2053) +++ numpy/core/include/numpy/arrayobject.h (working copy) @@ -74,7 +74,7 @@ #define PY_SUCCEED 1 /* Helpful to distinguish what is installed */ -#define NDARRAY_VERSION 0x00090403 +#define NDARRAY_VERSION 0x00090404 /* Some platforms don't define bool, long long, or long double. Handle that here. Index: numpy/core/src/arrayobject.c =================================================================== --- numpy/core/src/arrayobject.c (revision 2053) +++ numpy/core/src/arrayobject.c (working copy) @@ -3517,7 +3517,7 @@ */ /*OBJECT_API*/ static Bool -PyArray_CheckStrides(int elsize, int nd, intp numbytes, +PyArray_CheckStrides(int elsize, int nd, intp numbytes, intp offset, intp *dims, intp *newstrides) { int i; @@ -3526,7 +3526,17 @@ numbytes = PyArray_MultiplyList(dims, nd) * elsize; for (i=0; i numbytes) { + intp stride = newstrides[i]; + if (stride > 0) { + if (offset + stride*(dims[i]-1)+elsize > numbytes) { + return FALSE; + } + } + else if (stride < 0) { + if (offset + stride*dims[i] < 0) { + return FALSE; + } + } else { return FALSE; } } @@ -4064,10 +4074,8 @@ } } else { /* buffer given -- use it */ - buffer.len -= offset; - buffer.ptr += offset; if (dims.len == 1 && dims.ptr[0] == -1) { - dims.ptr[0] = buffer.len / itemsize; + dims.ptr[offset] = buffer.len / itemsize; } else if (buffer.len < itemsize* \ PyArray_MultiplyList(dims.ptr, dims.len)) { @@ -4084,7 +4092,7 @@ goto fail; } if (!PyArray_CheckStrides(itemsize, strides.len, - buffer.len, + buffer.len, offset, dims.ptr, strides.ptr)) { PyErr_SetString(PyExc_ValueError, "strides is incompatible "\ @@ -4104,7 +4112,7 @@ PyArray_NewFromDescr(subtype, descr, dims.len, dims.ptr, strides.ptr, - (char *)buffer.ptr, + offset + (char *)buffer.ptr, buffer.flags, NULL); if (ret == NULL) {descr=NULL; goto fail;} PyArray_UpdateFlags(ret, UPDATE_ALL_FLAGS); @@ -4222,7 +4230,8 @@ numbytes = PyArray_MultiplyList(new->dimensions, new->nd)*new->descr->elsize; - if (!PyArray_CheckStrides(self->descr->elsize, self->nd, numbytes, + if (!PyArray_CheckStrides(self->descr->elsize, self->nd, numbytes, + self->data - new->data, self->dimensions, newstrides.ptr)) { PyErr_SetString(PyExc_ValueError, "strides is not "\ "compatible with available memory"); Index: numpy/core/tests/test_multiarray.py =================================================================== --- numpy/core/tests/test_multiarray.py (revision 2053) +++ numpy/core/tests/test_multiarray.py (working copy) @@ -62,6 +62,34 @@ assert_equal(self.one.dtype.str[1], 'i') assert_equal(self.three.dtype.str[1], 'f') + def check_stridesattr(self): + x = self.one + def make_array(size, offset, strides): + return ndarray([size], buffer=x, + offset=offset*x.itemsize, + strides=strides*x.itemsize) + assert_equal(make_array(4, 4, -1), array([4, 3, 2, 1])) + self.failUnlessRaises(ValueError, make_array, 4, 4, -2) + self.failUnlessRaises(ValueError, make_array, 4, 3, -1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 0) + + def check_set_stridesattr(self): + x = self.one + def make_array(size, offset, strides): + try: + r = ndarray([size], buffer=x, offset=offset*x.itemsize) + except: + pass + r.strides = strides=strides*x.itemsize + return r + assert_equal(make_array(4, 4, -1), array([4, 3, 2, 1])) + self.failUnlessRaises(ValueError, make_array, 4, 4, -2) + self.failUnlessRaises(ValueError, make_array, 4, 3, -1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 0) + + class test_dtypedescr(ScipyTestCase): def check_construction(self): d1 = dtype('i4') From matthew.brett at gmail.com Fri Feb 3 10:36:16 2006 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri Feb 3 10:36:16 2006 Subject: [Numpy-discussion] Huge performance hit for NaNs with Intel P3, P4 Message-ID: <1e2af89e0602031009s1c36f178ne2941ca0678c8f9f@mail.gmail.com> Hi, This is just to flag up a problem I ran into for matlab, which is that Pentium 3s and 4s have very very slow standard math performance with NaN values - for example adding to an NaN value on my machine is about 22 times slower than adding to a non-NaN value. This can become a very big problem with matrix multiplication if there are a significant number of NaNs. I explained the problem here, for matlab and the software I have been working with: http://www.mrc-cbu.cam.ac.uk/Imaging/Common/spm_intel_tune.shtml To illustrate, I've attached a timing script, running on current svn numpy linked with a standard P4 optimized ATLAS library. It (dot) multiples a 200x200 array of ones by a) another 200x200 array of ones and b) a 200x200 array of NaNs: ones * ones: 0.017460 ones * NaNs: 2.323742 proportion: 133.090452 Happily, for the Pentium 4, you can solve the problem by forcing the chip to do floating point math with the SSE instructions, which do not have this NaN penalty. So, the solution was only to recompile the ATLAS libraries with extra gcc flags forcing the use of SSE math (see the page above) - or use the Intel Math Kernel libraries, which appear to have already used this trick. Here's output from numpy linked to the recompiled ATLAS libraries: ones * ones: 0.026638 ones * NaNs: 0.023987 proportion: 0.900473 I wonder if it would be worth considering distributing the recompiled libraries by default in any binary releases? Or include a test like this one in the benchmarks to warn users about this problem? Best, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: nan_timer.py Type: text/x-python Size: 360 bytes Desc: not available URL: From mithrandir42 at web.de Fri Feb 3 13:06:08 2006 From: mithrandir42 at web.de (N. Volbers) Date: Fri Feb 3 13:06:08 2006 Subject: [Numpy-discussion] retrieving type objects for void array-scalar objects Message-ID: <43E3C596.2010904@web.de> Hello everyone! I think I have finally understood the 'void array-scalar object', but now I need some help me with the following. Assume I have an array, e.g. >>> dtype = numpy.dtype({'names': ['name', 'weight'],'formats': ['U30', 'f4']}) >>> a = numpy.array([(u'Bill', 71.2), (u'Fred', 94.3)], dtype=dtype) and this array is displayed in a graphical list. When the user modifies a value in the GUI, the value, which is a string, needs to be converted to the appropriate type, which in this example might either be a unicode string for the 'name' _or_ a float for the 'weight'. If the row already exists, I can get the type object easily: >>> my_type = type(ds.array['weight'][0]) and using this type object, I can convert the string >>> value = my_type(user_value) Is there some way to retrieve the type object directly from the array (not using any existing row) using only the name of the item? I have checked the dtype attribute, but I could only get the character representation for the item types (e.g. 'f4'). Any help would be appreciated, Niklas Volbers. From ndarray at mac.com Fri Feb 3 13:43:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 13:43:01 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: <43E2C50D.3090103@ieee.org> References: <43E2B420.1060801@ieee.org> <43E2C50D.3090103@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > Sasha wrote: > > >Sure. I've started working on a "proof of concept" patch and will post it soon. > > > Great. Attached patch allows numpy create memory-saving zero-stride arrays. Here is a sample session: >>> from numpy import * >>> x = ndarray([5], strides=0) >>> x array([12998768, 12998768, 12998768, 12998768, 12998768]) >>> x[0] = 0 >>> x array([0, 0, 0, 0, 0]) >>> x.strides = 4 Traceback (most recent call last): File "", line 1, in ? ValueError: strides is not compatible with available memory >>> x.strides (0,) >>> x.data Traceback (most recent call last): File "", line 1, in ? AttributeError: cannot get single-segment buffer for discontiguous array >>> exp(x) array([ 1., 1., 1., 1., 1.]) # Only single-element buffer is required for zero-stride array: >>> y = ones(1) >>> z = ndarray([10], strides=0, buffer=y) >>> z array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) I probably missed some places where buffer size is computed as a product of dimensions, but it should not be hard to review the code for those if we agree that having zero-stride arrays is a good idea. Note that I did not attempt to change any behaviors, the only change is that zero-stride arrays do not use more memory than they need. -------------- next part -------------- Index: numpy/core/src/arrayobject.c =================================================================== --- numpy/core/src/arrayobject.c (revision 2055) +++ numpy/core/src/arrayobject.c (working copy) @@ -3517,8 +3517,7 @@ For axes with a positive stride this function checks for a walk beyond the right end of the buffer, for axes with a negative stride, - it checks for a walk beyond the left end of the buffer. Zero strides - are disallowed. + it checks for a walk beyond the left end of the buffer. */ /*OBJECT_API*/ static Bool @@ -3532,27 +3531,17 @@ for (i=0; i 0) { + if (stride >= 0) { /* The last stride does not need to be fully inside the buffer, only its first elsize bytes */ if (offset + stride*(dims[i]-1)+elsize > numbytes) { return FALSE; } } - else if (stride < 0) { + else { if (offset + stride*dims[i] < 0) { return FALSE; } - } else { - /* XXX: Zero strides may be useful, but currently - XXX: allowing them would lead to strange results, - XXX: for example : - XXX: >>> x = arange(5) - XXX: >>> x.strides = 0 - XXX: >>> x += 1 - XXX: >>> x - XXX: array([5, 5, 5, 5, 5]) */ - return FALSE; } } return TRUE; @@ -3602,6 +3591,33 @@ } return itemsize; } +/* computes the buffer size needed to accomodate dims and strides */ +static intp +_array_buffer_size(int nd, intp *dims, intp *strides, intp itemsize) +{ + intp bufsize = 0, size; + int i; + for (i = 0; i < nd; ++i) { + if (dims[i] < 0) { + PyErr_Format(PyExc_ValueError, + "negative dimension (%d) for axis %d", + dims[i], i); + return -1; + } + if (strides[i] < 0) { + PyErr_Format(PyExc_ValueError, + "negative stride (%d) for axis %d", + strides[i], i); + return -1; + } + if (dims[i] == 0) + continue; + size = (dims[i] - 1)*strides[i] + itemsize; + if (size > bufsize) + bufsize = size; + } + return bufsize; +} /*OBJECT_API Generic new array creation routine. @@ -3768,13 +3784,8 @@ flags, &(self->flags)); } else { - if (data == NULL) { - PyErr_SetString(PyExc_ValueError, - "if 'strides' is given in " \ - "array creation, data must " \ - "be given too"); - goto fail; - } + sd = _array_buffer_size(nd, dims, strides, sd); + if (sd < 0) goto fail; memcpy(self->strides, strides, sizeof(intp)*nd); } } @@ -4092,7 +4103,7 @@ if (dims.len == 1 && dims.ptr[0] == -1) { dims.ptr[offset] = buffer.len / itemsize; } - else if (buffer.len < itemsize* \ + else if (strides.ptr == NULL && buffer.len < itemsize* \ PyArray_MultiplyList(dims.ptr, dims.len)) { PyErr_SetString(PyExc_TypeError, "buffer is too small for " \ @@ -4242,9 +4253,9 @@ if (PyArray_Check(new->base)) new = (PyArrayObject *)new->base; } - numbytes = PyArray_MultiplyList(new->dimensions, - new->nd)*new->descr->elsize; - + numbytes = _array_buffer_size(new->nd, new->dimensions, new->strides, + new->descr->elsize); + if (numbytes < 0) goto fail; if (!PyArray_CheckStrides(self->descr->elsize, self->nd, numbytes, self->data - new->data, self->dimensions, newstrides.ptr)) { From oliphant at ee.byu.edu Fri Feb 3 13:54:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 3 13:54:03 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: References: <43E2B420.1060801@ieee.org> <43E2C50D.3090103@ieee.org> Message-ID: <43E3D0D8.2060609@ee.byu.edu> Sasha wrote: >Attached patch allows numpy create memory-saving zero-stride arrays. > > > A good first cut. I'm very concerned about the speed of PyArray_NewFromDescr. So, I don't really want to make changes that will cause it to be slower for all cases unless absolutely essential. Could you give more examples of how you will be using these zero-stride arrays? What problem are they actually solving? I would also like to get more opinions about Sasha's proposal for zero-stride arrays. -Travis From alexander.belopolsky at gmail.com Fri Feb 3 14:04:08 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri Feb 3 14:04:08 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: <43E2C7BB.8010808@ieee.org> References: <43E2A751.1060807@ieee.org> <43E2C7BB.8010808@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > ... > Here's the issue. With records it is quite easy to generate strides > that are not integer multiples of the data. For example, a record > [('field1', 'f8),('field2', 'i2')] data-type would have floating point > data separated by 10 bytes. When you get a view of field1 (but getting > that attribute) you would get such a "misaligned" data. > > Look at the following: > > temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') > temp['f1'].strides > (10,) > > How would you represent that in the element-based strides report? You are right. I cannot think of anything better than just byte-based strides in this case. Maybe we could add a restriction abs(strides[i]) >= itemsize? This will probably catch some of the more common mistakes that are due to using number of elements instead of number of bytes. From ndarray at mac.com Fri Feb 3 14:05:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 14:05:01 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: References: <43E2A751.1060807@ieee.org> <43E2C7BB.8010808@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > ... > Here's the issue. With records it is quite easy to generate strides > that are not integer multiples of the data. For example, a record > [('field1', 'f8),('field2', 'i2')] data-type would have floating point > data separated by 10 bytes. When you get a view of field1 (but getting > that attribute) you would get such a "misaligned" data. > > Look at the following: > > temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') > temp['f1'].strides > (10,) > > How would you represent that in the element-based strides report? You are right. I cannot think of anything better than just byte-based strides in this case. Maybe we could add a restriction abs(strides[i]) >= itemsize? This will probably catch some of the more common mistakes that are due to using number of elements instead of number of bytes. From jswhit at fastmail.fm Fri Feb 3 14:35:09 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Fri Feb 3 14:35:09 2006 Subject: [Numpy-discussion] treating numpy arrays like lists is slow Message-ID: <43E3DA64.5080506@fastmail.fm> Hi: I've noticed that code like this is really slow in numpy (0.9.4): import numpy as NP a = NP.ones(10000,'d') a = [2.*a1 for a1 in a] the last line takes 0.17 seconds on my G5, while for Numeric and numarray it takes only 0.01. Anyone know the reason for this? -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From ndarray at mac.com Fri Feb 3 15:03:16 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 15:03:16 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: <43E3D0D8.2060609@ee.byu.edu> References: <43E2B420.1060801@ieee.org> <43E2C50D.3090103@ieee.org> <43E3D0D8.2060609@ee.byu.edu> Message-ID: On 2/3/06, Travis Oliphant wrote: > I'm very concerned about the speed of PyArray_NewFromDescr. So, I > don't really want to make changes that will cause it to be slower for > all cases unless absolutely essential. > It is easy to change the code so that it only affects the branch in PyArray_NewFromDescr that currently raises an exception -- providing both strides but no buffer. There is no need to call _array_buffer_size if data is provided. > Could you give more examples of how you will be using these zero-stride > arrays? What problem are they actually solving? > Currently when I need to represent a statistic that is constant across population, I use scalars. In many cases this works because thanks to broadcasting rules a scalar behaves almost like a vector with equal elements. With the changes introduced in numpy, generic code that works on both scalars and vectors is becoming increasingly easier to write, but there are some cases where scalars cannot replace a vector with equal elements. For example, if you want to combine data for two populations and the data comes as two scalars, you need to somehow know the size of each population to add to the size of the result. A zero-stride array would solve this problem: it takes little memory, but unlike scalar knows its size. Another use that I was contemplating was to represent per-row or per-column mask in ma. It is often the case that in a rectangular matrix data may be missing only for an entire row. It is tempting to use rank-1 mask with an element for each row to represent this case. That will work fine, but if you would not be able to use vectors to specify either per-row or per-column mask. With zero-stride array, you can use strides=(1,0) or strides=(0,1) and have the same memory use as with a vector. From ndarray at mac.com Fri Feb 3 15:11:10 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 15:11:10 2006 Subject: [Numpy-discussion] treating numpy arrays like lists is slow In-Reply-To: <43E3DA64.5080506@fastmail.fm> References: <43E3DA64.5080506@fastmail.fm> Message-ID: This is so because scalar math is very slow in numpy. This will improve with the introduction of the scalarmath module. > python -m timeit -s "from numpy import float_; x = float_(2)" "2.*x" 100000 loops, best of 3: 15.8 usec per loop > python -m timeit -s "x = 2." "2.*x" 1000000 loops, best of 3: 0.261 usec per loop On 2/3/06, Jeff Whitaker wrote: > > Hi: > > I've noticed that code like this is really slow in numpy (0.9.4): > > import numpy as NP > a = NP.ones(10000,'d') > a = [2.*a1 for a1 in a] > > > the last line takes 0.17 seconds on my G5, while for Numeric and > numarray it takes only 0.01. Anyone know the reason for this? > > -Jeff > > -- > Jeffrey S. Whitaker Phone : (303)497-6313 > Meteorologist FAX : (303)497-6449 > NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov > 325 Broadway Office : Skaggs Research Cntr 1D-124 > Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From jswhit at fastmail.fm Fri Feb 3 19:08:02 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Fri Feb 3 19:08:02 2006 Subject: [Numpy-discussion] treating numpy arrays like lists is slow] Message-ID: <43E419F1.8070608@fastmail.fm> Travis Oliphant wrote: > Jeff Whitaker wrote: > >> >> Hi: >> >> I've noticed that code like this is really slow in numpy (0.9.4): >> >> import numpy as NP >> a = NP.ones(10000,'d') >> a = [2.*a1 for a1 in a] >> >> >> the last line takes 0.17 seconds on my G5, while for Numeric and >> numarray it takes only 0.01. Anyone know the reason for this? >> > We could actually change this right now, before the introduction of > scalar math by using the standard float table for the corresponding > array scalars. The only reason I didn't do this initially was that I > wanted consistency in behavior for "division-by-zero" between arrays > and scalars. > Using the Python float math you will get divide-by-zero errors whereas > you don't (unless you ask for them), with numpy arrays. > > Thus, current scalars are treated as 0-d arrays in the internals and > go through the entire ufunc machinery for every operation. > Now, the real question is why are you doing this? Using arrays in > this way defeats their purpose :-) > > What is wrong with 2*a? Now, of course there will be situations that > require this. > > -Travis > Travis: Of course I know this is a dumb thing to do - but sometimes it does happen that a function that expects a list actually gets a rank-1 array. The workaround in that case is to just pass it a.tolist() instead of a. -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From tim.hochberg at cox.net Fri Feb 3 19:29:05 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 3 19:29:05 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Message-ID: <43E41E52.6060805@cox.net> Hi I recently installed the Visual Studio .NET 2003 (AKA VC7) compiler and I took a stab at compiling numpy. I've tried previously with the free, toolkit version of VC7 with little success, but I was hoping this would be a piece of cake. No joy! It's quite possible that my compiler setup is gummed up by the previous existence of the toolkit compiler. A bunch of paths were set to this and that and there may be some residue that is messing things up. However I successfully compiled numarray 1.5 and a couple of my own extensions so things *seem* OK. So before I go hunting, I thought I'd ask and see if there were some known issues with compiling numpy 0.9.4 with VC7. The symptoms I'm seeing are, first, that it can't run configure. It can't find python24.lib. An abbreviated traceback is shown at the bottom. I kludged my way past this by replacing line 33 of numpy/core/setup.py with the two lines: python_lib = sysconfig.EXEC_PREFIX + '/libs' result = config_cmd.try_run(tc,include_dirs=[python_include],library_dirs=[python_lib]) That got me a little farther, but I quickly ran into trouble compiling multiarray module: C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /nologo /Ox /MD /W3 /GX /DNDEBU G -Ibuild\src\numpy\core\src -Inumpy\core\include -Ibuild\src\numpy\core -Inumpy\core\src -Inumpy\li b\..\core\include -IC:\Python24\include -IC:\Python24\PC /Tcnumpy\core\src\multiarraymodule.c /Fobui ld\temp.win32-2.4\Release\numpy\core\src\multiarraymodule.obj multiarraymodule.c build\src\numpy\core\src\arraytypes.inc(5305) : error C2036: 'void *' : unknown size build\src\numpy\core\src\arraytypes.inc(5885) : error C2036: 'void *' : unknown size build\src\numpy\core\src\arraytypes.inc(6465) : error C2036: 'void *' : unknown size ...a bunch of warnings... c:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\core\src\arrayobject.c(4049) : er ror C2036: 'void *' : unknown size ...some more warnings... error: Command ""C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe" /c /nologo /Ox / MD /W3 /GX /DNDEBUG -Ibuild\src\numpy\core\src -Inumpy\core\include -Ibuild\src\numpy\core -Inumpy\c ore\src -Inumpy\lib\..\core\include -IC:\Python24\include -IC:\Python24\PC /Tcnumpy\core\src\multiar raymodule.c /Fobuild\temp.win32-2.4\Release\numpy\core\src\multiarraymodule.obj" failed with exit st atus 2 Anyway, like I said, my compiler could be broken, but if there is a known issue with VC7 or this rings a bell with anyone please let me know. I certainly wouldn't mind a hint. -tim Traceback from configure failure: ----------------------------------------------------------------------------------------------------------------------- C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /nologo /Ox /MD /W3 /GX /DNDEBU G -IC:\Python24\include -Inumpy\core\src -Inumpy\lib\..\core\include -IC:\Python24\include -IC:\Pyth on24\PC /Tc_configtest.c /Fo_configtest.obj C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /nologo /INCREMENTAL:NO _configt est.obj /OUT:_configtest.exe LINK : fatal error LNK1104: cannot open file 'python24.lib' failure. removing: _configtest.c _configtest.obj Traceback (most recent call last): File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\setup.py", line 73, in ? setup_package() File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\setup.py", line 66, in setup_pa ckage setup( **config.todict() ) File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\core.py", line 93, in setup return old_setup(**new_attr) File "C:\Python24\lib\distutils\core.py", line 149, in setup dist.run_commands() File "C:\Python24\lib\distutils\dist.py", line 946, in run_commands self.run_command(cmd) File "C:\Python24\lib\distutils\dist.py", line 966, in run_command cmd_obj.run() File "C:\Python24\lib\distutils\command\build.py", line 112, in run self.run_command(cmd_name) File "C:\Python24\lib\distutils\cmd.py", line 333, in run_command self.distribution.run_command(command) File "C:\Python24\lib\distutils\dist.py", line 966, in run_command cmd_obj.run() File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 86, in run self.build_sources() File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 99, in build_sources self.build_extension_sources(ext) File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 143, in build_extension_sources sources = self.generate_sources(sources, ext) File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 199, in generate_sources source = func(extension, build_dir) File "numpy\core\setup.py", line 35, in generate_config_h raise "ERROR: Failed to test configuration" ERROR: Failed to test configuration ----------------------------------------------------------------------------------------------------------------------- From faltet at carabos.com Sat Feb 4 01:34:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Sat Feb 4 01:34:02 2006 Subject: [Numpy-discussion] retrieving type objects for void array-scalar objects In-Reply-To: <43E3C596.2010904@web.de> References: <43E3C596.2010904@web.de> Message-ID: <1139045585.7529.22.camel@localhost.localdomain> El dv 03 de 02 del 2006 a les 22:05 +0100, en/na N. Volbers va escriure: > >>> dtype = numpy.dtype({'names': ['name', 'weight'],'formats': ['U30', 'f4']}) > >>> a = numpy.array([(u'Bill', 71.2), (u'Fred', 94.3)], dtype=dtype) > Is there some way to retrieve the type object directly from the array (not using any existing row) using only the name of the item? I have checked the dtype attribute, but I could only get the character representation for the item types (e.g. 'f4'). To retrieve the type directly from the array, you can use a function like this: def get_field_type_flat(descr, fname): """Get the type associated with a field named `fname`. If the field name is not found, None is returned. """ for item in descr: if fname == item[0]: return numpy.typeDict[item[1][1]] return None That one is very simple and fast. However, it can't deal with nested types. The next one is more general: def get_field_type_nested(descr, fname): """Get the type associated with a field named `fname`. This funcion looks recursively in possible nested descriptions. If the field is not found anywhere in the hierarchy, None is returned. If there are two names that are equal in the hierarchy, the first one (from top to bottom and from left to the right) found is returned. """ for item in descr: descr = item[1] if fname == item[0]: return numpy.dtype(descr).type else: if isinstance(descr, list): return get_field_type(descr, fname) return None The drawback here is that you can not select a field that is named the same way and that lives in different levels of the hierarchy. For example, selecting 'name' in a type structure like this: +-----------+ |name |x | | +-----+ | |name | +-----+-----+ is ambiguous (in the algorithm implemented above, the top level 'name' would be selected). Addressing this problem would imply to define a way to univocally specify nested fields. Anyway, I'm attaching a file with several examples on these functions. HTH, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" -------------- next part -------------- A non-text attachment was scrubbed... Name: prova.py Type: text/x-python Size: 1778 bytes Desc: not available URL: From lmbrmxdr at webquill.com Sat Feb 4 13:26:03 2006 From: lmbrmxdr at webquill.com (Bragg Megan) Date: Sat Feb 4 13:26:03 2006 Subject: [Numpy-discussion] Hey numpy-discussion Message-ID: <219570723.20060204222530@lists.sourceforge.net> Hi, numpy-discussion. globe vice pleas grumbled freshness? slithering returning hardly false rent serves delighted empire leading scrutinizing sleeper ripens bent works mountain purpose? birthday maltre forecasters mania wrong categorically hysterics neighbour business singing linden decide saved she hidden redbearded curlingirons? hysterically nonetoofresh father discovered burdock insistence chain emperors lucid peeked exclaims ratify sparkles clatter listlessly ladder diddled naive habit sector seats expects -- Best Regards, Bragg Megan mailto:numpy-discussion at lists.sourceforge.net -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wwdzjpx.gif Type: image/gif Size: 9269 bytes Desc: not available URL: From faltet at carabos.com Sun Feb 5 04:33:04 2006 From: faltet at carabos.com (Francesc Altet) Date: Sun Feb 5 04:33:04 2006 Subject: [Numpy-discussion] retrieving type objects for void array-scalar objects In-Reply-To: <43E3C596.2010904@web.de> References: <43E3C596.2010904@web.de> Message-ID: <1139142764.7534.20.camel@localhost.localdomain> El dv 03 de 02 del 2006 a les 22:05 +0100, en/na N. Volbers va escriure: > Is there some way to retrieve the type object directly from the array (not using any existing row) using only the name of the item? I have checked the dtype attribute, but I could only get the character representation for the item types (e.g. 'f4'). Ops, I've just discovered a new way to get the type in a simpler way: In [17]:dtype = numpy.dtype({'names': ['name', 'weight'],'formats': ['U30', 'f4']}) In [18]:a = numpy.array([(u'Bill', 71.2), (u'Fred', 94.3)], dtype=dtype) In [20]:a.dtype.fields['name'][0].type Out[20]: In [21]:a.dtype.fields['weight'][0].type Out[21]: For nested types, something like this should work: ntype = a.dtype.fields['name'][0].fields['nested_field'][0].type By the way, you will need numpy 0.9.5 (at least) for this to work. Incidentally, Travis, what do you think about allowing: In [30]:a.dtype.fields['weight'] Out[30]:dtype('0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From jswhit at fastmail.fm Sun Feb 5 07:25:04 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Sun Feb 5 07:25:04 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? Message-ID: <43E6188F.20703@fastmail.fm> Hi: I've successfully used the examples at http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data in a 'normal' numpy array, but have had no success adapting these examples to work with object arrays. I understand that the .data attribute holds pointers to the objects which actually contain the data in an object array, but how to you use those pointers to get the data in C/pyrex? -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From jswhit at fastmail.fm Sun Feb 5 07:57:03 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Sun Feb 5 07:57:03 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? Message-ID: <43E6202D.90906@fastmail.fm> Hi: I've successfully used the examples at http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data in a 'normal' numpy array, but have had no success adapting these examples to work with object arrays. I understand that the .data attribute holds pointers to the objects which actually contain the data in an object array, but how do you use those pointers to get the data in C/pyrex? -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From oliphant.travis at ieee.org Sun Feb 5 20:22:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sun Feb 5 20:22:10 2006 Subject: [Numpy-discussion] Re: Numpy 0.9.4 install In-Reply-To: <200602051655.01719.j.simons@planet.nl> References: <200602051655.01719.j.simons@planet.nl> Message-ID: <43E6CEBC.7070405@ieee.org> Jan Simons @planet.nl wrote: >Dear Travis, > >Thank you for all the work that you put into numerical Python. I believe that >it makes Python applicable to serious numerical work. > >I just attempted to install the package on my Suse 10.0 system (which does >have the (recent) python 2.4.1. > > I think the problem with the rpm binary is that I built the binary rpm versions against a debug-version of Python. Most people install from source on Linux, because this is the first time somebody has complained and I'm sure others have stumbled on this. I've been using a debug version of Python for a few months. I will probably switch back soon, which should make these issues less of a problem. Try building from source directly. Best, -travis From oliphant.travis at ieee.org Sun Feb 5 20:41:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sun Feb 5 20:41:01 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E6188F.20703@fastmail.fm> References: <43E6188F.20703@fastmail.fm> Message-ID: <43E6D329.4060001@ieee.org> Jeff Whitaker wrote: > > Hi: I've successfully used the examples at > http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data > in a 'normal' numpy array, but have had no success adapting these > examples to work with object arrays. I understand that the .data > attribute holds pointers to the objects which actually contain the > data in an object array, but how to you use those pointers to get the > data in C/pyrex? You have a pointer to a PyObject *object in the data. Thus, data should be recast to PyObject **. I don't know how to do that in PyRex. But, it's easy in C. In C, you will need to be concerned about reference counts. I don't know how pyrex handles this. From jswhit at fastmail.fm Mon Feb 6 05:01:09 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Mon Feb 6 05:01:09 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E6D329.4060001@ieee.org> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> Message-ID: <43E7487C.2060600@fastmail.fm> Travis Oliphant wrote: > Jeff Whitaker wrote: > >> >> Hi: I've successfully used the examples at >> http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data >> in a 'normal' numpy array, but have had no success adapting these >> examples to work with object arrays. I understand that the .data >> attribute holds pointers to the objects which actually contain the >> data in an object array, but how to you use those pointers to get the >> data in C/pyrex? > > You have a pointer to a PyObject *object in the data. Thus, data > should be recast to PyObject **. I don't know how to do that in PyRex. Travis: Apparently not. If I try to do this pyrex says 115:25: Pointer base type cannot be a Python object > But, it's easy in C. > In C, you will need to be concerned about reference counts. OK, I was hoping to avoid hand-coding an extension in C (which I'm woefully unqualified to do). -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From nicolist at limare.net Mon Feb 6 07:14:14 2006 From: nicolist at limare.net (Nico) Date: Mon Feb 6 07:14:14 2006 Subject: [Numpy-discussion] new on the list In-Reply-To: <20060206144906.627F28821B@sc8-sf-spam1.sourceforge.net> References: <20060206144906.627F28821B@sc8-sf-spam1.sourceforge.net> Message-ID: <43E76798.4050602@limare.net> Hi. I'm a new user of the numpy-discussion and scipy-user mailing-lists. So, as I usually do, here are a few words about me and my use of numpy/scipy. I am a doctorate student, in Paris; I will work on numerical analysis, mesh generation and image processing, and I intend to do the prototyping (and maybe everything) of my works with python. I recently choosed python because... - flexible and rich language for array manipulation - seems a good language to help me write clean, clear, bug-free and reusable code - seems possible to make a GUI frontend without too much pain - seems OK to glue with various other C/fortran applications without to much pain - free, as in free beer (I had to work on Matlab previously, and I don't like to force people pay for an expensive licence if they are interested in my work) - free, as in free speech (... I also had serious problems, needing compatibility of Matlab with a linux kernel not officially supported) I use numpy/scipy on Debian/Ubuntu, building from the release tarballs. And I am currently reading the available documentation... Last thing: What about a #scipy irc channel? I feel there are too many people on irc.freenode.org/#python for an efficient use. Happy coding! -- Nico -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature URL: From faltet at carabos.com Mon Feb 6 10:25:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 6 10:25:07 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy Message-ID: <1139250278.7538.52.camel@localhost.localdomain> Hi, I'm a bit surprised by the fact that unicode types are the only ones breaking the rule that must be specified with a different number of bytes than it really takes. For example: In [120]:numpy.dtype([('x','c16')]) Out[120]:dtype([('x', ' 64-bit issues?). OTOH, I thought that Python would represent internally unicode strings with 16-bit chars. Oh well, I'm bit lost on this. Anybody can bring some light? Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From faltet at carabos.com Mon Feb 6 10:53:14 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 6 10:53:14 2006 Subject: [Numpy-discussion] Mapping protocol to nested types in descriptor Message-ID: <1139251961.7538.73.camel@localhost.localdomain> Hi, I've implemented a simple mapping protocol in the descriptor type so that the user would be able to do: In [138]:dtype = numpy.dtype([ .....: ('x', ' instead of the current: In [141]:dtype.fields['Info'][0].name Out[141]:'void3872' In [142]:dtype.fields['Info'][0].fields['name'][0].type Out[142]: which I find cumbersome to type. Find the patch for this in the attachments. OTOH, I've completed the tests for heterogeneous objects in test_numerictypes.py. Now, there is a better check for both flat and nested fields, as well as explicit checking of type descriptors (including tests for the new mapping interface in descriptors). So far, no more problems have been detected by the new tests :-). Please, note that you will need the patch above applied in order to run the tests. Travis, if you think that it would be better to do not apply the patch, the tests can be easily adapted by changing lines like: self.assert_(h.dtype['x'][0].name[:4] == 'void') by self.assert_(h.dtype.fields['x'][0].name[:4] == 'void') Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" -------------- next part -------------- A non-text attachment was scrubbed... Name: arrayobject.c.patch Type: text/x-patch Size: 2654 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_numerictypes.py Type: text/x-python Size: 12824 bytes Desc: not available URL: From faltet at carabos.com Mon Feb 6 11:06:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 6 11:06:02 2006 Subject: [Numpy-discussion] Properties of fields in numpy Message-ID: <1139252739.7538.84.camel@localhost.localdomain> Hi, I don't specially like the 'void*' typecasting that are receiving the types in fields in situations like: In [143]:dtype = numpy.dtype([ .....: ('x', '0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From oliphant at ee.byu.edu Mon Feb 6 11:17:00 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 11:17:00 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139250278.7538.52.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> Message-ID: <43E7A062.9030508@ee.byu.edu> Francesc Altet wrote: >Hi, > >I'm a bit surprised by the fact that unicode types are the only ones >breaking the rule that must be specified with a different number of >bytes than it really takes. For example: > > Yeah, it's a bit annoying. There are special checks throughout the code for this. The problem, though is that sizeof(Py_UNICODE) can be 4 or 2 depending on how Python was compiled. Also, Python treats unicode and string characters as having the same length (even though internally, there is a different number of bytes required). So, I'm not sure exactly what to do, short of introducing a new code for "Unicode with specific number of bytes." I think the inconsistency should be removed, though. I'm just not sure how to do it. -Travis From oliphant at ee.byu.edu Mon Feb 6 11:21:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 11:21:01 2006 Subject: [Numpy-discussion] Properties of fields in numpy In-Reply-To: <1139252739.7538.84.camel@localhost.localdomain> References: <1139252739.7538.84.camel@localhost.localdomain> Message-ID: <43E7A16D.3050801@ee.byu.edu> Francesc Altet wrote: >Hi, > >I don't specially like the 'void*' typecasting that are receiving the >types in fields in situations like: > >In [143]:dtype = numpy.dtype([ > .....: ('x', ' .....: ('Info',[ > .....: ('name', ' .....: ('weight', ' >In [147]:dtype.fields['x'][0].name >Out[147]:'void64' > >were you can see that we have lost the information about the native type >of the 'x' field. Rather, I'd expect something like: > > Well, it's actually there. Look at dtype.fields['x'][0].subdtype[0] dtype.fields['x'][0].subdtype[1] The issue is that the base data-type of the 'x' field is void-64 (that's the dtype object the array "sees"). -Travis From strawman at astraw.com Mon Feb 6 12:33:05 2006 From: strawman at astraw.com (Andrew Straw) Date: Mon Feb 6 12:33:05 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7487C.2060600@fastmail.fm> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> Message-ID: <43E7B254.3040200@astraw.com> Hi Jeff, I've significantly updated the page at http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy Pyrex should be able to do everything you need. I hope you find the revised page more useful. Please let me know (or fix the page) if you have any issues or questions. Cheers! Andrew From jswhit at fastmail.fm Mon Feb 6 13:32:03 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Mon Feb 6 13:32:03 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7B254.3040200@astraw.com> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> <43E7B254.3040200@astraw.com> Message-ID: <43E7C03C.4060806@fastmail.fm> Andrew Straw wrote: > Hi Jeff, > > I've significantly updated the page at > http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy > > Pyrex should be able to do everything you need. > > I hope you find the revised page more useful. Please let me know (or > fix the page) if you have any issues or questions. > > Cheers! > Andrew Andrew: Thanks! That looks like exactly what I need. -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From oliphant at ee.byu.edu Mon Feb 6 14:16:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 14:16:02 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139250278.7538.52.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> Message-ID: <43E7CA57.2040907@ee.byu.edu> Francesc Altet wrote: >Hi, > >I'm a bit surprised by the fact that unicode types are the only ones >breaking the rule that must be specified with a different number of >bytes than it really takes. For example: > > Right now, the array protocol typestring is a little ambiguous on unicode characters. Ideally, the array interface would describe what kind of Unicode characters are being dealt with so that 2-byte and 4-byte unicode characters have a different description in the typestring. Python can be compiled with Unicode as either 2-byte or 4-byte. The 'U#' descriptor is supposed to be the Python unicode data-type with # representing the number of characters. If this data-type is handed off to a Python that is compiled with a different representation for Unicode, then we have a problem. Right now, the typestring value gives the number of bytes in the type. Thus, "U4" gives dtype(" References: <1139252739.7538.84.camel@localhost.localdomain> Message-ID: <43E7E4FD.9050303@ee.byu.edu> Francesc Altet wrote: >Hi, > >I don't specially like the 'void*' typecasting that are receiving the >types in fields in situations like: > >In [143]:dtype = numpy.dtype([ > .....: ('x', ' .....: ('Info',[ > .....: ('name', ' .....: ('weight', ' >In [147]:dtype.fields['x'][0].name >Out[147]:'void64' > >were you can see that we have lost the information about the native type >of the 'x' field. Rather, I'd expect something like: > > In SVN of numpy, the dtype objects now have a .base attribute and a .shape attribute. The .shape attribute returns (1,) or the shape of the sub-array. The .base attribute returns the data-type object of the base-type, or a new reference to self, if the object has no base.type. Thus, in current SVN dtype['x'].base.name would always give you what you want. -Travis From tim.hochberg at cox.net Mon Feb 6 17:14:11 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 6 17:14:11 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E7CA57.2040907@ee.byu.edu> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7CA57.2040907@ee.byu.edu> Message-ID: <43E7F40F.7030303@cox.net> Travis Oliphant wrote: > Francesc Altet wrote: > >> Hi, >> >> I'm a bit surprised by the fact that unicode types are the only ones >> breaking the rule that must be specified with a different number of >> bytes than it really takes. For example: >> >> > > Right now, the array protocol typestring is a little ambiguous on > unicode characters. Ideally, the array interface would describe what > kind of Unicode characters are being dealt with so that 2-byte and > 4-byte unicode characters have a different description in the typestring. > > Python can be compiled with Unicode as either 2-byte or 4-byte. The > 'U#' descriptor is supposed to be the Python unicode data-type with # > representing the number of characters. If this data-type is handed > off to a Python that is compiled with a different representation for > Unicode, then we have a problem. > > Right now, the typestring value gives the number of bytes in the > type. Thus, "U4" gives dtype(" sizeof(Py_UNICODE)==2, but on another system it could give dtype(" I know only a little-bit about unicode. The full Unicode character is > a 4-byte entity, but there are standard 2-byte (UTF-16) and even > 1-byte (UTF-8) encoders. > > I changed the source so that (" (i.e. if you specify an endianness then you are being byte-conscious > anyway and so the number is interpreted as a byte, otherwise the > number is interpreted as a length). This fixes issues on the same > platform, but does not fix issues where data is saved out with one > Python interpreter and read in by another with a different value of > sizeof(Py_UNICODE). This sounds like a mess. I'm not sure what the level of Unicode expertise is one this list (I certainly don't add to it), but I'd be tempted to raise this issue on PythonDev and see if anyone there has any good suggestions. I'm way out of my depth here, but it really sounds like there needs to be one descriptor for each type. Just for example "U" could be 2-byte unicode and "V" (assuming it's not taken already) could be 4-byte unicode. Then the size for a given descriptor would be constant and things would be much less confusing. -tim From oliphant at ee.byu.edu Mon Feb 6 17:28:19 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 17:28:19 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E7F40F.7030303@cox.net> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7CA57.2040907@ee.byu.edu> <43E7F40F.7030303@cox.net> Message-ID: <43E7F78F.7080304@ee.byu.edu> Tim Hochberg wrote: >> Right now, the typestring value gives the number of bytes in the >> type. Thus, "U4" gives dtype("> sizeof(Py_UNICODE)==2, but on another system it could give >> dtype("> I know only a little-bit about unicode. The full Unicode character >> is a 4-byte entity, but there are standard 2-byte (UTF-16) and even >> 1-byte (UTF-8) encoders. >> >> I changed the source so that ("> "U4" (i.e. if you specify an endianness then you are being >> byte-conscious anyway and so the number is interpreted as a byte, >> otherwise the number is interpreted as a length). This fixes issues >> on the same platform, but does not fix issues where data is saved out >> with one Python interpreter and read in by another with a different >> value of sizeof(Py_UNICODE). > > > This sounds like a mess. I'm not sure what the level of Unicode > expertise is one this list (I certainly don't add to it), but I'd be > tempted to raise this issue on PythonDev and see if anyone there has > any good suggestions. > I'm not a unicode expert, but I have read-up on it so I think I at least understand the issues involved. > I'm way out of my depth here, but it really sounds like there needs to > be one descriptor for each type. Just for example "U" could be 2-byte > unicode and "V" (assuming it's not taken already) could be 4-byte > unicode. Then the size for a given descriptor would be constant and > things would be much less confusing. > This is what I'm currently thinking. The question is would we have to define a new basic data-type for 4-byte unicode or would we just handle this on the input. Would we also define a 1-byte unicode data-type or just let the user deal with that using standard strings and encoding as is currently done in Python. -Travis From oliphant.travis at ieee.org Mon Feb 6 20:04:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 20:04:08 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? In-Reply-To: <43E81650.2040204@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> Message-ID: <43E81BFA.7060600@ieee.org> Tim Hochberg wrote: > > Just a little update on this: > > It appears that all (or almost all) of the checks in generate_config_h > must be failing. I would guess from a missing library or some such. I > will investigate some more and see what I find. > That shouldn't be a big problem. It just means that NumPy will provide the missing features instead of using the system functions. More problematic is the strange errors you are getting about void * not having a size. The line numbers you show are where we have variable declarations like register intp i Is it possible that integers the size of void * cannot be placed in a register?? -Travis From oliphant.travis at ieee.org Mon Feb 6 22:14:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 22:14:01 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? In-Reply-To: <43E82930.7070103@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> Message-ID: <43E83A88.9080607@ieee.org> Tim Hochberg wrote: > Travis Oliphant wrote: > >> Tim Hochberg wrote: >> >>> >>> Just a little update on this: >>> >>> It appears that all (or almost all) of the checks in >>> generate_config_h must be failing. I would guess from a missing >>> library or some such. I will investigate some more and see what I find. >>> >> That shouldn't be a big problem. It just means that NumPy will >> provide the missing features instead of using the system functions. >> More problematic is the strange errors you are getting about void * >> not having a size. The line numbers you show are where we have >> variable declarations like >> >> register intp i >> >> Is it possible that integers the size of void * cannot be placed in a >> register?? > > > OK, I think I found what causes the problem. What we have is lines like: > > for(i=0; i > where op is declared (void*). There shouldn't be anything like that. These should all be char *. Where did you see these? > > Of course, unfuncmodule then failed to compile. A quick peak shows > that it's throwing a lot of syntax errors. It appears to happen > whenever there's a longdouble function defined. For example: > > longdouble sinl(longdouble x) { > return (longdouble) sin((double)x); > } On your platform longdouble should be equivalent to double, so I'm not sure why this would fail. -Travis From oliphant.travis at ieee.org Mon Feb 6 22:40:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 22:40:06 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc Message-ID: <43E840BE.5060204@ieee.org> We need to test numpy on other compilers besides gcc, so that we can ferret out any gnu-isms that we may be relying on. Anybody out there with compilers they are willing to try out and/or report on? Thanks, -Travis From oliphant.travis at ieee.org Mon Feb 6 23:17:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 23:17:04 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E7F40F.7030303@cox.net> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7CA57.2040907@ee.byu.edu> <43E7F40F.7030303@cox.net> Message-ID: <43E8495C.9020008@ieee.org> > I'm way out of my depth here, but it really sounds like there needs to > be one descriptor for each type. Just for example "U" could be 2-byte > unicode and "V" (assuming it's not taken already) could be 4-byte > unicode. Then the size for a given descriptor would be constant and > things would be much less confusing. In current SVN, numpy assumes 'w' is 2-byte unicode and 'W' is 4-byte unicode in the array interface typestring. Right now these codes require that the number of bytes be specified explicitly (to satisfy the array interface requirement). There is still only 1 Unicode data-type on the platform and it has the size of Python's Py_UNICODE type. The character 'U' continues to be useful on data-type construction to stand for a unicode string of a specific character length. It's internal dtype representation will use 'w' or 'W' depending on how Python was compiled. This may not solve all issues, but at least it's a bit more consistent and solves the problem of dtype(dtype('U8').str) not producing the same datatype. It also solves the problem of unicode written out with one compilation of Python and attempted to be written in with another (it won't let you because only one of 'w#' or 'W#' is supported on a platform. -Travis From a.h.jaffe at gmail.com Tue Feb 7 01:10:03 2006 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Tue Feb 7 01:10:03 2006 Subject: [Numpy-discussion] Re: Need compilations with compilers other than gcc In-Reply-To: <43E840BE.5060204@ieee.org> References: <43E840BE.5060204@ieee.org> Message-ID: Also, what is the status of gcc 4.0 support (on Mac OS X at least)? It's a bit of a pain to have to switch between the two (are there any other disadvantages?). Andrew Travis Oliphant wrote: > > We need to test numpy on other compilers besides gcc, so that we can > ferret out any gnu-isms that we may be relying on. > > Anybody out there with compilers they are willing to try out and/or > report on? > > Thanks, > > -Travis > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 From faltet at carabos.com Tue Feb 7 02:55:10 2006 From: faltet at carabos.com (Francesc Altet) Date: Tue Feb 7 02:55:10 2006 Subject: [Numpy-discussion] Properties of fields in numpy In-Reply-To: <43E7E4FD.9050303@ee.byu.edu> References: <1139252739.7538.84.camel@localhost.localdomain> <43E7E4FD.9050303@ee.byu.edu> Message-ID: <200602071154.39285.faltet@carabos.com> A Dimarts 07 Febrer 2006 01:08, Travis Oliphant va escriure: > In SVN of numpy, the dtype objects now have a .base attribute and a > .shape attribute. > > The .shape attribute returns (1,) or the shape of the sub-array. Uh, it wouldn't be better to put .shape = 1 in case of a scalar field and (...) for a non-scalar field? Remember that this is the current convention for the numpy protocol. > The .base attribute returns the data-type object of the base-type, or a > new reference to self, if the object has no base.type. > > Thus, in current SVN > > dtype['x'].base.name would always give you what you want. Great. I like it. Thanks! -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From arnd.baecker at web.de Tue Feb 7 03:13:01 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 03:13:01 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7C03C.4060806@fastmail.fm> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> <43E7B254.3040200@astraw.com> <43E7C03C.4060806@fastmail.fm> Message-ID: On Mon, 6 Feb 2006, Jeff Whitaker wrote: > Andrew Straw wrote: > > > Hi Jeff, > > > > I've significantly updated the page at > > http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy > > > > Pyrex should be able to do everything you need. > > > > I hope you find the revised page more useful. Please let me know (or > > fix the page) if you have any issues or questions. > > > > Cheers! > > Andrew > > Andrew: Thanks! That looks like exactly what I need. -Jeff Very nice! Would it be better the policy that any runnable .py file is an attachment (see tst.py in http://scipy.org/Wiki/WikiSandBox) to the page, so that it can be easily downloaded? Presently one has to disable line numbers, copy the text, paste into an editor and save with the right file name... Best, Arnd From pearu at scipy.org Tue Feb 7 03:55:05 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 7 03:55:05 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7B254.3040200@astraw.com> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> <43E7B254.3040200@astraw.com> Message-ID: On Mon, 6 Feb 2006, Andrew Straw wrote: > I've significantly updated the page at > http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy FYI, numpy.distutils now supports building pyrex extension modules. See numpy/distutils/tests/pyrex_ext/ for a working example. In case of Cookbook/Pyrex_and_NumPy, the corresponding setup.py file is: #!/usr/bin/env python def configuration(parent_package='',top_path=None): from numpy.distutils.misc_util import Configuration config = Configuration('mypackage',parent_package,top_path) config.add_extension('pyrex_and_numpy', sources = ['test.pyx'], depends = ['c_python.pxd','c_numpy.pxd']) return config if __name__ == "__main__": from numpy.distutils.core import setup setup(**configuration(top_path='').todict()) And to build the package inplace, use python setup.py build_src build_ext --inplace Pearu From arnd.baecker at web.de Tue Feb 7 06:02:05 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 06:02:05 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: <43E840BE.5060204@ieee.org> References: <43E840BE.5060204@ieee.org> Message-ID: Hi Travis, On Mon, 6 Feb 2006, Travis Oliphant wrote: > We need to test numpy on other compilers besides gcc, so that we can > ferret out any gnu-isms that we may be relying on. > > Anybody out there with compilers they are willing to try out and/or > report on? Alright, we might need the asbestos suite thing: Something ahead: I normally used python numpy/distutils/system_info.py lapack_opt to figure out which library numpy is going to use. With current svn I get the folloowing error: Traceback (most recent call last): File "numpy/distutils/system_info.py", line 111, in ? from exec_command import find_executable, exec_command, get_pythonexe File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", line 56, in ? from numpy.distutils.misc_util import is_sequence ImportError: No module named numpy.distutils.misc_util Concerning icc compilation I used: export FC_VENDOR=Intel export F77=ifort export CC=icc export CXX=icc python setup.py config --compiler=intel install --prefix=$DESTnumpyDIR | tee ../build_log_numpy_${nr}.txt The build log shows 1393 warnings 3362 remarks Should I post them off-list or on scipy-dev? Trying to test the resulting numpy gives: In [1]: import numpy import core -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove import random -> failed: 'module' object has no attribute 'dtype' import lib -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove import linalg -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so: undefined symbol: ?1__serial_memmove import dft -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove --------------------------------------------------------------------------- exceptions.ImportError Traceback (most recent call last) /work/home/baecker/ /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/__init__.py 43 44 test = ScipyTest('numpy').test ---> 45 import add_newdocs 46 47 __doc__ += """ /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/add_newdocs.py ----> 2 from lib import add_newdoc 3 4 add_newdoc('numpy.core','dtypedescr', 5 [('fields', "Fields of the data-typedescr if any."), 6 ('alignment', "Needed alignment for this data-type"), /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/lib/__init__.py 3 from numpy.version import version as __version__ 4 ----> 5 from type_check import * 6 from index_tricks import * 7 from function_base import * /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/lib/type_check.py 6 'common_type'] 7 ----> 8 import numpy.core.numeric as _nx 9 from numpy.core.numeric import ndarray, asarray, array, isinf, isnan, \ 10 isfinite, signbit, ufunc, ScalarType, obj2sctype /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/__init__.py 3 from numpy.version import version as __version__ 4 ----> 5 import multiarray 6 import umath 7 import numerictypes as nt ImportError: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove I already reported this a month ago with a bit more information on a possible solution http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 Best, Arnd From faltet at carabos.com Tue Feb 7 06:44:01 2006 From: faltet at carabos.com (Francesc Altet) Date: Tue Feb 7 06:44:01 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E8495C.9020008@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> Message-ID: <200602071542.35593.faltet@carabos.com> A Dimarts 07 Febrer 2006 08:16, Travis Oliphant va escriure: > In current SVN, numpy assumes 'w' is 2-byte unicode and 'W' is 4-byte > unicode in the array interface typestring. Right now these codes > require that the number of bytes be specified explicitly (to satisfy the > array interface requirement). There is still only 1 Unicode data-type > on the platform and it has the size of Python's Py_UNICODE type. The > character 'U' continues to be useful on data-type construction to stand > for a unicode string of a specific character length. It's internal dtype > representation will use 'w' or 'W' depending on how Python was compiled. > > This may not solve all issues, but at least it's a bit more consistent > and solves the problem of > > dtype(dtype('U8').str) not producing the same datatype. > > It also solves the problem of unicode written out with one compilation > of Python and attempted to be written in with another (it won't let you > because only one of 'w#' or 'W#' is supported on a platform. While I agree that this solution is more consistent, I must say that I'm not very confortable with having to deal with two different widths for unicode characters. What bothers me is the lack portability of unicode strings when saving them to disk in python interpreters UCS4-enabled and retrieving with UCS2-enabled ones in the context of PyTables (or any other database). Let's suppose that a user have a numpy object of type unicode that has been created in a python with UCS4. This would look like: # UCS4-aware interpreter here >>> numpy.array(u"\U000110fc", "U1") array(u'\U000110fc', dtype=(unicode,4)) Now, suppose that you save this in a PyTables file (for example) and you want to regenerate it on a python interpreter compiled with UCS2. As the buffer on-disk has a fixed length, we are forced to use unicode types twice as larger as containers for this data. So the net effect is that we will end in the UCS2 interpreter with an object like: # UCS2-aware interpreter here >>> numpy.array(u"\U000110fc", "U2") array(u'\U000110fc', dtype=(unicode,4)) which, apparently is the same than the one above, but not quite. To begin with, the former is an array that is an unicode scalar with only *one* character, while the later has *two* characters. But worse than that, the interpretation of the original content changes drastically in the UCS2 platform. For example, if we select the first and second characters of the string in the UCS2-aware platform, we have: >>> numpy.array(u"\U000110fc", "U2")[()][0] u'\ud804' >>> numpy.array(u"\U000110fc", "U2")[()][1] u'\udcfc' that have nothing to do with the original \U000110fc character (I'd expect to get at least the truncated values \u0001 and \u10fc). I think this is because of the conventions that are used to represent 32-bit unicode characters in UTF-16 using a technique called "surrogate pairs" (see: http://www.unicode.org/glossary/). All in all, my opinion is that allowing the coexistence of different sizes of unicode types in numpy would be a receipt for disaster when one wants to transport unicode characters between platforms with python interpreters compiled with different unicode sizes. Consequently I'd propose to suport just one size of unicode sizes in numpy, namely, the 4-byte one, and if this size doesn't match the underlying python platform, then refuse to deliver native unicode objects if the user is asking for them. Something like would work: # UCS2-aware interpreter here >>> h=numpy.array(u"\U000110fc", "U1") >>> h # This is a 'true' 32-bit unicode array in numpy array(u'\U000110fc', dtype=(unicode,4)) >>> h[()] # Try to get a native unicode object in python Traceback (most recent call last): File "", line 1, in ? ValueError: unicode sizes in numpy and your python interpreter doesn't match. Sorry, but you should get an UCS4-enable python interpreter if you want to successfully complete this operation. As a premium, we can get rid of the 'w' and 'W' typecodes that has been introduced a bit forcedly, IMO. I don't know, however, how difficult would be implementing this in numpy. Another option can be to refuse to compile numpy with UCS2-aware interpreters, but this sounds a bit extreme, but see below. OTOH, I'm not an expert in Unicode, but after googling a bit, I've found interesting recommendations about its use in Python. The first is from Uge Ubuchi in http://www.xml.com/pub/a/2005/06/15/py-xml.html. Here is the relevant excerpt: """ I also want to mention another general principle to keep in mind: if possible, use a Python install compiled to use UCS4 character storage [...] UCS4 uses more space to store characters, but there are some problems for XML processing in UCS2, which the Python core team is reluctant to address because the only known fixes would be too much of a burden on performance. Luckily, most distributors have heeded this advice and ship UCS4 builds of Python. """ So, it seems that the Python crew is not interested in solving problems with with UCS2. Now, towards the end of the PEP 261 ('Support for "wide" Unicode characters') one can read this as a final conclusion: """ This PEP represents the least-effort solution. Over the next several years, 32-bit Unicode characters will become more common and that may either convince us that we need a more sophisticated solution or (on the other hand) convince us that simply mandating wide Unicode characters is an appropriate solution. """ This PEP dates from 27-Jun-2001, so the "next several years" the author is referring to is nowadays. In fact, the interpreters in my Debian based Linux, are both compiled with UCS4. Despite of this, it seems that the default for compiling python is using UCS2 provided that you still need to pass the flag "--enable-unicode=ucs4" if you want to end with a UCS4-enabled interpreter. I wonder why they are doing this if that can positively lead to problems with XML as Uge Ubuchi said (?). Anyway, I don't know if the recommendation of compiling Python with UCS4 is spread enough or not in the different distributions, but people can easily check this with: >>> len(buffer(u"u")) 4 if the output of this is 4 (as in my example), then the interpreter is using UCS4; if it is 2, it is using UCS2. Finally, I agree that asking for help about these issues in the python list would be a good idea. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From pearu at scipy.org Tue Feb 7 07:00:05 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 7 07:00:05 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Arnd Baecker wrote: > Alright, we might need the asbestos suite thing: > > Something ahead: I normally used > python numpy/distutils/system_info.py lapack_opt > to figure out which library numpy is going to use. > With current svn I get the folloowing error: > > Traceback (most recent call last): > File "numpy/distutils/system_info.py", line 111, in ? > from exec_command import find_executable, exec_command, get_pythonexe > File > "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", > line 56, in ? > from numpy.distutils.misc_util import is_sequence > ImportError: No module named numpy.distutils.misc_util This occurs probably because numpy is not installed. > Concerning icc compilation I used: > > export FC_VENDOR=Intel This has no effect anymore. Use --fcompiler=intel instead. > export F77=ifort > export CC=icc > export CXX=icc > python setup.py config --compiler=intel install --prefix=$DESTnumpyDIR > | tee ../build_log_numpy_${nr}.txt There is no intel compiler. Allowed C compilers are unix,msvc,cygwin,mingw32,bcpp,mwerks,emx. Distutils should have given an exception when using --compiler=intel. If you are using IFC compiled blas/lapack libraries then --fcompiler=intel might produce importable extension modules (because then ifc is used for linking that knows about which intel libraries need be linked to a shared library). > Trying to test the resulting numpy gives: > > In [1]: import numpy > import core -> failed: > /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: > undefined symbol: ?1__serial_memmove > I already reported this a month ago with a bit more information > on a possible solution > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 When Python is compiled with a different compiler than numpy (or any extension module) is going to be installed then proper libraries must be specified manually. Which libraries and flags are needed exactly, this is described in compilers manual. So, a recommended fix would be to build Python with icc and as a result correct libraries will be used for building 3rd party extension modules. Otherwise one has to read compilers manual, sections like about gcc-compatibility and linking might be useful. See also http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb Pearu From arnd.baecker at web.de Tue Feb 7 07:27:17 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 07:27:17 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Pearu Peterson wrote: > On Tue, 7 Feb 2006, Arnd Baecker wrote: > > > Alright, we might need the asbestos suite thing: > > > > Something ahead: I normally used > > python numpy/distutils/system_info.py lapack_opt > > to figure out which library numpy is going to use. > > With current svn I get the folloowing error: > > > > Traceback (most recent call last): > > File "numpy/distutils/system_info.py", line 111, in ? > > from exec_command import find_executable, exec_command, get_pythonexe > > File > > "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", > > line 56, in ? > > from numpy.distutils.misc_util import is_sequence > > ImportError: No module named numpy.distutils.misc_util > > This occurs probably because numpy is not installed. Maybe I am wrong, but I thought that I could run the above command before any installation to see which libraries will be used. My installation notes on this give me the feeling that this used to work... > > Concerning icc compilation I used: > > > > export FC_VENDOR=Intel > > This has no effect anymore. Use --fcompiler=intel instead. OK - I have to confess that I am really confused about which options might work and which not. Is there a document which describes this? > > export F77=ifort > > export CC=icc > > export CXX=icc But these are still needed? > > python setup.py config --compiler=intel install --prefix=$DESTnumpyDIR > > | tee ../build_log_numpy_${nr}.txt > > There is no intel compiler. Allowed C compilers are > unix,msvc,cygwin,mingw32,bcpp,mwerks,emx. Distutils should have given an > exception when using --compiler=intel. > > If you are using IFC compiled blas/lapack libraries then --fcompiler=intel > might produce importable extension modules (because then ifc is used for > linking that knows about which intel libraries need be linked to a shared > library). For this test I haven't used any blas/lapack. But it is good to know. > > Trying to test the resulting numpy gives: > > > > In [1]: import numpy > > import core -> failed: > > /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: > > undefined symbol: ?1__serial_memmove > > > > > I already reported this a month ago with a bit more information > > on a possible solution > > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 > > When Python is compiled with a different compiler than numpy (or any > extension module) is going to be installed then proper libraries must be > specified manually. Which libraries and flags are needed exactly, this is > described in compilers manual. > > So, a recommended fix would be to build Python with icc and as a > result correct libraries will be used for building 3rd party extension > modules. This would also mean that all dependent packages will have to be installed again, right? I am sorry but then I won't be able to help with icc at the moment as I am completely swamped with other stuff... > Otherwise one has to read compilers manual, sections like > about gcc-compatibility and linking might be useful. See also > http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb I thought that supplying ``--libraries="irc"`` might cure the problem, but (quoting from http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 ) """ However, in the build log I only found -lirc for the config_tests but nowhere else. What should I do instead of the above? """ Best, Arnd From pearu at scipy.org Tue Feb 7 08:07:06 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 7 08:07:06 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Arnd Baecker wrote: > On Tue, 7 Feb 2006, Pearu Peterson wrote: > >> On Tue, 7 Feb 2006, Arnd Baecker wrote: >> >>> Alright, we might need the asbestos suite thing: >>> >>> Something ahead: I normally used >>> python numpy/distutils/system_info.py lapack_opt >>> to figure out which library numpy is going to use. >>> With current svn I get the folloowing error: >>> >>> Traceback (most recent call last): >>> File "numpy/distutils/system_info.py", line 111, in ? >>> from exec_command import find_executable, exec_command, get_pythonexe >>> File >>> "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", >>> line 56, in ? >>> from numpy.distutils.misc_util import is_sequence >>> ImportError: No module named numpy.distutils.misc_util >> >> This occurs probably because numpy is not installed. > > Maybe I am wrong, but I thought that I could run the above > command before any installation to see which > libraries will be used. > My installation notes on this give me the feeling that > this used to work... from numpy.distutils.misc_util import is_sequence, is_string should be changed to from misc_util import is_sequence, is_string to fix this. >>> Concerning icc compilation I used: >>> >>> export FC_VENDOR=Intel >> >> This has no effect anymore. Use --fcompiler=intel instead. > > OK - I have to confess that I am really confused about > which options might work and which not. > Is there a document which describes this? FC_VENDOR env. variable was used in old f2py long time ago. When Fortran compiler support was moved to scipy_distutils, --fcompiler option was introduced to config, config_fc, build_ext,.. setup.py commands. One should use any of these commands to specify a Fortran compiler and config_fc to change various Fortran compiler flags. See python setup.py config_fc --help for more information. How to enhance C compiler options, see standard Distutils documentation. >>> export F77=ifort >>> export CC=icc >>> export CXX=icc > > But these are still needed? No for F77, using --fcompiler=.. should be enough. I am not sure about CC, CXX, must try it out.. >> When Python is compiled with a different compiler than numpy (or any >> extension module) is going to be installed then proper libraries must be >> specified manually. Which libraries and flags are needed exactly, this is >> described in compilers manual. >> >> So, a recommended fix would be to build Python with icc and as a >> result correct libraries will be used for building 3rd party extension >> modules. > > This would also mean that all dependent packages will have > to be installed again, right? > I am sorry but then I won't be able to help with icc at the moment > as I am completely swamped with other stuff... > >> Otherwise one has to read compilers manual, sections like >> about gcc-compatibility and linking might be useful. See also >> http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb > > I thought that supplying ``--libraries="irc"`` > might cure the problem, but > (quoting from > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 > ) > """ > However, in the build log I only found -lirc for > the config_tests but nowhere else. > What should I do instead of the above? > """ Try: export CC=icc python setup.py build build_ext -lirc This will probably use gcc for linking but might fix undefined symbol problems. Pearu From cjw at sympatico.ca Tue Feb 7 10:02:15 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Tue Feb 7 10:02:15 2006 Subject: [Numpy-discussion] Is the Python types module superfluous? In-Reply-To: <43E13502.6050207@ee.byu.edu> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> <43E0DE8D.6020907@sympatico.ca> <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> <43E13502.6050207@ee.byu.edu> Message-ID: <43E8CDCF.10303@sympatico.ca> Travis Oliphant wrote: > Gerard Vermeulen wrote: > >> On Wed, 01 Feb 2006 11:15:09 -0500 >> "Colin J. Williams" wrote: >> >> [ currently numpy uses ndarray, with synonym ArrayType, for a >> multidimensional array ] >> >> >> >>> [Dbg]>>> import types >>> [Dbg]>>> dir(types) >>> ['BooleanType', 'BufferType', 'BuiltinFunctionType', >>> 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', >>> 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', >>> 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', >>> 'Instance >>> Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', >>> 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', >>> 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', >>> 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan >>> geType', '__builtins__', '__doc__', '__file__', '__name__'] >>> [Dbg]>>> >>> >>> >> >> >> Isn't the types module becoming superfluous? >> >> >> > That's the point I was trying to make. ArrayType is to ndarray as > DictionaryType is to dict. My understanding is that the use of > types.DictionaryType is discouraged. > > -Travis > I was simply trying to suggest that the name ArrayType is more appropriate name that ndbigarray or ndarray for the multidimensional array. Since the intent is, in the long run, to integrate numpy with the Python distribution, the use of a name in the style of the existing Python types would appear to be better. Is the types module becoming superfluous? I've cross posted to c.l.p to seek information on this. Colin W. From arnd.baecker at web.de Tue Feb 7 10:13:26 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 10:13:26 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Pearu Peterson wrote: [... /numpy/distutils/exec_command.py ...] > from numpy.distutils.misc_util import is_sequence, is_string > > should be changed to > > from misc_util import is_sequence, is_string > > to fix this. Making the same type of change in numpy/distutils/system_info.py worked if ATLAS is not used (`export ATLAS=None`). Otherwise I get: python numpy/distutils/system_info.py lapack_opt lapack_opt_info: lapack_mkl_info: mkl_info: NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS system_info.atlas_threads_info Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/baecker/python2/lib/atlas'] language = f77 include_dirs = ['/usr/include'] Traceback (most recent call last): File "numpy/distutils/system_info.py", line 1693, in ? show_all() File "numpy/distutils/system_info.py", line 1689, in show_all r = c.get_info() File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/system_info.py", line 338, in get_info self.calc_info() File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/system_info.py", line 1123, in calc_info atlas_version = get_atlas_version(**version_info) File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/system_info.py", line 1028, in get_atlas_version from core import Extension, setup File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/core.py", line 12, in ? from numpy.distutils.extension import Extension ImportError: No module named numpy.distutils.extension numpy/distutils/core.py is full of `from numpy.distutils.command import ...`. > >>> Concerning icc compilation I used: > >>> > >>> export FC_VENDOR=Intel > >> > >> This has no effect anymore. Use --fcompiler=intel instead. > > > > OK - I have to confess that I am really confused about > > which options might work and which not. > > Is there a document which describes this? > > FC_VENDOR env. variable was used in old f2py long time ago. When Fortran > compiler support was moved to scipy_distutils, --fcompiler option was > introduced to config, config_fc, build_ext,.. setup.py commands. > One should use any of these commands to specify a Fortran compiler and > config_fc to change various Fortran compiler flags. See > python setup.py config_fc --help > for more information. > > How to enhance C compiler options, see standard Distutils documentation. > > >>> export F77=ifort > >>> export CC=icc > >>> export CXX=icc > > > > But these are still needed? > > No for F77, using --fcompiler=.. should be enough. I am not sure about CC, > CXX, must try it out.. > > >> When Python is compiled with a different compiler than numpy (or any > >> extension module) is going to be installed then proper libraries must be > >> specified manually. Which libraries and flags are needed exactly, this is > >> described in compilers manual. > >> > >> So, a recommended fix would be to build Python with icc and as a > >> result correct libraries will be used for building 3rd party extension > >> modules. > > > > This would also mean that all dependent packages will have > > to be installed again, right? > > I am sorry but then I won't be able to help with icc at the moment > > as I am completely swamped with other stuff... > > > >> Otherwise one has to read compilers manual, sections like > >> about gcc-compatibility and linking might be useful. See also > >> http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb > > > > I thought that supplying ``--libraries="irc"`` > > might cure the problem, but > > (quoting from > > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 > > ) > > """ > > However, in the build log I only found -lirc for > > the config_tests but nowhere else. > > What should I do instead of the above? > > """ > > Try: > > export CC=icc > python setup.py build build_ext -lirc > > This will probably use gcc for linking Yes, it does use gcc for linking. I also had to specify the location of `libirc`, export CC=icc python setup.py build build_ext -L/opt/intel/cc_90/lib/ -lirc followed by python setup.py config --fcompiler=intel install worked. On import I get another error import core -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N3/lib/python2.4/site-packages/numpy/core/umath.so: undefined symbol: __libm_sincos import random -> failed: 'module' object has no attribute 'dtype' import lib -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N3/lib/python2.4/site-packages/numpy/core/umath.so: undefined symbol: __libm_sincos import linalg -> failed: /opt/intel/fc_90/lib/libunwind.so.6: undefined symbol: ?1__serial_memmove import dft -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N3/lib/python2.4/site-packages/numpy/core/umath.so: undefined symbol: __libm_sincos So it seems I will have to specify more libraries, Would this be the correct syntax: python setup.py build build_ext -L/opt/intel/cc_90/lib/:SomeOtherPath -lirc:someotherlibrary ? >From ``python setup.py build build_ext --help`` --libraries (-l) external C libraries to link with --library-dirs (-L) directories to search for external C libraries (separated by ':') it is not clear how to specify several libraries with "-l"? But that did not work (neither did -lirc -lm) > but might fix undefined symbol problems. Many thanks, Arnd From oliphant.travis at ieee.org Tue Feb 7 10:16:13 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 10:16:13 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: References: Message-ID: <43E8E39D.5020605@ieee.org> Rich Shepard wrote: > Last evening I downloaded numpy-0.9.4 and scipy-0.4.4. I have an earlier >version of Numeric in /usr/lib/python2.4/site-packages/Numeric/. Should I >remove all references to Numeric before installing NumPy? > >Rich > > > No need to do that. Numeric and NumPy (import numpy) can live happily together. With versions of Numeric about 24.0, then can even share the same data. -Travis From rshepard at appl-ecosys.com Tue Feb 7 10:18:40 2006 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Tue Feb 7 10:18:40 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: <43E8E39D.5020605@ieee.org> References: <43E8E39D.5020605@ieee.org> Message-ID: On Tue, 7 Feb 2006, Travis Oliphant wrote: > No need to do that. Numeric and NumPy (import numpy) can live happily > together. With versions of Numeric about 24.0, then can even share the same > data. Travis, Are there advantages to having both on the system? I read the Numeric manual a couple of times, but haven't looked deeply at the division between the two. Many thanks, Rich -- Richard B. Shepard, Ph.D. | Author of "Quantifying Environmental Applied Ecosystem Services, Inc. (TM) | Impact Assessments Using Fuzzy Logic" Voice: 503-667-4517 Fax: 503-667-8863 From efiring at hawaii.edu Tue Feb 7 10:22:25 2006 From: efiring at hawaii.edu (Eric Firing) Date: Tue Feb 7 10:22:25 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <200602071542.35593.faltet@carabos.com> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> Message-ID: <43E8E529.6030308@hawaii.edu> Francesc, Travis, Francesc Altet wrote: [...] > All in all, my opinion is that allowing the coexistence of different > sizes of unicode types in numpy would be a receipt for disaster when > one wants to transport unicode characters between platforms with > python interpreters compiled with different unicode sizes. I agree--it would be a nightmare. > Anyway, I don't know if the recommendation of compiling Python with > UCS4 is spread enough or not in the different distributions, but > people can easily check this with: > > >>>>len(buffer(u"u")) > > 4 > > if the output of this is 4 (as in my example), then the interpreter is > using UCS4; if it is 2, it is using UCS2. No, it is not sufficiently widespread; Mandriva 2006 python is compiled for UCS2. Eric From tim.hochberg at cox.net Tue Feb 7 10:34:09 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 7 10:34:09 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E8E529.6030308@hawaii.edu> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <43E8E529.6030308@hawaii.edu> Message-ID: <43E8E7C8.3030206@cox.net> Eric Firing wrote: > Francesc, Travis, > > Francesc Altet wrote: > [...] > >> All in all, my opinion is that allowing the coexistence of different >> sizes of unicode types in numpy would be a receipt for disaster when >> one wants to transport unicode characters between platforms with >> python interpreters compiled with different unicode sizes. > > > I agree--it would be a nightmare. > > >> Anyway, I don't know if the recommendation of compiling Python with >> UCS4 is spread enough or not in the different distributions, but >> people can easily check this with: >> >> >>>>> len(buffer(u"u")) >>>> >> >> 4 >> >> if the output of this is 4 (as in my example), then the interpreter is >> using UCS4; if it is 2, it is using UCS2. > > > No, it is not sufficiently widespread; Mandriva 2006 python is > compiled for UCS2. Also the default build for MS Windows is compiled for UCS2. How about always storing data as UCS4 and converting it on the fly to UCS2 when extracting a python string from the array, if on a UCS2 python build. Isn't converting to UCS2 simply a matter of lopping off the top two bytes? If so, converting it should be simply a check that the value is not out of range, followed by the aforementioned lopping. -tim From gerard.vermeulen at grenoble.cnrs.fr Tue Feb 7 10:50:04 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Tue Feb 7 10:50:04 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <200602071542.35593.faltet@carabos.com> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> Message-ID: <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> On Tue, 7 Feb 2006 15:42:34 +0100 Francesc Altet wrote: > A Dimarts 07 Febrer 2006 08:16, Travis Oliphant va escriure: > > In current SVN, numpy assumes 'w' is 2-byte unicode and 'W' is 4-byte > > unicode in the array interface typestring. Right now these codes > > require that the number of bytes be specified explicitly (to satisfy the > > array interface requirement). There is still only 1 Unicode data-type > > on the platform and it has the size of Python's Py_UNICODE type. The > > character 'U' continues to be useful on data-type construction to stand > > for a unicode string of a specific character length. It's internal dtype > > representation will use 'w' or 'W' depending on how Python was compiled. > > > > This may not solve all issues, but at least it's a bit more consistent > > and solves the problem of > > > > dtype(dtype('U8').str) not producing the same datatype. > > > > It also solves the problem of unicode written out with one compilation > > of Python and attempted to be written in with another (it won't let you > > because only one of 'w#' or 'W#' is supported on a platform. > > While I agree that this solution is more consistent, I must say that > I'm not very confortable with having to deal with two different widths > for unicode characters. What bothers me is the lack portability of > unicode strings when saving them to disk in python interpreters > UCS4-enabled and retrieving with UCS2-enabled ones in the context of > PyTables (or any other database). Let's suppose that a user have a > numpy object of type unicode that has been created in a python with > UCS4. This would look like: > > # UCS4-aware interpreter here > >>> numpy.array(u"\U000110fc", "U1") > array(u'\U000110fc', dtype=(unicode,4)) > > Now, suppose that you save this in a PyTables file (for example) and > you want to regenerate it on a python interpreter compiled with UCS2. > As the buffer on-disk has a fixed length, we are forced to use unicode > types twice as larger as containers for this data. So the net effect > is that we will end in the UCS2 interpreter with an object like: > > # UCS2-aware interpreter here > >>> numpy.array(u"\U000110fc", "U2") > array(u'\U000110fc', dtype=(unicode,4)) > > which, apparently is the same than the one above, but not quite. To > begin with, the former is an array that is an unicode scalar with only > *one* character, while the later has *two* characters. But worse than > that, the interpretation of the original content changes drastically > in the UCS2 platform. For example, if we select the first and second > characters of the string in the UCS2-aware platform, we have: > > >>> numpy.array(u"\U000110fc", "U2")[()][0] > u'\ud804' > >>> numpy.array(u"\U000110fc", "U2")[()][1] > u'\udcfc' > > that have nothing to do with the original \U000110fc character (I'd > expect to get at least the truncated values \u0001 and \u10fc). I > think this is because of the conventions that are used to represent > 32-bit unicode characters in UTF-16 using a technique called > "surrogate pairs" (see: http://www.unicode.org/glossary/). > > All in all, my opinion is that allowing the coexistence of different > sizes of unicode types in numpy would be a receipt for disaster when > one wants to transport unicode characters between platforms with > python interpreters compiled with different unicode sizes. > Consequently I'd propose to suport just one size of unicode sizes in > numpy, namely, the 4-byte one, and if this size doesn't match the > underlying python platform, then refuse to deliver native unicode > objects if the user is asking for them. Something like would work: > > # UCS2-aware interpreter here > >>> h=numpy.array(u"\U000110fc", "U1") > >>> h # This is a 'true' 32-bit unicode array in numpy > array(u'\U000110fc', dtype=(unicode,4)) > >>> h[()] # Try to get a native unicode object in python > Traceback (most recent call last): > File "", line 1, in ? > ValueError: unicode sizes in numpy and your python interpreter doesn't > match. Sorry, but you should get an UCS4-enable python interpreter if > you want to successfully complete this operation. > > As a premium, we can get rid of the 'w' and 'W' typecodes that has > been introduced a bit forcedly, IMO. I don't know, however, how > difficult would be implementing this in numpy. Another option can be > to refuse to compile numpy with UCS2-aware interpreters, but this > sounds a bit extreme, but see below. > > OTOH, I'm not an expert in Unicode, but after googling a bit, I've > found interesting recommendations about its use in Python. The first > is from Uge Ubuchi in http://www.xml.com/pub/a/2005/06/15/py-xml.html. > Here is the relevant excerpt: > > """ > I also want to mention another general principle to keep in mind: if > possible, use a Python install compiled to use UCS4 character storage > [...] UCS4 uses more space to store characters, but there are some > problems for XML processing in UCS2, which the Python core team is > reluctant to address because the only known fixes would be too much of > a burden on performance. Luckily, most distributors have heeded this > advice and ship UCS4 builds of Python. > """ > > So, it seems that the Python crew is not interested in solving > problems with with UCS2. Now, towards the end of the PEP 261 ('Support > for "wide" Unicode characters') one can read this as a final > conclusion: > > """ > This PEP represents the least-effort solution. Over the next several > years, 32-bit Unicode characters will become more common and that may > either convince us that we need a more sophisticated solution or (on > the other hand) convince us that simply mandating wide Unicode > characters is an appropriate solution. > """ > > This PEP dates from 27-Jun-2001, so the "next several years" the > author is referring to is nowadays. In fact, the interpreters in my > Debian based Linux, are both compiled with UCS4. Despite of this, it > seems that the default for compiling python is using UCS2 provided > that you still need to pass the flag "--enable-unicode=ucs4" if you > want to end with a UCS4-enabled interpreter. I wonder why they are > doing this if that can positively lead to problems with XML as Uge > Ubuchi said (?). > > Anyway, I don't know if the recommendation of compiling Python with > UCS4 is spread enough or not in the different distributions, but > people can easily check this with: > > >>> len(buffer(u"u")) > 4 > > if the output of this is 4 (as in my example), then the interpreter is > using UCS4; if it is 2, it is using UCS2. > > Finally, I agree that asking for help about these issues in the python > list would be a good idea. > I have no good solution for this problem, but the standard Python on my 1-year old Mandrake is still UCS2 and I quote from PEP-261: Windows builds will be narrow for a while based on the fact that there have been few requests for wide characters, those requests are mostly from hard-core programmers with the ability to buy their own Python and Windows itself is strongly biased towards 16-bit characters. Suppose that is still true. Maybe Vista will change that. Wouldn't it be possible that numpy takes care of the "surrogate pairs" when transferring unicode strings from UCS2-interpreters to UCS4-ndarrays and vice-versa? It would be nice to be able to cast explicitly between UCS2- and UCS4- arrays, too. Requesting users to recompile their Python is a rather brutal solution :-) Gerard From oliphant.travis at ieee.org Tue Feb 7 11:09:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 11:09:06 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? In-Reply-To: <43E8EE72.6070101@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> Message-ID: <43E8EFE9.5040207@ieee.org> Tim Hochberg wrote: > > A couple of more minor issues. > > 1. numpy/random/mtrand/distributions.c needs M_PI defined if is not > already. I used the def from umathmodule.c: > > #ifndef M_PI > #define M_PI 3.14159265358979323846264338328 > #endif > > 2. The math library m.lib was hardcoded into numpy/random/setup.py. > I simply replaced ['m'] with [], which is probably not right in > general. It should probably be grabbed from config.h. > > 3. This made it through all the compiling, but blew up on linking > randomkit because sever CryptXXX functions were not defined. I added > 'Advapi32' to the libraries list. (In total libraries went from ['m'] > to ['Advapi32']. > > With this I got a full compile. I successfully imported numpy and > added a couple of matrices. Hooray! > > Is there a way to run it through some regression tests? That seems > like it should be the next step. > > Let's see if we can't fix up the setup.py file to handle this common platform correctly.... import numpy numpy.test(1,1) -Travis From oliphant.travis at ieee.org Tue Feb 7 11:12:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 11:12:08 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: References: <43E8E39D.5020605@ieee.org> Message-ID: <43E8F0B4.6090502@ieee.org> Rich Shepard wrote: > On Tue, 7 Feb 2006, Travis Oliphant wrote: > >> No need to do that. Numeric and NumPy (import numpy) can live happily >> together. With versions of Numeric about 24.0, then can even share >> the same >> data. > > > Travis, > > Are there advantages to having both on the system? I read the Numeric > manual a couple of times, but haven't looked deeply at the division > between > the two. The only real advantage is to ease the transition burden. Several third-party libraries have not converted yet, so to use those you still need Numeric. -Travis From oliphant.travis at ieee.org Tue Feb 7 11:27:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 11:27:04 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> Message-ID: <43E8F449.2060600@ieee.org> Gerard Vermeulen wrote: >>While I agree that this solution is more consistent, I must say that >>I'm not very confortable with having to deal with two different widths >>for unicode characters. >> Python itself hands us this difference. Is it really so different then the fact that python integers are either 32-bit or 64-bit depending on the platform. Perhaps what this is telling us, is that we do indeed need another data-type for 4-byte unicode. It's how we solve the problem of 32-bit or 64-bit integers (we have a 64-bit integer on all platforms). Then in NumPy we can support going back and forth between UCS-2 (which we can then say is UTF-16) and UCS-4. The issue with saving to disk is really one of encoding anyway. So, if PyTables want's do do this correctly, then it should be using a particular encoding anyway. The internal representation of Unicode should not technically matter as it's only input and output that is important. I won't support requiring a UCS-4 build of Python, though. That's too stringent. Most characters are contained within the 0th plane of UCS-2. For the additional characters (only up to 0x0010FFFF are defined), the surrogate pairs can be used. I think the best solution is to define separate UCS4 and UCS2 data-types and handle conversion between them using the casting functions. This is a bit of work to implement, but not too bad... >Wouldn't it be possible that numpy takes care of the "surrogate pairs" >when transferring unicode strings from UCS2-interpreters to UCS4-ndarrays >and vice-versa? > >It would be nice to be able to cast explicitly between UCS2- and UCS4- arrays, >too. > >Requesting users to recompile their Python is a rather brutal solution :-) > > I agree. I much prefer an additional data-type since that is after-all what UCS2 and UCS4 are... different data-types. -Travis From rshepard at appl-ecosys.com Tue Feb 7 11:32:03 2006 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Tue Feb 7 11:32:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: <43E8F0B4.6090502@ieee.org> References: <43E8E39D.5020605@ieee.org> <43E8F0B4.6090502@ieee.org> Message-ID: On Tue, 7 Feb 2006, Travis Oliphant wrote: > The only real advantage is to ease the transition burden. Several > third-party libraries have not converted yet, so to use those you still > need Numeric. Thank you. Rich -- Richard B. Shepard, Ph.D. | Author of "Quantifying Environmental Applied Ecosystem Services, Inc. (TM) | Impact Assessments Using Fuzzy Logic" Voice: 503-667-4517 Fax: 503-667-8863 From faltet at carabos.com Tue Feb 7 12:09:05 2006 From: faltet at carabos.com (Francesc Altet) Date: Tue Feb 7 12:09:05 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E8F449.2060600@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> Message-ID: <1139342874.7544.37.camel@localhost.localdomain> El dt 07 de 02 del 2006 a les 12:26 -0700, en/na Travis Oliphant va escriure: > Python itself hands us this difference. Is it really so different then > the fact that python integers are either 32-bit or 64-bit depending on > the platform. > > Perhaps what this is telling us, is that we do indeed need another > data-type for 4-byte unicode. It's how we solve the problem of 32-bit > or 64-bit integers (we have a 64-bit integer on all platforms). Agreed. > Then in NumPy we can support going back and forth between UCS-2 (which > we can then say is UTF-16) and UCS-4. If this could be implemented, then excellent! > The issue with saving to disk is really one of encoding anyway. So, if > PyTables want's do do this correctly, then it should be using a > particular encoding anyway. The problem with unicode encodings is that most (I'm thinking in UTF-8 and UTF-16) choose (correct me if I'm wrong here) a technique of surrogating pairs when trying to encode values that doesn't fit in a single word (7 bits for UTF-8 and 15 bits for UTF-16), which brings to a *variable* length of the coded output. And this is precisely the point: PyTables (as NumPy itself, or any other piece of software with efficiency in mind) would require a *fixed* space for keeping data, not a space that can be bigger or smaller depending on the number of surrogate pairs that should be used to encode a certain unicode string. But, if what you are saying is that NumPy would adopt a 32-bit unicode type internally and then do the appropriate conversion to/from the python interpreter, then this is perfect, because it is the buffer of NumPy that will be used to be written/read to/from disk, not the Python object, and the buffer of such a NumPy object meets the requisites to become an efficient buffer: fixed length *and* large enough to keep *every* Unicode character without a need to use encodings. > I think the best solution is to define separate UCS4 and UCS2 data-types > and handle conversion between them using the casting functions. This > is a bit of work to implement, but not too bad... Well, I don't understand well here. I thought that you were proposing a 32-bit unicode type for NumPy and then converting it appropriately to UCS2 (conversion to UCS4 wouldn't be necessary as it would be the same as the native NumPy unicode type) just in case that the user requires an scalar out of the NumPy object. But you are talking here about defining separate UCS4 and UCS2 data-types. I admit that I'm loosed here... Regards, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From oliphant.travis at ieee.org Tue Feb 7 12:37:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 12:37:03 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139342874.7544.37.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> Message-ID: <43E904AC.5060002@ieee.org> Francesc Altet wrote: >El dt 07 de 02 del 2006 a les 12:26 -0700, en/na Travis Oliphant va >escriure: > > >>Python itself hands us this difference. Is it really so different then >>the fact that python integers are either 32-bit or 64-bit depending on >>the platform. >> >>Perhaps what this is telling us, is that we do indeed need another >>data-type for 4-byte unicode. It's how we solve the problem of 32-bit >>or 64-bit integers (we have a 64-bit integer on all platforms). >> >> > >Agreed. > > > >>Then in NumPy we can support going back and forth between UCS-2 (which >>we can then say is UTF-16) and UCS-4. >> >> > >If this could be implemented, then excellent! > > Sure it could be implemented. It's just a matter of effort. Python itself always defines a Py_UCS4 type even on UCS2 builds. We would just have to make sure Py_UCS2 is always defined as well. The biggest hassle is implementing the corresponding scalar type. The one corresponding to the build for Python comes free. The other would have to be implemented directly. >The problem with unicode encodings is that most (I'm thinking in UTF-8 >and UTF-16) choose (correct me if I'm wrong here) a technique of >surrogating pairs when trying to encode values that doesn't fit in a >single word (7 bits for UTF-8 and 15 bits for UTF-16), which brings to a >*variable* length of the coded output. And this is precisely the point: >PyTables (as NumPy itself, or any other piece of software with >efficiency in mind) would require a *fixed* space for keeping data, not >a space that can be bigger or smaller depending on the number of >surrogate pairs that should be used to encode a certain unicode string. > > You are correct that encoding introduces a variable byte-length per character (up to 6 for UTF-8 and up to 2 for UTF-16 I think). I've seen data-bases handle this by warning the user to make sure the size of their data area is large enough to handle their longest use case. You can still used fixed-sizes you just have to make sure they are large enough (or risk truncation). >But, if what you are saying is that NumPy would adopt a 32-bit unicode >type internally and then do the appropriate conversion to/from the >python interpreter, then this is perfect, because it is the buffer of >NumPy that will be used to be written/read to/from disk, not the Python >object, and the buffer of such a NumPy object meets the requisites to >become an efficient buffer: fixed length *and* large enough to keep >*every* Unicode character without a need to use encodings. > > I see the value in such a buffer, I really do. I'm just concerned about forcing everyone to use Python UCS4 builds. That is way too stringent. I'm afraid the only real solution is to implement a UCS2 and a UCS4 data-type. >Well, I don't understand well here. I thought that you were proposing a >32-bit unicode type for NumPy and then converting it appropriately to >UCS2 (conversion to UCS4 wouldn't be necessary as it would be the same >as the native NumPy unicode type) just in case that the user requires an >scalar out of the NumPy object. But you are talking here about defining >separate UCS4 and UCS2 data-types. I admit that I'm loosed here... > > > I suppose that is another approach: we could internally have all UNICODE data-types use 4-bytes and do the conversions necessary. But, it would still require us to do most of work of supporting two data-types. Currently, the unicode scalar object is a simple inheritance from Python's UNICODE data-type. That would have to change and the work to do that is most of the work to support two different data-types. So, if we are going to go through that effort. I would rather see the result be two different Unicode data-types supported. -Travis From oliphant.travis at ieee.org Tue Feb 7 16:52:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 16:52:04 2006 Subject: ***[Possible UCE]*** Re: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43E9235E.70004@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> Message-ID: <43E94090.5080609@ieee.org> Tim Hochberg wrote: > > OK, I finally got it to pass all of the tests. The final two pieces of > the puzzle were using _isnan and _finite and then realizing the > _finite was not in fact the opposite of isinf. Thanks for finding this. I've updated the ufuncobject.h file with definitions for isinf, isfinite, and isnan. Presumably this should allow the SVN version of numpy to build. Let me know what happens. -Travis From faltet at carabos.com Wed Feb 8 00:09:10 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 8 00:09:10 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E904AC.5060002@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> <43E904AC.5060002@ieee.org> Message-ID: <1139386100.7534.35.camel@localhost.localdomain> El dt 07 de 02 del 2006 a les 13:35 -0700, en/na Travis Oliphant va escriure: > Sure it could be implemented. It's just a matter of effort. Python > itself always defines a Py_UCS4 type even on UCS2 builds. We would just > have to make sure Py_UCS2 is always defined as well. Be careful with this because you can run into problems. For example, trying to import numpy compiled with a UCS4 python from a UCS2 one, gives me the following: $ python Python 2.4.2 (#1, Feb 8 2006, 08:16:44) [GCC 4.0.3 20060115 (prerelease) (Debian 4.0.2-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy import core -> failed: /usr/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: _PyUnicodeUCS4_IsWhitespace import random -> failed: 'module' object has no attribute 'dtype' import lib -> failed: /usr/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: _PyUnicodeUCS4_IsWhitespace Although I guess that this would be not a problem when using a numpy compiled with a proper interpreter. Just wanted to point out this. > The biggest hassle is implementing the corresponding scalar type. The > one corresponding to the build for Python comes free. The other would > have to be implemented directly. Yeah, it seems like we should end implementing a new Unicode type entirely in NumPy in a way or other. > I've seen data-bases handle this by warning the user to make sure the > size of their data area is large enough to handle their longest use > case. You can still used fixed-sizes you just have to make sure they > are large enough (or risk truncation). Ok. I can admit that data can be truncated (you may end with a corrupted Unicode string, but this is the responsability of the user :-(). However, another thing that I feel unconfortable with is the additional encoding/decoding steps that potentially introduces UCS2 for doing I/O. Well, perhaps this is faster than I suppose and that I/O speed will not be too affected, but still... > >Well, I don't understand well here. I thought that you were proposing a > >32-bit unicode type for NumPy and then converting it appropriately to > >UCS2 (conversion to UCS4 wouldn't be necessary as it would be the same > >as the native NumPy unicode type) just in case that the user requires an > >scalar out of the NumPy object. But you are talking here about defining > >separate UCS4 and UCS2 data-types. I admit that I'm loosed here... > > > > > > > I suppose that is another approach: we could internally have all > UNICODE data-types use 4-bytes and do the conversions necessary. But, > it would still require us to do most of work of supporting two > data-types. Currently, the unicode scalar object is a simple > inheritance from Python's UNICODE data-type. That would have to change > and the work to do that is most of the work to support two different > data-types. So, if we are going to go through that effort. I would > rather see the result be two different Unicode data-types supported. Ok. I see that you got my point. Well, maybe I'm wrong here, but my proposal would result in implementing just one new data-type for 32-bit unicode when the python platform is UCS2 aware. If, as you said above, Py_UCS4 type is always defined, even on UCS2 interpreters, that should be relatively easy to do. So, you we can make all the NumPy unicode *arrays* based on this new type. The NumPy unicode *scalars* will inherit directly from the native Py_UCS2 type for this interpreter. Then, we just have to implement the necessary conversions between UCS4<-->UCS2 to comunicate data from NumPy array into/from scalar type. The only drawback that I see in this approach is that you will end having UCS4 types in numpy ndarrays and UCS2 types when getting scalars from them (however, the user will hardly notice this, IMO). The advantage would be that NumPy arrays will always be UCS4 irregardingly of the platform they are, making the access to their data from C much easier and portable (and yes, efficient!). Of course, if you are using a UCS4 platform, then you can choose the same native Py_UCS4 type for NumPy arrays and scalars and you are done. Well, probably I've overlooked something, but I really think that this would be a nice thing to do. Regards, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From oliphant.travis at ieee.org Wed Feb 8 00:42:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 8 00:42:03 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139386100.7534.35.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> <43E904AC.5060002@ieee.org> <1139386100.7534.35.camel@localhost.localdomain> Message-ID: <43E9AEAE.5020306@ieee.org> Francesc Altet wrote: >Ok. I see that you got my point. Well, maybe I'm wrong here, but my >proposal would result in implementing just one new data-type for 32-bit >unicode when the python platform is UCS2 aware. If, as you said above, >Py_UCS4 type is always defined, even on UCS2 interpreters, that should >be relatively easy to do. > Hmm. I think I'm beginning to like your idea. We could in fact make the NumPy Unicode type always UCS4 and then keep the Python Unicode scalar. On Python UCS2 builds the conversion would use UTF-16 to go to the Python scalar (which would always inherit from the native unicode type). It would be one data-type where there was not an identical match in the memory layout of the scalar and the array data-type, but because in this case there are conversions to go back and forth, it may not matter. This would not be too difficult to implement, actually --- it would require new functions to handle conversions in arraytypes.inc.src and some modifications to PyArray_Scalar. The only draw-back is that now all unicode arrays are twice as large and the aforementioned asymmetry between the data-type and the array-scalar on Python UCS2 builds. But, all in all, it sounds like a good plan. If the time comes that somebody wants to add a reduced-size USC2 array of unicode characters then we can cross that bridge if and when it comes up. I still like using explicit typecode characters in the array interface to denote UCS2 or the UCS4 data-type. We could still change from 'W', 'w' to other characters... >Well, probably I've overlooked something, but I really think that this >would be a nice thing to do. > > There are details in the scalar-array conversions (getitem and setitem that would have to be implemented but it is possible. The UCS4 --> UTF-16 encoding is one of the easiest. It's done in unicodeobject.h in Python, but I'm not sure it's exposed other than going through the interpreter. Does this seem like a solution that everyone can live with? -Travis From gerard.vermeulen at grenoble.cnrs.fr Wed Feb 8 01:30:02 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Wed Feb 8 01:30:02 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E9AEAE.5020306@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> <43E904AC.5060002@ieee.org> <1139386100.7534.35.camel@localhost.localdomain> <43E9AEAE.5020306@ieee.org> Message-ID: <20060208102906.6191d180.gerard.vermeulen@grenoble.cnrs.fr> On Wed, 08 Feb 2006 01:41:18 -0700 Travis Oliphant wrote: > >Well, probably I've overlooked something, but I really think that this > >would be a nice thing to do. > > > > > There are details in the scalar-array conversions (getitem and setitem > that would have to be implemented but it is possible. The UCS4 --> > UTF-16 encoding is one of the easiest. It's done in unicodeobject.h in > Python, but I'm not sure it's exposed other than going through the > interpreter. > > Does this seem like a solution that everyone can live with? > Yes. The only point that worries me a little bit that some problems are limited by memory or memory bandwidth and for those cases UCS2 arrays are better than UCS4 arrays. I have run into memory problems before and I don't know if it will happen for unicode strings. Time will tell. Gerard From faltet at carabos.com Wed Feb 8 02:10:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 8 02:10:07 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E9AEAE.5020306@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <1139386100.7534.35.camel@localhost.localdomain> <43E9AEAE.5020306@ieee.org> Message-ID: <200602081109.14883.faltet@carabos.com> A Dimecres 08 Febrer 2006 09:41, Travis Oliphant va escriure: > Hmm. I think I'm beginning to like your idea. We could in fact make Good :-) > the NumPy Unicode type always UCS4 and then keep the Python Unicode > scalar. On Python UCS2 builds the conversion would use UTF-16 to go to > the Python scalar (which would always inherit from the native unicode > type). Yes, exactly. > But, all in all, it sounds like a good plan. If the time comes that > somebody wants to add a reduced-size USC2 array of unicode characters > then we can cross that bridge if and when it comes up. Well, provided the recommendations about migrating to 32-bit unicode objects, I'd say that this would be a strange desire. If the problem is memory consumption, the users can always choose regular 8-bit strings (of course, without supporting completely general unicode characters). > I still like using explicit typecode characters in the array interface > to denote UCS2 or the UCS4 data-type. We could still change from 'W', > 'w' to other characters... But, why do you want to do this? If data type for unicode in arrays is always UCS4 and in scalars is always determined by the python build, then why do we want to try to distinguish them with specific type codes? At C level there should be straightforward ways to determine whether a scalar is UCS2 or UCS4 (just looking at the native python type), and at python level there is not an evident way to distinguish (correct me if I'm wrong here) between an UCS2 and UCS4 unicode string, and in fact, the user will not notice the difference in general (but see later). Besides, having an 'U' as indicator for unicode is compatible in the way Python has to express 32-bit unicode chars (i.e. \Uxxxxxxxx). So I find that keeping 'U' for specifying unicode types would be more than enough and that introducing 'w' and 'W' (or whathever) will only introduce unnecessary burden, IMO. Moreover, if a user tries to know the type using the .dtype descriptor, he will find that the type continues to be 'U' irregardingly of the build he is using. Something like: # We are in a UCS2 interpreter In [30]: numpy.array([1],dtype="U2")[0].dtype Out[30]: dtype('UCS2. I'm still wondering why this is not the default... :-/ Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From a.h.jaffe at gmail.com Wed Feb 8 04:36:04 2006 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Wed Feb 8 04:36:04 2006 Subject: [Numpy-discussion] GCC 4 (OS X?) support for numpy? Message-ID: Hi All, [originally posted in a slightly off-topic thread, so I thought I'd try here -- sorry for the duplication!] What is the status of gcc 4.0 support (on Mac OS X at least)? It's a bit of a pain to have to switch between the two (are there any other disadvantages?). As of my last attempt, numpy.test() fails due to some machar issues if I recall correctly. Andrew From stefan at sun.ac.za Wed Feb 8 06:09:12 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed Feb 8 06:09:12 2006 Subject: [Numpy-discussion] creating column vectors Message-ID: <20060208141052.GB5734@alpha> This is probably a silly question, but what is the best way of creating column vectors? 'arange' always returns a row vector, on which you cannot perform 'transpose' since it has only one dimension. mat(arange(1,10)).transpose() works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). I'd appreciate pointers in the right direction. Regards St?fan From svetosch at gmx.net Wed Feb 8 06:35:19 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 8 06:35:19 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <20060208141052.GB5734@alpha> References: <20060208141052.GB5734@alpha> Message-ID: <43EA0157.9000908@gmx.net> Stefan van der Walt schrieb: > This is probably a silly question, but what is the best way of > creating column vectors? 'arange' always returns a row vector, on > which you cannot perform 'transpose' since it has only one dimension. > > mat(arange(1,10)).transpose() > mat(range(1,10)).T is a bit shorter, but I would agree that doing matrix algebra in numpy is not as natural as with explicitly matrix-oriented languages; my understanding is that this is due to numpy's broader (n-dimensional) scope. Numpy-masters: Is there a way to set a user- or project-specific config switch or something like that to always get matrix results when dealing with 1d and 2d arrays? I think that would make numpy much more attractive for people like Stefan and me coming from the 2d world. cheers, Sven From luszczek at cs.utk.edu Wed Feb 8 07:03:03 2006 From: luszczek at cs.utk.edu (Piotr Luszczek) Date: Wed Feb 8 07:03:03 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA0157.9000908@gmx.net> References: <20060208141052.GB5734@alpha> <43EA0157.9000908@gmx.net> Message-ID: <200602081001.54312.luszczek@cs.utk.edu> On Wednesday 08 February 2006 09:33, Sven Schreiber wrote: > Stefan van der Walt schrieb: > > This is probably a silly question, but what is the best way of > > creating column vectors? 'arange' always returns a row vector, on > > which you cannot perform 'transpose' since it has only one > > dimension. > > > > mat(arange(1,10)).transpose() > > mat(range(1,10)).T is a bit shorter, but I would agree that doing > matrix algebra in numpy is not as natural as with explicitly > matrix-oriented languages; my understanding is that this is due to > numpy's broader (n-dimensional) scope. > > Numpy-masters: Is there a way to set a user- or project-specific > config switch or something like that to always get matrix results > when dealing with 1d and 2d arrays? I think that would make numpy > much more attractive for people like Stefan and me coming from the 2d > world. I'm not a master by far but I heard that question before. Isn't the mlab module just for that purpose? I was explained that the problem with a "switch" is that the same code will behave differently depending on which installation you run. If you run on my n-D installation it will do one thing and if you run it on your 2-D installation (with the 2D world "switch" enabled) you get subtly different result. It might become a bug hunting nighmare. I think this is when Python's explicit vs. implicit rule kicks in: python -c 'import this' Piotr From gerard.vermeulen at grenoble.cnrs.fr Wed Feb 8 07:22:04 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Wed Feb 8 07:22:04 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <20060208141052.GB5734@alpha> References: <20060208141052.GB5734@alpha> Message-ID: <20060208162115.205360dd.gerard.vermeulen@grenoble.cnrs.fr> On Wed, 8 Feb 2006 16:10:52 +0200 Stefan van der Walt wrote: > This is probably a silly question, but what is the best way of > creating column vectors? 'arange' always returns a row vector, on > which you cannot perform 'transpose' since it has only one dimension. > > mat(arange(1,10)).transpose() > > works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). > > I'd appreciate pointers in the right direction. > What about this? arange(1, 10)[:, NewAxis] Gerard From arnd.baecker at web.de Wed Feb 8 09:24:02 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Wed Feb 8 09:24:02 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Arnd Baecker wrote: > On Tue, 7 Feb 2006, Pearu Peterson wrote: [...] > > >> So, a recommended fix would be to build Python with icc and as a > > >> result correct libraries will be used for building 3rd party extension > > >> modules. OK, I went for this. With numpy.__version__ '0.9.5.2069' I get for numpy.test(10) ====================================================================== FAIL: check_basic (numpy.lib.function_base.test_function_base.test_cumprod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/lib/tests/test_function_base.py", line 169, in check_basic 1320, 6600, 26400],ctype)) File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/testing/utils.py", line 156, in assert_array_equal assert cond,\ AssertionError: Arrays are not equal (mismatch 57.1428571429%): Array 1: [ 1. 2. 20. 0. 0. 0. 0.] Array 2: [ 1.0000000000000000e+00 2.0000000000000000e+00 2.0000000000000000e+01 2.2000000000000000e+02 1.32000000000000... ====================================================================== FAIL: check_basic (numpy.lib.function_base.test_function_base.test_cumsum) ---------------------------------------------------------------------- Traceback (most recent call last): File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/lib/tests/test_function_base.py", line 128, in check_basic assert_array_equal(cumsum(a), array([1,3,13,24,30,35,39],ctype)) File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/testing/utils.py", line 156, in assert_array_equal assert cond,\ AssertionError: Arrays are not equal (mismatch 57.1428571429%): Array 1: [ 1. 3. 13. 11. 17. 5. 9.] Array 2: [ 1. 3. 13. 24. 30. 35. 39.] ====================================================================== FAIL: check_simple (numpy.lib.function_base.test_function_base.test_unwrap) ---------------------------------------------------------------------- Traceback (most recent call last): File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/lib/tests/test_function_base.py", line 273, in check_simple assert(all(diff(unwrap(rand(10)*100)) References: <20060208141052.GB5734@alpha> <20060208162115.205360dd.gerard.vermeulen@grenoble.cnrs.fr> Message-ID: <43EA2E16.906@gmx.net> Gerard Vermeulen schrieb: >> mat(arange(1,10)).transpose() >> >> works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). > > What about this? > > arange(1, 10)[:, NewAxis] > The numpy-book beats both of us (see my previous post) in terms of minimal typing overhead by suggesting r_[1:10,'c'] which produces a matrix type, very nice. Compared to [1:10]', that's quite good already... -sven From stefan at sun.ac.za Wed Feb 8 12:25:02 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed Feb 8 12:25:02 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA2E16.906@gmx.net> References: <20060208141052.GB5734@alpha> <20060208162115.205360dd.gerard.vermeulen@grenoble.cnrs.fr> <43EA2E16.906@gmx.net> Message-ID: <20060208202658.GC5734@alpha> On Wed, Feb 08, 2006 at 06:44:54PM +0100, Sven Schreiber wrote: > Gerard Vermeulen schrieb: > > >> mat(arange(1,10)).transpose() > >> > >> works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). > > > > > What about this? > > > > arange(1, 10)[:, NewAxis] > > > > The numpy-book beats both of us (see my previous post) in terms of > minimal typing overhead by suggesting r_[1:10,'c'] which produces a > matrix type, very nice. Thanks for your effort, that's exactly what I was looking for! Time to get hold of that book... Cheers St?fan From Chris.Barker at noaa.gov Wed Feb 8 14:01:02 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 8 14:01:02 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <20060208141052.GB5734@alpha> References: <20060208141052.GB5734@alpha> Message-ID: <43EA69CC.3030504@noaa.gov> Stefan van der Walt wrote: > This is probably a silly question, but what is the best way of > creating column vectors? I do this: >>> import numpy as N >>> v = N.arange(10) >>> v array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> v.shape = (-1,1) >>> v array([[0], [1], [2], [3], [4], [5], [6], [7], [8], [9]]) > 'arange' always returns a row vector no, it doesn't, it returns a 1-dimensional vector. > Numpy-masters: Is there a way to set a user- or project-specific config > switch or something like that to always get matrix results when dealing > with 1d and 2d arrays? I think that would make numpy much more > attractive for people like Stefan and me coming from the 2d world. numpy is not a Matlab clone, nor should it be. That's exactly why I use it! Take a little time to get used to it, and you'll become very glad that numpy works the way it does, rather than like Matlab. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Feb 8 14:04:16 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 8 14:04:16 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) Message-ID: <43EA6A34.2000202@gmail.com> Sasha (from the ticket comments): """I would prefer if arange, would do what range does and round step, possibly with a warning for fractional steps. In other words, arange(start, stop, step, dtype) should be an optimized version of array(range(start, stop, step), dtype). If this is not acceptable, I think arange(start,stop,step)[-1] < stop should be an invariant and floating point issues should be properly addressed. """ arange() does allow for fractional steps unlinke range(). You may fix the docstring if you like. However, I don't think it is possible to ensure that invariant in the face of floating point. That's why we have linspace(). -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From oliphant at ee.byu.edu Wed Feb 8 14:40:19 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 14:40:19 2006 Subject: [Numpy-discussion] newunicode branch started to fix unicode to always be UCS4 Message-ID: <43EA72D5.8090508@ee.byu.edu> I've started a branch on SVN to fix the unicode implementation in NumPy so that internally all unicode arrays use UCS4. When a scalar is obtained it will be the Python unicode scalar and the required conversions (and data-copying) will be done. If anybody would like to help the branch is http://svn.scipy.org/svn/numpy/branches/newunicode -Travis From andorxor at gmx.de Wed Feb 8 14:47:20 2006 From: andorxor at gmx.de (Stephan Tolksdorf) Date: Wed Feb 8 14:47:20 2006 Subject: [Numpy-discussion] Constructing array from generator expression/iterator Message-ID: <43EA7441.4020500@gmx.de> Hi I'm new to Numpy and just stumbled over the following problem in Numpy 0.9.4: array(x**2 for x in range(10)) does not return what one (me) would suspect, i.e. array([x**2 for x in range(10)]) Is this expected behavior? Stephan From ndarray at mac.com Wed Feb 8 14:51:10 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 14:51:10 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA6A34.2000202@gmail.com> References: <43EA6A34.2000202@gmail.com> Message-ID: On 2/8/06, Robert Kern wrote: > ... > arange() does allow for fractional steps unlinke range(). You may fix the > docstring if you like. However, I don't think it is possible to ensure that > invariant in the face of floating point. That's why we have linspace(). There is certainly a way to ensure that arange(..., stop, ...)[-1] < stop in the face of floating point -- just repeat start += step with start in a volatile double variable until it exceeds stop to get the length of the result. There might be an O(1) solution as well, but it may require some assumptions about the floating point unit. In any case, I can do one of the following depending on a vote: 1 (default). Document length=ceil((stop - start)/step) in the arange docstring 2. Change arange to be a fast equivalent of array(range(start, stop, step), dtype). 3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. Please vote on 1-3. -- sasha From oliphant at ee.byu.edu Wed Feb 8 14:59:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 14:59:03 2006 Subject: [Numpy-discussion] Constructing array from generator expression/iterator In-Reply-To: <43EA7441.4020500@gmx.de> References: <43EA7441.4020500@gmx.de> Message-ID: <43EA773A.5010200@ee.byu.edu> Stephan Tolksdorf wrote: > Hi > > I'm new to Numpy and just stumbled over the following problem in Numpy > 0.9.4: > > array(x**2 for x in range(10)) > > does not return what one (me) would suspect, i.e. > > array([x**2 for x in range(10)]) > > Is this expected behavior? The array constructor does not current "understand" generators objects. It only understands sequence objects. It could be made to work but is based on code written long before there were generators. So, instead you get a 0-d Object-array containing the generator. Just use list comprehensions instead. -Travis From oliphant at ee.byu.edu Wed Feb 8 15:00:33 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 15:00:33 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> Message-ID: <43EA779F.2080302@ee.byu.edu> Sasha wrote: >On 2/8/06, Robert Kern wrote: > > >> ... >>arange() does allow for fractional steps unlinke range(). You may fix the >>docstring if you like. However, I don't think it is possible to ensure that >>invariant in the face of floating point. That's why we have linspace(). >> >> > >There is certainly a way to ensure that arange(..., stop, ...)[-1] < >stop in the face of floating point -- just repeat start += step with >start in a volatile double variable until it exceeds stop to get the >length of the result. There might be an O(1) solution as well, but >it may require some assumptions about the floating point unit. > >In any case, I can do one of the following depending on a vote: > >1 (default). Document length=ceil((stop - start)/step) in the arange docstring > > +5 We can't really do anything else at this point since this behavior has been what is with us for a long time. -Travis From ndarray at mac.com Wed Feb 8 15:05:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 15:05:03 2006 Subject: [Numpy-discussion] Constructing array from generator expression/iterator In-Reply-To: <43EA7441.4020500@gmx.de> References: <43EA7441.4020500@gmx.de> Message-ID: Array constructor does not support arbitrary iterables. For example: >>> array(iter([1,2,3])) array(, dtype=object) In Numeric, it was not possible to try to iterate throught the object in array constructor because rank-0 arrays were iterable and would lead to infinite recursion. Since this problem was fixed in numpy, I don't see much of a problem in implementing such feature. On 2/8/06, Stephan Tolksdorf wrote: > Hi > > I'm new to Numpy and just stumbled over the following problem in Numpy > 0.9.4: > > array(x**2 for x in range(10)) > > does not return what one (me) would suspect, i.e. > > array([x**2 for x in range(10)]) > > Is this expected behavior? > > Stephan > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From tim.hochberg at cox.net Wed Feb 8 15:41:31 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 8 15:41:31 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> Message-ID: <43EA7FE7.2040902@cox.net> Sasha wrote: >On 2/8/06, Robert Kern wrote: > > >> ... >>arange() does allow for fractional steps unlinke range(). You may fix the >>docstring if you like. However, I don't think it is possible to ensure that >>invariant in the face of floating point. That's why we have linspace(). >> >> > >There is certainly a way to ensure that arange(..., stop, ...)[-1] < >stop in the face of floating point -- just repeat start += step with >start in a volatile double variable until it exceeds stop to get the >length of the result. There might be an O(1) solution as well, but >it may require some assumptions about the floating point unit. > > Isn't that bad numerically? That is, isn't (n*step) much more accurate than (step + step + ....)? It also seems needlessly inefficient; you should be able to to it in at most a few steps: length = (stop - start)/step while length * step < stop: length += 1 while length * step >= stop: length -= 1 Fix errors, convert to C and enjoy. It should normally take only a few tries to get the right N. I see that the internals of range use repeated adding to make the range. I imagine that is why you proposed the repeated adding. I think that results in error that's on the order of length ULP, while multiplying would result in error on the order of 1 ULP. So perhaps we should fix XXX_fill to be more accurate if nothing else. >In any case, I can do one of the following depending on a vote: > >1 (default). Document length=ceil((stop - start)/step) in the arange docstring > > That has the virtue of being easy to explain. >2. Change arange to be a fast equivalent of array(range(start, stop, >step), dtype). > > No thank you. >3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. > > I see that Travis has vetoed this in any event, but perhaps we should fix up the fill functions to be more accurate and maybe most of the problem would just magically go away. -tim From ndarray at mac.com Wed Feb 8 16:05:26 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 16:05:26 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA779F.2080302@ee.byu.edu> References: <43EA6A34.2000202@gmail.com> <43EA779F.2080302@ee.byu.edu> Message-ID: On 2/8/06, Travis Oliphant wrote: > +5 > > We can't really do anything else at this point since this behavior has > been what is with us for a long time. I guess this closes the dispute. I've commited a new docstring to SVN. From wbaxter at gmail.com Wed Feb 8 16:05:29 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Wed Feb 8 16:05:29 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA69CC.3030504@noaa.gov> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> Message-ID: On 2/9/06, Christopher Barker wrote: > > > numpy is not a Matlab clone, nor should it be. That's exactly why I use > it! Take a little time to get used to it, and you'll become very glad > that numpy works the way it does, rather than like Matlab. If you can spare the time, I'd love to hear you elaborate on that. What are some specifics that make you say 'thank goodness for numpy!'? If you have some good ones, I'd like to put them up on http://www.scipy.org/NumPy_for_Matlab_Addicts (of course you're more than welcome to cut out the middle man and just post them directly on the wiki there yourself...) --Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From svetosch at gmx.net Wed Feb 8 16:12:53 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 8 16:12:53 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA69CC.3030504@noaa.gov> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> Message-ID: <43EA87EE.1030808@gmx.net> Christopher Barker schrieb: > > numpy is not a Matlab clone, nor should it be. That's exactly why I use > it! Take a little time to get used to it, and you'll become very glad > that numpy works the way it does, rather than like Matlab. > well, I have taken that time because I was already into python (glue everyting together, you know), but I bet you won't be very successful in the Gauss et al. camp with that marketing slogan... also, your statement does not sound very pythonic; the "you'll get used to it, and trust me, even if you don't understand it now, it's great afterwards"-approach sounds more like the pre-python era (you may insert a language of your choice here ;-) I don't see why numpy cannot preserve the features that are important to you (and which I know nothing about) and at the same time make life more intuitive and easier for 2d-dummies like myself -- in a lot of ways, it's already accomplished, I'd say it just needs the finishing touch. cheers, sven From oliphant at ee.byu.edu Wed Feb 8 16:17:31 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 16:17:31 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA7FE7.2040902@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> Message-ID: <43EA8867.5080109@ee.byu.edu> Tim Hochberg wrote: >> > > > I see that Travis has vetoed this in any event, but perhaps we should > fix up the fill functions to be more accurate and maybe most of the > problem would just magically go away. To do something different than arange has always done we need a new function, not change what arange does and thus potentially break lots of code. How do you propose to make the fill funtions more accurate? I'm certainly willing to see improvements there. -Travis From svetosch at gmx.net Wed Feb 8 16:29:06 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 8 16:29:06 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> Message-ID: <43EA8B46.60202@gmx.net> Bill Baxter schrieb: > > If you can spare the time, I'd love to hear you elaborate on that. What > are some specifics that make you say 'thank goodness for numpy!'? If > you have some good ones, I'd like to put them up on > http://www.scipy.org/NumPy_for_Matlab_Addicts (of course you're more > than welcome to cut out the middle man and just post them directly on > the wiki there yourself...) > > --Bill Baxter Just one addition/correction for your page (sorry won't do it myself, all those different wiki engines/syntaxes...): a * b is only element-wise if a and b are not numpy-matrices, afaik that's the main reason why it's so important to know whether you're working with numpy-arrays or with its subclass numpy-matrix. -sven From wbaxter at gmail.com Wed Feb 8 16:45:22 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Wed Feb 8 16:45:22 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA8B46.60202@gmx.net> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> <43EA8B46.60202@gmx.net> Message-ID: Thanks. To be honest I wrote that page in the middle of composing my email to Chris just so something would be there when people clicked on the link. :-) I think my 1-minute draft of that chart needs a different organization, because, as you point out, things are different in NumPy depending on whether you have a Matrix or an Array. Maybe it should be a 3-way comparison of Matlab / NumPy Array / NumPy Matrix instead. --bb On 2/9/06, Sven Schreiber wrote: > > Bill Baxter schrieb: > > > > > If you can spare the time, I'd love to hear you elaborate on that. What > > are some specifics that make you say 'thank goodness for numpy!'? If > > you have some good ones, I'd like to put them up on > > http://www.scipy.org/NumPy_for_Matlab_Addicts (of course you're more > > than welcome to cut out the middle man and just post them directly on > > the wiki there yourself...) > > > > --Bill Baxter > > Just one addition/correction for your page (sorry won't do it myself, > all those different wiki engines/syntaxes...): a * b is only > element-wise if a and b are not numpy-matrices, afaik that's the main > reason why it's so important to know whether you're working with > numpy-arrays or with its subclass numpy-matrix. > -sven > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Feb 8 17:02:22 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 8 17:02:22 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA87EE.1030808@gmx.net> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> <43EA87EE.1030808@gmx.net> Message-ID: <43EA93C4.7000904@noaa.gov> Sven Schreiber wrote: >> Take a little time to get used to it, and you'll become very glad >> that numpy works the way it does, rather than like Matlab. > I bet you won't be very successful in > the Gauss et al. camp with that marketing slogan... It's not a marketing slogan. It's a suggestion for someone that has already decided to learn Python+Numpy. Whenever you use something new, you shouldn't try to use it the same way that you use a different tool. We say the same thing to people that try to write Python like it's C. > does not sound very pythonic; the "you'll get used to it, and trust me, > even if you don't understand it now, it's great afterwards"-approach > sounds more like the pre-python era (you may insert a language of your > choice here ;-) The difference is that you really will like it better, not just get used to it. > I don't see why numpy cannot preserve the features that are important to > you (and which I know nothing about) and at the same time make life more > intuitive and easier for 2d-dummies like myself -- Because a matrix is not the same as an array. A matrix can be represented by a 2-d matrix, but a matrix can not represent an arbitrary n-d array (at least not easily!). If you're really doing a lot of linear algebra, then you want to use the matrix package. I haven't used it, but it should have a way to easily create a column vector for you. Python (and NumPy) is a much more powerful and flexible language than Matlab (Or gauss, or IDL, or...) Once you learn to use it, you will be happy you did. I was a major Matlab fan a while back. I spend 5 years in grad school using it, and did my entire dissertation with it. I've recently been helping a friend with some Matlab code, and I find it painful to use. You'll see. Or was that too smug? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ndarray at mac.com Wed Feb 8 20:10:02 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 20:10:02 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA7FE7.2040902@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> Message-ID: On 2/8/06, Tim Hochberg wrote: > Sasha wrote: > Isn't that bad numerically? That is, isn't (n*step) much more accurate > than (step + step + ....)? It does not matter whether n*step is more accurate than step+...+step. As long as arange uses stop+=step loop to fill in the values, the last element may exceed stop even if start + length*step does not. One may argue that filling with start + i*step is more accurate, but that will probably be much slower (even than my O(N) algorithm). > It also seems needlessly inefficient; I proposed O(N) algorithm just to counter Robert's argument that it is not possible to ensure the invariant. On the other hand I don't think it is that bad - I would expect the length computing loop to be much faster than the main loop that involves main memory. > you > should be able to to it in at most a few steps: > > length = (stop - start)/step > while length * step < stop: > length += 1 > while length * step >= stop: > length -= 1 > > Fix errors, convert to C and enjoy. It should normally take only a few > tries to get the right N. This will not work (even if you fix the error of missing start+ in the conditions :-): start + length*step < stop does not guarantee than start + step + ... + step < stop. > I see that the internals of range use repeated adding to make the range. > I imagine that is why you proposed the repeated adding. I think that > results in error that's on the order of length ULP, while multiplying > would result in error on the order of 1 ULP. So perhaps we should fix > XXX_fill to be more accurate if nothing else. > I don't think accuracy of XXX_fill for fractional steps is worth improving. In the cases where accuracy matters, one can always use integral step and multiply the result by a float. However, if anything is done to that end, I would suggest to generalize XXX_fill functions to allow accumulation be performed using a different type similarly to the way op.reduce and op.accumulate functions us their (new in numpy) dtype argument. > >3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. > > > I see that Travis has vetoed this in any event, but perhaps we should > fix up the fill functions to be more accurate and maybe most of the > problem would just magically go away. The more I think about this, the more I am convinced that using arange with a non-integer step is a bad idea. Since making it illegal is not an option, I don't see much of a point in changing exactly how bad it is. Users who want fractional steps should just be educated about linspace. From tim.hochberg at cox.net Wed Feb 8 21:01:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 8 21:01:01 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> Message-ID: <43EACC42.6030102@cox.net> Sasha wrote: >On 2/8/06, Tim Hochberg wrote: > > >>Sasha wrote: >>Isn't that bad numerically? That is, isn't (n*step) much more accurate >>than (step + step + ....)? >> >> > >It does not matter whether n*step is more accurate than step+...+step. > As long as arange uses stop+=step loop to fill in the values, the >last element may exceed stop even if start + length*step does not. >One may argue that filling with start + i*step is more accurate, but >that will probably be much slower (even than my O(N) algorithm). > > > >>It also seems needlessly inefficient; >> >> >I proposed O(N) algorithm just to counter Robert's argument that it is >not possible to ensure the invariant. On the other hand I don't think >it is that bad - I would expect the length computing loop to be much >faster than the main loop that involves main memory. > > > >>you >>should be able to to it in at most a few steps: >> >>length = (stop - start)/step >>while length * step < stop: >> length += 1 >>while length * step >= stop: >> length -= 1 >> >>Fix errors, convert to C and enjoy. It should normally take only a few >>tries to get the right N. >> >> > >This will not work (even if you fix the error of missing start+ in the >conditions :-): start + length*step < stop does not guarantee than >start + step + ... + step < stop. > > Indeed. When I first was thinking about this, I assumed that arange was computed as essentially start + range(0, n)*step. Not, as an accumulation. After I actually looked at what arange did, I failed to update my thinking -- my mistake, sorry. > > >>I see that the internals of range use repeated adding to make the range. >>I imagine that is why you proposed the repeated adding. I think that >>results in error that's on the order of length ULP, while multiplying >>would result in error on the order of 1 ULP. So perhaps we should fix >>XXX_fill to be more accurate if nothing else. >> >> >> > >I don't think accuracy of XXX_fill for fractional steps is worth improving. > > I would think that would depend on (a) how hard it is to do (I think the answer to that is not hard at all), (b) how much of a performance impact would it have (some, probably, since one's adding a multiply, and (c) how much one values the minor increase in accuracy versus whatever performance impact this might have. The change I was referring to would look more or less like: static void FLOAT_fill(float *buffer, intp length, void *ignored) { intp i; float start = buffer[0]; float delta = buffer[1]; delta -= start; /*start += (delta + delta); */ buffer += 2; for (i=2; iIn the cases where accuracy matters, one can always use integral step >and multiply the result by a float. However, if anything is done to >that end, I would suggest to generalize XXX_fill functions to allow >accumulation be performed using a different type similarly to the way >op.reduce and op.accumulate functions us their (new in numpy) dtype >argument. > > Really? That seems unnecessarily baroque. >>>3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. >>> >>> >>> >>I see that Travis has vetoed this in any event, but perhaps we should >>fix up the fill functions to be more accurate and maybe most of the >>problem would just magically go away. >> >> > >The more I think about this, the more I am convinced that using arange >with a non-integer step is a bad idea. Since making it illegal is not >an option, I don't see much of a point in changing exactly how bad it >is. Users who want fractional steps should just be educated about >linspace. > > Are integer steps with noninteger start and stop safe? For that matter are integer steps safe for sufficiently large, floating point, but integral, values of start and stop. It seems like they might well not be, but I haven't thought it through very well. I suppose that even if this was technically unsafe, in practice it would probably be pretty hard to get into trouble in that way. Regards, -tim From oliphant.travis at ieee.org Wed Feb 8 21:37:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 8 21:37:03 2006 Subject: [Numpy-discussion] newunicode branch started to fix unicode to always be UCS4 In-Reply-To: <43EA72D5.8090508@ee.byu.edu> References: <43EA72D5.8090508@ee.byu.edu> Message-ID: <43EAD4D1.800@ieee.org> Travis Oliphant wrote: > > I've started a branch on SVN to fix the unicode implementation in > NumPy so that internally all unicode arrays use UCS4. When a scalar > is obtained it will be the Python unicode scalar and the required > conversions (and data-copying) will be done. > If anybody would like to help the branch is > Well, it turned out not to be too difficult. It is done. All Unicode arrays are now always 4-bytes-per character in NumPy. The length is specified in terms of characters (not bytes). This is different than other types, but it's consistent with the use of Unicode as characters. The array-scalar that a unicode array produces inherits directly from Python unicode type which has either 2 or 4 bytes depending on the build. On narrow builds where Python unicode is only 2-bytes, the 4-byte unicode is converted to 2-byte using surrogate pairs. There may be lingering bugs of course, so please try it out and report problems. -Travis From wbaxter at gmail.com Thu Feb 9 00:22:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 00:22:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki Message-ID: I added some content to the "NumPy/SciPy for Matlab users" page on the scipy wiki. But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart of equivalents that I laid out. If folks who know both could browse by and maybe fill in a blank or two, that would be great. I think this will be a helpful "getting started" page for newbies to NumPy coming from matlab, like me. One of the most frustrating things is when you sit down and can't figure out how to do the most basic things that do in your sleep in another environment (like making a column vector). So hopefully this page will help. The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts Thanks, Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Feb 9 01:15:00 2006 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu Feb 9 01:15:00 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> Hi Bill, On 2/9/06, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the scipy > wiki. Thanks a lot for doing this. Did you see this excellent reference? Maybe it would be useful to combine effort in some way? http://www.37mm.no/matlab-python-xref.html Best, Matthew From dd55 at cornell.edu Thu Feb 9 04:05:16 2006 From: dd55 at cornell.edu (Darren Dale) Date: Thu Feb 9 04:05:16 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <200602090703.45993.dd55@cornell.edu> On Thursday 09 February 2006 3:21 am, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the > scipy wiki. > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart > of equivalents that I laid out. > If folks who know both could browse by and maybe fill in a blank or two, > that would be great. I think this will be a helpful "getting started" page > for newbies to NumPy coming from matlab, like me. One of the most > frustrating things is when you sit down and can't figure out how to do the > most basic things that do in your sleep in another environment (like making > a column vector). So hopefully this page will help. > > The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts I filled in a couple of places, where I could. I have a question about upcasting related to this example: a with elements less than 0.5 zeroed out: Matlab: a .* (a>0.5) NumPy: where(a<0.5, 0, a) I think numpy should be able to do a*a>0.5 as well, but instead one must do: a*(a>0.5).astype('i') Is it desireable to upcast bools in this case? I think so, one could always recover the current behavior by doing: (a*(a>0.5)).astype('?') Darren From martin.wiechert at gmx.de Thu Feb 9 04:14:28 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Thu Feb 9 04:14:28 2006 Subject: [Numpy-discussion] segfault when calling PyArray_DescrFromType Message-ID: <200602091304.59062.martin.wiechert@gmx.de> Hi list, I'm trying to build an C extension, which uses arrays. It builds, and I can import it from python, but the very first call to a numpy function ea = (PyObject *) PyArray_DescrFromType (PyArray_INT); gives me a segfault. I have absolutely no clue, but nm -l mymodule.so | grep rray gives 000026a0 b PyArray_API /usr/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api.h:316 and this line reads static void **PyArray_API=NULL; which looks suspicious to me. Something wrong with my setup.py? Any suggestions? Regards, Martin. From dd55 at cornell.edu Thu Feb 9 04:24:01 2006 From: dd55 at cornell.edu (Darren Dale) Date: Thu Feb 9 04:24:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <200602090703.45993.dd55@cornell.edu> References: <200602090703.45993.dd55@cornell.edu> Message-ID: <200602090723.03522.dd55@cornell.edu> On Thursday 09 February 2006 7:03 am, Darren Dale wrote: > I have a question about upcasting related to this example: > > a with elements less than 0.5 zeroed out: > Matlab: a .* (a>0.5) > NumPy: where(a<0.5, 0, a) > > I think numpy should be able to do a*a>0.5 as well, but instead one must > do: a*(a>0.5).astype('i') > > Is it desireable to upcast bools in this case? I think so, one could always > recover the current behavior by doing: > (a*(a>0.5)).astype('?') oops: I should have been doing a*(a>0.5), the order of operations is important. My mistake. From gruben at bigpond.net.au Thu Feb 9 04:27:06 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 9 04:27:06 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> References: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> Message-ID: <43EB34F8.6030006@bigpond.net.au> Vidar's documentation is under a GNU Free Documentation License. This is probably a problem with incorporating it directly into the scipy site, although Vidar was at one point happy to incorporate the MATLAB parts into Perry Greenfield and Robert Jedrzejewski's interactive data analysis tutorial. This tutorial used to be on the numarray page but has now disappeared and hasn't quite found it's way onto the scipy site, although it may just be due to a broken link. I've added a link to Vidar's site to the wiki. Gary R. Matthew Brett wrote: > Hi Bill, > > On 2/9/06, Bill Baxter wrote: >> I added some content to the "NumPy/SciPy for Matlab users" page on the scipy >> wiki. > > Thanks a lot for doing this. Did you see this excellent reference? > Maybe it would be useful to combine effort in some way? > > http://www.37mm.no/matlab-python-xref.html > > Best, > > Matthew From faltet at carabos.com Thu Feb 9 04:50:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 9 04:50:03 2006 Subject: [Numpy-discussion] newunicode branch started to fix unicode to always be UCS4 In-Reply-To: <43EAD4D1.800@ieee.org> References: <43EA72D5.8090508@ee.byu.edu> <43EAD4D1.800@ieee.org> Message-ID: <200602091349.47672.faltet@carabos.com> A Dijous 09 Febrer 2006 06:36, Travis Oliphant va escriure: > Travis Oliphant wrote: > > I've started a branch on SVN to fix the unicode implementation in > > NumPy so that internally all unicode arrays use UCS4. When a scalar > > is obtained it will be the Python unicode scalar and the required > > conversions (and data-copying) will be done. > > If anybody would like to help the branch is > > Well, it turned out not to be too difficult. It is done. Oh my! If I wouldn't have met you in person I would tend to think that you are not human ;-) > All Unicode > arrays are now always 4-bytes-per character in NumPy. The length is > specified in terms of characters (not bytes). This is different than > other types, but it's consistent with the use of Unicode as characters. Yes, I think this is a good idea. > The array-scalar that a unicode array produces inherits directly from > Python unicode type which has either 2 or 4 bytes depending on the build. > > On narrow builds where Python unicode is only 2-bytes, the 4-byte > unicode is converted to 2-byte using surrogate pairs. Very good! > There may be lingering bugs of course, so please try it out and report > problems. Well, I've tried it for a while and it seems to me that you made a very good job! Just a little thing: # Using an UCS4 interpreter here >>> len(buffer(numpy.array("qsds", 'U4')[()])) 16 >>> numpy.array("qsds", 'U4')[()].dtype dtype('>> len(buffer(numpy.array("qsds", 'U3')[()])) 12 >>> numpy.array("qsds", 'U3')[()].dtype dtype('>> len(buffer(numpy.array("qsds", 'U4')[()])) 8 # Fine >>> numpy.array("qsds", 'U4')[()].dtype dtype('>> len(buffer(numpy.array("qsds", 'U3')[()])) 6 # Fine >>> numpy.array("qsds", 'U3')[()].dtype dtype('0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From bsouthey at gmail.com Thu Feb 9 05:57:04 2006 From: bsouthey at gmail.com (Bruce Southey) Date: Thu Feb 9 05:57:04 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: Hi, The example of ndim to give the rank is not the same as the Matlab function rank(a). See http://en.wikipedia.org/wiki/Rank_of_a_matrix for definition of rank that I would think that most people would use if they use Matlab and is provided by rank(a). I have not used the latest numpy but the equivalent function is not present in numarray/Numeric (to my knowledge) so you have to find some other way like using svd. Regards Bruce On 2/9/06, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the scipy wiki. > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart of equivalents that I laid out. > If folks who know both could browse by and maybe fill in a blank or two, that would be great. I think this will be a helpful "getting started" page for newbies to NumPy coming from matlab, like me. One of the most frustrating things is when you sit down and can't figure out how to do the most basic things that do in your sleep in another environment (like making a column vector). So hopefully this page will help. > > The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts > > Thanks, > Bill Baxter > From aisaac at american.edu Thu Feb 9 06:39:11 2006 From: aisaac at american.edu (Alan G Isaac) Date: Thu Feb 9 06:39:11 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: On Thu, 9 Feb 2006, Bruce Southey apparently wrote: > The example of ndim to give the rank is not the same as the Matlab > function rank(a). See > http://en.wikipedia.org/wiki/Rank_of_a_matrix for definition of rank > that I would think that most people would use if they use Matlab Coming from GAUSS and having never studies tensors, I was also surprised by the 'rank' terminology. I believe this is why Travis changed to ndim, which is less likely to confuse users coming from a linear algebra perspective. Unfortunately the SciPy book currently uses the term 'rank' in the two conflicting ways. (It uses 'rank' in the linear algebra sense only in the discussion of lstsq on p.145.) It might be helpful for the tensor sense to always be qualified as 'tensor rank'? Cheers, Alan Isaac From pau.gargallo at gmail.com Thu Feb 9 06:44:01 2006 From: pau.gargallo at gmail.com (Pau Gargallo) Date: Thu Feb 9 06:44:01 2006 Subject: [Numpy-discussion] ufuncs question Message-ID: <6ef8f3380602090643m382c0b46ndf025e39f67894c0@mail.gmail.com> hi all, i have a code the following code: def foo(x): '''takes a nd-array x and return another md-array''' do something return a md-array A = an array of nd-arrays #A has 1+n dimensions B = an array of md-arrays #B has 1+m dimensions for i in len(A): B[i] = foo(A[i]) and was wandering if there is an easy way to speed it up. I guess that something using ufuncs could be used (?). Something like B = ufunced_foo( A ). thanks in advance, pau From martin.wiechert at gmx.de Thu Feb 9 07:01:31 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Thu Feb 9 07:01:31 2006 Subject: [Numpy-discussion] Re: [SciPy-user] segfault when calling PyArray_DescrFromType In-Reply-To: <200602091141.51520.martin.wiechert@gmx.de> References: <200602091141.51520.martin.wiechert@gmx.de> Message-ID: <200602091552.11896.martin.wiechert@gmx.de> Found it (in the "old" docs). Must #define PY_ARRAY_UNIQUE_SYMBOL and call import_array (). Sorry to bother. Martin. On Thursday 09 February 2006 11:41, Martin Wiechert wrote: > Hi list, > > I'm trying to build an C extension, which uses arrays. It builds, and I can > import it from python, but the very first call to a numpy function > > ea = (PyObject *) PyArray_DescrFromType (PyArray_INT); > > gives me a segfault. > > I have absolutely no clue, but > > nm -l mymodule.so | grep rray > > gives > > 000026a0 b > PyArray_API > /usr/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api. >h:316 > > and this line reads > > static void **PyArray_API=NULL; > > which looks suspicious to me. Something wrong with my setup.py? > > Any suggestions? > > Regards, Martin. > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.net > http://www.scipy.net/mailman/listinfo/scipy-user From oliphant.travis at ieee.org Thu Feb 9 08:11:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 08:11:04 2006 Subject: [Numpy-discussion] Re: ***[Possible UCE]*** Re: ***[Possible UCE]*** Re: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EA3DE0.1070608@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> Message-ID: <43EB695D.9050501@ieee.org> Tim Hochberg wrote: > I'm attaching the two modified setup files. The first is > numpy/core/setup.py and the second is numpy/random/setup.py. I tried > to keep the modifications as minimal as possible. With these two setup > files, and adding M_PI to numpy\random\mtrand\distributions.c, numpy > compiles fine and passes all tests except for the test_minrelpath path > I mentioned in my last message. I'm trying to incorporate your changes. 1) M_PI was easy to fix. 2) In the core/setup.py file you sent you add the line: python_libs = join(distutils.sysconfig.EXEC_PREFIX, 'libs') I'm not sure what this is supposed to do. What problem does it fix on your system? It makes no sence on mine as this becomes python_libs = '/usr/libs' which is not a directory. 3) For the setup.py file in random you are using Advapi for all win32 platforms. But, this seems to be a windows NT file or at least only needed when compiling with certain compilers. Mingw32 built just fine without it. So, I'm not sure how to handle this. Suggestions? -Travis > > -tim > >------------------------------------------------------------------------ > > >import imp >import os >import sys >import distutils.sysconfig >from os.path import join >from glob import glob >from distutils.dep_util import newer,newer_group > >def configuration(parent_package='',top_path=None): > from numpy.distutils.misc_util import Configuration,dot_join > from numpy.distutils.system_info import get_info > > config = Configuration('core',parent_package,top_path) > local_dir = config.local_path > codegen_dir = join(local_dir,'code_generators') > > generate_umath_py = join(codegen_dir,'generate_umath.py') > n = dot_join(config.name,'generate_umath') > generate_umath = imp.load_module('_'.join(n.split('.')), > open(generate_umath_py,'U'),generate_umath_py, > ('.py','U',1)) > > header_dir = join(*(config.name.split('.')+['include','numpy'])) > > def generate_config_h(ext, build_dir): > target = join(build_dir,'config.h') > if newer(__file__,target): > config_cmd = config.get_config_cmd() > print 'Generating',target > # > tc = generate_testcode(target) > from distutils import sysconfig > python_include = sysconfig.get_python_inc() > python_libs = join(distutils.sysconfig.EXEC_PREFIX, 'libs') > result = config_cmd.try_run(tc,include_dirs=[python_include], library_dirs=[python_libs]) > if not result: > raise "ERROR: Failed to test configuration" > moredefs = [] > > # > mathlibs = [] > tc = testcode_mathlib() > mathlibs_choices = [[],['m'],['cpml']] > mathlib = os.environ.get('MATHLIB') > if mathlib: > mathlibs_choices.insert(0,mathlib.split(',')) > for libs in mathlibs_choices: > if config_cmd.try_run(tc,libraries=libs): > mathlibs = libs > break > else: > raise "math library missing; rerun setup.py after setting the MATHLIB env variable" > ext.libraries.extend(mathlibs) > moredefs.append(('MATHLIB',','.join(mathlibs))) > > libs = mathlibs > kws_args = {'libraries':libs,'decl':0,'headers':['math.h']} > if config_cmd.check_func('expl', **kws_args): > moredefs.append('HAVE_LONGDOUBLE_FUNCS') > if config_cmd.check_func('expf', **kws_args): > moredefs.append('HAVE_FLOAT_FUNCS') > if config_cmd.check_func('log1p', **kws_args): > moredefs.append('HAVE_LOG1P') > if config_cmd.check_func('expm1', **kws_args): > moredefs.append('HAVE_EXPM1') > if config_cmd.check_func('asinh', **kws_args): > moredefs.append('HAVE_INVERSE_HYPERBOLIC') > if config_cmd.check_func('atanhf', **kws_args): > moredefs.append('HAVE_INVERSE_HYPERBOLIC_FLOAT') > if config_cmd.check_func('atanhl', **kws_args): > moredefs.append('HAVE_INVERSE_HYPERBOLIC_LONGDOUBLE') > if config_cmd.check_func('isnan', **kws_args): > moredefs.append('HAVE_ISNAN') > if config_cmd.check_func('isinf', **kws_args): > moredefs.append('HAVE_ISINF') > > if sys.version[:3] < '2.4': > kws_args['headers'].append('stdlib.h') > if config_cmd.check_func('strtod', **kws_args): > moredefs.append(('PyOS_ascii_strtod', 'strtod')) > > if moredefs: > target_f = open(target,'a') > for d in moredefs: > if isinstance(d,str): > target_f.write('#define %s\n' % (d)) > else: > target_f.write('#define %s %s\n' % (d[0],d[1])) > target_f.close() > else: > mathlibs = [] > target_f = open(target) > for line in target_f.readlines(): > s = '#define MATHLIB' > if line.startswith(s): > value = line[len(s):].strip() > if value: > mathlibs.extend(value.split(',')) > target_f.close() > > ext.libraries.extend(mathlibs) > > incl_dir = os.path.dirname(target) > if incl_dir not in config.numpy_include_dirs: > config.numpy_include_dirs.append(incl_dir) > > config.add_data_files((header_dir,target)) > return target > > def generate_api_func(header_file, module_name): > def generate_api(ext,build_dir): > target = join(build_dir, header_file) > script = join(codegen_dir, module_name + '.py') > if newer(script, target): > sys.path.insert(0, codegen_dir) > try: > m = __import__(module_name) > print 'executing',script > m.generate_api(build_dir) > finally: > del sys.path[0] > config.add_data_files((header_dir,target)) > return target > return generate_api > > generate_array_api = generate_api_func('__multiarray_api.h', > 'generate_array_api') > generate_ufunc_api = generate_api_func('__ufunc_api.h', > 'generate_ufunc_api') > > def generate_umath_c(ext,build_dir): > target = join(build_dir,'__umath_generated.c') > script = generate_umath_py > if newer(script,target): > f = open(target,'w') > f.write(generate_umath.make_code(generate_umath.defdict, > generate_umath.__file__)) > f.close() > return [] > > config.add_data_files(join('include','numpy','*.h')) > config.add_include_dirs('src') > > config.numpy_include_dirs.extend(config.paths('include')) > > deps = [join('src','arrayobject.c'), > join('src','arraymethods.c'), > join('src','scalartypes.inc.src'), > join('src','arraytypes.inc.src'), > join('src','_signbit.c'), > join('src','_isnan.c'), > join('include','numpy','*object.h'), > join(codegen_dir,'genapi.py'), > join(codegen_dir,'*.txt') > ] > > config.add_extension('multiarray', > sources = [join('src','multiarraymodule.c'), > generate_config_h, > generate_array_api, > join('src','scalartypes.inc.src'), > join('src','arraytypes.inc.src'), > join(codegen_dir,'generate_array_api.py'), > join('*.py') > ], > depends = deps, > ) > > config.add_extension('umath', > sources = [generate_config_h, > join('src','umathmodule.c.src'), > generate_umath_c, > generate_ufunc_api, > join('src','scalartypes.inc.src'), > join('src','arraytypes.inc.src'), > ], > depends = [join('src','ufuncobject.c'), > generate_umath_py, > join(codegen_dir,'generate_ufunc_api.py'), > ]+deps, > ) > > config.add_extension('_sort', > sources=[join('src','_sortmodule.c.src'), > generate_config_h, > generate_array_api, > ], > ) > > # Configure blasdot > blas_info = get_info('blas_opt',0) > #blas_info = {} > def get_dotblas_sources(ext, build_dir): > if blas_info: > return ext.depends[:1] > return None # no extension module will be built > > config.add_extension('_dotblas', > sources = [get_dotblas_sources], > depends=[join('blasdot','_dotblas.c'), > join('blasdot','cblas.h'), > ], > include_dirs = ['blasdot'], > extra_info = blas_info > ) > > > config.add_data_dir('tests') > config.make_svn_version_py() > > return config > >def testcode_mathlib(): > return """\ >/* check whether libm is broken */ >#include >int main(int argc, char *argv[]) >{ > return exp(-720.) > 1.0; /* typically an IEEE denormal */ >} >""" > >import sys >def generate_testcode(target): > if sys.platform == 'win32': > target = target.replace('\\','\\\\') > testcode = [r''' >#include >#include >#include > >int main(int argc, char **argv) >{ > > FILE *fp; > > fp = fopen("'''+target+'''","w"); > '''] > > c_size_test = r''' >#ifndef %(sz)s > fprintf(fp,"#define %(sz)s %%d\n", sizeof(%(type)s)); >#else > fprintf(fp,"/* #define %(sz)s %%d */\n", %(sz)s); >#endif >''' > for sz, t in [('SIZEOF_SHORT', 'short'), > ('SIZEOF_INT', 'int'), > ('SIZEOF_LONG', 'long'), > ('SIZEOF_FLOAT', 'float'), > ('SIZEOF_DOUBLE', 'double'), > ('SIZEOF_LONG_DOUBLE', 'long double'), > ('SIZEOF_PY_INTPTR_T', 'Py_intptr_t'), > ]: > testcode.append(c_size_test % {'sz' : sz, 'type' : t}) > > testcode.append('#ifdef PY_LONG_LONG') > testcode.append(c_size_test % {'sz' : 'SIZEOF_LONG_LONG', > 'type' : 'PY_LONG_LONG'}) > testcode.append(c_size_test % {'sz' : 'SIZEOF_PY_LONG_LONG', > 'type' : 'PY_LONG_LONG'}) > > > testcode.append(r''' >#else > fprintf(fp, "/* PY_LONG_LONG not defined */\n"); >#endif >#ifndef CHAR_BIT > { > unsigned char var = 2; > int i=0; > while (var >= 2) { > var = var << 1; > i++; > } > fprintf(fp,"#define CHAR_BIT %d\n", i+1); > } >#else > fprintf(fp, "/* #define CHAR_BIT %d */\n", CHAR_BIT); >#endif > fclose(fp); > return 0; >} >''') > testcode = '\n'.join(testcode) > return testcode > >if __name__=='__main__': > from numpy.distutils.core import setup > setup(**configuration(top_path='').todict()) > > >------------------------------------------------------------------------ > >import sys >from os.path import join > >def configuration(parent_package='',top_path=None): > from numpy.distutils.misc_util import Configuration > config = Configuration('random',parent_package,top_path) > > # Configure mtrand > # Note that I'm mimicking the original behaviour of always using 'm' for > # the math library. This should probably use the logic from numpy/core/setup.py > # to chose the math libraries, but I'm going for minimal changes -- TAH > if sys.platform == "win32": > libraries = ['Advapi32'] > else: > libraries = ['m'] > config.add_extension('mtrand', > sources=[join('mtrand', x) for x in > ['mtrand.c', 'randomkit.c', 'initarray.c', > 'distributions.c']], > libraries=libraries, > depends = [join('mtrand','*.h'), > join('mtrand','*.pyx'), > join('mtrand','*.pxi'), > ] > ) > > return config > >if __name__ == '__main__': > from numpy.distutils.core import setup > setup(**configuration(top_path='').todict()) > > From tim.hochberg at cox.net Thu Feb 9 08:30:10 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 08:30:10 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB695D.9050501@ieee.org> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> Message-ID: <43EB6DF5.6010705@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >> I'm attaching the two modified setup files. The first is >> numpy/core/setup.py and the second is numpy/random/setup.py. I tried >> to keep the modifications as minimal as possible. With these two >> setup files, and adding M_PI to numpy\random\mtrand\distributions.c, >> numpy compiles fine and passes all tests except for the >> test_minrelpath path I mentioned in my last message. > > > > I'm trying to incorporate your changes. Great. > 1) M_PI was easy to fix. > > 2) In the core/setup.py file you sent you add the line: > > python_libs = join(distutils.sysconfig.EXEC_PREFIX, 'libs') > > I'm not sure what this is supposed to do. What problem does it fix on > your system? It makes no sence on mine as this becomes > > python_libs = '/usr/libs' > > which is not a directory. OK, we'll have to work out something that works for both. The issue here on windows is that compiling the testcode requires python.lib, and it doesn't get found unless that directory is specified. The problem is perhaps related to the following comment in system_info.py if sys.platform == 'win32': # line 116 default_lib_dirs = ['C:\\'] # probably not very helpful... In any event, it does seem like there should be a better way to find where python.lib lives, but I couldn't find it in my perusal of the distutils docs. > > 3) For the setup.py file in random you are using Advapi for all win32 > platforms. But, this seems to be a windows NT file I'm compiling on XP FWIW. > or at least only needed when compiling with certain compilers. > Mingw32 built just fine without it. So, I'm not sure how to handle > this. My guess, and it's only a gues because I use neither mingw or the windows crypto stuff, is that defines are set differently by mingw so that the parts that need that library are not being compiled when you use mingw. The code in question is all gaurded by: #ifdef _WIN32 #ifndef RK_NO_WINCRYPT As far as I can tell, RK_NO_WINCRYPT never gets defined anywhere, so the important test is for _WIN32. So, does mingw define _WIN32? If it does not, then that's what's going on. In that case, the proper test is probably to check if _WIN32 is defined by the compiler in question and include Advapi only then. If it does define _WIN32, then I dunno! -tim From strawman at astraw.com Thu Feb 9 09:00:03 2006 From: strawman at astraw.com (Andrew Straw) Date: Thu Feb 9 09:00:03 2006 Subject: [Numpy-discussion] segfault when calling PyArray_DescrFromType In-Reply-To: <200602091304.59062.martin.wiechert@gmx.de> References: <200602091304.59062.martin.wiechert@gmx.de> Message-ID: <43EB74DD.3050608@astraw.com> Martin Wiechert wrote: >Hi list, > >I'm trying to build an C extension, which uses arrays. It builds, and I can >import it from python, but the very first call to a numpy function > > ea = (PyObject *) PyArray_DescrFromType (PyArray_INT); > >gives me a segfault. > >I have absolutely no clue, but > >nm -l mymodule.so | grep rray > >gives > >000026a0 b >PyArray_API /usr/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api.h:316 > >and this line reads > >static void **PyArray_API=NULL; > >which looks suspicious to me. Something wrong with my setup.py? > >Any suggestions? > > Did you do import_array() beforehand? From oliphant.travis at ieee.org Thu Feb 9 09:30:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 09:30:03 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB6DF5.6010705@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> Message-ID: <43EB7BE3.70706@ieee.org> Tim Hochberg wrote: >> >> >> I'm trying to incorporate your changes. > > > > Great. > > > OK, we'll have to work out something that works for both. The issue > here on windows is that compiling the testcode requires python.lib, > and it doesn't get found unless that directory is specified. The > problem is perhaps related to the following comment in system_info.py > > if sys.platform == 'win32': # line 116 > default_lib_dirs = ['C:\\'] # probably not very helpful... I added the change you made in setup.py to default_lib_dirs, here. See if this fixes it. >> >> 3) For the setup.py file in random you are using Advapi for all win32 >> platforms. But, this seems to be a windows NT file > > > I'm compiling on XP FWIW. > >> or at least only needed when compiling with certain compilers. >> Mingw32 built just fine without it. So, I'm not sure how to handle >> this. > I see now. On _WIN32 platforms it's using the registry instead of the file system to store things. I modified the random/setup.py script to test for _WIN32 in the compiler and add the dll to the list of libraries if it is found. I'm also reading the configuration file to determine MATHLIB. Can you try out the new SVN and see if it builds for you without modification. -Travis From tim.hochberg at cox.net Thu Feb 9 09:47:14 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 09:47:14 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA8867.5080109@ee.byu.edu> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> Message-ID: <43EB7FE5.1050000@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >>> >> >> >> I see that Travis has vetoed this in any event, but perhaps we should >> fix up the fill functions to be more accurate and maybe most of the >> problem would just magically go away. > > > > To do something different than arange has always done we need a new > function, not change what arange does and thus potentially break lots > of code. > > How do you propose to make the fill funtions more accurate? I'm > certainly willing to see improvements there. OK, I experimented with. I replaced the code for @NAME at _fill in arraytypes.inc.src with: /**begin repeat #NAME=BYTE,UBYTE,SHORT,USHORT,INT,UINT,LONG,ULONG,LONGLONG,ULONGLONG,FLOAT,DOUBLE,LONGDOUBLE# #typ=byte,ubyte,short,ushort,int,uint,long,ulong,longlong,ulonglong,float,double,longdouble# */ static void @NAME at _fill(@typ@ *buffer, intp length, void *ignored) { intp i; @typ@ start = buffer[0]; @typ@ delta = buffer[1]; delta -= start; buffer += 2; for (i=2; i= 100.65 arange(0, 100.68000000000001, 0.12) failed 100.68 >= 100.68 arange(0, 100.65000000000001, 0.074999999999999997) failed 100.65 >= 100.65 arange(0, 100.26000000000001, 0.059999999999999998) failed 100.26 >= 100.26 arange(0, 100.62, 0.059999999999999998) failed 100.62 >= 100.62 arange(0, 100.68000000000001, 0.059999999999999998) failed 100.68 >= 100.68 arange(0, 100.98, 0.059999999999999998) failed 100.98 >= 100.98 arange(0, 100.5, 0.031914893617021274) failed 100.5 >= 100.5 arange(10000) took 2.25123220968 seconds for 100000 reps arange(10000.0) took 4.31864636427 seconds for 100000 reps After the change: arange(10000) took 1.82795662577 seconds for 100000 reps arange(10000.0) took 3.93278363591 seconds for 100000 reps That is, not only did all of the incorrect end cases go away, it actually got marginally faster. Why it got faster I can't say, there's not much to be gained in second guessing an optimizing compiler. It's quite possible that this may be compiler dependant, so I'd be interested in the results with other compilers. Also, I only sampled a small chunk of the input space of arange, so if you have some other failing input values, please send them to me and I can test them and see if this change fixes them also. I didn't mess with the complex version of fill yet. Is that just there to support arange(0, 100, 2, dtype=complex), or is there some other use for _fill besides arange? Regards, -tim -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: testarange.py URL: From tim.hochberg at cox.net Thu Feb 9 10:34:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 10:34:03 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB7BE3.70706@ieee.org> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> Message-ID: <43EB8AFE.2060704@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >>> >>> >>> I'm trying to incorporate your changes. >> >> >> >> >> Great. >> >> >> OK, we'll have to work out something that works for both. The issue >> here on windows is that compiling the testcode requires python.lib, >> and it doesn't get found unless that directory is specified. The >> problem is perhaps related to the following comment in system_info.py >> >> if sys.platform == 'win32': # line 116 >> default_lib_dirs = ['C:\\'] # probably not very helpful... > > > > I added the change you made in setup.py to default_lib_dirs, here. > See if this fixes it. > >>> >>> 3) For the setup.py file in random you are using Advapi for all >>> win32 platforms. But, this seems to be a windows NT file >> >> >> >> I'm compiling on XP FWIW. >> >>> or at least only needed when compiling with certain compilers. >>> Mingw32 built just fine without it. So, I'm not sure how to handle >>> this. >> >> > I see now. On _WIN32 platforms it's using the registry instead of the > file system to store things. I modified the random/setup.py script to > test for _WIN32 in the compiler and add the dll to the list of > libraries if it is found. I'm also reading the configuration file to > determine MATHLIB. > > Can you try out the new SVN and see if it builds for you without > modification. There's a shallow error in system_info.py File "C:\Documents and Settings\End-user\Desktop\numpy\svn\numpy\numpy\distutils\system_info.py", line 118, in ? default_lib_dirs = ['C:\\', NameError: name 'join' is not defined Just replacing join with os.path.join fixed that. However, it didn't help. I had this fantasy that default_lib_dirs would get picked up automagically; however that does not happen. I still ended up putting: from numpy.distutils import system_info library_dirs = system_info.default_lib_dirs result = config_cmd.try_run(tc,include_dirs=[python_include], library_dirs=library_dirs) into setup.py. Is that acceptable? It's not very elegant. The changes to setup.py in random and the M_PI, seem to have worked since with the changes above it compiles and passes all of the tests except for the previously mentioned test_minrelpath. -tim From oliphant.travis at ieee.org Thu Feb 9 12:30:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 12:30:03 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EB7FE5.1050000@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> Message-ID: <43EBA615.8020101@ieee.org> Tim Hochberg wrote: > That is, not only did all of the incorrect end cases go away, it > actually got marginally faster. Why it got faster I can't say, there's > not much to be gained in second guessing an optimizing compiler. It's > quite possible that this may be compiler dependant, so I'd be > interested in the results with other compilers. Also, I only sampled a > small chunk of the input space of arange, so if you have some other > failing input values, please send them to me and I can test them and > see if this change fixes them also. > > I didn't mess with the complex version of fill yet. Is that just there > to support arange(0, 100, 2, dtype=complex), or is there some other > use for _fill besides arange? This is a simple change and one we could easily do. The complex versions are there to support complex arange? There are no other uses "currently" for fill. Although you could use it with two equal values to fill an array with the same thing quickly. I have yet to test a version of ones using fill against the current implementation which adds 1 to a zeros array. Thanks for the changes. -Travis From oliphant.travis at ieee.org Thu Feb 9 12:33:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 12:33:06 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB8AFE.2060704@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> Message-ID: <43EBA6D5.9020906@ieee.org> Tim Hochberg wrote: > > > There's a shallow error in system_info.py > > File "C:\Documents and > Settings\End-user\Desktop\numpy\svn\numpy\numpy\distutils\system_info.py", > > line 118, in ? > default_lib_dirs = ['C:\\', > NameError: name 'join' is not defined > > Just replacing join with os.path.join fixed that. However, it didn't > help. I had this fantasy that default_lib_dirs would get picked up > automagically; however that does not happen. I still ended up putting: > > from numpy.distutils import system_info > library_dirs = system_info.default_lib_dirs > result = > config_cmd.try_run(tc,include_dirs=[python_include], > library_dirs=library_dirs) > > into setup.py. Is that acceptable? It's not very elegant. I think it's probably O.K. as long as it doesn't produce errors on other systems (and it doesn't on mine). > > The changes to setup.py in random and the M_PI, seem to have worked > since with the changes above it compiles and passes all of the tests > except for the previously mentioned test_minrelpath. > I thought I fixed minrelpath too by doing a search and replace. Perhaps this did not help. -Travis From ndarray at mac.com Thu Feb 9 12:51:02 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 12:51:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: On 2/9/06, Alan G Isaac wrote: > Unfortunately the SciPy book currently uses the term 'rank' > in the two conflicting ways. (It uses 'rank' in the linear > algebra sense only in the discussion of lstsq on p.145.) > It might be helpful for the tensor sense to always be > qualified as 'tensor rank'? Another alternative would be "number of axes." I also find a glossary used by the J language (an APL descendant) useful in array discussions. See . Here is how J documentation explains the difference in their terminology and that of the C language: "What C calls an n-dimensional array of rank i?j???k is in J an array of rank n with axes of length i,j,?,k." From tim.hochberg at cox.net Thu Feb 9 13:44:06 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 13:44:06 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EBA615.8020101@ieee.org> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> Message-ID: <43EBB782.1010509@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >> That is, not only did all of the incorrect end cases go away, it >> actually got marginally faster. Why it got faster I can't say, >> there's not much to be gained in second guessing an optimizing >> compiler. It's quite possible that this may be compiler dependant, so >> I'd be interested in the results with other compilers. Also, I only >> sampled a small chunk of the input space of arange, so if you have >> some other failing input values, please send them to me and I can >> test them and see if this change fixes them also. >> >> I didn't mess with the complex version of fill yet. Is that just >> there to support arange(0, 100, 2, dtype=complex), or is there some >> other use for _fill besides arange? > > > > This is a simple change and one we could easily do. The complex > versions are there to support complex arange? There are no other uses > "currently" for fill. Just for truth in advertising, after the last svn update I did, the speed advantage mostly went away: # baseline arange(10000) took 2.27355292363 seconds for 100000 reps arange(10000.0) took 4.39404812623 seconds for 100000 reps arange(10000,dtype=complex) took 4.01601209092 seconds for 100000 reps # multiply instead of repeated add. arange(10000) took 2.20859410903 seconds for 100000 reps arange(10000.0) took 4.34652784083 seconds for 100000 reps arange(10000,dtype=complex) took 6.02266433304 seconds for 100000 reps I'm not sure if this is a result of the changes that you made stripping out the unneeded 'i' or if my machine was in some sort of different state or what. Note that I modified the complex fills as well now and they are much slower. Is it possible for delta to be complex? If not, we could speed up the complex case a little by exploiting the fact that delta.real is always zero. If in, in addition, we can assume both that start.imag is zero and that the array is zeroed out to start with, we could speed things up some more. This seems like a no-brainer for floats (and a noop for ints) since it alleviates the problem of arange(start,stop,step)[-1] sometimes being >= stop without costing anything performance wise. (I don't know that it cures the problem, but it seems to make it a lot less likely). For complex the situation is more, uh, complex. I'd like to make it since in general I'd rather be right than fast. Still, it's a signifigant performance hit in this case. Thoughts? -tim > Although you could use it with two equal values to fill an array with > the same thing quickly. I have yet to test a version of ones using > fill against the current implementation which adds 1 to a zeros array. > > Thanks for the changes. > > -Travis > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From pearu at scipy.org Thu Feb 9 14:35:07 2006 From: pearu at scipy.org (Pearu Peterson) Date: Thu Feb 9 14:35:07 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB8AFE.2060704@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> Message-ID: On Thu, 9 Feb 2006, Tim Hochberg wrote: > I had this fantasy that default_lib_dirs would get picked up > automagically; however that does not happen. I still ended up putting: > > from numpy.distutils import system_info > library_dirs = system_info.default_lib_dirs > result = config_cmd.try_run(tc,include_dirs=[python_include], > library_dirs=library_dirs) > > into setup.py. Is that acceptable? It's not very elegant. No, don't use system_info.default_lib_dirs. Use distutils.sysconfig.get_python_lib() to get the directory that contains Python library. Pearu From tim.hochberg at cox.net Thu Feb 9 14:46:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 14:46:04 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> Message-ID: <43EBC5DF.9090709@cox.net> Pearu Peterson wrote: > > > On Thu, 9 Feb 2006, Tim Hochberg wrote: > >> I had this fantasy that default_lib_dirs would get picked up >> automagically; however that does not happen. I still ended up putting: >> >> from numpy.distutils import system_info >> library_dirs = system_info.default_lib_dirs >> result = >> config_cmd.try_run(tc,include_dirs=[python_include], >> library_dirs=library_dirs) >> >> into setup.py. Is that acceptable? It's not very elegant. > > > No, don't use system_info.default_lib_dirs. > > Use distutils.sysconfig.get_python_lib() to get the directory that > contains Python library. That's the wrong library. Get_python_lib gives you the location of the python standard library, not the location of python24.lib . The former being python24/Lib (or python24/Lib/site-packages depending what options you feed get_python_lib) 'and the latter being python24/libs on my box. -tim From vidar+list at 37mm.no Thu Feb 9 16:00:15 2006 From: vidar+list at 37mm.no (Vidar Gundersen) Date: Thu Feb 9 16:00:15 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EB34F8.6030006@bigpond.net.au> (Gary Ruben's message of "Thu, 09 Feb 2006 23:26:32 +1100") References: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> <43EB34F8.6030006@bigpond.net.au> Message-ID: ===== Original message from Gary Ruben | 9 Feb 2006: > Vidar's documentation is under a GNU Free Documentation License. This is > probably a problem with incorporating it directly into the scipy site, > although Vidar was at one point happy to incorporate the MATLAB parts this was not the intention when i picked an open license for it. i'm not that familiar with the legal stuff, and i guess when first used a GPL/GFDL it always has to be? i also considered CC, but didn't want to spend a lot of time digging into legal stuff: i wanted to make the reference available and reusable to anyone. i don't mind, i wanted to achieve openness and encourage contributions and derivations, and be able to use these to improve and update the original reference. i need to update it with the new NumPy package, but i haven't taken the time to buy the manual and start looking into it yet. will including NumPy commands be a problem related to licensing on the NumPy documentation? also, i'd prefer to publish it on a more appropriate site (like scipy.org, sourceforge.net, or wherever useful) when i feel the documents are more complete. but note that this is not really a Numerical Python and Matlab thing, but a framework to get from math environment a to b. it could also (when i include NumPy) help transition between Numeric/numarray/NumPy: this can easily be generated as a separate reference (i use XSL and LaTeX). (although i did this to support my own transition from Matlab to non-commercial alternatives, e.g. Python and R/RPy, Gnuplot, etc for plotting.) thanks for cross-posting this to me, Gary. i'm jumping right into this list, so please be indulgent if i seem uninformed on late talks here. A brief observation on "NumPy for Matlab Addicts": The section "Some Key Differences" says nothing about the amount of routines found in Matlab toolboxes for Optimization, Control engineering, Wavelets, etc. for these there are no real alternatives. kind regards, Vidar Bronken Gundersen From wbaxter at gmail.com Thu Feb 9 16:02:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:02:15 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: Oh, yeh. I can see the problem with the phrase "rank of a matrix". It does sound like it means the linear algebra rank of rank/nullity fame. I changed the description on the page a bit. Thanks for catching that. --bb On 2/9/06, Bruce Southey wrote: > > Hi, > The example of ndim to give the rank is not the same as the Matlab > function rank(a). See > http://en.wikipedia.org/wiki/Rank_of_a_matrix for definition of rank > that I would think that most people would use if they use Matlab and > is provided by rank(a). > > I have not used the latest numpy but the equivalent function is not > present in numarray/Numeric (to my knowledge) so you have to find some > other way like using svd. > > Regards > Bruce > > On 2/9/06, Bill Baxter wrote: > > I added some content to the "NumPy/SciPy for Matlab users" page on the > scipy wiki. > > > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole > chart of equivalents that I laid out. > > If folks who know both could browse by and maybe fill in a blank or two, > that would be great. I think this will be a helpful "getting started" page > for newbies to NumPy coming from matlab, like me. One of the most > frustrating things is when you sit down and can't figure out how to do the > most basic things that do in your sleep in another environment (like making > a column vector). So hopefully this page will help. > > > > The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts > > > > Thanks, > > Bill Baxter > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 9 16:07:33 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:07:33 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those don't seem to be defined, at least for the versions of NumPy/SciPy that I have. Are they new? Or are they perhaps defined by a 3rd package in your environment? By the way, is there any python way to tell which package a symbol is coming from? --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From cookedm at physics.mcmaster.ca Thu Feb 9 16:17:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 9 16:17:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: (Bill Baxter's message of "Fri, 10 Feb 2006 09:06:49 +0900") References: Message-ID: Bill Baxter writes: > Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those > don't seem to be defined, at least for the versions of NumPy/SciPy that I > have. Are they new? Or are they perhaps defined by a 3rd package in your > environment? They're in numpy.linalg. > By the way, is there any python way to tell which package a symbol is coming > from? Check it's __module__ attribute. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From wbaxter at gmail.com Thu Feb 9 16:19:24 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:19:24 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: I find 'number of axes' to be even more confusing than 'dimension'. Both sound to me like they're talking about the number of components in a vector (e.g. 3-dimensional space vs 2-dimensional space), but axes moreso. The word dimension has a lot of uses, and in most programming languages arrays are described as being one, two or three dimensional etc. So that makes sense. But I can't think of any common usages of axis that aren't related to vectors in a vector space. But that's just me. Seems like this debate probably came and went a long time ago. What is right probably depends mostly on what sort of math you spend your time doing. --bb On 2/10/06, Sasha wrote: > > On 2/9/06, Alan G Isaac wrote: > > Unfortunately the SciPy book currently uses the term 'rank' > > in the two conflicting ways. (It uses 'rank' in the linear > > algebra sense only in the discussion of lstsq on p.145.) > > It might be helpful for the tensor sense to always be > > qualified as 'tensor rank'? > > Another alternative would be "number of axes." I also find a > glossary used by the J language (an APL descendant) useful in array > discussions. See > . > > Here is how J documentation explains the difference in their > terminology and that of the C language: "What C calls an n-dimensional > array of rank i?j???k is in J an array of rank n with axes of length > i,j,?,k." > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 9 16:24:14 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:24:14 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: > Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those > > don't seem to be defined, at least for the versions of NumPy/SciPy that > I > > have. Are they new? Or are they perhaps defined by a 3rd package in > your > > environment? > > They're in numpy.linalg. Ooooh! Lots of goodies there! > By the way, is there any python way to tell which package a symbol is > coming > > from? > > Check it's __module__ attribute. Ah, perfect. I see it's also mentioned in help(thing) for thing. Thanks. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fernando.Perez at colorado.edu Thu Feb 9 16:35:03 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 16:35:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43EBDF88.6040504@colorado.edu> [ regarding the way of describing arrays vs. matlab's matrices, and the use of 'dimension', 'rank', 'number of axes', etc.] Let's not introduce new terms where none are needed. These concepts have had well-established names (tensor rank and matrix rank) for a long time. It may be a good idea to add a local glossary page reminding anyone of what the definitions are, but for as long as I remember reading literature on these topics, the two terms have been fully unambiguous. A numpy array with length(array.shape)==d is closest to a rank d tensor (minus the geometric co/contravariance information). A d=2 array can be used to represent a matrix, and linear algebra operations can be performed on it; if a Matrix object is built out of it, a number of things (notably the * operator) are then performed in the linear algebra sense (and not element-wise). The rank of a matrix has nothing to do with the shape attribute of the underlying array, but with the number of non-zero singular values (and for floating-point matrices, is best defined up to a given tolerance). Since numpy is a n-dimensional array package, it may be convenient to introduce a matrix_rank() routine which does what matlab's rank() for 2-d arrays and matrices, while raising an error for any other shape. This would also make it explicit that this operation is only well-defined for 2-d objects. My 1e-2, f From tim.hochberg at cox.net Thu Feb 9 17:07:21 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 17:07:21 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EBC5DF.9090709@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> <43EBC5DF.9090709@cox.net> Message-ID: <43EBE721.6000602@cox.net> Tim Hochberg wrote: > Pearu Peterson wrote: > >> >> >> On Thu, 9 Feb 2006, Tim Hochberg wrote: >> >>> I had this fantasy that default_lib_dirs would get picked up >>> automagically; however that does not happen. I still ended up putting: >>> >>> from numpy.distutils import system_info >>> library_dirs = system_info.default_lib_dirs >>> result = >>> config_cmd.try_run(tc,include_dirs=[python_include], >>> library_dirs=library_dirs) >>> >>> into setup.py. Is that acceptable? It's not very elegant. >> >> >> >> No, don't use system_info.default_lib_dirs. >> >> Use distutils.sysconfig.get_python_lib() to get the directory that >> contains Python library. > > > That's the wrong library. Get_python_lib gives you the location of the > python standard library, not the location of python24.lib. The former > being python24/Lib (or python24/Lib/site-packages depending what > options you feed get_python_lib) 'and the latter being python24/libs > on my box. To follow up on this a little bit, I investigated how distutils itself finds python24.lib. It turns out that it is in build_ext.py, near line 168. The relevant code is: # also Python's library directory must be appended to library_dirs if os.name == 'nt': self.library_dirs.append(os.path.join(sys.exec_prefix, 'libs')) Unfortunately, there's no obvious, clean way to extract the library information from there. You can grab it using the following magic formula: from distutils.core import Distribution from distutils.command import build_ext be = build_ext.build_ext(Distribution()) be.finalize_options() librarys_dirs = be.library_dirs However, that seems worse than what we're doing now. I haven't actually tried this in the code either -- for all I know instantiating an extra Distribution may have some horrible side effect that I don't know about. If someone can come up with a cleaner way to get to this info, that'd be great, otherwise I'd say we might as well just keep things as they are for the time being. Regards, -tim From wbaxter at gmail.com Thu Feb 9 17:10:37 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 17:10:37 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EBDF88.6040504@colorado.edu> References: <43EBDF88.6040504@colorado.edu> Message-ID: For what it's worth, matlab's rank function just calls svd, and returns the number singular values greater than a tolerance. The implementation is a whopping 5 lines long. On 2/10/06, Fernando Perez wrote: > > Since numpy is a n-dimensional array package, it may be convenient to > introduce a matrix_rank() routine which does what matlab's rank() for 2-d > arrays and matrices, while raising an error for any other shape. This > would > also make it explicit that this operation is only well-defined for 2-d > objects. Or put it in numpy.linalg, which also makes it pretty clear what the scope is. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 9 17:14:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 17:14:15 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <43EBDF88.6040504@colorado.edu> Message-ID: Anyone know what the terms are for redistribution of applications built with Matlab? I searched around their site a bit but couldn't find anything conclusive. One page seemed to be saying there was a per-application fee for distributing a matlab-based application, but other pages made it sound more like it was no extra charge. If the former, then that's another point that should go in the 'key differences' section. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fernando.Perez at colorado.edu Thu Feb 9 17:27:39 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 17:27:39 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <43EBDF88.6040504@colorado.edu> Message-ID: <43EBEBD8.4050705@colorado.edu> Bill Baxter wrote: > For what it's worth, matlab's rank function just calls svd, and returns the > number singular values greater than a tolerance. The implementation is a > whopping 5 lines long. Yup, and it would be pretty much the same 5 lines in numpy, with the same semantics. Here's a quick and dirty implementation for old-scipy (I don't have new-scipy on this box): def matrix_rank(arr,tol=1e-8): """Return the matrix rank of an input array.""" arr = scipy.asarray(arr) if len(arr.shape) != 2: raise ValueError('Input must be a 2-d array or Matrix object') svdvals = scipy.linalg.svdvals(arr) return sum(scipy.where(svdvals>tol,1,0)) If you really hate readability and error-checking, it's a one-liner :) matrix_rank = lambda arr,tol=1e-8: sum(scipy.where(scipy.linalg.svdvals(arr)>tol,1,0)) Looks OK (RA is RandomArray from Numeric): In [21]: matrix_rank([[1,0],[0,0]]) Out[21]: 1 In [22]: matrix_rank(RA.random((3,3))) Out[22]: 3 In [23]: matrix_rank([[1,0],[0,0]]) Out[23]: 1 In [24]: matrix_rank([[1,0],[1,0]]) Out[24]: 1 In [25]: matrix_rank([[1,0],[0,1]]) Out[25]: 2 In [26]: matrix_rank(RA.random((3,3)),1e-1) Out[26]: 2 In [48]: matrix_rank([[1,0],[1,1e-8]]) Out[48]: 1 In [49]: matrix_rank([[1,0],[1,1e-4]]) Out[49]: 2 In [50]: matrix_rank([[1,0],[1,1e-8]],1e-9) Out[50]: 2 Cheers, f From aisaac at american.edu Thu Feb 9 17:33:00 2006 From: aisaac at american.edu (Alan G Isaac) Date: Thu Feb 9 17:33:00 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: On Fri, 10 Feb 2006, Bill Baxter apparently wrote: > Some kind soul added 'svd' and 'inv' to the NumPy/SciPi > columns, but those don't seem to be defined, at least for > the versions of NumPy/SciPy that I have. Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from numpy.linalg import svd, inv >>> hth, Alan Isaac From ndarray at mac.com Thu Feb 9 17:41:39 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 17:41:39 2006 Subject: [Numpy-discussion] NumPy Glossary Was:Matlab page on scipy wiki Message-ID: I've created a rough draft of NumPy Glossary on the developer's wiki . Please comment/edit. When it is ready, we can move it to scipy.org. From Fernando.Perez at colorado.edu Thu Feb 9 17:49:35 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 17:49:35 2006 Subject: [Numpy-discussion] NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43EBF101.3090401@colorado.edu> Sasha wrote: > I've created a rough draft of NumPy Glossary on the developer's wiki > . Please > comment/edit. When it is ready, we can move it to scipy.org. A humble suggestion: move it NOW. It will never 'be ready', and that's just the wiki way: put it in early, mark it at the top as a stub (so we don't falsely claim it to be in great shape when it isn't), and let it be improved in-place. The trac wiki should be for developers to work on pure development things, and it requires an SVN login (well, it doesn't right now, but this should be changed ASAP: spammers WILL show up sooner or later, and they will destroy the wiki. They did it to Enthought's and to IPython's in the past, they will also do it here; it's just a matter of time, and the cleanup later will be more work than running trac-admin now and closing wiki edit permissions for anonymous users). The public wiki is where this content should be: a non-developer can do a perfectly good job of contributing content here, so there's no reason to keep this material in the dev wiki. Cheers, f From ndarray at mac.com Thu Feb 9 18:03:47 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 18:03:47 2006 Subject: [Numpy-discussion] NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <43EBF101.3090401@colorado.edu> References: <43EBF101.3090401@colorado.edu> Message-ID: On 2/9/06, Fernando Perez wrote: > A humble suggestion: move it NOW. Done. See . From ndarray at mac.com Thu Feb 9 20:36:09 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 20:36:09 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EBB782.1010509@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> <43EBB782.1010509@cox.net> Message-ID: Well, my results are different. SVN r2087: > python -m timeit -s "from numpy import arange" "arange(10000.0)" 10000 loops, best of 3: 21.1 usec per loop SVN r2088: > python -m timeit -s "from numpy import arange" "arange(10000.0)" 10000 loops, best of 3: 25.6 usec per loop I am using gcc version 3.3.4 with the following flags: -msse2 -mfpmath=sse -fno-strict-aliasing -DNDEBUG -g -O3. The timing is consistent with the change in the DOUBLE_fill loop: r2087: 1b8f0: f2 0f 11 08 movsd %xmm1,(%eax) 1b8f4: f2 0f 58 ca addsd %xmm2,%xmm1 1b8f8: 83 c0 08 add $0x8,%eax 1b8fb: 39 c8 cmp %ecx,%eax 1b8fd: 72 f1 jb 1b8f0 r2088: 1b9d0: f2 0f 2a c2 cvtsi2sd %edx,%xmm0 1b9d4: 42 inc %edx 1b9d5: f2 0f 59 c1 mulsd %xmm1,%xmm0 1b9d9: f2 0f 58 c2 addsd %xmm2,%xmm0 1b9dd: f2 0f 11 00 movsd %xmm0,(%eax) 1b9e1: 83 c0 08 add $0x8,%eax 1b9e4: 39 ca cmp %ecx,%edx 1b9e6: 7c e8 jl 1b9d0 The loop was 5 instructions before the change and 8 instructions after. It is possible that 387 FPU may do addition and multiplication in parallel and this is why you don't see the difference. Nevetheless, I would like to withdraw my prior objections. I think the code is now more numerically correct and that is worth the slow-down on some platforms. By the way, as I was playing with the code. I've also tried the recommendation of using a[i] instead of *p: --- numpy/core/src/arraytypes.inc.src (revision 2088) +++ numpy/core/src/arraytypes.inc.src (working copy) @@ -1652,9 +1652,8 @@ @typ@ start = buffer[0]; @typ@ delta = buffer[1]; delta -= start; - buffer += 2; - for (i=2; i This is one instruction less because "add $0x8,%eax" was eliminated and all pointer arithmetics and store (buffer[i] = ...) is now done in a single instruction "movsd %xmm0,(%edx,%eax,8)". The timing, however did not change: > python -m timeit -s "from numpy import arange" "arange(10000.0)" 10000 loops, best of 3: 25.6 usec per loop My change may be worth commiting because C code is shorter and arguably more understandable (at least by Fortran addicts :-). Travis? On 2/9/06, Tim Hochberg wrote: > # baseline > arange(10000.0) took 4.39404812623 seconds for 100000 reps > # multiply instead of repeated add. > arange(10000.0) took 4.34652784083 seconds for 100000 reps From jh at oobleck.astro.cornell.edu Thu Feb 9 21:08:01 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Thu Feb 9 21:08:01 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> (numpy-discussion-request@lists.sourceforge.net) References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> Message-ID: <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> > http://scipy.org/NumPyGlossary In general, such things should be birthed on the Developer_Zone page. This is where the front page directs people to go if they are interested in contributing. We're getting a lot of new interest now, so posting hidden pages on the mailing list will miss new talent. Items can move to Documentation (or wherever) when they are somewhat stable, and continue to grow there. There's now a link for this page under the heading DOCUMENTATION: Projects on the Developer_Zone page. --jh-- From Fernando.Perez at colorado.edu Thu Feb 9 21:12:02 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 21:12:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> Message-ID: <43EC208E.5000705@colorado.edu> Joe Harrington wrote: >>http://scipy.org/NumPyGlossary > > > In general, such things should be birthed on the Developer_Zone page. > This is where the front page directs people to go if they are > interested in contributing. We're getting a lot of new interest now, > so posting hidden pages on the mailing list will miss new talent. > Items can move to Documentation (or wherever) when they are somewhat > stable, and continue to grow there. Why? Did you read the argument I made for putting it on the main wiki? How are you going to get contributions on the dev wiki once anonymous edits are locked out (which they will hopefully be very soon, before the wiki is spammed out of recognition)? The less friction and committee-ness we impose on this whole thing, the better of we'll all be. Let's be _less_ bureaucratic, not more. f From aisaac at american.edu Thu Feb 9 21:18:10 2006 From: aisaac at american.edu (Alan G Isaac) Date: Thu Feb 9 21:18:10 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com><43EB34F8.6030006@bigpond.net.au> Message-ID: On Fri, 10 Feb 2006, Vidar Gundersen apparently wrote: > i guess when first used a GPL/GFDL it always has to be? If you own the copyright, you can license it anyway you want at any time. You have already licensed it under the GFDL, but you can license it other ways as well. > i also considered CC, but didn't want to spend a lot of > time digging into legal stuff: i wanted to make the > reference available and reusable to anyone. If that is really the goal, then just include a statement placing it in the public domain. E.g., Copyright: This document has been placed in the public domain. If you want attribution, use an attribution license: http://creativecommons.org/licenses/by/2.5/ (Be sure to say what you want as attribution.) Cheers, Alan Isaac PS IANAL! From jh at oobleck.astro.cornell.edu Thu Feb 9 21:32:01 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Thu Feb 9 21:32:01 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <43EC208E.5000705@colorado.edu> (message from Fernando Perez on Thu, 09 Feb 2006 22:11:42 -0700) References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> <43EC208E.5000705@colorado.edu> Message-ID: <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> >>>http://scipy.org/NumPyGlossary >> >> >> In general, such things should be birthed on the Developer_Zone page. >> This is where the front page directs people to go if they are >> interested in contributing. We're getting a lot of new interest now, >> so posting hidden pages on the mailing list will miss new talent. >> Items can move to Documentation (or wherever) when they are somewhat >> stable, and continue to grow there. >Why? Did you read the argument I made for putting it on the main wiki? How are >you going to get contributions on the dev wiki once anonymous edits are locked >out (which they will hopefully be very soon, before the wiki is spammed out of >recognition)? Fernando, that page *is* on the main wiki (I don't deal with the developers' wiki at all). Go to scipy.org, click on Developer Zone in the navigation tabs, scroll down to DOCUMENTATION: Projects. There are two reasons to put it there. First, there are now many people who are looking for projects to do. This is where we can list stuff we want to call attention to as needing work. Once someone is happy with it, they can link it from the Documentation page as well, but it should also stay in Developer Zone until it's mature enough that we'd rather people spent their time on other projects. This is the "work on me first" page. Second, it might not belong on the Documentation page until it gets at least a little review for scope, correctness, and readability. Remember that too many stubs and apologies for being under construction will turn people away. > The less friction and committee-ness we impose on this whole thing, the better > of we'll all be. Let's be _less_ bureaucratic, not more. It just takes one happy person (you?) to link it under Documentation (and one unhappy person to take it off, but that won't be me, in this case). --jh-- From Fernando.Perez at colorado.edu Thu Feb 9 21:40:00 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 21:40:00 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> <43EC208E.5000705@colorado.edu> <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> Message-ID: <43EC26F1.2050101@colorado.edu> Joe Harrington wrote: >>Why? Did you read the argument I made for putting it on the main wiki? How are >>you going to get contributions on the dev wiki once anonymous edits are locked >>out (which they will hopefully be very soon, before the wiki is spammed out of >>recognition)? > > > Fernando, that page *is* on the main wiki (I don't deal with the > developers' wiki at all). Go to scipy.org, click on Developer Zone in > the navigation tabs, scroll down to DOCUMENTATION: Projects. I misunderstood something: I thought you wanted it moved over to the dev wiki, which is the first link on the DeveloperZone page. I read the DeveloperZone page (on the main wiki) and thought you wanted the glossary moved over to the pages linked there, and since the first ones are for the Trac wiki, I (mis)understood you wanted the glossary pushed over there. Sorry for the confusion. Cheers, f From mithrandir42 at web.de Thu Feb 9 22:02:02 2006 From: mithrandir42 at web.de (N. Volbers) Date: Thu Feb 9 22:02:02 2006 Subject: [Numpy-discussion] Using ndarray for 2-dimensional, heterogeneous data Message-ID: <43EC2C21.9020509@web.de> Hello everyone, I am re-thinking the design of my evaluation software, but I am not quite sure if I am doing the right decision, so let me state my problem: I am writing a simple evaluation program to read scientific (ASCII) data and plot it both via gnuplot and matplotlib. The data is typically very simple: numbers arranged in columns. Before numpy I was using Numeric arrays to store this data in a list of 1-dimensional arrays, e.g.: a = [ array([1,2,3,4]), array([2.3,17.2,19.1,22.2]) ] This layout made it very easy to add, remove or rearrange columns, because these were simple list operations. It also had the nice effect to allow different data types for different columns. However, row access was hard and I had to use my own iterator object to do so. When I read about heterogeneous arrays in numpy I started a new implementation which would store the same data as above like this: b = numpy.array( [(1,2,3,4), (2.3,17.2,19.1,22.2)], dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) Row operations are much easier now, because I can use numpy's intrinsic capabilities. However column operations require to create a new array based on the old one. Now I am wondering if the use of such an array has more drawbacks that I am not aware of. E.g. is it possible to mask values in such an array? And is it slower to get a certain column by using b['col1'] than it would using a homogeneous array c and the notation c[:,0]? Does anyone else use such a data layout and can report on problems with it? Best regards, Niklas Volbers. From mithrandir42 at web.de Thu Feb 9 22:12:02 2006 From: mithrandir42 at web.de (N. Volbers) Date: Thu Feb 9 22:12:02 2006 Subject: [Numpy-discussion] Using ndarray for 2-dimensional, heterogeneous data In-Reply-To: <43EC2C21.9020509@web.de> References: <43EC2C21.9020509@web.de> Message-ID: <43EC2E8A.2040200@web.de> N. Volbers wrote: > Hello everyone, > > I am re-thinking the design of my evaluation software, but I am not > quite sure if I am doing the right decision, so let me state my problem: > > I am writing a simple evaluation program to read scientific (ASCII) > data and plot it both via gnuplot and matplotlib. The data is > typically very simple: numbers arranged in columns. Before numpy I was > using Numeric arrays to store this data in a list of 1-dimensional > arrays, e.g.: > > a = [ array([1,2,3,4]), array([2.3,17.2,19.1,22.2]) ] > > This layout made it very easy to add, remove or rearrange columns, > because these were simple list operations. It also had the nice effect > to allow different data types for different columns. However, row > access was hard and I had to use my own iterator object to do so. > > When I read about heterogeneous arrays in numpy I started a new > implementation which would store the same data as above like this: > > b = numpy.array( [(1,2,3,4), (2.3,17.2,19.1,22.2)], > dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) > Sorry, I meant of course b = numpy.array( [(1,2.3), (2, 17.2), (3, 19.1), (4, 22.2)], dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) > Row operations are much easier now, because I can use numpy's > intrinsic capabilities. However column operations require to create a > new array based on the old one. > > Now I am wondering if the use of such an array has more drawbacks that > I am not aware of. E.g. is it possible to mask values in such an array? > > And is it slower to get a certain column by using b['col1'] than it > would using a homogeneous array c and the notation c[:,0]? > > Does anyone else use such a data layout and can report on problems > with it? The mathematical operations I want to use will be limited to operations acting on the column e.g. creating a new column = b['col1'] + b['col2'] and such. So of course I am aware of the basic difference that slicing works different if I have an heterogeneous array due to the fact that each row is considered a single item. Niklas. From oliphant.travis at ieee.org Thu Feb 9 22:28:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 22:28:02 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> <43EBB782.1010509@cox.net> Message-ID: <43EC3261.2060601@ieee.org> Sasha wrote: >Well, my results are different. > >SVN r2087: > > >>python -m timeit -s "from numpy import arange" "arange(10000.0)" >> >> >10000 loops, best of 3: 21.1 usec per loop > >SVN r2088: > > >>python -m timeit -s "from numpy import arange" "arange(10000.0)" >> >> >10000 loops, best of 3: 25.6 usec per loop > >I am using gcc version 3.3.4 with the following flags: -msse2 >-mfpmath=sse -fno-strict-aliasing -DNDEBUG -g -O3. > >The timing is consistent with the change in the DOUBLE_fill loop: > >r2087: > 1b8f0: f2 0f 11 08 movsd %xmm1,(%eax) > 1b8f4: f2 0f 58 ca addsd %xmm2,%xmm1 > 1b8f8: 83 c0 08 add $0x8,%eax > 1b8fb: 39 c8 cmp %ecx,%eax > 1b8fd: 72 f1 jb 1b8f0 > >r2088: > 1b9d0: f2 0f 2a c2 cvtsi2sd %edx,%xmm0 > 1b9d4: 42 inc %edx > 1b9d5: f2 0f 59 c1 mulsd %xmm1,%xmm0 > 1b9d9: f2 0f 58 c2 addsd %xmm2,%xmm0 > 1b9dd: f2 0f 11 00 movsd %xmm0,(%eax) > 1b9e1: 83 c0 08 add $0x8,%eax > 1b9e4: 39 ca cmp %ecx,%edx > 1b9e6: 7c e8 jl 1b9d0 > > > Nice to see some real hacking on this list :-) >My change may be worth commiting because C code is shorter and >arguably more understandable (at least by Fortran addicts :-). >Travis? > > Yes, I think it's worth submitting. Most of the suggestions for pointer-arithmetic for fast C-code were developed when processors spent more time computing than fetching memory. Now it seem it's all about fetching memory intelligently. The buffer[i]= style is even recommended according to the AMD-optimization book Sasha pointed out. So, I say go ahead unless somebody can point out something we are missing... -Travis From pearu at scipy.org Thu Feb 9 23:15:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Thu Feb 9 23:15:02 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EBE721.6000602@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> <43EBC5DF.9090709@cox.net> <43EBE721.6000602@cox.net> Message-ID: On Thu, 9 Feb 2006, Tim Hochberg wrote: > Tim Hochberg wrote: > >> Pearu Peterson wrote: >> >>> >>> >>> On Thu, 9 Feb 2006, Tim Hochberg wrote: >>> >>>> I had this fantasy that default_lib_dirs would get picked up >>>> automagically; however that does not happen. I still ended up putting: >>>> >>>> from numpy.distutils import system_info >>>> library_dirs = system_info.default_lib_dirs >>>> result = config_cmd.try_run(tc,include_dirs=[python_include], >>>> library_dirs=library_dirs) >>>> >>>> into setup.py. Is that acceptable? It's not very elegant. >>> >>> >>> >>> No, don't use system_info.default_lib_dirs. >>> >>> Use distutils.sysconfig.get_python_lib() to get the directory that >>> contains Python library. >> >> >> That's the wrong library. Get_python_lib gives you the location of the >> python standard library, not the location of python24.lib. The former being >> python24/Lib (or python24/Lib/site-packages depending what options you feed >> get_python_lib) 'and the latter being python24/libs on my box. Ok, but using system_info.default_lib_dirs is still wrong, this list is not designed for this purpose.. > To follow up on this a little bit, I investigated how distutils itself finds > python24.lib. It turns out that it is in build_ext.py, near line 168. The > relevant code is: > > # also Python's library directory must be appended to library_dirs > if os.name == 'nt': > self.library_dirs.append(os.path.join(sys.exec_prefix, 'libs')) Hmm, this should be effective also for numpy.distutils. self.library_dirs and other such attributes are used in distutils.command.build_ext.run() method while our numpy.distutils.command.build_ext.run() doesn't. So, I we have to do is to update numpy.distutils.command.build_ext.run() method to resolve this issue. This should fix also rpath issues that was reported on this list for certain platforms. I'll look fixing it today.. Pearu From wbaxter at gmail.com Thu Feb 9 23:22:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 23:22:03 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <43EC26F1.2050101@colorado.edu> References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> <43EC208E.5000705@colorado.edu> <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> <43EC26F1.2050101@colorado.edu> Message-ID: Well, I'm with Fernando. Wikis are meant for having people muck with them. I am much more annoyed for instance by the "NumPy Tutorial" teaser link on the main documentation page that goes to nowhere than I would be by a half finished page that acknowledges it's half finished. If I'd known I was supposed to go to some Dev Zone and put in a provisional page or whatnot I probably wouldn't have made the NumPy for Matlab users page. Which is still half finished, thank-you-very-much, but nonetheless still contains useful information. More useful information than the "NumPy Tutorial" link at least. :-) --bb On 2/10/06, Fernando Perez wrote: > > Joe Harrington wrote: > > >>Why? Did you read the argument I made for putting it on the main wiki? > How are > >>you going to get contributions on the dev wiki once anonymous edits are > locked > >>out (which they will hopefully be very soon, before the wiki is spammed > out of > >>recognition)? > > > > > > Fernando, that page *is* on the main wiki (I don't deal with the > > developers' wiki at all). Go to scipy.org, click on Developer Zone in > > the navigation tabs, scroll down to DOCUMENTATION: Projects. > > I misunderstood something: I thought you wanted it moved over to the dev > wiki, > which is the first link on the DeveloperZone page. I read the > DeveloperZone > page (on the main wiki) and thought you wanted the glossary moved over to > the > pages linked there, and since the first ones are for the Trac wiki, I > (mis)understood you wanted the glossary pushed over there. Sorry for the > confusion. > > Cheers, > > f > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Thu Feb 9 23:40:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 9 23:40:03 2006 Subject: [Numpy-discussion] Using ndarray for 2-dimensional, heterogeneous data In-Reply-To: <43EC2E8A.2040200@web.de> References: <43EC2C21.9020509@web.de> <43EC2E8A.2040200@web.de> Message-ID: <1139557152.7537.9.camel@localhost.localdomain> El dv 10 de 02 del 2006 a les 06:11 +0000, en/na N. Volbers va escriure: > N. Volbers wrote: > Sorry, I meant of course > > b = numpy.array( [(1,2.3), (2, 17.2), (3, 19.1), (4, 22.2)], > dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) > > > Row operations are much easier now, because I can use numpy's > > intrinsic capabilities. However column operations require to create a > > new array based on the old one. Yes, but this should be a pretty fast operation, as not data copy is implied in doing b['col1'], for example. > > Now I am wondering if the use of such an array has more drawbacks that > > I am not aware of. E.g. is it possible to mask values in such an array? I'm not familiar with masked arrays, but my understanding is that such column arrays are the same than regular arrays, so I'd say yes. > > And is it slower to get a certain column by using b['col1'] than it > > would using a homogeneous array c and the notation c[:,0]? Well, you should do some benchmarks, but I'd be surprised if there is a big speed difference. > > Does anyone else use such a data layout and can report on problems > > with it? I use column data *a lot* in numarray and had not problems with this. With NumPy things should be similar in terms of stability. > The mathematical operations I want to use will be limited to operations > acting on the column e.g. creating a new column = b['col1'] + b['col2'] > and such. So of course I am aware of the basic difference that slicing > works different if I have an heterogeneous array due to the fact that > each row is considered a single item. Exactly, these array columns are the same than regular homogeneous arrays. The only difference is that there is a 'hole' between elements. However, this is handled internally by NumPy through the use of the stride. My 2 cents, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From atrocious at netvigator.com Fri Feb 10 01:29:05 2006 From: atrocious at netvigator.com (BALANCED SCORECARD) Date: Fri Feb 10 01:29:05 2006 Subject: [Numpy-discussion] =?windows-1251?B?ysDKIM/F0MXJ0sggztIgwd7ExsXSzc7DziDTz9DAwsvFzcjfIMog0dLQwNLFw8jX?= =?windows-1251?B?xdHKzszT?= Message-ID: ?????????? ?????. ??????????? ??? ????. ? ????? - ??? ?? ??????? ?? ???????? - ?????????. ? ? ?????? ?????????? ???? ??????? ????. ??????. ?????? ????. ?????. ??????. ?????????. ?????????. ?????????, ???, ???? ?????????. ????. ????? ?????????. ???????, ??????, ?????, ?????, ?????? ????????: ?????????. ???. ??????? ????, ??????? ????, ??????? ??????. ????? ??????????, ????????? ??????????? ??????????, ?????? ??????. ?????? ?????????? - ???????? ??????????. ???????????? ??????????. ?????? - ??????????? ???????? ??????-????. ????? - ????? ? ?????? ?????????. ?????? ? ???????? - ????????! ?????????. ?????????. ???, ????? ????????? ??? ????? ???? (??? ?????? ???? - ?????????? ????????). ????? ???! ????? ???????????? ????????. ?????????????? ??????????. ????? ??????. ???????????????? ??????????. ?, ???... BALANCED SCORECARD: ??? ??????? ?? ?????????? ?????????? ? ??????????????? ?????? 21 - 22 ??????? ... ???, (495) 980-65-36 ???????????????? ?????????? - ???????? ?????? ? ??????????????? ???????????. ?????????? ?? ???????????????? ??????? ???????????, ?? ???? ???????? ???????? ??? ?????????? ???????????? ?????? ????????, ????????? ????????????? ? ????????????? ????????????????? ???????????? ???????????. I. ???????????????? ??????? ??????????? ??????? ????????? ??? ? ?? ????????. ??????? ?? ?????????? ????????????? ???????? ? ?????????? ???????????. ????????? ???? ?????????????? ??????? ? ????? ???????? ???????? ?????????. ?????????????? ?????. ?????????????? ??????????? ? ?????????? ? ?????????? ????????????, ???????????? ?????????? ????????? ? ????????. ????????-???????????? ????????? ?????, ???????????? ????????? ???????????. ???????? ?????????? ????????????????. ??????????? ????? ?????? ????? ?????????? ???? ???????? ?????, ??????? ????????, ???????????? ???????? ? ???????. ?????????? ?????????????, ???????????? ????????? ? ????????? ?????????? ????????. ???????? ?????? ??????????????? ??????????. ???????????? ?????/??????????? ???????? ??? ????? ???????? ????????? ?? ???????????? ???????. ????????? ??????????? ? ??????????? ?? ?????????? ???????? ???????????. II. ????????? ??? ???????? ?????????. ?????????? ?????? ? ?????????. ?????????????? ??????? ?????????? ?????????????. ??????? ????????? ??? ? ??????. ?????????? ?????? ??? ?????????????? ?????????? ?????????? ???. ???????? ???????? ??????? ??????????? ??????? ? ??????? ???. ???????? ???. ?????????-??????????????? ?????????????? ??? ???????? ?????????? ??????????????? ??????????????. ??????? ??????? ??????. ?????????? ????????? ????????? ???. ??????? "????????? - ??????? ???????". ?????????? ??????? ??? ?????? ???????? ???????? ?????? ? ????????? ????????? ???????????. ???????????? ???????. ?????????? ?????? ??? ??? ???????????? ???????????. ????? ???????? - ???????? ????????????? ????, ?????????? ?? ??????????????? ??????????. ????? ???? ???????????? ?????? ? ?????????? ????????? ????????????? ?????? ??????????? ??????? ? ??????-????????????, ????????????? ???????-?????????????? ??????, ????????????? ?????? ?????????????? ????????, ? ?????????????? ???????? ????????????? ??????? "?????????? ???????????????? ??????????????". ????? ???? "???????????????? ?????????? ????????????", "??????-???? ???????????? ????????" ? ???? ?????? ? ??????? ????????. (495) 98O-65-?9 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu at scipy.org Fri Feb 10 03:10:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 10 03:10:03 2006 Subject: [Numpy-discussion] BUG: numpy.put raises TypeError. Message-ID: Hi, There seems to be a bug in numpy.put function (i.e. array.put method). Consider the following example: a = array([0, 0, 0, 0, 0]) a.put([1.1],[2]) # works as documented a.put(array([1.1]),[2]) # raises the following exception: TypeError: array cannot be safely cast to required type The bug seems to boil down to calling PyArray_FromArray function but I got a bit lost on debugging this issue.. Pearu From pearu at scipy.org Fri Feb 10 03:30:06 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 10 03:30:06 2006 Subject: [Numpy-discussion] BUG(?): array([None])==None test Message-ID: Hi, While converting some Numeric based code to numpy, I noticed that in Numeric array([None])==None returns array([1]) while in numpy it returns False. Is this expected behaviour or a bug? Pearu From svetosch at gmx.net Fri Feb 10 03:36:02 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Fri Feb 10 03:36:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EBEBD8.4050705@colorado.edu> References: <43EBDF88.6040504@colorado.edu> <43EBEBD8.4050705@colorado.edu> Message-ID: <43EC7A6E.2020702@gmx.net> Fernando Perez schrieb: > Bill Baxter wrote: >> For what it's worth, matlab's rank function just calls svd, and >> returns the >> number singular values greater than a tolerance. The implementation is a >> whopping 5 lines long. > > Yup, and it would be pretty much the same 5 lines in numpy, with the > same semantics. > > Here's a quick and dirty implementation for old-scipy (I don't have > new-scipy on this box): > Is there any reason not to use the algorithm implicit in lstsq, as in: rk = linalg.lstsq(M, ones(p))[2] (where M is the matrix to check, and p==M.shape[0]) thanks, sven From stefan at sun.ac.za Fri Feb 10 04:24:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 10 04:24:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <20060210122242.GA21950@sun.ac.za> On Thu, Feb 09, 2006 at 05:21:10PM +0900, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the scipy > wiki. > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart of > equivalents that I laid out. One of my colleagues also asked about the shortest way to do array concatenation. In Octave that would be [1, 0, 1:4, 0, 1] Using numpy we currently do concatenate([[1, 0], arange(1,5), [0, 1]]) or vstack(...) The "+" operator now means something else, so you can't do [1,0] + arange(1,5) + [0,1], while [1, 0, arange(1,5), 0, 1] produces [1, 0, array([1, 2, 3, 4]), 0, 1] which can't be converted to an array by simply doing array([[1, 0, array([1, 2, 3, 4]), 0, 1]]) I'll add it to the wiki once I know what the best method is. Regards St?fan From aisaac at american.edu Fri Feb 10 06:20:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 10 06:20:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <20060210122242.GA21950@sun.ac.za> References: <20060210122242.GA21950@sun.ac.za> Message-ID: On Fri, 10 Feb 2006, Stefan van der Walt apparently wrote: > In Octave that would be > [1, 0, 1:4, 0, 1] > Using numpy we currently do > concatenate([[1, 0], arange(1,5), [0, 1]]) or > vstack(...) numpy.r_[1,0,range(1,5),0,1] fwiw, Alan Isaac From cjw at sympatico.ca Fri Feb 10 07:14:07 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 10 07:14:07 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> Message-ID: <43ECAD98.60802@sympatico.ca> Alan G Isaac wrote: >On Fri, 10 Feb 2006, Stefan van der Walt apparently wrote: > > >>In Octave that would be >>[1, 0, 1:4, 0, 1] >>Using numpy we currently do >>concatenate([[1, 0], arange(1,5), [0, 1]]) or >>vstack(...) >> >> > >numpy.r_[1,0,range(1,5),0,1] > >fwiw, >Alan Isaac > > This seems to be a neat idea but not in the usual Python style. >>> help(numpy.r_) Help on concatenator in module numpy.lib.index_tricks object: class concatenator(__builtin__.object) | Translates slice objects to concatenation along an axis. | | Methods defined here: | | __getitem__(self, key) | | __getslice__(self, i, j) | | __init__(self, axis=0, matrix=False) | | __len__(self) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __dict__ = | dictionary for instance variables (if defined) | | __weakref__ = | list of weak references to the object (if defined) The help refers to concatenator, presumably r_ is a synonym, but that name is not available to the user: >>> numpy.concatenator Traceback (most recent call last): File "", line 1, in ? AttributeError: 'module' object has no attribute 'concatenator' >>> If r_ is a class, couldn't it have have a more mnemonic name and, in the usual Python style, start with an upper case letter? help(numpy.r_.__init__) Help on method __init__ in module numpy.lib.index_tricks: __init__(self, axis=0, matrix=False) unbound numpy.lib.index_tricks.concatenator method >>> Colin W. From jh at oobleck.astro.cornell.edu Fri Feb 10 08:17:02 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Fri Feb 10 08:17:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> (numpy-discussion-request@lists.sourceforge.net) References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> Message-ID: <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Look, people are making this into something it isn't, and hating that. I'm the last person in the world to want rules and bureaucrazy! However, to make the wiki accomplish our goals, we need to agree on some basic standards of judgement that we apply to ourselves. We did agree on some goals after SciPy '04: we want Python to be the environment of choice for EVERYONE to do numerical manipulation and data display, and to get there we need to present ourselves well (see the ASP roadmap, linked at the bottom of DevZone). I'm going to lay out my reasoning why a particular workflow will reach that goal. The idea is to take full advantage of the Wiki Way without having the liabilities of the Wiki Way. I'm sorry this is long. I have three proposals due Thursday and I don't have time to edit much. Bill, your argument is that you want to see a work in progress, something you and anyone else can just go and contribute to whenever you see a need, and that benefits from others making contributions continually. That's great from the point of view of an experienced user of array math (in Python or otherwise) who is used to and interested in contributing in the Wiki Way. I'm sure most list members, even of the scipy-user list, are still in this category. It is therefore crucial for all of us to remember that such people are *not* the main viewers of the site, at least not in the future we said we hoped for, and probably not even now. Most viewers are dyed-in-the-wool users and potential users. They want to see a clean, professional, straightforward site with the simple life laid out before them: bulletproof, current, binary installs for all their preferred platform(s); readable, grammatical, complete, current, well-assembled documentation for beginners and experts, both tutorial and reference; examples, screenshots, demos; lots of good, indexed, well-documented, shared software; and an active community. "Under construction" is an annoyance at best, and a deal-killer to many. They may contribute someday, but that's not in their minds yet. Recall that we have in our vision not just practicing scientists, but also secondary-school students and their teachers, students taking their only math or science course in college (under direction from their professors and TAs), and even photographers and others who would have need of the manipulations NumPy allows without necessarily understanding them. Our failure to gather a large following to date is largely due to our not (yet) delivering on the site/project vision above. The reunification that is NumPy allows us to change that. There are a LOT of people lurking, waiting for things to clean up and professionalize. Once they jump in, they'll tell their peers, who have never heard of us, and so on. The pure Wiki Way will never produce the site that will start this hoped-for avalanche of ordinary users. Wikis are always under construction, always bleeding-edge, always overdone in areas where their developers have interest, and always weak in crucial places where they don't. That said, wikis are *great* incubators. Wikis get individual subprojects completed fast and well by breaking down the barriers to contribution and review. DevZone tries to take advantage of the Wiki Way while still producing a polished site. The point of DevZone is twofold: first, focus contributors looking for a project on the things that most need help: our weak spots, like documentation. I'm getting about an email a week from new people looking to help. This wasn't the case a month ago or before; it was more like 1-2 a year. Second, isolate the worst of "under constructon" from the general view, so we don't look like a pile of stubs, go-nowhere links, and abandonned or outdated projects. The second item probably makes more sense for larger projects (a user's guide, etc.) than for pages like the glossary. The model for workflow is that projects are announced to the lists and immediately started in DevZone. When they reach a basic level of completeness and cleanliness, something like a 1.0 release, they get a link from the main site. When they are no longer in need of new helpers or when their development curve levels off, the DevZone link goes away. Right now, it's pretty loose when to put the link in from the main site to a project's page. Anyone can do it. To avoid a Wiki War, we need to have some common vision for how much of a construction zone we want the main site to be. The development model I'm discussing is a little like the Plone model, except that there is no elite administrator who decides when things move up to the main site. We dumped Plone for performance reasons, and because the administrators were too busy with their Enthought work to have time to do much on the site. But, the basic Plone idea of internal incubation is a good one. In my ideal world, a group (potentially including everyone, at least to a small degree) would write a (large) doc and periodically ask the list to take a look and comment. At some point, they'd ask for objections to putting it on the main site. They'd try to satisfy any objections, and then put it up. I'd trust the authors of a smaller doc (that therefore was both easier to write and had more people willing to give a quick review) to make the decision to promote to the main site themselves. This is exactly what code developers do when cutting releases: ensure that when a version goes public, it is reasonably clean, consistent, and complete. At the moment, some pages, most notably Documentation, are a mess. Clear out all the "incompletes", and those pages will look like Cookbook: cool but with gaping holes. I think that would be an improvement, particularly if we state on those pages that documents in early stages of construction are in DevZone, and provide a link. We could link those docs at the bottom of Documentation, but there is a point in a project's life cycle when it would be linked both from the main site and in DevZone. Do we want those projects to have two links on the same page? And do we really want all that "under construction" in people's faces? In the future, we will have good docs and a more spanning set of recipes. At that point, if we have embraced the pure Wiki Way, we will have a hard time agreeing no longer to do early construction in the main site. The loads of construction between the gems will turn away many of the huge class of non-expert users. SciPy will thus gain fewer users, and therefore attract fewer contributors and grow more slowly. My point now is to get our community culture to include a sense of professionalism and pride about what we present to the world. Unless you're a fool or you have no competition, you dress well for a job interview. We're not fools, and we have very healthy competition. The main site is the first impression we make on new users. My goal is to prevent it from being the last. --jh-- From cjw at sympatico.ca Fri Feb 10 08:26:06 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 10 08:26:06 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43EC96F5.6020500@sympatico.ca> David M. Cooke wrote: >Bill Baxter writes: > > > >>Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those >>don't seem to be defined, at least for the versions of NumPy/SciPy that I >>have. Are they new? Or are they perhaps defined by a 3rd package in your >>environment? >> >> > >They're in numpy.linalg. > > > >>By the way, is there any python way to tell which package a symbol is coming >>from? >> >> > >Check it's __module__ attribute. > > > Yes, but not all objects have this attribute and some do not yet have a docstring. Colin W. From bblais at bryant.edu Fri Feb 10 08:40:04 2006 From: bblais at bryant.edu (Brian Blais) Date: Fri Feb 10 08:40:04 2006 Subject: [Numpy-discussion] gnuplot problem with numpy Message-ID: <43ECC154.1000004@bryant.edu> Hello, I have been trying to use the Gnuplot1.7.py module, but it doesn't seem to work with numpy (although it works with Numeric). The following code plots two "identical" sets of data, but the numpy data gets rounded to the nearest integer when passed to Gnuplot. What is odd is that the offending code in utils.py, is the function float_array(m), which does the conversion that I do in this script, but it doesn't seem to work. Any ideas? #---------------------------- import numpy import Numeric import Gnuplot g = Gnuplot.Gnuplot(debug=1) dh=.1; x=numpy.arange(dh,2+dh,dh,'d') y1 = x**2 y2=y1 d1 = Gnuplot.Data(x, y1, title='numpy', with='points') # doesn't work d2 = Gnuplot.Data(Numeric.asarray(x,'f'), Numeric.asarray(y2,'f'), title='Numeric', with='points') # works g.plot(d1,d2) #---------------------------- thanks, bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From Fernando.Perez at colorado.edu Fri Feb 10 09:15:02 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Fri Feb 10 09:15:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EC7A6E.2020702@gmx.net> References: <43EBDF88.6040504@colorado.edu> <43EBEBD8.4050705@colorado.edu> <43EC7A6E.2020702@gmx.net> Message-ID: <43ECC9D0.8070809@colorado.edu> Sven Schreiber wrote: > Fernando Perez schrieb: > >>Bill Baxter wrote: >> >>>For what it's worth, matlab's rank function just calls svd, and >>>returns the >>>number singular values greater than a tolerance. The implementation is a >>>whopping 5 lines long. >> >>Yup, and it would be pretty much the same 5 lines in numpy, with the >>same semantics. >> >>Here's a quick and dirty implementation for old-scipy (I don't have >>new-scipy on this box): >> > > > Is there any reason not to use the algorithm implicit in lstsq, as in: > rk = linalg.lstsq(M, ones(p))[2] Simplicity? lstsq goes through a lot of contortions (needed for other reasons), and uses lapack's *gelss. If you read its man page: PURPOSE DGELSS computes the minimum norm solution to a real linear least squares problem: Minimize 2-norm(| b - A*x |). using the singular value decomposition (SVD) of A. A is an M-by-N matrix which may be rank-deficient. Several right hand side vectors b and solution vectors x can be handled in a single call; they are stored as the columns of the M-by-NRHS right hand side matrix B and the N-by-NRHS solution matrix X. The effective rank of A is determined by treating as zero those singu- lar values which are less than RCOND times the largest singular value. So you've gone through all that extra complexity, to get back what a direct call to svd would give you (caveat: the quick version I posted used absolute tolerance, while this one is relative; that can be trivially fixed). Given that a direct SVD call fits the definition of what we are computing (a numerical estimation of a matrix rank), I completely fail to see the point of going through several additional layers of unnecessary complexity, which both add cost and obscure the intent of the calculation. But perhaps I'm missing something... Cheers, f From tim.hochberg at cox.net Fri Feb 10 09:41:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 10 09:41:12 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> <43EBC5DF.9090709@cox.net> <43EBE721.6000602@cox.net> Message-ID: <43ECCFFF.2000209@cox.net> Pearu Peterson wrote: > > > On Thu, 9 Feb 2006, Tim Hochberg wrote: > >> Tim Hochberg wrote: >> >>> Pearu Peterson wrote: >>> >>>> >>>> >>>> On Thu, 9 Feb 2006, Tim Hochberg wrote: >>>> >>>>> I had this fantasy that default_lib_dirs would get picked up >>>>> automagically; however that does not happen. I still ended up >>>>> putting: >>>>> >>>>> from numpy.distutils import system_info >>>>> library_dirs = system_info.default_lib_dirs >>>>> result = >>>>> config_cmd.try_run(tc,include_dirs=[python_include], >>>>> library_dirs=library_dirs) >>>>> >>>>> into setup.py. Is that acceptable? It's not very elegant. >>>> >>>> >>>> >>>> >>>> No, don't use system_info.default_lib_dirs. >>>> >>>> Use distutils.sysconfig.get_python_lib() to get the directory that >>>> contains Python library. >>> >>> >>> >>> That's the wrong library. Get_python_lib gives you the location of >>> the python standard library, not the location of python24.lib. The >>> former being python24/Lib (or python24/Lib/site-packages depending >>> what options you feed get_python_lib) 'and the latter being >>> python24/libs on my box. >> > > Ok, but using system_info.default_lib_dirs is still wrong, this list > is not designed for this purpose.. OK. > >> To follow up on this a little bit, I investigated how distutils >> itself finds python24.lib. It turns out that it is in build_ext.py, >> near line 168. The relevant code is: >> >> # also Python's library directory must be appended to library_dirs >> if os.name == 'nt': >> self.library_dirs.append(os.path.join(sys.exec_prefix, >> 'libs')) > > > Hmm, this should be effective also for numpy.distutils. > self.library_dirs and other such attributes are used in > distutils.command.build_ext.run() method while our > numpy.distutils.command.build_ext.run() doesn't. So, I we have to do > is to update numpy.distutils.command.build_ext.run() method to resolve > this issue. This should fix also rpath issues that was reported on > this list for certain platforms. I'll look fixing it today.. While your looking at it, keep in mind that the original failure that I was trying to fix occurs when numpy/core/setup.py calls config_cmd.try_run. I'm not certain, but I suspect that this isn't going to got through numpy.distutils.command.build_ext. One strategy would be to put a functions somewhere that returns these extra libray directories somewhere appropriate and call it from both numpy.distutils.command.build_ext and numpy/core/setup.py. It could look like: def get_extra_library_dirs(): if os.name == 'nt': return [os.path.join(sys.exec_prefix, 'libs')] else: return [] I'm not sure what would be an appropriate place for it though. -tim > > Pearu > > From tim.hochberg at cox.net Fri Feb 10 11:17:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 10 11:17:01 2006 Subject: [Numpy-discussion] Test test_minrelpath.py on unix for me Message-ID: <43ECE677.2040104@cox.net> Could someone try the attached diff on a unixy system? It works under windows, but it's easy to mess up those \/'s. Thanks, -tim -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fix_minrelpathtests.diff URL: From nicolist at limare.net Fri Feb 10 12:25:04 2006 From: nicolist at limare.net (Nico) Date: Fri Feb 10 12:25:04 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations Message-ID: <43ECF661.1010003@limare.net> Hi! a = array([0,1,0,0]) a[1:] |= a[:-1] gives the unexpected result [0 1 1 1] instead of [0 1 1 0] because python performs the |= on the forst cell, then on the second, and so on. I found two ways to get it right, with a copy: b = a.copy() a[1:] |= b[:-1] or working backward: a[-1:1:-1] |= a[-2:0:-1] which is better, in terms of speed (and memory management), for large 3D arrays? -- Nico From nicolist at limare.net Fri Feb 10 12:47:02 2006 From: nicolist at limare.net (Nico) Date: Fri Feb 10 12:47:02 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations In-Reply-To: <43ECF661.1010003@limare.net> References: <43ECF661.1010003@limare.net> Message-ID: <43ECFB86.6030600@limare.net> > a = array([0,1,0,0]) > a[1:] |= a[:-1] > > gives the unexpected result [0 1 1 1] instead of [0 1 1 0] because > python performs the |= on the first cell, then on the second, and so on. > > I found two ways to get it right, with a copy: > b = a.copy() > a[1:] |= b[:-1] > > or working backward: > a[-1:1:-1] |= a[-2:0:-1] I finally noticed that a = array([0,1,0,0]) a[1:] |= a[:-1] | False also works, but I can't figure out why... -- Nico From tim.hochberg at cox.net Fri Feb 10 12:51:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 10 12:51:02 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations In-Reply-To: <43ECF661.1010003@limare.net> References: <43ECF661.1010003@limare.net> Message-ID: <43ECFC86.8020603@cox.net> Nico wrote: >Hi! > >a = array([0,1,0,0]) >a[1:] |= a[:-1] > >gives the unexpected result [0 1 1 1] instead of [0 1 1 0] because >python performs the |= on the forst cell, then on the second, and so on. > >I found two ways to get it right, with a copy: >b = a.copy() >a[1:] |= b[:-1] > > You could also do: a[1:] = a[1:] | a[:-1] This is nearly the same as the copy version, but uses very slightly less space and is clearer IMO. >or working backward: >a[-1:1:-1] |= a[-2:0:-1] > >which is better, in terms of speed (and memory management), for large 3D >arrays? > > The backwards version will be better in terms of memory usage and almost certainly in terms of speed as well since it avoids and extra copy and should have better locality of reference (also because of no extra copy). It's a little obscure though. I'd be tempted to do something like. a_rev = a[::-1] a_rev[:-1] |= a_rev[1:] # a holds result. Another question / suggestion is: are you using a[0]. This looks like an operation where you may well be throwing away a[0] when you are done anyway. If that is the case, would it work to use: a[:-1] |= a[1:] a = a[:-1] This last will give you the same result as the others except that the first value will be missing. -tim From perry at stsci.edu Fri Feb 10 12:55:01 2006 From: perry at stsci.edu (Perry Greenfield) Date: Fri Feb 10 12:55:01 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations In-Reply-To: <43ECFB86.6030600@limare.net> References: <43ECF661.1010003@limare.net> <43ECFB86.6030600@limare.net> Message-ID: <5f2eeb776ec6e857cc6c92c73f0249bc@stsci.edu> On Feb 10, 2006, at 3:45 PM, Nico wrote: > > I finally noticed that > > a = array([0,1,0,0]) > a[1:] |= a[:-1] | False > > also works, but I can't figure out why... > Because the expression on the right generates a new copy, thus eliminating the problem of overwriting itself. From Chris.Barker at noaa.gov Fri Feb 10 13:34:04 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 10 13:34:04 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43ED06A6.7040600@noaa.gov> Bill Baxter wrote: > By the way, is there any python way to tell which package a symbol is coming > from? Yes. Don't use "import *". There is a long tradition of using NumPy this way: from numpy import * But now I always use it this way: import numpy as N (or nx, or whatever short name you want). I like it, because it's always clear where stuff is coming from. numpy's addition of a number of methods for what used to be functions helps make this more convenient too. "Namespaces are one honking great idea -- let's do more of those!" from: http://www.python.org/doc/Humor.html#zen -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Fri Feb 10 13:58:03 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 10 13:58:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ECAD98.60802@sympatico.ca> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> Message-ID: <43ED0C5D.6030300@noaa.gov> Colin J. Williams wrote: >> numpy.r_[1,0,range(1,5),0,1] > This seems to be a neat idea but not in the usual Python style. Exactly. couldn't it at least get a meaningful, but short, name? And is there a way to use it to concatenate along other axis? I couldn't see a way. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at ee.byu.edu Fri Feb 10 14:18:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 10 14:18:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> Message-ID: <43ED10D0.1000405@ee.byu.edu> Alan G Isaac wrote: >On Fri, 10 Feb 2006, Stefan van der Walt apparently wrote: > > >>In Octave that would be >>[1, 0, 1:4, 0, 1] >>Using numpy we currently do >>concatenate([[1, 0], arange(1,5), [0, 1]]) or >>vstack(...) >> >> > >numpy.r_[1,0,range(1,5),0,1] > > or even faster numpy.r_[1,0,1:5,0,1] The whole point of r_ is to allow you to use slice notation to build ranges easily. I wrote it precisely to make it easier to construct arrays in a simliar style that Matlab allows. -Travis From oliphant at ee.byu.edu Fri Feb 10 14:29:05 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 10 14:29:05 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED0C5D.6030300@noaa.gov> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> <43ED0C5D.6030300@noaa.gov> Message-ID: <43ED138C.7040301@ee.byu.edu> Christopher Barker wrote: > Colin J. Williams wrote: > >>> numpy.r_[1,0,range(1,5),0,1] >> > >> This seems to be a neat idea but not in the usual Python style. > > > Exactly. couldn't it at least get a meaningful, but short, name? It is meaningful :-) r_ means row concatenation... (but, it has taken on more functionality than that). What name do you suggest? > > And is there a way to use it to concatenate along other axis? I > couldn't see a way. Yes, add a string at the end with the number of the axis you want to concatenate along. But, you have to have that axis to start with or the result is no different. The default is to concatenate along the last axis. Thus (the ndmin keyword forces the array to have a minimum number of dimensions --- prepended). a = array([1,2,3],ndmin=2) b = array([1,2,3],ndmin=2) c = r_[a,b,'0'] print c [[1 2 3] [1 2 3]] print r_[a,b,'1'] [[1 2 3 1 2 3]] -Travis From ndarray at mac.com Fri Feb 10 14:31:05 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 14:31:05 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED10D0.1000405@ee.byu.edu> References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: On 2/10/06, Travis Oliphant wrote: > The whole point of r_ is to allow you to use slice notation to build > ranges easily. I wrote it precisely to make it easier to construct > arrays in a simliar style that Matlab allows. Maybe it is just me, but r_ is rather unintuitive. I would expect something like this to be called "c" for "combine" or "concatenate." This is the name used by S+ and R. >From R manual: """ c package:base R Documentation Combine Values into a Vector or List ... Examples: c(1,7:9) ... """ From ndarray at mac.com Fri Feb 10 14:50:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 14:50:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: To tell you the truth I dislike trailing underscore much more than the choice of letter. In my code I will probably be renaming all these foo_ to delete the underscore foo_(...) or foo_[...] is way too ugly for my taste. However I fully admit that it is just a matter of taste and it is trivial to rename things on import in Python. PS: Trailing underscore reminds me of C++ - the language that I happily live without :-) On 2/10/06, Ryan Krauss wrote: > The problem is that c_ at least used to mean "column concatenate" and > concatenate is too long to type. > > On 2/10/06, Sasha wrote: > > On 2/10/06, Travis Oliphant wrote: > > > The whole point of r_ is to allow you to use slice notation to build > > > ranges easily. I wrote it precisely to make it easier to construct > > > arrays in a simliar style that Matlab allows. > > > > Maybe it is just me, but r_ is rather unintuitive. I would expect > > something like this to be called "c" for "combine" or "concatenate." > > This is the name used by S+ and R. > > > > From R manual: > > """ > > c package:base R Documentation > > Combine Values into a Vector or List > > ... > > Examples: > > c(1,7:9) > > ... > > """ > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > > for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From ndarray at mac.com Fri Feb 10 14:58:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 14:58:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: Actually, what would be wrong with a single letter "c" or "r" for the concatenator? NumPy already has one single-letter global identifier - "e", so it will not be against any naming standard. I don't think either "c" or "r" will conflict with anything in the standard library. I would still prefer "c" because "r" is taken by RPy. On 2/10/06, Sasha wrote: > To tell you the truth I dislike trailing underscore much more than the > choice of letter. In my code I will probably be renaming all these > foo_ to delete the underscore foo_(...) or foo_[...] is way too ugly > for my taste. However I fully admit that it is just a matter of taste > and it is trivial to rename things on import in Python. > > PS: Trailing underscore reminds me of C++ - the language that I > happily live without :-) > > On 2/10/06, Ryan Krauss wrote: > > The problem is that c_ at least used to mean "column concatenate" and > > concatenate is too long to type. > > > > On 2/10/06, Sasha wrote: > > > On 2/10/06, Travis Oliphant wrote: > > > > The whole point of r_ is to allow you to use slice notation to build > > > > ranges easily. I wrote it precisely to make it easier to construct > > > > arrays in a simliar style that Matlab allows. > > > > > > Maybe it is just me, but r_ is rather unintuitive. I would expect > > > something like this to be called "c" for "combine" or "concatenate." > > > This is the name used by S+ and R. > > > > > > From R manual: > > > """ > > > c package:base R Documentation > > > Combine Values into a Vector or List > > > ... > > > Examples: > > > c(1,7:9) > > > ... > > > """ > > > > > > > > > ------------------------------------------------------- > > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > > > for problems? Stop! Download the new AJAX search engine that makes > > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > > http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > > From efiring at hawaii.edu Fri Feb 10 14:59:03 2006 From: efiring at hawaii.edu (Eric Firing) Date: Fri Feb 10 14:59:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED138C.7040301@ee.byu.edu> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> <43ED0C5D.6030300@noaa.gov> <43ED138C.7040301@ee.byu.edu> Message-ID: <43ED1A71.1080605@hawaii.edu> Travis Oliphant wrote: > Christopher Barker wrote: > >> Colin J. Williams wrote: >> >>>> numpy.r_[1,0,range(1,5),0,1] >>> >>> >> >>> This seems to be a neat idea but not in the usual Python style. >> >> >> >> Exactly. couldn't it at least get a meaningful, but short, name? > > > It is meaningful :-) r_ means row concatenation... (but, it has taken > on more functionality than that). What name do you suggest? "cat"? "rcat"? "catr"? "catter"? Eric From ndarray at mac.com Fri Feb 10 15:16:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 15:16:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED1A71.1080605@hawaii.edu> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> <43ED0C5D.6030300@noaa.gov> <43ED138C.7040301@ee.byu.edu> <43ED1A71.1080605@hawaii.edu> Message-ID: I would be against any meaningful name because it will look too much like a function and people will be trying to use (...) instead of [...] with it. A single-letter identifier will look more like syntax and the concatenator is really just a clever way to take advantage of Python syntax that recognizes slices inside []. Novices may just think that something like c[1:3,9:20] is an array literal like r"xyz" for raw strings (another argument against "r"!). On 2/10/06, Eric Firing wrote: > Travis Oliphant wrote: > > Christopher Barker wrote: > > > >> Colin J. Williams wrote: > >> > >>>> numpy.r_[1,0,range(1,5),0,1] > >>> > >>> > >> > >>> This seems to be a neat idea but not in the usual Python style. > >> > >> > >> > >> Exactly. couldn't it at least get a meaningful, but short, name? > > > > > > It is meaningful :-) r_ means row concatenation... (but, it has taken > > on more functionality than that). What name do you suggest? > > "cat"? "rcat"? "catr"? "catter"? > > Eric > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ndarray at mac.com Fri Feb 10 16:09:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 16:09:01 2006 Subject: [Numpy-discussion] Array literal Message-ID: Recent discussion of the numpy catenator (r_) made me realize that Python syntax allows us to effectively implement an array literal. >>> from numpy import r_ as a >>> a[1:3,5:9] array([1, 2, 5, 6, 7, 8]) >>> a[1, 2, 5, 6, 7, 8] array([1, 2, 5, 6, 7, 8]) One can think of a[1, 2, 5, 6, 7, 8] as an array literal. To me it looks very "pythonic": [...] already has a meaning of list literal and python uses single-letter modifier in string literals to denote raw strings. In other words a[...] is to [...] what r"..." is to "...". The catenator can probably be generalized to cover all use cases of the "array" constructor. For example: a(shape=(2,3))[1:3,5:9] may return array([[1,2,5],[6,7,8]]) a(shape=(2,3))[1] may return ones((2,3)) a(shape=(2,3))[...] may return empty((2,3)) a(shape=(2,3))[1, 2, ...] may return array([[1,2,1],[2,1,2]]) dtype and other array(...) arguments can be passed similarly to shape above. If this syntax proves successful, ndarray repr may be changed to return "a[...]" instead of "array([...])" and thus make new users immediately aware of this way to represent arrays. From cookedm at physics.mcmaster.ca Fri Feb 10 16:12:03 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 10 16:12:03 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> (Joe Harrington's message of "Fri, 10 Feb 2006 11:16:00 -0500") References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: Joe Harrington writes: > [...] Put this it on the wiki (seriously). Another thing to look at is the "Producing Open Source Software" book that's been mentioned before (http://producingoss.com/). There's a section on wiki's that useful to keep in mind at http://producingoss.com/html-chunk/index.html -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant at ee.byu.edu Fri Feb 10 16:52:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 10 16:52:02 2006 Subject: [Numpy-discussion] BUG(?): array([None])==None test In-Reply-To: References: Message-ID: <43ED3525.3020502@ee.byu.edu> Pearu Peterson wrote: > > Hi, > > While converting some Numeric based code to numpy, I noticed that > in Numeric > > array([None])==None > > returns array([1]) while in numpy it returns False. > > Is this expected behaviour or a bug? It's expected behavior. If you do an equality test on None then False is returned while True is returned on an inequality test to None. -Travis From gruben at bigpond.net.au Fri Feb 10 17:30:05 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 17:30:05 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: <43ED3DDB.4030806@bigpond.net.au> Sasha wrote: > On 2/10/06, Travis Oliphant wrote: >> The whole point of r_ is to allow you to use slice notation to build >> ranges easily. I wrote it precisely to make it easier to construct >> arrays in a simliar style that Matlab allows. > > Maybe it is just me, but r_ is rather unintuitive. I would expect > something like this to be called "c" for "combine" or "concatenate." > This is the name used by S+ and R. I agree that c or c_ (don't care which) is more intuitive but I can understand why it's ended up as it has. Even v or v_ for 'vector' or a or a_ for 'array' would also make sense to me. I must say that Travis's example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that the upper limit on an integer range is non-inclusive. I'm sure the BDFL has some excuse for this silliness. Gary R From ndarray at mac.com Fri Feb 10 18:37:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 18:37:04 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED3DDB.4030806@bigpond.net.au> References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> <43ED3DDB.4030806@bigpond.net.au> Message-ID: On 2/10/06, Gary Ruben wrote: >... I must say that Travis's > example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that > the upper limit on an integer range is non-inclusive. In this case you must hate that an integer range starts at 0 (I don't think you would want len(range(10)) to be 11). If this is the case, I don't blame you: it is silly to start counting at 0, but algorithmically it is quite natural. Semi-closed integer ranges have many algorithmic advantages as well such as length = (stop - start)/step, empty range can be recognized by start=stop test regardless of step, adjacent ranges - start2=stop1 (again no need to know step) etc. > I'm sure the BDFL has some excuse for this silliness. Maybe he does not like Fortran :-) PS: What's your second favorite language (I assume that python is the first :-)? From gruben at bigpond.net.au Fri Feb 10 19:42:00 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 19:42:00 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> <43ED3DDB.4030806@bigpond.net.au> Message-ID: <43ED5CCA.8000805@bigpond.net.au> Sasha wrote: > On 2/10/06, Gary Ruben wrote: >> ... I must say that Travis's >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that >> the upper limit on an integer range is non-inclusive. > > In this case you must hate that an integer range starts at 0 (I don't > think you would want len(range(10)) to be 11). Actually, that wouldn't bother me and I'm not really fussed by whether a language chooses 0 or 1 based integer ranges, as long as you can override the default, but 0 seems more natural for any programming language. > If this is the case, > I don't blame you: it is silly to start counting at 0, but > algorithmically it is quite natural. Semi-closed integer ranges have > many algorithmic advantages as well such as length = (stop - > start)/step, empty range can be recognized by start=stop test > regardless of step, adjacent ranges - start2=stop1 (again no need to > know step) etc. Thanks for the explanation Sasha. It does make some sense in terms of your examples, but I'll remain unconvinced. >> I'm sure the BDFL has some excuse for this silliness. > > Maybe he does not like Fortran :-) > > PS: What's your second favorite language (I assume that python is the first :-)? It's not Fortran-77! If I say it's Object Pascal (i.e. Delphi) you may begin to see where my range specifier preference comes from. Pascal lets you define things like enumeration type ranges like Monday..Friday. It would seem nonsensical to define the range of working weekdays as Monday..Saturday. I'm pretty competent with C, less-so with C++ and I've totally missed out on Java. One day I might have a play with Haskell and Ruby. Actually I see that Ruby sidesteps my pet hate by providing both types of range specifiers. I can't see myself defecting to the enemy just because of this though, Gary R. From ndarray at mac.com Fri Feb 10 20:39:07 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 20:39:07 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review Message-ID: Sorry for cross posting. This request clearly relevant to the NumPy list, but Wiki instructs that such requests should be posted on scipy-dev. Please review http://scipy.org/NumPyGlossary . From gruben at bigpond.net.au Fri Feb 10 21:04:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 21:04:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: Message-ID: <43ED700B.10903@bigpond.net.au> Hi Sasha, A couple of points: Stride The distance (in bytes) between the two concecutive elements along an axis. Stride isn't the distance in bytes is it? Isn't it just the index increment or alternatively the distance in terms of the multiplier of the word length of the contained type. Also, a slight typo: concecutive -> consecutive. Record A composite element of an array similar to C struct. This implies that you can contain different types in a record, which I think is only true if you have an object array. Everything else looks OK. Gary R. Sasha wrote: > Sorry for cross posting. This request clearly relevant to the NumPy > list, but Wiki instructs that such requests should be posted on > scipy-dev. Please review http://scipy.org/NumPyGlossary . From cookedm at physics.mcmaster.ca Fri Feb 10 21:20:03 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 10 21:20:03 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43ED700B.10903@bigpond.net.au> (Gary Ruben's message of "Sat, 11 Feb 2006 16:03:07 +1100") References: <43ED700B.10903@bigpond.net.au> Message-ID: Gary Ruben writes: > Hi Sasha, > A couple of points: > > Stride > The distance (in bytes) between the two concecutive elements > along an axis. > > Stride isn't the distance in bytes is it? Isn't it just the index > increment or alternatively the distance in terms of the multiplier of > the word length of the contained type. Also, a slight typo: > concecutive -> consecutive. In numpy usage, it's bytes. It's particularly important when you've got a record array of mixed types. Travis's example is temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') temp['f1'].strides (10,) > Record > A composite element of an array similar to C struct. > > This implies that you can contain different types in a record, which I > think is only true if you have an object array. Nope; see above. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From ndarray at mac.com Fri Feb 10 21:31:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 21:31:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43ED700B.10903@bigpond.net.au> References: <43ED700B.10903@bigpond.net.au> Message-ID: On 2/11/06, Gary Ruben wrote: > Stride isn't the distance in bytes is it? Isn't it just the index > increment or alternatively the distance in terms of the multiplier of > the word length of the contained type. Unfortunately it is in bytes and Travis convinced me that there is no way to change it. > Also, a slight typo: concecutive -> consecutive. I've changed that. In the future, please just edit the wiki for obvious misspellings. Spell check does not work for me on that wiki and English is not my first language, so any spelling/grammar corrections are more than welcome. > > Record > A composite element of an array similar to C struct. > > This implies that you can contain different types in a record, which I > think is only true if you have an object array. Record arrays is a new feature in numpy. I think what I wrote is correct, but this entry will definitely benefit from a review by someone familiar with record arrays since I am not. From gruben at bigpond.net.au Fri Feb 10 21:56:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 21:56:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> Message-ID: <43ED7C3D.6070705@bigpond.net.au> David Cooke corrected my misconceptions, so the glossary all looks good to me. Gary R. Sasha wrote: > On 2/11/06, Gary Ruben wrote: >> Stride isn't the distance in bytes is it? Isn't it just the index >> increment or alternatively the distance in terms of the multiplier of >> the word length of the contained type. > > Unfortunately it is in bytes and Travis convinced me that there is no > way to change it. > >> Also, a slight typo: concecutive -> consecutive. > > I've changed that. In the future, please just edit the wiki for > obvious misspellings. Spell check does not work for me on that wiki > and English is not my first language, so any spelling/grammar > corrections are more than welcome. > >> Record >> A composite element of an array similar to C struct. >> >> This implies that you can contain different types in a record, which I >> think is only true if you have an object array. > > Record arrays is a new feature in numpy. I think what I wrote is > correct, but this entry will definitely benefit from a review by > someone familiar with record arrays since I am not. From zpincus at stanford.edu Fri Feb 10 22:41:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Fri Feb 10 22:41:01 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43ED7C3D.6070705@bigpond.net.au> References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: The "broadcasting" entry is somewhat unclear in terms of what "conforming" array shapes are. Perhaps "compatible shapes" would be better, coupled with an example or two of "compatible" shapes, and/or a precise definition of how compatibility is determined. Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine On Feb 10, 2006, at 9:55 PM, Gary Ruben wrote: > David Cooke corrected my misconceptions, so the glossary all looks > good to me. > > Gary R. > > Sasha wrote: >> On 2/11/06, Gary Ruben wrote: >>> Stride isn't the distance in bytes is it? Isn't it just the index >>> increment or alternatively the distance in terms of the >>> multiplier of >>> the word length of the contained type. >> Unfortunately it is in bytes and Travis convinced me that there is no >> way to change it. >>> Also, a slight typo: concecutive -> consecutive. >> I've changed that. In the future, please just edit the wiki for >> obvious misspellings. Spell check does not work for me on that wiki >> and English is not my first language, so any spelling/grammar >> corrections are more than welcome. >>> Record >>> A composite element of an array similar to C struct. >>> >>> This implies that you can contain different types in a record, >>> which I >>> think is only true if you have an object array. >> Record arrays is a new feature in numpy. I think what I wrote is >> correct, but this entry will definitely benefit from a review by >> someone familiar with record arrays since I am not. > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From ndarray at mac.com Fri Feb 10 23:10:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 23:10:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: I updated the "broadcasting" entry. I don't think examples belong to a glossary. I think a glossary should be more like a quick reference rather than a tutorial. Unfortunately the broadcasting is one of those concepts that will never be clear without examples. On 2/11/06, Zachary Pincus wrote: > The "broadcasting" entry is somewhat unclear in terms of what > "conforming" array shapes are. Perhaps "compatible shapes" would be > better, coupled with an example or two of "compatible" shapes, and/or > a precise definition of how compatibility is determined. From wbaxter at gmail.com Sat Feb 11 03:59:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sat Feb 11 03:59:01 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: Definitely a very clear and convincing explanation, Joe. I guess my problem is mostly that I pretty much just walked in here, and I'm wondering how was I supposed to know that new things go in this DevZone? I saw a wiki, it didn't have the page I wished were there, so I registered and added it, figuring that's how Wikis are supposed to work. Seems like the information in your email needs to be communicated to new registrants to the wiki. And maybe permissions for creating new pages limited to the DevZone? --bb On 2/11/06, David M. Cooke wrote: > > Joe Harrington writes: > > > [...] > > Put this it on the wiki (seriously). > > Another thing to look at is the "Producing Open Source Software" book > that's been mentioned before (http://producingoss.com/). There's a > section on wiki's that useful to keep in mind at > http://producingoss.com/html-chunk/index.html > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sat Feb 11 04:06:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sat Feb 11 04:06:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: On the point of professionalism, I'd like to change the matlab page's title from "NumPy for Matlab Addicts" to simply "NumPy for Matlab Users". It's been bugging me since I put it up there initially... but I'm not really sure how to chage the name of a page in the wiki. On 2/11/06, Joe Harrington wrote: > > My point now is to get our community culture to include a > sense of professionalism and pride about what we present to the world. > Unless you're a fool or you have no competition, you dress well for a > job interview. We're not fools, and we have very healthy competition. > The main site is the first impression we make on new users. My goal > is to prevent it from being the last. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mithrandir42 at web.de Sat Feb 11 04:46:02 2006 From: mithrandir42 at web.de (N. Volbers) Date: Sat Feb 11 04:46:02 2006 Subject: [Numpy-discussion] dtype names and titles Message-ID: <43EDDC6C.6040005@web.de> I continue to learn all about the heterogeneous arrays... When I was reading through the records.py code I discovered that besides the 'names' and 'formats' for the fields of a numpy array you can also specify 'titles'. Playing around with this feature I discovered a bug: >>> import numpy >>> mydata = [(1,1), (2,4), (3,9)] >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'], 'titles': ['col2', 'col1']} >>> b = numpy.array( mydata, dtype=mytype) >>> print b [(1.0, 1.0) (4.0, 4.0) (9.0, 9.0)] This seems to be caused by the fact that you can access a field by both the name and the field title. Why would you want to have two names anyway? By the way, is there an easy way to access a field vector by its index? Right now I retrieve the field name from dtype.fields[-1][index] and then return the 'column' by using myarray[name]. Best regards, Niklas. From pearu at scipy.org Sat Feb 11 07:21:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Sat Feb 11 07:21:02 2006 Subject: [Numpy-discussion] Test test_minrelpath.py on unix for me In-Reply-To: <43ECE677.2040104@cox.net> References: <43ECE677.2040104@cox.net> Message-ID: On Fri, 10 Feb 2006, Tim Hochberg wrote: > > Could someone try the attached diff on a unixy system? It works under > windows, but it's easy to mess up those \/'s. Hmm, minrelpath does not need if os.sep != '/': path = path.replace('/',os.sep) In functions (see njoin, for instance) that call minrelpath already have applied this codelet. I have commited the tests fix with some modifications to svn, tested on Linux. Pearu From tim.hochberg at cox.net Sat Feb 11 07:37:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sat Feb 11 07:37:02 2006 Subject: [Numpy-discussion] Test test_minrelpath.py on unix for me In-Reply-To: References: <43ECE677.2040104@cox.net> Message-ID: <43EE0466.8070608@cox.net> Pearu Peterson wrote: > > > On Fri, 10 Feb 2006, Tim Hochberg wrote: > >> >> Could someone try the attached diff on a unixy system? It works under >> windows, but it's easy to mess up those \/'s. > > > Hmm, minrelpath does not need > > if os.sep != '/': > path = path.replace('/',os.sep) > > In functions (see njoin, for instance) that call minrelpath already > have applied this codelet. > > I have commited the tests fix with some modifications to svn, tested > on Linux. That seems to do the trick under VC7 as well. Thanks, -tim > > Pearu > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From ndarray at mac.com Sat Feb 11 08:03:04 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 11 08:03:04 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: At this point the only change I would like to make to the glossary page is to rename it to NumPy_Glossary. I don't have a permission to change file names on the wiki, so I have to defer this task to someone else. I don't have any view on where this page should be linked, so I will not make any more changes relating to this page. From david.trem at gmail.com Sat Feb 11 09:05:03 2006 From: david.trem at gmail.com (David TREMOUILLES) Date: Sat Feb 11 09:05:03 2006 Subject: [Numpy-discussion] gnuplot problem with numpy In-Reply-To: <43ECC154.1000004@bryant.edu> References: <43ECC154.1000004@bryant.edu> Message-ID: <129e1cd10602110904i9177e55t@mail.gmail.com> Hello, Maybe you have to upgrade your numeric to 24.2 Refer to the recent thread in gnuplot-py-user list: http://sourceforge.net/mailarchive/forum.php?forum_id=11272&max_rows=25&style=nested&viewmonth=200602 David 2006/2/10, Brian Blais : > > Hello, > > I have been trying to use the Gnuplot1.7.py module, but it doesn't seem to > work with > numpy (although it works with Numeric). The following code plots two > "identical" > sets of data, but the numpy data gets rounded to the nearest integer when > passed to > Gnuplot. What is odd is that the offending code in utils.py, is the > function > float_array(m), which does the conversion that I do in this script, but it > doesn't > seem to work. Any ideas? > > #---------------------------- > import numpy > import Numeric > import Gnuplot > > g = Gnuplot.Gnuplot(debug=1) > dh=.1; > x=numpy.arange(dh,2+dh,dh,'d') > y1 = x**2 > > > y2=y1 > > d1 = Gnuplot.Data(x, y1, > title='numpy', > with='points') # doesn't work > d2 = Gnuplot.Data(Numeric.asarray(x,'f'), Numeric.asarray(y2,'f'), > title='Numeric', > with='points') # works > > g.plot(d1,d2) > > #---------------------------- > > > > > thanks, > > bb > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu at scipy.org Sat Feb 11 09:08:01 2006 From: pearu at scipy.org (Pearu Peterson) Date: Sat Feb 11 09:08:01 2006 Subject: [Numpy-discussion] More numpy and Numeric differences Message-ID: I have created a wiki page http://scipy.org/PearuPeterson/NumpyVersusNumeric that reports my findings on how numpy and Numeric behave on various corner cases. Travis O., could you take a look at it? Here is the most recent addition: """ Clipping integer array with Inf In Numeric (v24.2) clip returns float array: >>> from Numeric import * >>> Inf = array(1.0)/0 >>> clip([1,2],0,Inf) array([ 1., 2.]) >>> In numpy (v0.9.5.2092) Overflow error is raised: >>> from numpy import * >>> clip([1,2],0,Inf) Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.3/site-packages/numpy/core/oldnumeric.py", line 336, in clip return asarray(m).clip(m_min, m_max) OverflowError: cannot convert float infinity to long Comment: is it a numpy bug? Should clip take optional dtype argument and return asarray(m, dtype=dtype).clip(m_min, m_max)? Then also array.clip should have optional dtype.. """ Pearu From affix at zahav.net.il Sat Feb 11 09:33:06 2006 From: affix at zahav.net.il (=?windows-1251?B?MTYgLSAxNyD05eLw4Ov/IDIwMDYg4+7k4A==?=) Date: Sat Feb 11 09:33:06 2006 Subject: [Numpy-discussion] =?windows-1251?B?08/QwMLLxc3IxSDOz9LOws7JINLO0MPOwsvFyQ==?= Message-ID: <743c01c62f30$28724834$e1d07284@zahav.net.il> ?????????? ????????????? ???????????, ???????????? ??????? ?????????, ???????????? ?????????? ? ?????????? ?? ????? ???????????? ????????, ???????????????? ? ????????? ????????????? ??????? ?????????? ??????? ?????????, ??????? ??????? ? ????? ?????????? ????????: ?????????? ??????? ????????? 16 - 17 ??????? 2006 ???? ???? ????????: ???????????? ?????????? ????????????? ???? ? ??????????? ??? ???????????? ???????, ???????????: ?????????????? ????????? ?????????? ??????? ?????????; ??????? ? ?????????????? ??????? ??????????, ????????, ???????????? ? ????????? ??????????; ??????????? ??????? ??????; ??????? ?????? ????????? ????????? ? ?????????????? ?? ??????????. ?? ????????? ???????? ????????? ??????: ??????????? ??????? ???????? ????????????; ????????? ??????????? ??????? ????????? ?????; ?????????????? ?????????????? ????????? ??? ???????? ?????; ???????????? ??????????? ?????; ???????????? ??? ?????????????? ???????? ? ????????, ??????????? ????? ??????? ????? ??????. ????????? ????????: ???????? ?????????????????????? ??????????? ????? ? ?????? ?????????? ????????? ????????????????????????. ????????? ??????????: ??? ??????? ?????????? ?????????. ????????? ???? ?????? ????. ?????? ?????????????????????? ???????????. ??????? ? ?????????? ??????? ????????????????????????. ?????? ?????? ? ?????????? ??????? ????????? ? ???? ?? ???????????. ???????????? ???????? ?????????: ??? ??????? ??????. ??????????? ??????????????????????? ???????? ??????? ???????? ? ???? ??????: ?????????, ????, ?????????. ???? ? ????????? - ?????????????? ????????????. ????????? ??????????? ????? - ?????? ??????????? ???????????. ???????????? ???????????? ???????? ???????????. ??????????????? ??? ??????? ???????????? ??????????. ??????????? ???????????? ?????????? ???????? ???????????. ?????? ??????????? ????????? ??????? ????????. ???????????? ??????? ???????? ????????? ?????????? ?????????????????????? ????????????? ???????? ? ??????. ??????? ?? ????????: ????? ? ??? ?????. ?????, ????? ? ????? ????????? ?????????. ???????? ?????????? ???????? ??????????. ??????? ????????? ????????? ?????????. ??????????? ?????????????? ???????? ??????????? ??????? ???????? ?????????. ?????? ? ???????????: ???????-????????? ?????????? ? ????????????? ??????, ??? ??????????? ?????????????? ???????? ? ??????????. ??????? ????????? ??? ? ??????? ?????????? ??? ? ??????, ??? ? ? ????????, ?????? ??????? ????????????? ??????? ??????????. ????? ???????? - ?????????? ? 20 ?????? ??????. ?????????? ????? 50 ???????? ?? ?????????? ? ??????????? ?????, ?????????? ?????????, ??????????? ????????????? ????????????, ???????????? ????????? ???????????? ? PR, ?????????????? ? ?.?. ????? ???? ?????? ?? ???????? ??????????, ?????????? ????????, ?????? ???????? ????????????????? (???), ????????????????? ???-??????, ????????????? ?????????? ? ??????????? ?? ?????????? ??? ? ?????? ??????-?????? ??????. (?95) 98?-65-39, 98?-65-36 -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Feb 11 09:35:01 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat Feb 11 09:35:01 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EC3261.2060601@ieee.org> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> <43EBB782.1010509@cox.net> <43EC3261.2060601@ieee.org> Message-ID: On 2/9/06, Travis Oliphant wrote: > Sasha wrote: > > >Well, my results are different. > > > > [snip] > Yes, I think it's worth submitting. Most of the suggestions for > pointer-arithmetic for fast C-code were developed when processors spent > more time computing than fetching memory. Now it seem it's all about > fetching memory intelligently. > > The buffer[i]= > > style is even recommended according to the AMD-optimization book Sasha > pointed out. Pointers vs indexing is architecture and compiler dependent. My own experience is that recent gcc compilers produce better indexing code than they used to and the indexing instructions on newer cpus are faster. When I wrote the sorting routines pointers were faster, so I used them for quicksort and mergesort. Now I think indexing is faster and I am tempted to change the code. Indexing also looks cleaner to me. Chuck From martin.wiechert at gmx.de Sat Feb 11 10:52:02 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Sat Feb 11 10:52:02 2006 Subject: [Numpy-discussion] bug with NO_IMPORT_ARRAY / PY_ARRAY_UNIQUE_SYMBOL? was Re: [SciPy-user] segfault when calling PyArray_DescrFromType In-Reply-To: <43EB6436.1050307@ieee.org> References: <200602091141.51520.martin.wiechert@gmx.de> <200602091552.11896.martin.wiechert@gmx.de> <43EB6436.1050307@ieee.org> Message-ID: <200602111942.50704.martin.wiechert@gmx.de> Hi Travis, thanks for your help! I think there is a small bug with NO_IMPORT_ARRAY / PY_ARRAY_UNIQUE_SYMBOL in numpy-0.9.4. For ease of reference I've pasted part of __multiarray_api.h below. The problem I ran into is, that my "non-importing" source files, the ones defining NO_IMPORT_ARRAY, cannot see PyArray_API, because they obviously cannot know which name I chose in the importing file. E.g. I do #define PY_ARRAY_UNIQUE_SYMBOL my_name in the file which calls import_array (). Then the object generated will not have the symbol PyArray_API, because PyArray_API is replaced with my_name. But the sources with NO_IMPORT_ARRAY look for PyArray_API, because for them it is not replaced. Indeed inserting #define PyArray_API my_name into these files seems to fix the problem for me. Regards, Martin. #if defined(PY_ARRAY_UNIQUE_SYMBOL) #define PyArray_API PY_ARRAY_UNIQUE_SYMBOL #endif #if defined(NO_IMPORT) || defined(NO_IMPORT_ARRAY) extern void **PyArray_API; #else #if defined(PY_ARRAY_UNIQUE_SYMBOL) void **PyArray_API; #else static void **PyArray_API=NULL; #endif #endif On Thursday 09 February 2006 16:48, Travis Oliphant wrote: > Martin Wiechert wrote: > >Found it (in the "old" docs). > >Must #define PY_ARRAY_UNIQUE_SYMBOL and call import_array (). > > To be clear, you must call import_array() in the modules init function. > This is the only requirement. > > You only have to define PY_ARRAY_UNIQUE_SYMBOL if your extension module > uses more than one file. In the files without the module initialization > code you also have to define NO_IMPORT_ARRAY. > > -Travis From oliphant.travis at ieee.org Sat Feb 11 14:24:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 11 14:24:04 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EDDC6C.6040005@web.de> References: <43EDDC6C.6040005@web.de> Message-ID: <43EE63E1.10601@ieee.org> N. Volbers wrote: > I continue to learn all about the heterogeneous arrays... > > When I was reading through the records.py code I discovered that > besides the 'names' and 'formats' for the fields of a numpy array you > can also specify 'titles'. Playing around with this feature I > discovered a bug: > > >>> import numpy > >>> mydata = [(1,1), (2,4), (3,9)] > >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'], > 'titles': ['col2', 'col1']} > >>> b = numpy.array( mydata, dtype=mytype) > >>> print b > [(1.0, 1.0) (4.0, 4.0) (9.0, 9.0)] > > This seems to be caused by the fact that you can access a field by > both the name and the field title. Why would you want to have two > names anyway? This lets you use attribute look up on the names but have the titles be the "true name" of the field. I've fixed this in SVN, so that it raises an error when the titles have the same names as the columns. Thanks for the test. -Travis From oliphant.travis at ieee.org Sat Feb 11 14:53:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 11 14:53:01 2006 Subject: [Numpy-discussion] More numpy and Numeric differences In-Reply-To: References: Message-ID: <43EE6AAB.9010408@ieee.org> Pearu Peterson wrote: > > I have created a wiki page > > http://scipy.org/PearuPeterson/NumpyVersusNumeric > > that reports my findings on how numpy and Numeric behave on various > corner cases. Travis O., could you take a look at it? > Here is the most recent addition: > I fixed the put issue. The problem with clip is actually in choose (clip is just a specific application of choose). The problem is in PyArray_ConvertToCommonType. You have an integer array, an integer scalar, and a floating-point scalar. I think the rules implemented in PyArray_ConvertToCommonType are not allowing the scalar to dictate anything. But, this should clearly be changed to allow scalars of different "kinds" to up-cast the array. This would be consistent with the umath module. So, PyArray_ConvertToCommonType needs to be improved. This will have an impact on several other functions that use this C-API. -Travis From mithrandir42 at web.de Sun Feb 12 01:03:04 2006 From: mithrandir42 at web.de (N. Volbers) Date: Sun Feb 12 01:03:04 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EE63E1.10601@ieee.org> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> Message-ID: <43EEF995.10706@web.de> (sorry Travis, I accidentally replied first to you directly, and not to the list) Travis Oliphant wrote: > N. Volbers wrote: > >> [...]This seems to be caused by the fact that you can access a field >> by both the name and the field title. Why would you want to have two >> names anyway? > > > This lets you use attribute look up on the names but have the titles > be the "true name" of the field. > I still don't understand the reason for keeping two different names. IMHO it adds some extra complexity and might be a potential source for errors. If I keep an extra title in the array, then I think I should be allowed to name it whatever I want. If this is not the case, then I would be better off to just have unique field names and keep my extra information about the fields in a separate dictionary with the field names as keys and the extra information as value. This is my current approach, which works quite well; unfortunately the extra information is not saved in the array itself. Is anybody actually using both names and titles? Best regards, Niklas. From bblais at bryant.edu Sun Feb 12 05:03:02 2006 From: bblais at bryant.edu (Brian Blais) Date: Sun Feb 12 05:03:02 2006 Subject: [Numpy-discussion] gnuplot problem with numpy In-Reply-To: <129e1cd10602110904i9177e55t@mail.gmail.com> References: <43ECC154.1000004@bryant.edu> <129e1cd10602110904i9177e55t@mail.gmail.com> Message-ID: <43EF3156.9040601@bryant.edu> David TREMOUILLES wrote: > Hello, > Maybe you have to upgrade your numeric to 24.2 bingo! thanks. I had already upgraded my numpy, and since I kept seeing "numpy=Numeric" in many threads, I didn't think to upgrade that as well. thanks, Brian Blais -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From gerard.vermeulen at grenoble.cnrs.fr Sun Feb 12 05:26:07 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Sun Feb 12 05:26:07 2006 Subject: [Numpy-discussion] ANN: first release of IVuPy-0.1 Message-ID: <20060212142525.45b6b924.gerard.vermeulen@grenoble.cnrs.fr> I am proud to announce IVuPy-0.1 (I-View-Py). IVuPy is a Python extension module developed to write Python programs for 3D visualization of large data sets using Qt and PyQt. Python is extended by IVuPy with more than 600 classes of two of the Coin3D C++ class libraries: Coin and SoQt. Coin is compatible with the Open Inventor API. Open Inventor is an object-oriented 3D toolkit built on OpenGL that provides a 3D scene database, a built-in event model for user interaction, and the ability to print objects and exchange data with other graphics formats. The SoQt library interfaces Coin to Qt. See http://www.coin3d.org for more information on Coin3D. IVuPy requires at least one of the Numerical Python extension modules: NumPy, Numeric, or numarray (IVuPy works with all of them at once). Data transfer between the Numerical Python arrays and the Coin data structures has been implemented by copying. The design of the Open Inventor API favors ease of use over performance. The API is a natural match for Python, and in my opinion it is fun to program with IVuPy. The performance penalty of the design choice is small. The first example at http://ivupy.sourceforge.net/examples.html demonstrates this: NumPy calculates a surface with a million nodes in 1.7 seconds and Coin3D redisplays the surface in 0.3 seconds on my Linux system with a 3.6 GHz Pentium and a nVidea graphics card (NV41.1). The Inventor Mentor ( http://www.google.com/search?q=inventor+mentor ) is essential for learning IVuPy. The IVuPy documentation supplements the Inventor Mentor. IVuPy includes all C++ examples from the Inventor Mentor and their Python translations. There are also more advanced examples to show the integration of IVuPy and PyQt. IVuPy has been used for almost 6 months on Linux and Windows in the development of a preprocessor for a finite element flow solver and has been proven to be very stable. Prerequisites for IVuPy are: - Python-2.4.x or -2.3.x - at least one of NumPy, numarray, or Numeric - Qt-3.3.x, -3.2.x, or -3.1.x - SIP-4.3.x or -4.2.1 - PyQt-3.15.x or -3.14.1 - Coin-2.4.4 or -2.4.3 - SoQt-1.3.0 or -1.2.0 IVuPy is licensed under the terms of the GPL. Contact me, if the GPL is an obstacle for you. http://ivupy.sourceforge.net is the home page of IVuPy. Have fun -- Gerard Vermeulen From faltet at carabos.com Mon Feb 13 01:03:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 13 01:03:03 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EEF995.10706@web.de> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> <43EEF995.10706@web.de> Message-ID: <1139821316.7532.5.camel@localhost.localdomain> El dg 12 de 02 del 2006 a les 09:02 +0000, en/na N. Volbers va escriure: > I still don't understand the reason for keeping two different names. > IMHO it adds some extra complexity and might be a potential source for > errors. If I keep an extra title in the array, then I think I should be > allowed to name it whatever I want. If this is not the case, then I > would be better off to just have unique field names and keep my extra > information about the fields in a separate dictionary with the field > names as keys and the extra information as value. This is my current > approach, which works quite well; unfortunately the extra information is > not saved in the array itself. Yes. I agree that accessing fields by both name and title might become a common source of confusion. So, in order to avoid problems in the future, I wouldn't let the users to access the fields by title. > > Is anybody actually using both names and titles? Not me. -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From aisaac at american.edu Mon Feb 13 05:30:07 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 13 05:30:07 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <1139821316.7532.5.camel@localhost.localdomain> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org><43EEF995.10706@web.de><1139821316.7532.5.camel@localhost.localdomain> Message-ID: >> Is anybody actually using both names and titles? On Mon, 13 Feb 2006, Francesc Altet apparently wrote: > Not me. Is the "title" the appropriate storage for the "displayname" for fields that are to be plotted? Or not? Thanks, Alan Isaac From faltet at carabos.com Mon Feb 13 06:08:15 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 13 06:08:15 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> <43EEF995.10706@web.de><1139821316.7532.5.camel@localhost.localdomain> Message-ID: <1139839637.7532.14.camel@localhost.localdomain> El dl 13 de 02 del 2006 a les 08:35 -0500, en/na Alan G Isaac va escriure: > >> Is anybody actually using both names and titles? > > On Mon, 13 Feb 2006, Francesc Altet apparently wrote: > > Not me. > > Is the "title" the appropriate storage for the "displayname" > for fields that are to be plotted? Or not? Uh, yes. Perhaps I messed up things. Of course it is interesting to have both a name and a title. What I tried to mean is that accessing fields by *both* names and titles might introduce confusion. For example, allowing: >>> mydata = [(1,1), (2,4), (3,9)] >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'],'titles': ['col 2', 'col 1']} >>> b = numpy.array( mydata, dtype=mytype) >>> b array([(1, 1.0), (2, 4.0), (3, 9.0)], dtype=(void,6)) >>> b['col1'] array([1, 2, 3], dtype=int16) >>> b['col 2'] array([1, 2, 3], dtype=int16) seems quite strange to me. My point is that I think that keys in arrays for accessing fields should be unique, and thus, I'd remove the last sentence as a valid one. But of course I think that having both names and titles is a good thing. Sorry for the confusion. Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From wbaxter at gmail.com Mon Feb 13 06:46:00 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 06:46:00 2006 Subject: [Numpy-discussion] matplotlib Message-ID: Anyone know if matplotlib is supposed to work with the new NumPy or if there is work afoot to make it work? It seems to truncate all numpy.array and numpy.matrix inputs to integer values: import matplotlib matplotlib.interactive(True) matplotlib.use('WXAgg') import matplotlib.pylab as g g.plot(rand(5),rand(5),'bo') just puts a dot at (0,0), while this g.plot(rand(5)*10,rand(5)*10,'bo') generates a plot of 5 points but all at integer coordinates. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryanlists at gmail.com Mon Feb 13 06:53:16 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 06:53:16 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: Message-ID: What version are you using? I know that CVS matplotlib works with numpy and I think the latest releases do as well. I think the current version is 0.86.2 On 2/13/06, Bill Baxter wrote: > Anyone know if matplotlib is supposed to work with the new NumPy or if there > is work afoot to make it work? > It seems to truncate all numpy.array and numpy.matrix inputs to integer > values: > > import matplotlib > matplotlib.interactive(True) > matplotlib.use('WXAgg') > import matplotlib.pylab as g > > g.plot(rand(5),rand(5),'bo') > > just puts a dot at (0,0), while this > > g.plot(rand(5)*10,rand(5)*10,'bo') > > generates a plot of 5 points but all at integer coordinates. > > > --bb > From wbaxter at gmail.com Mon Feb 13 07:07:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 07:07:01 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: Message-ID: I've got 0.86.2. It looks like if I do 'import pylab as g' it doesn't work, but 'from pylab import *' does for some reason. --bb On 2/13/06, Ryan Krauss wrote: > > What version are you using? I know that CVS matplotlib works with > numpy and I think the latest releases do as well. I think the current > version is 0.86.2 > > On 2/13/06, Bill Baxter wrote: > > Anyone know if matplotlib is supposed to work with the new NumPy or if > there > > is work afoot to make it work? > > It seems to truncate all numpy.array and numpy.matrix inputs to integer > > values: > > > > import matplotlib > > matplotlib.interactive(True) > > matplotlib.use('WXAgg') > > import matplotlib.pylab as g > > > > g.plot(rand(5),rand(5),'bo') > > > > just puts a dot at (0,0), while this > > > > g.plot(rand(5)*10,rand(5)*10,'bo') > > > > generates a plot of 5 points but all at integer coordinates. > > > > > > --bb > > > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 07:13:08 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 07:13:08 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: (Bill Baxter's message of "Mon, 13 Feb 2006 23:45:00 +0900") References: Message-ID: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Bill" == Bill Baxter writes: Bill> Anyone know if matplotlib is supposed to work with the new Bill> NumPy or if there is work afoot to make it work? It seems Bill> to truncate all numpy.array and numpy.matrix inputs to Bill> integer values: You're script as posted is incomplete import matplotlib matplotlib.interactive(True) matplotlib.use('WXAgg') import matplotlib.pylab as g g.plot(rand(5),rand(5),'bo') where for example is rand coming from? My guess is you have an import statement you are not showing us. If you are using a recent numpy and matplotlib, and set numerix to numpy in your matplotlib rc file (~/.matplotlib/matplotlibrc) everything should work if you get your array symbols from pylab, numpy or matplotlib.numerix (all of which will get their symbols from numpy....) JDH From wbaxter at gmail.com Mon Feb 13 07:29:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 07:29:02 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: from numpy import * was the only line missing, called before the rest. It seems to work fine if I use from pylab import * instead of import pylab as g And actually if I do both in this order: import pylab as g from pylab import * then plot() and g.plot() both do the right thing (no truncating of floats). Seems as if there's some initialization code that only gets run with the 'from pylab import *' version. --bb On 2/14/06, John Hunter wrote: > > >>>>> "Bill" == Bill Baxter writes: > > Bill> Anyone know if matplotlib is supposed to work with the new > Bill> NumPy or if there is work afoot to make it work? It seems > Bill> to truncate all numpy.array and numpy.matrix inputs to > Bill> integer values: > > You're script as posted is incomplete > > import matplotlib > matplotlib.interactive(True) > matplotlib.use('WXAgg') > import matplotlib.pylab as g > > g.plot(rand(5),rand(5),'bo') > > where for example is rand coming from? My guess is you have an import > statement you are not showing us. > > If you are using a recent numpy and matplotlib, and set numerix to > numpy in your matplotlib rc file (~/.matplotlib/matplotlibrc) > everything should work if you get your array symbols from pylab, numpy > or matplotlib.numerix (all of which will get their symbols from > numpy....) > > JDH > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 07:34:05 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 07:34:05 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: (Bill Baxter's message of "Tue, 14 Feb 2006 00:28:07 +0900") References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Bill" == Bill Baxter writes: Bill> from numpy import * was the only line missing, called before Bill> the rest. It seems to work fine if I use from pylab import Bill> * instead of import pylab as g Bill> And actually if I do both in this order: import pylab as g Bill> from pylab import * Bill> Seems as if there's some Bill> initialization code that only gets run with the 'from pylab Bill> import *' version. As far as I know that is a python impossibility, unless perhaps you do some deep dark magic that is beyond my grasp. pylab doesn't know how it is imported. Are you sure you have your numerix set properly? I suggest creating two free standing scripts, one with the problem and one without, and running both with --verbose-helpful to make sure that your settings are what you think they are. If you verify that numerix is set properly and still see the problem, I would like to see both scripts in case it is exposing a problem with matplotlib. Of course, doing multiple import * commands is a recipe for long term pain, especially with packages that have so much overlapping namespace and numpy/scipy/pylab. JDH From wbaxter at gmail.com Mon Feb 13 07:59:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 07:59:03 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: Ah, ok. You're right. Doing from pylab import * was actually just overwriting the definition of array and rand() to be those from Numeric, which pylab was picking to use by default. I guess my expectation was that pylab would default to using the best numerical package installed. With "numerix : numpy" in my ~/.matplotlib/matplotlibrc file, it seems to be working properly now. Thanks for the help! --bb On 2/14/06, John Hunter wrote: > > >>>>> "Bill" == Bill Baxter writes: > > Bill> from numpy import * was the only line missing, called before > Bill> the rest. It seems to work fine if I use from pylab import > Bill> * instead of import pylab as g > > Bill> And actually if I do both in this order: import pylab as g > Bill> from pylab import * > > Bill> Seems as if there's some > Bill> initialization code that only gets run with the 'from pylab > Bill> import *' version. > > As far as I know that is a python impossibility, unless perhaps you do > some deep dark magic that is beyond my grasp. pylab doesn't know how > it is imported. > > Are you sure you have your numerix set properly? I suggest creating > two free standing scripts, one with the problem and one without, and > running both with --verbose-helpful to make sure that your settings > are what you think they are. If you verify that numerix is set > properly and still see the problem, I would like to see both scripts > in case it is exposing a problem with matplotlib. > > Of course, doing multiple import * commands is a recipe for long term > pain, especially with packages that have so much overlapping namespace > and numpy/scipy/pylab. > > JDH > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryanlists at gmail.com Mon Feb 13 09:01:03 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 09:01:03 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: The point of the numerix setting in the rc file is that matplotlib can't tell you what is the best numerical package to use for your problem. On 2/13/06, Bill Baxter wrote: > Ah, ok. You're right. Doing from pylab import * was actually just > overwriting the definition of array and rand() to be those from Numeric, > which pylab was picking to use by default. I guess my expectation was that > pylab would default to using the best numerical package installed. > > With "numerix : numpy" in my ~/.matplotlib/matplotlibrc file, it seems to be > working properly now. > > Thanks for the help! > > --bb > > On 2/14/06, John Hunter wrote: > > >>>>> "Bill" == Bill Baxter writes: > > > > Bill> from numpy import * was the only line missing, called before > > Bill> the rest. It seems to work fine if I use from pylab import > > Bill> * instead of import pylab as g > > > > Bill> And actually if I do both in this order: import pylab as g > > Bill> from pylab import * > > > > Bill> Seems as if there's some > > Bill> initialization code that only gets run with the 'from pylab > > Bill> import *' version. > > > > As far as I know that is a python impossibility, unless perhaps you do > > some deep dark magic that is beyond my grasp. pylab doesn't know how > > it is imported. > > > > Are you sure you have your numerix set properly? I suggest creating > > two free standing scripts, one with the problem and one without, and > > running both with --verbose-helpful to make sure that your settings > > are what you think they are. If you verify that numerix is set > > properly and still see the problem, I would like to see both scripts > > in case it is exposing a problem with matplotlib. > > > > Of course, doing multiple import * commands is a recipe for long term > > pain, especially with packages that have so much overlapping namespace > > and numpy/scipy/pylab. > > > > JDH > > > > > From ryanlists at gmail.com Mon Feb 13 09:45:04 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 09:45:04 2006 Subject: [Numpy-discussion] indexing problem Message-ID: I am having a problem with indexing an array and not getting the expected scalar behavior for complex128scalar: In [44]: c Out[44]: array([ 3.31781200e+06, 2.20157529e+13, 1.46088259e+20, 9.69386754e+26, 6.43248601e+33, 4.26835585e+40, 2.83232045e+47, 1.87942136e+54, 1.24711335e+61, 8.27537526e+67]) In [45]: s=c[-1]*1.0j In [46]: type(s) Out[46]: In [47]: s**2 Out[47]: (-6.848183561893313e+135+8.3863291020365108e+119j) In [48]: s=8.27537526e+67*1.0j In [49]: type(s) Out[49]: In [50]: s**2 Out[50]: (-6.8481835693820068e+135+0j) Why does result 47 have a non-zero imaginary part? Ryan From ryanlists at gmail.com Mon Feb 13 09:54:02 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 09:54:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: Message-ID: This may only be a problem for ridiculously large numbers. I actually meant to be dealing with these values: In [75]: d Out[75]: array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, 19985.94891221, 24674.01100272]) In [76]: s=d[-1]*1.0j In [77]: s Out[77]: 24674.011002723393j In [78]: type(s) Out[78]: In [79]: s**2 Out[79]: (-608806818.96251547+7.4554869875188623e-08j) So perhaps the previous difference of 26 orders of magnitude really did mean that the imaginary part was negligibly small, that just got obscured by the fact that the real part was order 1e+135. On 2/13/06, Ryan Krauss wrote: > I am having a problem with indexing an array and not getting the > expected scalar behavior for complex128scalar: > > In [44]: c > Out[44]: > array([ 3.31781200e+06, 2.20157529e+13, 1.46088259e+20, > 9.69386754e+26, 6.43248601e+33, 4.26835585e+40, > 2.83232045e+47, 1.87942136e+54, 1.24711335e+61, > 8.27537526e+67]) > > In [45]: s=c[-1]*1.0j > > In [46]: type(s) > Out[46]: > > In [47]: s**2 > Out[47]: (-6.848183561893313e+135+8.3863291020365108e+119j) > > In [48]: s=8.27537526e+67*1.0j > > In [49]: type(s) > Out[49]: > > In [50]: s**2 > Out[50]: (-6.8481835693820068e+135+0j) > > Why does result 47 have a non-zero imaginary part? > > Ryan > From russel at appliedminds.com Mon Feb 13 10:08:13 2006 From: russel at appliedminds.com (Russel Howe) Date: Mon Feb 13 10:08:13 2006 Subject: [Numpy-discussion] String array equality test does not broadcast Message-ID: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> I am converting some numarray code to numpy and I noticed this behavior: >>> from numpy import * >>> sta=array(['abc', 'def', 'ghi']) >>> stb=array(['abc', 'jkl', 'ghi']) >>> sta==stb False I expected the same as this: >>> a1=array([1,2,3]) >>> a2=array([1,4,3]) >>> a1==a2 array([True, False, True], dtype=bool) I am trying to figure out how to fix this now... From chanley at stsci.edu Mon Feb 13 10:57:03 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Mon Feb 13 10:57:03 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EDDC6C.6040005@web.de> References: <43EDDC6C.6040005@web.de> Message-ID: <43F0D640.3000308@stsci.edu> N. Volbers wrote: > > By the way, is there an easy way to access a field vector by its index? > Right now I retrieve the field name from dtype.fields[-1][index] and > then return the 'column' by using myarray[name]. Travis, Perhaps we could add a field method to recarray like numarray's? This would allow access by both field name and "column" index. This would be nice for people who are using this convention and are making the switch from numarray. Chris From Fernando.Perez at colorado.edu Mon Feb 13 11:06:01 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Mon Feb 13 11:06:01 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: <43F0D851.7000807@colorado.edu> John Hunter wrote: >>>>>>"Bill" == Bill Baxter writes: > > > Bill> from numpy import * was the only line missing, called before > Bill> the rest. It seems to work fine if I use from pylab import > Bill> * instead of import pylab as g > > Bill> And actually if I do both in this order: import pylab as g > Bill> from pylab import * > > Bill> Seems as if there's some > Bill> initialization code that only gets run with the 'from pylab > Bill> import *' version. > > As far as I know that is a python impossibility, unless perhaps you do > some deep dark magic that is beyond my grasp. pylab doesn't know how > it is imported. Actually, a little creative use of sys._getframe() can tell you that, in some instances (if the import was done via pure python and the source can be found via inspect, it will fail for extension code and if inspect runs into trouble). If you _really_ want, you can also use dis.dis() on the frame above you and analyze the bytecode. But I seriously doubt matplotlib goes to such unpleasant extremes in this case :) Cheers, f ps - for the morbidly curious, here's how to do this: planck[import_tricks]> cat all.py from trick import * planck[import_tricks]> cat mod.py import trick planck[import_tricks]> cat trick.py import sys,dis f = sys._getframe(1) print f.f_code print dis.dis(f.f_code) planck[import_tricks]> python all.py 1 0 LOAD_CONST 0 (('*',)) 3 IMPORT_NAME 0 (trick) 6 IMPORT_STAR 7 LOAD_CONST 1 (None) 10 RETURN_VALUE None planck[import_tricks]> python mod.py 1 0 LOAD_CONST 0 (None) 3 IMPORT_NAME 0 (trick) 6 STORE_NAME 0 (trick) 9 LOAD_CONST 0 (None) 12 RETURN_VALUE None #### Since the code object has a file path and line number, you could fetch that and look at the source directly instead of dealing with the bytecode. From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 11:09:05 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 11:09:05 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <43F0D851.7000807@colorado.edu> (Fernando Perez's message of "Mon, 13 Feb 2006 12:04:49 -0700") References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> <43F0D851.7000807@colorado.edu> Message-ID: <87d5hrccil.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Fernando" == Fernando Perez writes: Fernando> Actually, a little creative use of sys._getframe() can Fernando> tell you that, in some instances (if the import was done Which is why I wrote "as far as I know..." because in real life almost nothing is impossible in python if you are willing to get in and inspect and modify the stack. Fernando> But I seriously doubt matplotlib goes to such unpleasant Fernando> extremes in this case :) No, we'll leave that kind of magic to you and the ipython crew :-) JDH From tim.hochberg at cox.net Mon Feb 13 11:31:05 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 11:31:05 2006 Subject: [Numpy-discussion] Re: indexing problem Message-ID: <43F0DE44.9040506@cox.net> I've been trying to look into the problem described below, but I just can't find where complex multiplication is being done (all the other multiplication, but not complex). Could someone with a grasp of the innards of numpy please point me in the right direction? Thanks, -tim Ryan Krauss wrote: >This may only be a problem for ridiculously large numbers. I actually >meant to be dealing with these values: > >In [75]: d >Out[75]: >array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, > 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, > 19985.94891221, 24674.01100272]) > >In [76]: s=d[-1]*1.0j > >In [77]: s >Out[77]: 24674.011002723393j > >In [78]: type(s) >Out[78]: > >In [79]: s**2 >Out[79]: (-608806818.96251547+7.4554869875188623e-08j) > >So perhaps the previous difference of 26 orders of magnitude really >did mean that the imaginary part was negligibly small, that just got >obscured by the fact that the real part was order 1e+135. > >On 2/13/06, Ryan Krauss wrote: > > >>I am having a problem with indexing an array and not getting the >>expected scalar behavior for complex128scalar: >> >>In [44]: c >>Out[44]: >>array([ 3.31781200e+06, 2.20157529e+13, 1.46088259e+20, >> 9.69386754e+26, 6.43248601e+33, 4.26835585e+40, >> 2.83232045e+47, 1.87942136e+54, 1.24711335e+61, >> 8.27537526e+67]) >> >>In [45]: s=c[-1]*1.0j >> >>In [46]: type(s) >>Out[46]: >> >>In [47]: s**2 >>Out[47]: (-6.848183561893313e+135+8.3863291020365108e+119j) >> >>In [48]: s=8.27537526e+67*1.0j >> >>In [49]: type(s) >>Out[49]: >> >>In [50]: s**2 >>Out[50]: (-6.8481835693820068e+135+0j) >> >>Why does result 47 have a non-zero imaginary part? >> >>Ryan >> >> >> > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From aisaac at american.edu Mon Feb 13 12:13:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 13 12:13:01 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <1139839637.7532.14.camel@localhost.localdomain> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org><43EEF995.10706@web.de><1139821316.7532.5.camel@localhost.localdomain><1139839637.7532.14.camel@localhost.localdomain> Message-ID: On Mon, 13 Feb 2006, Francesc Altet apparently wrote: > My point is that I think that keys in arrays for accessing > fields should be unique > But of course I think that having both names and titles is > a good thing. OK. We're in agreement then. Thanks, Alan From oliphant at ee.byu.edu Mon Feb 13 13:59:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 13 13:59:01 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> Message-ID: <43F100F6.1020200@ee.byu.edu> Russel Howe wrote: > I am converting some numarray code to numpy and I noticed this behavior: > > >>> from numpy import * > >>> sta=array(['abc', 'def', 'ghi']) > >>> stb=array(['abc', 'jkl', 'ghi']) > >>> sta==stb > False > > I expected the same as this: > >>> a1=array([1,2,3]) > >>> a2=array([1,4,3]) > >>> a1==a2 > array([True, False, True], dtype=bool) > > I am trying to figure out how to fix this now... Equality testing on string arrays does not work (equality testing uses ufuncs internally which are not supported generally for flexible arrays). You must use chararray's. Thus, sta.view(chararray) == stb.view(chararray) Or create chararray's from the beginning: sta = char.array(['abc','def','ghi']) stb = char.array(['abc','jkl','ghi']) Char arrays are a special subclass of the ndarray that give arrays all the methods of strings (and unicode) elements and allow (rich) comparison operations. -Travis From oliphant at ee.byu.edu Mon Feb 13 14:08:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 13 14:08:03 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F0DE44.9040506@cox.net> References: <43F0DE44.9040506@cox.net> Message-ID: <43F10318.5090507@ee.byu.edu> Tim Hochberg wrote: > > I've been trying to look into the problem described below, but I just > can't find where complex multiplication is being done (all the other > multiplication, but not complex). Could someone with a grasp of the > innards of numpy please point me in the right direction? Look in the build directory for __umath_generated.c. In there you will see that multiplication for complex numbers is done using PyUFunc_FF_F and friends (i.e. using a generic interface for wrapping a "scalar" function). The scalar function wrapped into a ufunc vectorized function is given in multiply_data. In that file you should see it present as nc_prodf, nc_prod, nc_prodl. nc_prod and friends are implemented in umathmodule.c.src -Travis From russel at appliedminds.com Mon Feb 13 14:43:14 2006 From: russel at appliedminds.com (Russel Howe) Date: Mon Feb 13 14:43:14 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <43F100F6.1020200@ee.byu.edu> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> <43F100F6.1020200@ee.byu.edu> Message-ID: OK, Thanks. Russel > Equality testing on string arrays does not work (equality testing > uses ufuncs internally which are not supported generally for > flexible arrays). You must use chararray's. > > Thus, > > sta.view(chararray) == stb.view(chararray) > > Or create chararray's from the beginning: > > sta = char.array(['abc','def','ghi']) > stb = char.array(['abc','jkl','ghi']) > > Char arrays are a special subclass of the ndarray that give arrays > all the methods of strings (and unicode) elements and allow (rich) > comparison operations. > > -Travis > > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at cox.net Mon Feb 13 14:52:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 14:52:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F10318.5090507@ee.byu.edu> References: <43F0DE44.9040506@cox.net> <43F10318.5090507@ee.byu.edu> Message-ID: <43F10D4B.9050501@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >> >> I've been trying to look into the problem described below, but I just >> can't find where complex multiplication is being done (all the other >> multiplication, but not complex). Could someone with a grasp of the >> innards of numpy please point me in the right direction? > > > Look in the build directory for __umath_generated.c. In there you > will see that multiplication for complex numbers is done using > PyUFunc_FF_F and friends (i.e. using a generic interface for wrapping > a "scalar" function). The scalar function wrapped into a ufunc > vectorized function is given in multiply_data. In that file you > should see it present as nc_prodf, nc_prod, nc_prodl. > > nc_prod and friends are implemented in umathmodule.c.src Thanks Travis, it would have taken me a while to track them down. As it turns out I was going off on the wrong track as I'll report in my next message. -tim From tim.hochberg at cox.net Mon Feb 13 15:05:05 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 15:05:05 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> Message-ID: <43F11086.308@cox.net> >> >>Ryan Krauss wrote: >> >> >> >>>This may only be a problem for ridiculously large numbers. I actually >>>meant to be dealing with these values: >>> >>>In [75]: d >>>Out[75]: >>>array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, >>> 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, >>> 19985.94891221, 24674.01100272]) >>> >>>In [76]: s=d[-1]*1.0j >>> >>>In [77]: s >>>Out[77]: 24674.011002723393j >>> >>>In [78]: type(s) >>>Out[78]: >>> >>>In [79]: s**2 >>>Out[79]: (-608806818.96251547+7.4554869875188623e-08j) >>> >>>So perhaps the previous difference of 26 orders of magnitude really >>>did mean that the imaginary part was negligibly small, that just got >>>obscured by the fact that the real part was order 1e+135. >>> >>>On 2/13/06, Ryan Krauss wrote: >>> >>> I got myself all tied up in a knot over this because I couldn't figure out how multiplying two purely complex numbers was going to result in something with a complex portion. Since I couldn't find the complex routines my imagination went wild: perhaps, I thought, numpy uses the complex multiplication routine that uses 3 multiplies instead of the more straightforward one that uses 4 multiples, etc, etc. None of these panned out, and of course they all evaporated when I got pointed to the code that implements this which is pure vanilla. All the time I was overlooking the obvious: Ryan is using s**2, not s*s. So the obvious answer, is that he's just seeing normal error in the function that is implementing pow. If this is inacuracy is problem, I'd just replace s**2 with s*s. It will probably be both faster and more accurate anyway Foolishly, -tim From Chris.Barker at noaa.gov Mon Feb 13 15:07:06 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon Feb 13 15:07:06 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: <43F110D6.6060302@noaa.gov> Sasha wrote: > I updated the "broadcasting" entry. I don't think examples belong to a > glossary. I think a glossary should be more like a quick reference > rather than a tutorial. Unfortunately the broadcasting is one of > those concepts that will never be clear without examples. Then a wiki page on broadcasting may be in order, and the glossary could link to it. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ryanlists at gmail.com Mon Feb 13 15:16:03 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 15:16:03 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F11086.308@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> Message-ID: At the risk of sounding silly, can you explain to me in simple terms why s**2 is less accurate than s*s. I can sort of intuitively appreciate that that would be true, but but might like just a little more detail. Thanks, Ryan On 2/13/06, Tim Hochberg wrote: > > >> > >>Ryan Krauss wrote: > >> > >> > >> > >>>This may only be a problem for ridiculously large numbers. I actually > >>>meant to be dealing with these values: > >>> > >>>In [75]: d > >>>Out[75]: > >>>array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, > >>> 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, > >>> 19985.94891221, 24674.01100272]) > >>> > >>>In [76]: s=d[-1]*1.0j > >>> > >>>In [77]: s > >>>Out[77]: 24674.011002723393j > >>> > >>>In [78]: type(s) > >>>Out[78]: > >>> > >>>In [79]: s**2 > >>>Out[79]: (-608806818.96251547+7.4554869875188623e-08j) > >>> > >>>So perhaps the previous difference of 26 orders of magnitude really > >>>did mean that the imaginary part was negligibly small, that just got > >>>obscured by the fact that the real part was order 1e+135. > >>> > >>>On 2/13/06, Ryan Krauss wrote: > >>> > >>> > > I got myself all tied up in a knot over this because I couldn't figure > out how multiplying two purely complex numbers was going to result in > something with a complex portion. Since I couldn't find the complex > routines my imagination went wild: perhaps, I thought, numpy uses the > complex multiplication routine that uses 3 multiplies instead of the > more straightforward one that uses 4 multiples, etc, etc. None of these > panned out, and of course they all evaporated when I got pointed to the > code that implements this which is pure vanilla. All the time I was > overlooking the obvious: > > Ryan is using s**2, not s*s. > > So the obvious answer, is that he's just seeing normal error in the > function that is implementing pow. > > If this is inacuracy is problem, I'd just replace s**2 with s*s. It will > probably be both faster and more accurate anyway > > Foolishly, > > -tim > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From tim.hochberg at cox.net Mon Feb 13 15:34:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 15:34:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> Message-ID: <43F11716.9050204@cox.net> Ryan Krauss wrote: >At the risk of sounding silly, can you explain to me in simple terms >why s**2 is less accurate than s*s. I can sort of intuitively >appreciate that that would be true, but but might like just a little >more detail. > > I don't know that it has to be *less* accurate, although it's unlikely to be more accurate since s*s should be nearly as accurate as you get with floating point. Multiplying two complex numbers in numpy is done in the most straightforward way imaginable: result.real = z1.real*z2.real - z1.imag*z2.imag result.image = z1.real*z2.imag + z1.imag*z2.real The individual results lose very little precision and the overall result will be nearly exact to within the limits of floating point. On the other hand, s**2 is being calculated by a completely different route. Something that will look like: result = pow(s, 2.0) Pow is some general function that computes the value of s to any power. As such it's a lot more complicated than the above simple expression. I don't think that there's any reason in principle that pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff between accuracy, speed and simplicity of implementation. That being said, it may be worthwhile having a look at complex pow and see if there's anything suspicious that might make the error larger than it needs to be. If all of that sounds a little bit like "I really know", there's some of that in there too. Regards, -tim >Thanks, > >Ryan > >On 2/13/06, Tim Hochberg wrote: > > >>>>Ryan Krauss wrote: >>>> >>>> >>>> >>>> >>>> >>>>>This may only be a problem for ridiculously large numbers. I actually >>>>>meant to be dealing with these values: >>>>> >>>>>In [75]: d >>>>>Out[75]: >>>>>array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, >>>>> 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, >>>>> 19985.94891221, 24674.01100272]) >>>>> >>>>>In [76]: s=d[-1]*1.0j >>>>> >>>>>In [77]: s >>>>>Out[77]: 24674.011002723393j >>>>> >>>>>In [78]: type(s) >>>>>Out[78]: >>>>> >>>>>In [79]: s**2 >>>>>Out[79]: (-608806818.96251547+7.4554869875188623e-08j) >>>>> >>>>>So perhaps the previous difference of 26 orders of magnitude really >>>>>did mean that the imaginary part was negligibly small, that just got >>>>>obscured by the fact that the real part was order 1e+135. >>>>> >>>>>On 2/13/06, Ryan Krauss wrote: >>>>> >>>>> >>>>> >>>>> >>I got myself all tied up in a knot over this because I couldn't figure >>out how multiplying two purely complex numbers was going to result in >>something with a complex portion. Since I couldn't find the complex >>routines my imagination went wild: perhaps, I thought, numpy uses the >>complex multiplication routine that uses 3 multiplies instead of the >>more straightforward one that uses 4 multiples, etc, etc. None of these >>panned out, and of course they all evaporated when I got pointed to the >>code that implements this which is pure vanilla. All the time I was >>overlooking the obvious: >> >>Ryan is using s**2, not s*s. >> >>So the obvious answer, is that he's just seeing normal error in the >>function that is implementing pow. >> >>If this is inacuracy is problem, I'd just replace s**2 with s*s. It will >>probably be both faster and more accurate anyway >> >>Foolishly, >> >>-tim >> >> >> >> >> >> >> >>------------------------------------------------------- >>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >>for problems? Stop! Download the new AJAX search engine that makes >>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >>_______________________________________________ >>Numpy-discussion mailing list >>Numpy-discussion at lists.sourceforge.net >>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> >> > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From cookedm at physics.mcmaster.ca Mon Feb 13 16:22:05 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Mon Feb 13 16:22:05 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F11716.9050204@cox.net> (Tim Hochberg's message of "Mon, 13 Feb 2006 16:32:38 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> Message-ID: Tim Hochberg writes: > Ryan Krauss wrote: > >>At the risk of sounding silly, can you explain to me in simple terms >>why s**2 is less accurate than s*s. I can sort of intuitively >>appreciate that that would be true, but but might like just a little >>more detail. >> >> > I don't know that it has to be *less* accurate, although it's unlikely > to be more accurate since s*s should be nearly as accurate as you get > with floating point. Multiplying two complex numbers in numpy is done > in the most straightforward way imaginable: > > result.real = z1.real*z2.real - z1.imag*z2.imag > result.image = z1.real*z2.imag + z1.imag*z2.real > > The individual results lose very little precision and the overall > result will be nearly exact to within the limits of floating point. > > On the other hand, s**2 is being calculated by a completely different > route. Something that will look like: > > result = pow(s, 2.0) > > Pow is some general function that computes the value of s to any > power. As such it's a lot more complicated than the above simple > expression. I don't think that there's any reason in principle that > pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff > between accuracy, speed and simplicity of implementation. On a close tangent, I had a patch at one point for Numeric (never committed) that did pow(s, 2.0) (= s**2) actually as s*s at the C level (no pow), which helped a lot in speed (currently, s**2 is slower than s*s). I should have another look at that. The difference is speed is pretty bad: for an array of 100 complex elements, s**2 is 68.4 usec/loop as opposed to s*s with 4.13 usec/loop on my machine. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant at ee.byu.edu Mon Feb 13 16:34:07 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 13 16:34:07 2006 Subject: [Numpy-discussion] Re: ***[Possible UCE]*** Test units on unicode types In-Reply-To: <1139858712.7532.33.camel@localhost.localdomain> References: <1139858712.7532.33.camel@localhost.localdomain> Message-ID: <43F12540.4050006@ee.byu.edu> Francesc Altet wrote: >Hi Travis, > >I've finished a series of tests on your recent new implementation of >unicode types in NumPy. They discovered a couple of issues in Numpy: one >is clearly a bug that show up in UCS2 builds (see the patch attached). >The other, well, it is not clear to me if it is a bug or not: > > Thanks very much for these tests... They are very, very useful. I recently realized that the getitem material must make copies for misaligned data because the convert to UCS2 functions expect aligned data (on Solaris anyway it would cause a segfault). You caught an obvious mistake in that code. >>>>ia1=numpy.array([1]) >>>>type(ia1) >>>> >>>> > > > >>>>type(ia1.view()) >>>> >>>> > > >However, for 0-dimensional arrays: > > > >>>ia0=numpy.array(1) >>>type(ia0) >>> >>> Francesc Altet wrote: >>>type(ia0.view()) >>> >>> !!!!!! Do you think the next this is a bug or a feature? My opinion is that it is a bug, but maybe I'm wrong. In fact, this has a very bad effect on unicode objects in UCS2 interpreters: Almost all of the methods right now, return scalars instead of 0-dimensional arrays on purpose. This was intentional because 0-dimensional arrays were not supposed to be handed around in Python. But, we were unable to completely eliminate them at this point. So, I suppose, though there are a select few methods that should not automatically convert 0-dimensional arrays to the equivalent scalar. .copy() is one of them *already changed* .view() is probably another *should be changed*. If you can think of other methods that should not return scalars instead of 0-dimensional arrays, post it. -Travis From wbaxter at gmail.com Mon Feb 13 16:35:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 16:35:03 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: Sorry, I wasn't very clear. My thinking was like this: - matplotlib web pages don't mention support for numpy anywhere, just numeric and numarray - matplotlib web page says that the default is to use numeric - numpy is basically the successor to numeric plus numarray functionality - conclusion: if matplotlib actually does support numpy, and the web pages are just out of date, then probably numpy would now be the default instead of numeric, since it is the successor to numeric. But apparently there's a flaw in that thinking somewhere. --bb On 2/14/06, Ryan Krauss wrote: > > The point of the numerix setting in the rc file is that matplotlib > can't tell you what is the best numerical package to use for your > problem. > > On 2/13/06, Bill Baxter wrote: > > Ah, ok. You're right. Doing from pylab import * was actually just > > overwriting the definition of array and rand() to be those from Numeric, > > which pylab was picking to use by default. I guess my expectation was > that > > pylab would default to using the best numerical package installed. > > > > With "numerix : numpy" in my ~/.matplotlib/matplotlibrc file, it seems > to be > > working properly now. > > > > Thanks for the help! > > > > --bb > > > > On 2/14/06, John Hunter wrote: > > > >>>>> "Bill" == Bill Baxter writes: > > > > > > Bill> from numpy import * was the only line missing, called before > > > Bill> the rest. It seems to work fine if I use from pylab import > > > Bill> * instead of import pylab as g > > > > > > Bill> And actually if I do both in this order: import pylab as g > > > Bill> from pylab import * > > > > > > Bill> Seems as if there's some > > > Bill> initialization code that only gets run with the 'from pylab > > > Bill> import *' version. > > > > > > As far as I know that is a python impossibility, unless perhaps you do > > > some deep dark magic that is beyond my grasp. pylab doesn't know how > > > it is imported. > > > > > > Are you sure you have your numerix set properly? I suggest creating > > > two free standing scripts, one with the problem and one without, and > > > running both with --verbose-helpful to make sure that your settings > > > are what you think they are. If you verify that numerix is set > > > properly and still see the problem, I would like to see both scripts > > > in case it is exposing a problem with matplotlib. > > > > > > Of course, doing multiple import * commands is a recipe for long term > > > pain, especially with packages that have so much overlapping namespace > > > and numpy/scipy/pylab. > > > > > > JDH > > > > > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Feb 13 17:46:18 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 13 17:46:18 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43F110D6.6060302@noaa.gov> References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> <43F110D6.6060302@noaa.gov> Message-ID: On 2/13/06, Christopher Barker wrote: > Then a wiki page on broadcasting may be in order, and the glossary could > link to it. I don't think a glossary should link to anything. I envisioned the glossary as a way to resolve ambiguities for people who already know more than one meaning of the terms. However, if others think a link to more detailed explanation belongs to glossary entries, the natural destination of the link would be a page in Travis' book. Travis, can you suggest a future-proof way to refer to a page in your book? From tim.hochberg at cox.net Mon Feb 13 17:47:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 17:47:04 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F11716.9050204@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> Message-ID: <43F13655.5010907@cox.net> Tim Hochberg wrote: > Ryan Krauss wrote: > >> At the risk of sounding silly, can you explain to me in simple terms >> why s**2 is less accurate than s*s. I can sort of intuitively >> appreciate that that would be true, but but might like just a little >> more detail. >> >> > I don't know that it has to be *less* accurate, although it's unlikely > to be more accurate since s*s should be nearly as accurate as you get > with floating point. Multiplying two complex numbers in numpy is done > in the most straightforward way imaginable: > > result.real = z1.real*z2.real - z1.imag*z2.imag > result.image = z1.real*z2.imag + z1.imag*z2.real > > The individual results lose very little precision and the overall > result will be nearly exact to within the limits of floating point. > > On the other hand, s**2 is being calculated by a completely different > route. Something that will look like: > > result = pow(s, 2.0) > > Pow is some general function that computes the value of s to any > power. As such it's a lot more complicated than the above simple > expression. I don't think that there's any reason in principle that > pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff > between accuracy, speed and simplicity of implementation. > > That being said, it may be worthwhile having a look at complex pow and > see if there's anything suspicious that might make the error larger > than it needs to be. > > If all of that sounds a little bit like "I really know", there's some > of that in there too. To add a little more detail to this, the formula that numpy uses to compute a**b is: exp(b*log(a)) where: log(x) = log(|x|) + arctan2(x.imag, x,real) exp(x) = exp(x.real) * (cos(x.imag) + 1j*sin(x.imag)) With these definitions in hand, it should be apparent what's happening when *a* = |a|j and *b* = 2. First, let's compute 2*log(a) for *a* = 24674.011002723393j: 2*log(a) = (20.227011565110185+3.1415926535897931j) Now it's clear what's happening: ideally the sine of the imaginary part of the above number should be zero. However: sin(3.1415926535897931) = 1.2246063538223773e-016 And this in turn leads to the error we see here. -tim > > Regards, > > -tim > > >> Thanks, >> >> Ryan >> >> On 2/13/06, Tim Hochberg wrote: >> >> >>>>> Ryan Krauss wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> This may only be a problem for ridiculously large numbers. I >>>>>> actually >>>>>> meant to be dealing with these values: >>>>>> >>>>>> In [75]: d >>>>>> Out[75]: >>>>>> array([ 246.74011003, 986.96044011, 2220.66099025, >>>>>> 3947.84176044, >>>>>> 6168.50275068, 8882.64396098, 12090.26539133, >>>>>> 15791.36704174, >>>>>> 19985.94891221, 24674.01100272]) >>>>>> >>>>>> In [76]: s=d[-1]*1.0j >>>>>> >>>>>> In [77]: s >>>>>> Out[77]: 24674.011002723393j >>>>>> >>>>>> In [78]: type(s) >>>>>> Out[78]: >>>>>> >>>>>> In [79]: s**2 >>>>>> Out[79]: (-608806818.96251547+7.4554869875188623e-08j) >>>>>> >>>>>> So perhaps the previous difference of 26 orders of magnitude really >>>>>> did mean that the imaginary part was negligibly small, that just got >>>>>> obscured by the fact that the real part was order 1e+135. >>>>>> >>>>>> On 2/13/06, Ryan Krauss wrote: >>>>>> >>>>>> >>>>>> >>>>> >>> I got myself all tied up in a knot over this because I couldn't figure >>> out how multiplying two purely complex numbers was going to result in >>> something with a complex portion. Since I couldn't find the complex >>> routines my imagination went wild: perhaps, I thought, numpy uses the >>> complex multiplication routine that uses 3 multiplies instead of the >>> more straightforward one that uses 4 multiples, etc, etc. None of these >>> panned out, and of course they all evaporated when I got pointed to the >>> code that implements this which is pure vanilla. All the time I was >>> overlooking the obvious: >>> >>> Ryan is using s**2, not s*s. >>> >>> So the obvious answer, is that he's just seeing normal error in the >>> function that is implementing pow. >>> >>> If this is inacuracy is problem, I'd just replace s**2 with s*s. It >>> will >>> probably be both faster and more accurate anyway >>> >>> Foolishly, >>> >>> -tim >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------- >>> This SF.net email is sponsored by: Splunk Inc. Do you grep through >>> log files >>> for problems? Stop! Download the new AJAX search engine that makes >>> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>> >>> >> >> >> >> ------------------------------------------------------- >> This SF.net email is sponsored by: Splunk Inc. Do you grep through >> log files >> for problems? Stop! Download the new AJAX search engine that makes >> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >> http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> >> >> > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From wbaxter at gmail.com Mon Feb 13 17:49:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 17:49:15 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit Message-ID: Is there anyway to get around this timing difference? * >>> import timeit ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", 'from numpy import zeros,mat') ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += 1.;", 'from numpy import zeros,mat') >>> **t1.timeit(100) 1.8391627591141742 >>> t2.timeit(100) 3.2988266117713465 *Copying all the data of the input array seems wasteful when the array is just going to go out of scope. Or is this not something to be concerned about? It seems like a copy-by-reference version of mat() would be useful. Really I can't imagine any case when I'd want both a matrix and the original version of the array both hanging around as separate copies. I can imagine either 1) the array is just a temp and I won't ever need it again or 2) temporarily wanting a "matrix view" on the array's data to do some linalg, after which I'll go back to using the original (now modified) array as an array again. --bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at cox.net Mon Feb 13 18:02:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 18:02:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: References: Message-ID: <43F139CD.8010407@cox.net> Bill Baxter wrote: > Is there anyway to get around this timing difference? > * > >>> import timeit > ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", > 'from numpy import zeros,mat') > ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += > 1.;", 'from numpy import zeros,mat') > >>> **t1.timeit(100) > 1.8391627591141742 > >>> t2.timeit(100) > 3.2988266117713465 > > *Copying all the data of the input array seems wasteful when the array > is just going to go out of scope. Or is this not something to be > concerned about? You could try using copy=False: >>> import timeit >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", 'from numpy import zeros,mat') >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d'), copy=False); a += 1.;", 'from numpy import z eros,mat') >>> t1.timeit(100) 3.6538127052460578 >>> t2.timeit(100) 3.6567186611706237 I'd also like to point out that your computer appears to be much faster than mine. -tim > > It seems like a copy-by-reference version of mat() would be useful. > Really I can't imagine any case when I'd want both a matrix and the > original version of the array both hanging around as separate copies. > I can imagine either 1) the array is just a temp and I won't ever need > it again or 2) temporarily wanting a "matrix view" on the array's data > to do some linalg, after which I'll go back to using the original (now > modified) array as an array again. > > --bill From tim.hochberg at cox.net Mon Feb 13 18:07:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 18:07:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> Message-ID: <43F13B20.3000301@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>Ryan Krauss wrote: >> >> >> >>>At the risk of sounding silly, can you explain to me in simple terms >>>why s**2 is less accurate than s*s. I can sort of intuitively >>>appreciate that that would be true, but but might like just a little >>>more detail. >>> >>> >>> >>> >>I don't know that it has to be *less* accurate, although it's unlikely >>to be more accurate since s*s should be nearly as accurate as you get >>with floating point. Multiplying two complex numbers in numpy is done >>in the most straightforward way imaginable: >> >> result.real = z1.real*z2.real - z1.imag*z2.imag >> result.image = z1.real*z2.imag + z1.imag*z2.real >> >>The individual results lose very little precision and the overall >>result will be nearly exact to within the limits of floating point. >> >>On the other hand, s**2 is being calculated by a completely different >>route. Something that will look like: >> >> result = pow(s, 2.0) >> >>Pow is some general function that computes the value of s to any >>power. As such it's a lot more complicated than the above simple >>expression. I don't think that there's any reason in principle that >>pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff >>between accuracy, speed and simplicity of implementation. >> >> > >On a close tangent, I had a patch at one point for Numeric (never >committed) that did pow(s, 2.0) (= s**2) actually as s*s at the C level (no >pow), which helped a lot in speed (currently, s**2 is slower than s*s). > >I should have another look at that. The difference is speed is pretty >bad: for an array of 100 complex elements, s**2 is 68.4 usec/loop as >opposed to s*s with 4.13 usec/loop on my machine. > > Python's complex object also special cases integer powers. Which is why you won't see the inaccuracy that started this thread using basic complex objects. However, I'm not convinced this is a good idea for numpy. This would introduce a discontinuity in a**b that could cause problems in some cases. If, for instance, one were running an iterative solver of some sort (something I've been known to do), and b was a free variable, it could get stuck at b = 2 since things would go nonmonotonic there. I would recomend that we just prominently document that x*x is faster and more accurate than x**2 and that people should use x*x where that's a concern. -tim From ryanlists at gmail.com Mon Feb 13 18:11:01 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 18:11:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F13B20.3000301@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> Message-ID: I agree. I already made that change in my code. Ryan On 2/13/06, Tim Hochberg wrote: > David M. Cooke wrote: > > >Tim Hochberg writes: > > > > > > > >>Ryan Krauss wrote: > >> > >> > >> > >>>At the risk of sounding silly, can you explain to me in simple terms > >>>why s**2 is less accurate than s*s. I can sort of intuitively > >>>appreciate that that would be true, but but might like just a little > >>>more detail. > >>> > >>> > >>> > >>> > >>I don't know that it has to be *less* accurate, although it's unlikely > >>to be more accurate since s*s should be nearly as accurate as you get > >>with floating point. Multiplying two complex numbers in numpy is done > >>in the most straightforward way imaginable: > >> > >> result.real = z1.real*z2.real - z1.imag*z2.imag > >> result.image = z1.real*z2.imag + z1.imag*z2.real > >> > >>The individual results lose very little precision and the overall > >>result will be nearly exact to within the limits of floating point. > >> > >>On the other hand, s**2 is being calculated by a completely different > >>route. Something that will look like: > >> > >> result = pow(s, 2.0) > >> > >>Pow is some general function that computes the value of s to any > >>power. As such it's a lot more complicated than the above simple > >>expression. I don't think that there's any reason in principle that > >>pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff > >>between accuracy, speed and simplicity of implementation. > >> > >> > > > >On a close tangent, I had a patch at one point for Numeric (never > >committed) that did pow(s, 2.0) (= s**2) actually as s*s at the C level (no > >pow), which helped a lot in speed (currently, s**2 is slower than s*s). > > > >I should have another look at that. The difference is speed is pretty > >bad: for an array of 100 complex elements, s**2 is 68.4 usec/loop as > >opposed to s*s with 4.13 usec/loop on my machine. > > > > > Python's complex object also special cases integer powers. Which is why > you won't see the inaccuracy that started this thread using basic > complex objects. > > However, I'm not convinced this is a good idea for numpy. This would > introduce a discontinuity in a**b that could cause problems in some > cases. If, for instance, one were running an iterative solver of some > sort (something I've been known to do), and b was a free variable, it > could get stuck at b = 2 since things would go nonmonotonic there. I > would recomend that we just prominently document that x*x is faster and > more accurate than x**2 and that people should use x*x where that's a > concern. > > -tim > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ryanlists at gmail.com Mon Feb 13 18:14:02 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 18:14:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F139CD.8010407@cox.net> References: <43F139CD.8010407@cox.net> Message-ID: You both seem to have cooler computers than I do: In [19]: t1.timeit(100) Out[19]: 4.9827449321746826 In [20]: t2.timeit(100) Out[20]: 4.9990239143371582 On 2/13/06, Tim Hochberg wrote: > Bill Baxter wrote: > > > Is there anyway to get around this timing difference? > > * > > >>> import timeit > > ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", > > 'from numpy import zeros,mat') > > ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += > > 1.;", 'from numpy import zeros,mat') > > >>> **t1.timeit(100) > > 1.8391627591141742 > > >>> t2.timeit(100) > > 3.2988266117713465 > > > > *Copying all the data of the input array seems wasteful when the array > > is just going to go out of scope. Or is this not something to be > > concerned about? > > You could try using copy=False: > > >>> import timeit > >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", 'from > numpy import zeros,mat') > >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d'), copy=False); a > += 1.;", 'from numpy import z > eros,mat') > >>> t1.timeit(100) > 3.6538127052460578 > >>> t2.timeit(100) > 3.6567186611706237 > > I'd also like to point out that your computer appears to be much faster > than mine. > > -tim > > > > > > It seems like a copy-by-reference version of mat() would be useful. > > Really I can't imagine any case when I'd want both a matrix and the > > original version of the array both hanging around as separate copies. > > I can imagine either 1) the array is just a temp and I won't ever need > > it again or 2) temporarily wanting a "matrix view" on the array's data > > to do some linalg, after which I'll go back to using the original (now > > modified) array as an array again. > > > > --bill > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From gruben at bigpond.net.au Mon Feb 13 18:19:01 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Mon Feb 13 18:19:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F13B20.3000301@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> Message-ID: <43F13DE0.8040309@bigpond.net.au> Tim Hochberg wrote: > However, I'm not convinced this is a good idea for numpy. This would > introduce a discontinuity in a**b that could cause problems in some > cases. If, for instance, one were running an iterative solver of some > sort (something I've been known to do), and b was a free variable, it > could get stuck at b = 2 since things would go nonmonotonic there. I don't quite understand the problem here. Tim says Python special cases integer powers but then talks about the problem when b is a floating type. I think special casing x**2 and maybe even x**3 when the power is an integer is still a good idea. Gary R. From wbaxter at gmail.com Mon Feb 13 18:27:04 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 18:27:04 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F139CD.8010407@cox.net> References: <43F139CD.8010407@cox.net> Message-ID: On 2/14/06, Tim Hochberg wrote: > > Bill Baxter wrote: > > > *Copying all the data of the input array seems wasteful when the array > > is just going to go out of scope. Or is this not something to be > > concerned about? > > You could try using copy=False: Lovely. That does the trick. And the syntax isn't so bad after defining a little helper like: def matr(a): return mat(a,copy=False) >>> t1.timeit(100) > 3.6538127052460578 > >>> t2.timeit(100) > 3.6567186611706237 > > I'd also like to point out that your computer appears to be much faster > than mine. Duly noted. :-) -tim --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Mon Feb 13 18:55:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 18:55:01 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) Message-ID: On 2/11/06, Gary Ruben wrote: > > Sasha wrote: > > On 2/10/06, Gary Ruben wrote: > >> ... I must say that Travis's > >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that > >> the upper limit on an integer range is non-inclusive. > > > > In this case you must hate that an integer range starts at 0 (I don't > > think you would want len(range(10)) to be 11). First, I think the range() function in python is ugly to begin with. Why can't python just support range notation directly like 'for a in 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more sense to me than having to call a named function. Anyway, that's a python pet peeve, and python's probably not going to change something so fundamental... Second, sometimes zero-based, non-inclusive ranges are handy, and sometimes one-based inclusive ranges are handy. For array indexing, I personally like zero based. But sometimes I just want a list of N numbers like a human would write it, from 1 to N, and in those cases it seems really odd for N+1 to show up. This is a place where numpy could do something. I think it would be nice if numpy had something like an 'irange' (inclusive range) function to complement the 'arange' function. They would act pretty much the same, except irange(5) would return [1,2,3,4,5], and irange(1,5) would return [1,2,3,4,5]. Anyway, I think I'm going to put a little irange function in my setup. --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at cox.net Mon Feb 13 19:20:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 19:20:02 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: References: Message-ID: <43F14C18.1040600@cox.net> Bill Baxter wrote: > On 2/11/06, *Gary Ruben* > wrote: > > Sasha wrote: > > On 2/10/06, Gary Ruben > wrote: > >> ... I must say that Travis's > >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with > python - that > >> the upper limit on an integer range is non-inclusive. > > > > In this case you must hate that an integer range starts at 0 (I > don't > > think you would want len(range(10)) to be 11). > > > First, I think the range() function in python is ugly to begin with. > Why can't python just support range notation directly like 'for a in > 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more > sense to me than having to call a named function. Anyway, that's a > python pet peeve, and python's probably not going to change something > so fundamental... > > Second, sometimes zero-based, non-inclusive ranges are handy, and > sometimes one-based inclusive ranges are handy. For array indexing, I > personally like zero based. But sometimes I just want a list of N > numbers like a human would write it, from 1 to N, and in those cases > it seems really odd for N+1 to show up. > > This is a place where numpy could do something. I think it would be > nice if numpy had something like an 'irange' (inclusive range) > function to complement the 'arange' function. They would act pretty > much the same, except irange(5) would return [1,2,3,4,5], and > irange(1,5) would return [1,2,3,4,5]. > > Anyway, I think I'm going to put a little irange function in my setup. FWIW, I'd recomend a different name. irange sounds like it belongs in the itertools module with ifilter, islice, izip, etc. Perhaps, rangei would work, although admittedly it's harder to see. Maybe crange for closed range (versus half-open range)? I dunno, but irange seems like it's gonna confuse someone, if not you, then other people who end up looking at your code. -tim From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 19:38:03 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 19:38:03 2006 Subject: [Numpy-discussion] number ranges In-Reply-To: (Bill Baxter's message of "Tue, 14 Feb 2006 11:54:15 +0900") References: Message-ID: <87zmkuwrgt.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Bill" == Bill Baxter writes: Bill> This is a place where numpy could do something. I think it Bill> would be nice if numpy had something like an 'irange' Bill> (inclusive range) function to complement the 'arange' In my view, this goes a bit against the spirit of the Zen of python ('import this') There should be one -- and preferably only one -- obvious way to do it. since there is an obvious way to get the range 1..6. JDH From cjw at sympatico.ca Mon Feb 13 19:41:06 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 19:41:06 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: <43F15148.3060304@sympatico.ca> Sasha wrote: >Actually, what would be wrong with a single letter "c" or "r" for the >concatenator? NumPy already has one single-letter global identifier - >"e", so it will not be against any naming standard. I don't think >either "c" or "r" will conflict with anything in the standard library. > I would still prefer "c" because "r" is taken by RPy. > > It seems to me that a single letter would only be appropriate for a a function which has a very high frequency of use. I used M for the matrix constructor for my numarray based package. Why not rowCat for row catenate or colCat for column catentate - I've never understood why concatentate is used more commonly. Colin W. > >On 2/10/06, Sasha wrote: > > >>To tell you the truth I dislike trailing underscore much more than the >>choice of letter. In my code I will probably be renaming all these >>foo_ to delete the underscore foo_(...) or foo_[...] is way too ugly >>for my taste. However I fully admit that it is just a matter of taste >>and it is trivial to rename things on import in Python. >> >>PS: Trailing underscore reminds me of C++ - the language that I >>happily live without :-) >> >>On 2/10/06, Ryan Krauss wrote: >> >> >>>The problem is that c_ at least used to mean "column concatenate" and >>>concatenate is too long to type. >>> >>>On 2/10/06, Sasha wrote: >>> >>> >>>>On 2/10/06, Travis Oliphant wrote: >>>> >>>> >>>>>The whole point of r_ is to allow you to use slice notation to build >>>>>ranges easily. I wrote it precisely to make it easier to construct >>>>>arrays in a simliar style that Matlab allows. >>>>> >>>>> >>>>Maybe it is just me, but r_ is rather unintuitive. I would expect >>>>something like this to be called "c" for "combine" or "concatenate." >>>>This is the name used by S+ and R. >>>> >>>>From R manual: >>>>""" >>>>c package:base R Documentation >>>>Combine Values into a Vector or List >>>>... >>>>Examples: >>>> c(1,7:9) >>>>... >>>>""" >>>> >>>> >>>>------------------------------------------------------- >>>>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >>>>for problems? Stop! Download the new AJAX search engine that makes >>>>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >>>>http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 >>>>_______________________________________________ >>>>Numpy-discussion mailing list >>>>Numpy-discussion at lists.sourceforge.net >>>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>> >>>> >>>> > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From cjw at sympatico.ca Mon Feb 13 19:55:02 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 19:55:02 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EE63E1.10601@ieee.org> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> Message-ID: <43F15469.10209@sympatico.ca> Travis Oliphant wrote: > N. Volbers wrote: > >> I continue to learn all about the heterogeneous arrays... >> >> When I was reading through the records.py code I discovered that >> besides the 'names' and 'formats' for the fields of a numpy array you >> can also specify 'titles'. Playing around with this feature I >> discovered a bug: >> >> >>> import numpy >> >>> mydata = [(1,1), (2,4), (3,9)] >> >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'], >> 'titles': ['col2', 'col1']} >> >>> b = numpy.array( mydata, dtype=mytype) >> >>> print b >> [(1.0, 1.0) (4.0, 4.0) (9.0, 9.0)] >> >> This seems to be caused by the fact that you can access a field by >> both the name and the field title. Why would you want to have two >> names anyway? > > > This lets you use attribute look up on the names but have the titles > be the "true name" of the field. Isn't it better to use the name as the identifier and the title as an external label? e.g. As a column heading when pretty-printing. It seems to me that permitting either the name or the title as an object accessor is potentially confusing. Colin W. > > I've fixed this in SVN, so that it raises an error when the titles > have the same names as the columns. > > Thanks for the test. > > > -Travis > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From cjw at sympatico.ca Mon Feb 13 19:56:09 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 19:56:09 2006 Subject: [Numpy-discussion] More numpy and Numeric differences In-Reply-To: <43EE6AAB.9010408@ieee.org> References: <43EE6AAB.9010408@ieee.org> Message-ID: <43F154CF.7070403@sympatico.ca> Travis Oliphant wrote: > Pearu Peterson wrote: > >> >> I have created a wiki page >> >> http://scipy.org/PearuPeterson/NumpyVersusNumeric >> >> that reports my findings on how numpy and Numeric behave on various >> corner cases. Travis O., could you take a look at it? >> Here is the most recent addition: >> > I fixed the put issue. The problem with clip is actually in choose > (clip is just a specific application of choose). > The problem is in PyArray_ConvertToCommonType. You have an integer > array, an integer scalar, and a floating-point scalar. > I think the rules implemented in PyArray_ConvertToCommonType are not > allowing the scalar to dictate anything. But, this should clearly be > changed to allow scalars of different "kinds" to up-cast the array. > This would be consistent with the umath module. > > So, PyArray_ConvertToCommonType needs to be improved. This will have > an impact on several other functions that use this C-API. > > -Travis > A numarray vs numpy would be helpful for some of us. Colin W. From cjw at sympatico.ca Mon Feb 13 20:09:05 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 20:09:05 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <43F14C18.1040600@cox.net> References: <43F14C18.1040600@cox.net> Message-ID: <43F157C0.1040209@sympatico.ca> Tim Hochberg wrote: > Bill Baxter wrote: > >> On 2/11/06, *Gary Ruben* > > wrote: >> >> Sasha wrote: >> > On 2/10/06, Gary Ruben > > wrote: >> >> ... I must say that Travis's >> >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with >> python - that >> >> the upper limit on an integer range is non-inclusive. >> > >> > In this case you must hate that an integer range starts at 0 (I >> don't >> > think you would want len(range(10)) to be 11). >> >> >> First, I think the range() function in python is ugly to begin with. >> Why can't python just support range notation directly like 'for a in >> 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot >> more sense to me than having to call a named function. Anyway, >> that's a python pet peeve, and python's probably not going to change >> something so fundamental... >> >> Second, sometimes zero-based, non-inclusive ranges are handy, and >> sometimes one-based inclusive ranges are handy. For array indexing, >> I personally like zero based. But sometimes I just want a list of N >> numbers like a human would write it, from 1 to N, and in those cases >> it seems really odd for N+1 to show up. >> >> This is a place where numpy could do something. I think it would be >> nice if numpy had something like an 'irange' (inclusive range) >> function to complement the 'arange' function. They would act pretty >> much the same, except irange(5) would return [1,2,3,4,5], and >> irange(1,5) would return [1,2,3,4,5]. >> >> Anyway, I think I'm going to put a little irange function in my setup. > > > > FWIW, I'd recomend a different name. irange sounds like it belongs in > the itertools module with ifilter, islice, izip, etc. Perhaps, rangei > would work, although admittedly it's harder to see. Maybe crange for > closed range (versus half-open range)? I dunno, but irange seems like > it's gonna confuse someone, if not you, then other people who end up > looking at your code. > > -tim Wouldn't it be nice if we could express range(a, b, c) as a:b:c? Colin W. From cookedm at physics.mcmaster.ca Mon Feb 13 20:14:01 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Mon Feb 13 20:14:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F13DE0.8040309@bigpond.net.au> (Gary Ruben's message of "Tue, 14 Feb 2006 13:18:08 +1100") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: Gary Ruben writes: > Tim Hochberg wrote: > >> However, I'm not convinced this is a good idea for numpy. This would >> introduce a discontinuity in a**b that could cause problems in some >> cases. If, for instance, one were running an iterative solver of >> some sort (something I've been known to do), and b was a free >> variable, it could get stuck at b = 2 since things would go >> nonmonotonic there. > > I don't quite understand the problem here. Tim says Python special > cases integer powers but then talks about the problem when b is a > floating type. I think special casing x**2 and maybe even x**3 when > the power is an integer is still a good idea. Well, what I had done with Numeric did special case x**0, x**1, x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the exponent was a scalar (so x**y where y was an array wouldn't be). I think this is very useful, as I don't want to microoptimize my code to x*x instead of x**2. The reason for just scalar exponents was so choosing how to do the power was lifted out of the inner loop. With that, x**2 was as fast as x*x. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From tim.hochberg at cox.net Mon Feb 13 20:16:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 20:16:04 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <43F100F6.1020200@ee.byu.edu> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> <43F100F6.1020200@ee.byu.edu> Message-ID: <43F1593C.9080203@cox.net> Travis Oliphant wrote: > Russel Howe wrote: > >> I am converting some numarray code to numpy and I noticed this behavior: >> >> >>> from numpy import * >> >>> sta=array(['abc', 'def', 'ghi']) >> >>> stb=array(['abc', 'jkl', 'ghi']) >> >>> sta==stb >> False >> >> I expected the same as this: >> >>> a1=array([1,2,3]) >> >>> a2=array([1,4,3]) >> >>> a1==a2 >> array([True, False, True], dtype=bool) >> >> I am trying to figure out how to fix this now... > > > > Equality testing on string arrays does not work (equality testing uses > ufuncs internally which are not supported generally for flexible > arrays). You must use chararray's. Should string arrays then perhaps raise an exception here to keep people out of trouble? -tim > > Thus, > > sta.view(chararray) == stb.view(chararray) > > Or create chararray's from the beginning: > > sta = char.array(['abc','def','ghi']) > stb = char.array(['abc','jkl','ghi']) > > Char arrays are a special subclass of the ndarray that give arrays all > the methods of strings (and unicode) elements and allow (rich) > comparison operations. > > -Travis > > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From wbaxter at gmail.com Mon Feb 13 20:22:06 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 20:22:06 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <43F14C18.1040600@cox.net> References: <43F14C18.1040600@cox.net> Message-ID: Thanks for the suggestion. Didn't realize there were all those other common i* functions. Maybe arangei is better. It gives more indication that it's mostly like arange, but different. --bb On 2/14/06, Tim Hochberg wrote: > > > FWIW, I'd recomend a different name. irange sounds like it belongs in > the itertools module with ifilter, islice, izip, etc. Perhaps, rangei > would work, although admittedly it's harder to see. Maybe crange for > closed range (versus half-open range)? I dunno, but irange seems like > it's gonna confuse someone, if not you, then other people who end up > looking at your code. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gruben at bigpond.net.au Mon Feb 13 20:33:12 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Mon Feb 13 20:33:12 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: References: Message-ID: <43F15D5A.9060202@bigpond.net.au> I agree with Bill's comments. The other languages I've used have used closed/inclusive ranges, which made a lot of sense because they supported enumerated types. The Python Zen guideline breaks down for me because it never felt 'obvious' to me and the principle of least surprise told me to expect closed ranges. I can see the arguments for both though. I discovered that Ruby has both 0..10 and 0...10 syntax depending on whether you want open or closed ranges. I don't know if numpy should break with Python convention here and try to supply a closed range specifier because there should probably be a backup Zen guideline saying 'There should be one -- and preferably only one -- way to do it, even if it's not obvious.' If there was one, I'd use it in preference though. Gary R. From oliphant.travis at ieee.org Mon Feb 13 20:35:13 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 13 20:35:13 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F139CD.8010407@cox.net> References: <43F139CD.8010407@cox.net> Message-ID: <43F15DDF.1030807@ieee.org> Tim Hochberg wrote: > Bill Baxter wrote: > >> Is there anyway to get around this timing difference? >> * >> >>> import timeit >> ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", >> 'from numpy import zeros,mat') >> ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += >> 1.;", 'from numpy import zeros,mat') >> >>> **t1.timeit(100) >> 1.8391627591141742 >> >>> t2.timeit(100) >> 3.2988266117713465 >> >> *Copying all the data of the input array seems wasteful when the >> array is just going to go out of scope. Or is this not something to >> be concerned about? > I think I originally tried to make mat *not* return a copy, but this actually broke code in SciPy. So, I left the default as it was as a copy on input. There is an *asmatrix* command that does not return a copy... -Travis From oliphant.travis at ieee.org Mon Feb 13 20:38:15 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 13 20:38:15 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <43F1593C.9080203@cox.net> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> <43F100F6.1020200@ee.byu.edu> <43F1593C.9080203@cox.net> Message-ID: <43F15E9B.7040706@ieee.org> Tim Hochberg wrote: > Travis Oliphant wrote: > >> Russel Howe wrote: >> >>> I am converting some numarray code to numpy and I noticed this >>> behavior: >>> >>> >>> from numpy import * >>> >>> sta=array(['abc', 'def', 'ghi']) >>> >>> stb=array(['abc', 'jkl', 'ghi']) >>> >>> sta==stb >>> False >>> >>> I expected the same as this: >>> >>> a1=array([1,2,3]) >>> >>> a2=array([1,4,3]) >>> >>> a1==a2 >>> array([True, False, True], dtype=bool) >>> >>> I am trying to figure out how to fix this now... >> >> >> >> >> Equality testing on string arrays does not work (equality testing >> uses ufuncs internally which are not supported generally for flexible >> arrays). You must use chararray's. > > > Should string arrays then perhaps raise an exception here to keep > people out of trouble? Probably. The equal (not_equal) rich comparison code has some left-over stuff from Numeric which is implemented so that if the ufunc equal (not_equal) failed False (True) was returned. I did not special-case the string arrays in this code. -Travis > > -tim > > >> >> Thus, >> >> sta.view(chararray) == stb.view(chararray) >> >> Or create chararray's from the beginning: >> >> sta = char.array(['abc','def','ghi']) >> stb = char.array(['abc','jkl','ghi']) >> >> Char arrays are a special subclass of the ndarray that give arrays >> all the methods of strings (and unicode) elements and allow (rich) >> comparison operations. >> >> -Travis >> >> >> >> >> >> >> >> >> ------------------------------------------------------- >> This SF.net email is sponsored by: Splunk Inc. Do you grep through >> log files >> for problems? Stop! Download the new AJAX search engine that makes >> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From gruben at bigpond.net.au Mon Feb 13 20:50:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Mon Feb 13 20:50:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: <43F16154.2050907@bigpond.net.au> Hi David, So, I think what you had done would be OK provided you removed the x**0.5 case to avoid the problem Tim raised and checked that the exponent is an integer, not just a scalar. Does anyone see a problem with this approach. Gary R. David M. Cooke wrote: > Gary Ruben writes: > >> Tim Hochberg wrote: >> >>> However, I'm not convinced this is a good idea for numpy. This would >>> introduce a discontinuity in a**b that could cause problems in some >>> cases. If, for instance, one were running an iterative solver of >>> some sort (something I've been known to do), and b was a free >>> variable, it could get stuck at b = 2 since things would go >>> nonmonotonic there. >> I don't quite understand the problem here. Tim says Python special >> cases integer powers but then talks about the problem when b is a >> floating type. I think special casing x**2 and maybe even x**3 when >> the power is an integer is still a good idea. > > Well, what I had done with Numeric did special case x**0, x**1, > x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the > exponent was a scalar (so x**y where y was an array wouldn't be). I > think this is very useful, as I don't want to microoptimize my code to > x*x instead of x**2. The reason for just scalar exponents was so > choosing how to do the power was lifted out of the inner loop. With > that, x**2 was as fast as x*x. > From tim.hochberg at cox.net Mon Feb 13 21:18:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 21:18:12 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: <43F167C0.7040806@cox.net> David M. Cooke wrote: >Gary Ruben writes: > > > >>Tim Hochberg wrote: >> >> >> >>>However, I'm not convinced this is a good idea for numpy. This would >>>introduce a discontinuity in a**b that could cause problems in some >>>cases. If, for instance, one were running an iterative solver of >>>some sort (something I've been known to do), and b was a free >>>variable, it could get stuck at b = 2 since things would go >>>nonmonotonic there. >>> >>> >>I don't quite understand the problem here. Tim says Python special >>cases integer powers but then talks about the problem when b is a >>floating type. I think special casing x**2 and maybe even x**3 when >>the power is an integer is still a good idea. >> >> > >Well, what I had done with Numeric did special case x**0, x**1, >x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the >exponent was a scalar (so x**y where y was an array wouldn't be). I >think this is very useful, as I don't want to microoptimize my code to >x*x instead of x**2. The reason for just scalar exponents was so >choosing how to do the power was lifted out of the inner loop. With >that, x**2 was as fast as x*x. > > This is getting harder to object to since, try as I might I can't get a**b to go nonmontonic in the vicinity of b==2. I run out of floating point resolution before the slight shift due to special casing at 2 results in nonmonoticity. I suspect that I could manage it with enough work, but it would require some unlikely function of a**b. I'm not sure if I'm really on board with this, but let me float a slightly modified proposal anyway: 1. numpy.power stays as it is now. That way in the rare case that someone runs into trouble they can drop back to power. Alternatively there could be rawpower and power where rawpower has the current behaviour. While the name rawpower sounds cool/cheesy, power is used infrequently enough that I doubt it matters whether it has these special case optimazations. 2, Don't distinguish between scalars and arrays -- that just makes things harder to explain. 3. Python itself special cases all integral powers between -100 and 100. Beg/borrow/steal their code. This makes it easier to explain since all smallish integer powers are just automagically faster. 4. Is the performance advantage of special casing a**0.5 signifigant? If so use the above trick to special case all half integral and integral powers between -N and N. Since sqrt probably chews up some time the cutoff. The cutoff probably shifts somewhat if we're optimizing half integral as well as integral powers. Perhaps N would be 32 or 64. The net result of this is that a**b would be computed using a combination of repeated multiplication and sqrt for real integral and half integral values of b between -N and N. That seems simpler to explain and somewhat more useful as well. It sounds like a fun project although I'm not certain yet that it's a good idea. -tim From ryanlists at gmail.com Mon Feb 13 21:21:01 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 21:21:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F16154.2050907@bigpond.net.au> References: <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F16154.2050907@bigpond.net.au> Message-ID: I think that would be great, but is there any chance there would be a problem with the scenario Tim posted earlier: If a script was running some sort of optimization on x**y, could the y value ever actually be returned as an integer and could that throw off the optimization if round off error caused the float version returned a significantly different value than the integer version? Ryan On 2/13/06, Gary Ruben wrote: > Hi David, > So, I think what you had done would be OK provided you removed the > x**0.5 case to avoid the problem Tim raised and checked that the > exponent is an integer, not just a scalar. > Does anyone see a problem with this approach. > Gary R. > > David M. Cooke wrote: > > Gary Ruben writes: > > > >> Tim Hochberg wrote: > >> > >>> However, I'm not convinced this is a good idea for numpy. This would > >>> introduce a discontinuity in a**b that could cause problems in some > >>> cases. If, for instance, one were running an iterative solver of > >>> some sort (something I've been known to do), and b was a free > >>> variable, it could get stuck at b = 2 since things would go > >>> nonmonotonic there. > >> I don't quite understand the problem here. Tim says Python special > >> cases integer powers but then talks about the problem when b is a > >> floating type. I think special casing x**2 and maybe even x**3 when > >> the power is an integer is still a good idea. > > > > Well, what I had done with Numeric did special case x**0, x**1, > > x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the > > exponent was a scalar (so x**y where y was an array wouldn't be). I > > think this is very useful, as I don't want to microoptimize my code to > > x*x instead of x**2. The reason for just scalar exponents was so > > choosing how to do the power was lifted out of the inner loop. With > > that, x**2 was as fast as x*x. > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From cookedm at physics.mcmaster.ca Mon Feb 13 21:46:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Mon Feb 13 21:46:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F167C0.7040806@cox.net> (Tim Hochberg's message of "Mon, 13 Feb 2006 22:16:48 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>Gary Ruben writes: >> >> >> >>>Tim Hochberg wrote: >>> >>> >>> >>>>However, I'm not convinced this is a good idea for numpy. This would >>>>introduce a discontinuity in a**b that could cause problems in some >>>>cases. If, for instance, one were running an iterative solver of >>>>some sort (something I've been known to do), and b was a free >>>>variable, it could get stuck at b = 2 since things would go >>>>nonmonotonic there. >>>> >>>> >>>I don't quite understand the problem here. Tim says Python special >>>cases integer powers but then talks about the problem when b is a >>>floating type. I think special casing x**2 and maybe even x**3 when >>>the power is an integer is still a good idea. >>> >>> >> >>Well, what I had done with Numeric did special case x**0, x**1, >>x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the >>exponent was a scalar (so x**y where y was an array wouldn't be). I >>think this is very useful, as I don't want to microoptimize my code to >>x*x instead of x**2. The reason for just scalar exponents was so >>choosing how to do the power was lifted out of the inner loop. With >>that, x**2 was as fast as x*x. >> >> > This is getting harder to object to since, try as I might I can't get > a**b to go nonmontonic in the vicinity of b==2. I run out of floating > point resolution before the slight shift due to special casing at 2 > results in nonmonoticity. I suspect that I could manage it with enough > work, but it would require some unlikely function of a**b. I'm not > sure if I'm really on board with this, but let me float a slightly > modified proposal anyway: > > 1. numpy.power stays as it is now. That way in the rare case that > someone runs into trouble they can drop back to power. Alternatively > there could be rawpower and power where rawpower has the current > behaviour. While the name rawpower sounds cool/cheesy, power is used > infrequently enough that I doubt it matters whether it has these > special case optimazations. +1 > > 2, Don't distinguish between scalars and arrays -- that just makes > things harder to explain. Makes the optimizations better, though. > 3. Python itself special cases all integral powers between -100 and > 100. Beg/borrow/steal their code. This makes it easier to explain > since all smallish integer powers are just automagically faster. > > 4. Is the performance advantage of special casing a**0.5 > signifigant? If so use the above trick to special case all half > integral and integral powers between -N and N. Since sqrt probably > chews up some time the cutoff. The cutoff probably shifts somewhat if > we're optimizing half integral as well as integral powers. Perhaps N > would be 32 or 64. > > The net result of this is that a**b would be computed using a > combination of repeated multiplication and sqrt for real integral and > half integral values of b between -N and N. That seems simpler to > explain and somewhat more useful as well. > > It sounds like a fun project although I'm not certain yet that it's a > good idea. Basically, my Numeric code looked like this: #define POWER_UFUNC3(prefix, basetype, exptype, outtype) \ static void prefix##_power(char **args, int *dimensions, \ int *steps, void *func) { \ int i, cis1=steps[0], cis2=steps[1], cos=steps[2], n=dimensions[0]; \ int is1=cis1/sizeof(basetype); \ int is2=cis2/sizeof(exptype); \ int os=cos/sizeof(outtype); \ basetype *i1 = (basetype *)(args[0]); \ exptype *i2=(exptype *)(args[1]); \ outtype *op=(outtype *)(args[2]); \ if (is2 == 0) { \ exptype exponent = i2[0]; \ if (POWER_equal(exponent, 0.0)) { \ for (i = 0; i < n; i++, op += os) { \ POWER_one((*op)) \ } \ } else if (POWER_equal(exponent, 1.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ *op = *i1; \ } \ } else if (POWER_equal(exponent, 2.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_square((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, -1.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_inverse((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, 3.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_cube((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, 4.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_fourth((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, 0.5)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_sqrt((*op),(*i1)) \ } \ } else { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_pow((*op), (*i1), (exponent)) \ } \ } \ } else { \ for (i=0; i|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From thorin at gmail.com Mon Feb 13 22:33:01 2006 From: thorin at gmail.com (Curtis Spencer) Date: Mon Feb 13 22:33:01 2006 Subject: [Numpy-discussion] Python Equivalent of Matlab lpc Message-ID: Hi, I am trying to pull cepstral coefficients from wav files for a speech recognizer and I am wondering if there is a python equivalent of the lpc function in matlab? If not, anyone know of any other good ways to featurize speech vectors with python included functions? Thanks, Curtis From arnd.baecker at web.de Tue Feb 14 02:55:02 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 14 02:55:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: On Mon, 13 Feb 2006, David M. Cooke wrote: > Gary Ruben writes: > > > Tim Hochberg wrote: > > > >> However, I'm not convinced this is a good idea for numpy. This would > >> introduce a discontinuity in a**b that could cause problems in some > >> cases. If, for instance, one were running an iterative solver of > >> some sort (something I've been known to do), and b was a free > >> variable, it could get stuck at b = 2 since things would go > >> nonmonotonic there. > > > > I don't quite understand the problem here. Tim says Python special > > cases integer powers but then talks about the problem when b is a > > floating type. I think special casing x**2 and maybe even x**3 when > > the power is an integer is still a good idea. > > Well, what I had done with Numeric did special case x**0, x**1, > x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the > exponent was a scalar (so x**y where y was an array wouldn't be). I > think this is very useful, as I don't want to microoptimize my code to > x*x instead of x**2. The reason for just scalar exponents was so > choosing how to do the power was lifted out of the inner loop. With > that, x**2 was as fast as x*x. +1 from me for the special casing. A speed improvement of more than a factor 16 (from David's numbers) is something relevant! Best, Arnd From cimrman3 at ntc.zcu.cz Tue Feb 14 05:16:01 2006 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue Feb 14 05:16:01 2006 Subject: [Numpy-discussion] transpose function Message-ID: <43F1D7F5.6060807@ntc.zcu.cz> Just now I have stumbled on a non-intuitive thing with the transpose function. The help says: """ Help on function transpose in module numpy.core.oldnumeric: transpose(a, axes=None) transpose(a, axes=None) returns array with dimensions permuted according to axes. If axes is None (default) returns array with dimensions reversed. """ There are many functions in scipy that accept the 'axis' argument which is a single integer number. I have overlooked that here it is 'axes' and see what happens (I expected 'flipping' the array around the given single axis, well...): import scipy as nm In [26]:b = nm.zeros( (3,4) ) In [27]:b Out[27]: array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]) In [29]:nm.transpose( b, 0 ) Out[29]:array([0, 0, 0]) In [30]:nm.transpose( b, 1 ) Out[30]:array([0, 0, 0, 0]) So I propose either to replace 'axes' with 'order' or give an example in the docstring. It would be also good to raise an exception when the length of the 'axes' argument does not match the array rank and/or does not contain a permutation (no repetitions) of relevant indices. What do the gurus think? r. From tim.hochberg at cox.net Tue Feb 14 08:58:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 14 08:58:03 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> Message-ID: <43F20BFE.5030100@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>David M. Cooke wrote: >> >> >> >>>Gary Ruben writes: >>> >>> >>> >>> >>> >>>>Tim Hochberg wrote: >>>> >>>> >>>> >>>> >>>> >>>>>However, I'm not convinced this is a good idea for numpy. This would >>>>>introduce a discontinuity in a**b that could cause problems in some >>>>>cases. If, for instance, one were running an iterative solver of >>>>>some sort (something I've been known to do), and b was a free >>>>>variable, it could get stuck at b = 2 since things would go >>>>>nonmonotonic there. >>>>> >>>>> >>>>> >>>>> >>>>I don't quite understand the problem here. Tim says Python special >>>>cases integer powers but then talks about the problem when b is a >>>>floating type. I think special casing x**2 and maybe even x**3 when >>>>the power is an integer is still a good idea. >>>> >>>> >>>> >>>> >>>Well, what I had done with Numeric did special case x**0, x**1, >>>x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the >>>exponent was a scalar (so x**y where y was an array wouldn't be). I >>>think this is very useful, as I don't want to microoptimize my code to >>>x*x instead of x**2. The reason for just scalar exponents was so >>>choosing how to do the power was lifted out of the inner loop. With >>>that, x**2 was as fast as x*x. >>> >>> >>> >>> >>This is getting harder to object to since, try as I might I can't get >>a**b to go nonmontonic in the vicinity of b==2. I run out of floating >>point resolution before the slight shift due to special casing at 2 >>results in nonmonoticity. I suspect that I could manage it with enough >>work, but it would require some unlikely function of a**b. I'm not >>sure if I'm really on board with this, but let me float a slightly >>modified proposal anyway: >> >> 1. numpy.power stays as it is now. That way in the rare case that >>someone runs into trouble they can drop back to power. Alternatively >>there could be rawpower and power where rawpower has the current >>behaviour. While the name rawpower sounds cool/cheesy, power is used >>infrequently enough that I doubt it matters whether it has these >>special case optimazations. >> >> > >+1 > > > >> 2, Don't distinguish between scalars and arrays -- that just makes >>things harder to explain. >> >> > >Makes the optimizations better, though. > > Ah, Because you can hoist all the checks for what type of optimization to do, if any, out of the core loop, right? That's a good point. Still I'm not keen on a**b having different performance *and* different results depending on whether b is a scalar or matrix. The first thing to do is to measure how much overhead doing the optimization element by element is going to add. Assuming that it's signifigant that leaves us with the familiar dilema: fast, simple or general purpose; pick any two. 1. Do what I've proposed: optimize things at the c_pow level. This is general purpose and relatively simple to implement (since we can steal most of the code from complexobject.c). It may have a signifigant speed penalty versus 2 though: 2. Do what you've proposed: optimize things at the ufunc level. This fast and relatively simple to implement. It's more limited in scope and a bit harder to explain than 2. 3. Do both. This is straightforward, but adds a bunch of extra code paths with all the attendant required testing and possibility for bugs. So, fast, general purpose, but not simple. > > >> 3. Python itself special cases all integral powers between -100 and >>100. Beg/borrow/steal their code. This makes it easier to explain >>since all smallish integer powers are just automagically faster. >> >> 4. Is the performance advantage of special casing a**0.5 >>signifigant? If so use the above trick to special case all half >>integral and integral powers between -N and N. Since sqrt probably >>chews up some time the cutoff. The cutoff probably shifts somewhat if >>we're optimizing half integral as well as integral powers. Perhaps N >>would be 32 or 64. >> >>The net result of this is that a**b would be computed using a >>combination of repeated multiplication and sqrt for real integral and >>half integral values of b between -N and N. That seems simpler to >>explain and somewhat more useful as well. >> >>It sounds like a fun project although I'm not certain yet that it's a >>good idea. >> >> > >Basically, my Numeric code looked like this: > >#define POWER_UFUNC3(prefix, basetype, exptype, outtype) \ >static void prefix##_power(char **args, int *dimensions, \ > int *steps, void *func) { \ > int i, cis1=steps[0], cis2=steps[1], cos=steps[2], n=dimensions[0]; \ > int is1=cis1/sizeof(basetype); \ > int is2=cis2/sizeof(exptype); \ > int os=cos/sizeof(outtype); \ > basetype *i1 = (basetype *)(args[0]); \ > exptype *i2=(exptype *)(args[1]); \ > outtype *op=(outtype *)(args[2]); \ > if (is2 == 0) { \ > exptype exponent = i2[0]; \ > if (POWER_equal(exponent, 0.0)) { \ > for (i = 0; i < n; i++, op += os) { \ > POWER_one((*op)) \ > } \ > } else if (POWER_equal(exponent, 1.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > *op = *i1; \ > } \ > } else if (POWER_equal(exponent, 2.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_square((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, -1.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_inverse((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, 3.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_cube((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, 4.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_fourth((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, 0.5)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_sqrt((*op),(*i1)) \ > } \ > } else { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_pow((*op), (*i1), (exponent)) \ > } \ > } \ > } else { \ > for (i=0; i POWER_pow((*op), (*i1), (*i2)) \ > } \ > } \ >} >#define POWER_UFUNC(prefix, type) POWER_UFUNC3(prefix, type, type, type) > >#define FTYPE float >#define POWER_equal(x,y) x == y >#define POWER_one(o) o = 1.0; >#define POWER_square(o,x) o = x*x; >#define POWER_inverse(o,x) o = 1.0 / x; >#define POWER_cube(o,x) FTYPE y=x; o = y*y*y; >#define POWER_fourth(o,x) FTYPE y=x, s = y*y; o = s * s; >#define POWER_sqrt(o,x) o = sqrt(x); >#define POWER_pow(o,x,n) o = pow(x, n); >POWER_UFUNC(FLOAT, float) >POWER_UFUNC3(FLOATD, float, double, float) > >plus similiar definitions for float, double, complex float, and >complex double. Using the POWER_square, etc. macros means the complex >case was easy to add. > >The speed comes from the inlining of how to do the power _outside_ the >inner loop. The reason x**2, etc. are slower currently is there is a >function call in the inner loop. Your's and mine C library's pow() >function mostly likely does something like I have above, for a single >case: pow(x, 2.0) is calculated as x*x. However, each time through it >has do decide _how_ to do it. > > Part of our difference in perspective comes from the fact that I've just been staring at the guts of complex power. In this case you always have function calls at present, even for s*s. (At least I'm fairly certain that doesn't get inlined although I haven't checked). Since much of the work I do is with complex matrices, it's appropriate that I focus on this. Have you measured the effect of a function call on the speed here, or is that just an educated guess. If it's an educated guess, it's probably worth determining how much of speed hit the function call actually causes. I was going to try to get a handle on this by comparing multiplication of Complex numbers (which requires a function call plus more math), with multiplication of Floats which does not. Perversly, the Complex multiplication came out marginally faster, which is hard to explain whichever way you look at it. >>> timeit.Timer("a*b", "from numpy import arange; a = arange(10000)+0j; b = arange(10000)+0j").time it(10000) 3.2974959107959876 >>> timeit.Timer("a*b", "from numpy import arange; a = arange(10000); b = arange(10000)").timeit(100 00) 3.4541194481425919 >That's why I limited the optimization to scalar exponents: array >exponents would mean it's about as slow as the pow() call, even if the >checks were inlined into the loop. It would probably be even slower >for the non-optimized case, as you'd check for the special exponents, >then call pow() if it fails (which would likely recheck the exponents). > > Again, here I'm thinking of the complex case. In that case at least, I don't think that the non-optimized case would take a noticeable speed hit. I would put it into pow itself, which already special cases a==0 and b==0. For float pow it might, but that's already slow, so I doubt that it would make much difference. >Maybe a simple way to add this is to rewrite x.__pow__() as something >like the C equivalent of > >def __pow__(self, p): > if p is not a scalar: > return power(self, p) > elif p == 1: > return p > elif p == 2: > return square(self) > elif p == 3: > return cube(self) > elif p == 4: > return power_4(self) > elif p == 0: > return ones(self.shape, dtype=self.dtype) > elif p == -1: > return 1.0/self > elif p == 0.5: > return sqrt(self) > >and add ufuncs square, cube, power_4 (etc.). > > It sounds like we need to benchmark some stuff and see what we come up with. One approach would be for each of us to implement this for one time (say float) and see how the approaches compare speed wise. That's not entirely fair as my approach will do much better at complex than float I believe, but it's certainly easier. regards, -tim From Chris.Barker at noaa.gov Tue Feb 14 11:43:02 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 14 11:43:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> <43F110D6.6060302@noaa.gov> Message-ID: <43F2328E.8090306@noaa.gov> Sasha wrote: > However, if others think a link to more detailed explanation belongs > to glossary entries, the natural destination of the link would be a > page in Travis' book. Not unless that glossary is part of the book. Links in the Wiki should point to the wiki, or maybe to other openly available sources on the web. A link isn't critical, but a page about broadcasting would still be nice, it's a great feature of numpy. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant.travis at ieee.org Tue Feb 14 15:02:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 14 15:02:03 2006 Subject: [Numpy-discussion] New release of NumPy coming Message-ID: <43F26137.5040901@ieee.org> I'd like to make a new release of NumPy in the next day or two. If there are any outstanding issues, please let me know. -Travis From cookedm at physics.mcmaster.ca Tue Feb 14 15:13:04 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 14 15:13:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases (was: indexing problem) In-Reply-To: <43F20BFE.5030100@cox.net> (Tim Hochberg's message of "Tue, 14 Feb 2006 09:57:34 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> Message-ID: [changed subject to reflect this better] Tim Hochberg writes: > David M. Cooke wrote: >>Tim Hochberg writes: >>>David M. Cooke wrote: >>> 2, Don't distinguish between scalars and arrays -- that just makes >>>things harder to explain. >>Makes the optimizations better, though. >> >> > Ah, Because you can hoist all the checks for what type of optimization > to do, if any, out of the core loop, right? That's a good point. Still > I'm not keen on a**b having different performance *and* different > results depending on whether b is a scalar or matrix. The first thing > to do is to measure how much overhead doing the optimization element > by element is going to add. Assuming that it's signifigant that leaves > us with the familiar dilema: fast, simple or general purpose; pick any > two. > > 1. Do what I've proposed: optimize things at the c_pow level. This is > general purpose and relatively simple to implement (since we can steal > most of the code from complexobject.c). It may have a signifigant > speed penalty versus 2 though: > > 2. Do what you've proposed: optimize things at the ufunc level. This > fast and relatively simple to implement. It's more limited in scope > and a bit harder to explain than 2. > > 3. Do both. This is straightforward, but adds a bunch of extra code > paths with all the attendant required testing and possibility for > bugs. So, fast, general purpose, but not simple. Start with #1, then try #2. The problem with #2 is that you still have to include #1: if you're doing x**y when y is an array, then you have do if (y==2) etc. checks in your inner loop anyways. In that case, you might as well do it in nc_pow. At that point, it may be better to move the #1 optimization to the level of x.__pow__ (see below). >>The speed comes from the inlining of how to do the power _outside_ the >>inner loop. The reason x**2, etc. are slower currently is there is a >>function call in the inner loop. Your's and mine C library's pow() >>function mostly likely does something like I have above, for a single >>case: pow(x, 2.0) is calculated as x*x. However, each time through it >>has do decide _how_ to do it. >> >> > Part of our difference in perspective comes from the fact that I've > just been staring at the guts of complex power. In this case you > always have function calls at present, even for s*s. (At least I'm > fairly certain that doesn't get inlined although I haven't checked). > Since much of the work I do is with complex matrices, it's > appropriate that I focus on this. Ah, ok, now things are clicking. Complex power is going to be harder, because making sure that going from x**2.001 to x**2 doesn't do some funny complex branch cut stuff (I work in reals all the time :-) For the real numbers, these type of optimizations *are* a big win, and don't have the same type of continuity problems. I'll put them into numpy soon. > Have you measured the effect of a function call on the speed here, or > is that just an educated guess. If it's an educated guess, it's > probably worth determining how much of speed hit the function call > actually causes. I was going to try to get a handle on this by > comparing multiplication of Complex numbers (which requires a function > call plus more math), with multiplication of Floats which does not. > Perversly, the Complex multiplication came out marginally faster, > which is hard to explain whichever way you look at it. > >>>> timeit.Timer("a*b", "from numpy import arange; a = > arange(10000)+0j; b = arange(10000)+0j").time > it(10000) > 3.2974959107959876 >>>> timeit.Timer("a*b", "from numpy import arange; a = arange(10000); >>>> b > = arange(10000)").timeit(100 > 00) > 3.4541194481425919 You're not multiplying floats in the last one: you're multiplying integers. You either need to use a = arange(10000.0), or a = arange(10000.0, dtype=float) (to be more specific). Your integer numbers are about 3x better than mine, though (difference in architecture, maybe? I'm on an Athlon64). >>That's why I limited the optimization to scalar exponents: array >>exponents would mean it's about as slow as the pow() call, even if the >>checks were inlined into the loop. It would probably be even slower >>for the non-optimized case, as you'd check for the special exponents, >>then call pow() if it fails (which would likely recheck the exponents). >> >> > Again, here I'm thinking of the complex case. In that case at least, I > don't think that the non-optimized case would take a noticeable speed > hit. I would put it into pow itself, which already special cases a==0 > and b==0. For float pow it might, but that's already slow, so I doubt > that it would make much difference. It does make a bit of difference with float pow: the general case slows down a bit. >>Maybe a simple way to add this is to rewrite x.__pow__() as something >>like the C equivalent of >> >>def __pow__(self, p): >> if p is not a scalar: >> return power(self, p) >> elif p == 1: >> return p >> elif p == 2: >> return square(self) >> elif p == 3: >> return cube(self) >> elif p == 4: >> return power_4(self) >> elif p == 0: >> return ones(self.shape, dtype=self.dtype) >> elif p == -1: >> return 1.0/self >> elif p == 0.5: >> return sqrt(self) >> >>and add ufuncs square, cube, power_4 (etc.). > > It sounds like we need to benchmark some stuff and see what we come up > with. One approach would be for each of us to implement this for one > time (say float) and see how the approaches compare speed wise. That's > not entirely fair as my approach will do much better at complex than > float I believe, but it's certainly easier. The way the ufuncs are templated, we can split out the complex routines easily enough. Here's what I propose: - add a square() ufunc, where square(x) == x*x (but faster of course) - I'll fiddle with the floats - you fiddle with the complex numbers :-) I've created a new branch in svn, at http://svn.scipy.org/svn/numpy/branches/power_optimization to do this fiddling. The changes below I mention are all checked in as revision 2104 (http://projects.scipy.org/scipy/numpy/changeset/2104). I've added a square() ufunc to the power_optimization branch because I'd argue that it's probably *the* most common use of **. I've implemented it, and it's as fast as a*a for reals, and runs in 2/3 the time as a*a for complex (which makes sense: squaring a complex numbers has 3 real multiplications, while multiplying has 4 in the (simple) scheme [1]). At least with square(), there's no argument about continuity, as it only squares :-). The next step I'd suggest is special-casing x.__pow__, like I suggest above. We could just test for integer scalar exponents (0, 1, 2), and just special-case those (returning ones(), x.copy(), square(x)), and leave all the rest to power(). I've also checked in code to the power_optimization branch that special cases power(x, ), or anytime the basic ufunc gets called with a stride of 0 for the exponent. It doesn't do complex x, so no problems on your side, but it's a good chunk faster for this case than what we've got now. One reason I'm also looking at adding square() is because my optimization of power() makes x**2 run (only) 1.5 slower than x*x (and I can't for the life of me see where that 0.5 is coming from! It should be 1.0 like square()!). [1] which brings up another point. Would using the 3-multiplication version for complex multiplication be good? There might be some effects with cancellation errors due to the extra subtractions... -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From ndarray at mac.com Tue Feb 14 15:21:05 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 14 15:21:05 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: I was deliquent checking in ma code that takes advantage of context in __array__ . Will try to do it tomorrow. (Need to add more tests.) On 2/14/06, Travis Oliphant wrote: > > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. > > -Travis > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From strawman at astraw.com Tue Feb 14 21:08:02 2006 From: strawman at astraw.com (Andrew Straw) Date: Tue Feb 14 21:08:02 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: <43F2B6FC.7040508@astraw.com> Travis Oliphant wrote: > > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. > Here's one: http://projects.scipy.org/scipy/numpy/ticket/4 From strawman at astraw.com Tue Feb 14 22:24:02 2006 From: strawman at astraw.com (Andrew Straw) Date: Tue Feb 14 22:24:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: <43F2C8C6.5030909@astraw.com> Bill Baxter wrote: > On the point of professionalism, I'd like to change the matlab page's > title from "NumPy for Matlab Addicts" to simply "NumPy for Matlab > Users". It's been bugging me since I put it up there initially... but > I'm not really sure how to chage the name of a page in the wiki. I went ahead and renamed the page and created a redirect from the old page. From pearu at scipy.org Wed Feb 15 00:10:08 2006 From: pearu at scipy.org (Pearu Peterson) Date: Wed Feb 15 00:10:08 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F2B6FC.7040508@astraw.com> References: <43F26137.5040901@ieee.org> <43F2B6FC.7040508@astraw.com> Message-ID: On Tue, 14 Feb 2006, Andrew Straw wrote: > Travis Oliphant wrote: > >> >> I'd like to make a new release of NumPy in the next day or two. If there >> are any outstanding issues, please let me know. >> > Here's one: > http://projects.scipy.org/scipy/numpy/ticket/4 Fixed in svn. Pearu From arnd.baecker at web.de Wed Feb 15 01:18:03 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Wed Feb 15 01:18:03 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: On Tue, 14 Feb 2006, Travis Oliphant wrote: > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. It seems that the icc stuff has fallen through the cracks, http://article.gmane.org/gmane.comp.python.numeric.general/3517/ (it is not relevant to me at this point - it was only a test after your request for compilations with compilers other than gcc ;-). Best, Arnd From faltet at carabos.com Wed Feb 15 02:02:27 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 15 02:02:27 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: <200602151100.37229.faltet@carabos.com> A Dimecres 15 Febrer 2006 00:01, Travis Oliphant va escriure: > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. Well, I've run all the recently added tests for unicode in both UCS4 and UCS2 platforms and all passes flawlessly. Very good :-) However, there are some problem when trying to load the unicode tests when running the complete suite: In [1]: import numpy In [2]: numpy.test(1) Found 11 tests for numpy.core.umath Found 8 tests for numpy.lib.arraysetops Found 26 tests for numpy.core.ma Found 6 tests for numpy.core.records Found 14 tests for numpy.core.numeric Found 4 tests for numpy.distutils.misc_util Found 3 tests for numpy.lib.getlimits Found 30 tests for numpy.core.numerictypes Found 9 tests for numpy.lib.twodim_base Found 1 tests for numpy.core.oldnumeric Found 44 tests for numpy.lib.shape_base Found 4 tests for numpy.lib.index_tricks Found 42 tests for numpy.lib.type_check Found 3 tests for numpy.dft.helper Warning: !! FAILURE importing tests for /usr/lib/python2.3/site-packages/numpy/core/tests/test_multiarray.py:195: ImportError: No module named test_unicode (in ?) Found 7 tests for numpy.core.defmatrix Found 33 tests for numpy.lib.function_base Found 0 tests for __main__ ....................................................................................................................................................................................................................................................... ---------------------------------------------------------------------- Ran 247 tests in 0.951s OK I've been trying to see how to correctly load the unicode tests, but failed miserably. Perhaps Pearu can tell us about the correct way to do that. Thanks, >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From schofield at ftw.at Wed Feb 15 02:32:02 2006 From: schofield at ftw.at (Ed Schofield) Date: Wed Feb 15 02:32:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F15DDF.1030807@ieee.org> References: <43F139CD.8010407@cox.net> <43F15DDF.1030807@ieee.org> Message-ID: <43F302F8.7080200@ftw.at> Travis Oliphant wrote: > I think I originally tried to make mat *not* return a copy, but this > actually broke code in SciPy. So, I left the default as it was as a > copy on input. There is an *asmatrix* command that does not return a > copy... All SciPy's unit tests actually pass with a default of copy=False in the matrix constructor. So SciPy needn't be the blocker here. I'd like to cast a vote for not copying by default, in the interests of efficiency and, as Bill Baxter argued, usefulness. -- Ed From pearu at scipy.org Wed Feb 15 03:20:01 2006 From: pearu at scipy.org (Pearu Peterson) Date: Wed Feb 15 03:20:01 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <200602151100.37229.faltet@carabos.com> References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> Message-ID: On Wed, 15 Feb 2006, Francesc Altet wrote: > I've been trying to see how to correctly load the unicode tests, but > failed miserably. Perhaps Pearu can tell us about the correct way to > do that. I have fixed it in svn. When importing modules from tests/ directory, one must surround the corresponding import statements with set_local_path() and restore_path() calls. Regards, Pearu From faltet at carabos.com Wed Feb 15 05:07:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 15 05:07:03 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> Message-ID: <200602151405.47640.faltet@carabos.com> A Dimecres 15 Febrer 2006 11:18, Pearu Peterson va escriure: > On Wed, 15 Feb 2006, Francesc Altet wrote: > > I've been trying to see how to correctly load the unicode tests, but > > failed miserably. Perhaps Pearu can tell us about the correct way to > > do that. > > I have fixed it in svn. When importing modules from tests/ directory, one > must surround the corresponding import statements with set_local_path() > and restore_path() calls. Ah, ok. Is there any place where this is explained or we have to use the source to figure out these sort of things? Thanks, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From arnd.baecker at web.de Wed Feb 15 05:25:04 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Wed Feb 15 05:25:04 2006 Subject: [Numpy-discussion] numpy/scipy transition Message-ID: Hi, concerning the transition from Numeric/scipy to the new numpy/scipy I have a couple of points for which I would be very interested in any advice/suggestions: As some of you might know we are running a computational physics course using python+Numeric+scipy+ipython. In six weeks the course will be run again and we are facing the question whether to switch to numpy/scipy right now or to delay this for one more year. Reasons to switch + numpy is better in several respects compared to Numeric + numpy/scipy installs more easily on newer machines + students will learn the most recent tools + extensive and alive documentation on www.scipy.org Reasons not to switch - there is no enthought edition yet (right?) - there are only packages for a few platforms/distribution - we need scipy.xplt (matplotlib is still no option at this point) Discussion/Background: To us the two main show-stoppers are scipy.xplt and the question about an Enthought Edition for Windows. For the Pool of PCs where the tutorial groups are to be held, it won't be a problem to install numpy/scipy in such a way that scipy.sandbox.xplt is visible as scipy.xplt (at least I hope). However, the students will either have windows (around 80%) or Linux at home. For windows users we have used the Enthought Edition (http://code.enthought.com/enthon/) and linux users were pointed to available packages for their machines or to install Numeric/scipy themselves. Concerning xplt another option might be to install scipy.sandbox.xplt in such a way that a `import xplt` would work. If that is possible we could try to supply `xplt` separately for some of the distributions, and maybe also for windows (which I don't use, so I have no idea how difficult that would be). If something like this was possible, the main question is whether a new enthon distribution with new numpy/scipy/ipython and all the other niceties of mayavi/VTK/wxPython/.... will come out in the near future? I would really love to use the new numpy/scipy - so any ideas are very welcome! Best, Arnd From manouchk at gmail.com Wed Feb 15 07:19:05 2006 From: manouchk at gmail.com (manouchk) Date: Wed Feb 15 07:19:05 2006 Subject: [Numpy-discussion] compressed method in doc Message-ID: <200602151326.49403.manouchk@gmail.com> Hi, First of all, I'm quite new to python and numpy (maybe to new!)... I was loooking for a convenient way to convert a (1D) masked array to a numeric array which only contain the not masked values. I spent several hours to figure out that the method "compressed" was the exact thing I was looking for! So I'm wondering what is the common way to find the right method (if there is) for a beginner ? Look at help of all methods (using completion to find all methods with tab key)? I hope the question is not too much basic for numpy mailing-list! Emmanuel From chanley at stsci.edu Wed Feb 15 08:59:16 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Wed Feb 15 08:59:16 2006 Subject: [Numpy-discussion] field method in recarray Message-ID: <43F35DB6.6070905@stsci.edu> Hi Travis, I have added a field method to recarray. This allows field access via either field name or index number. Example below: In [1]: from numpy import rec In [2]: r = rec.fromrecords([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3') In [3]: r.field('col1') Out[3]: array([456, 2]) In [4]: r.field(0) Out[4]: array([456, 2]) In [5]: r.field(0)[1]=1000 In [6]: r.field(0) Out[6]: array([ 456, 1000]) Chris From cwmoad at gmail.com Wed Feb 15 09:05:05 2006 From: cwmoad at gmail.com (Charlie Moad) Date: Wed Feb 15 09:05:05 2006 Subject: [Numpy-discussion] distutils env variables Message-ID: <6382066a0602150903t3d0667bes55165f23843f7b7b@mail.gmail.com> So numpy's distutils highjacking doesn't seem to respect environment variables. When I try something like... CPPFLAGS="-I/extra/include/path" python setup.py build ... it seems ignored. Is this intentional, or just missing functionality. Thanks, Charlie From tim.hochberg at cox.net Wed Feb 15 09:15:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 09:15:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F302F8.7080200@ftw.at> References: <43F139CD.8010407@cox.net> <43F15DDF.1030807@ieee.org> <43F302F8.7080200@ftw.at> Message-ID: <43F36174.3090703@cox.net> Ed Schofield wrote: >Travis Oliphant wrote: > > > >>I think I originally tried to make mat *not* return a copy, but this >>actually broke code in SciPy. So, I left the default as it was as a >>copy on input. There is an *asmatrix* command that does not return a >>copy... >> >> > >All SciPy's unit tests actually pass with a default of copy=False in the >matrix constructor. So SciPy needn't be the blocker here. I'd like to >cast a vote for not copying by default, in the interests of efficiency >and, as Bill Baxter argued, usefulness. > > I would like to cast a vote for keeping the behaviour the same. Note that: mat([[1,2,3], [3,4,5]]) will always create a copy of its data by necesesity. Which means that changing the default copy=False means that some data will be copied, while others will not be, potentially leading to subtle bugs. I strongly disaprove of this sort of inconstent behaviour (don't get me started on reshape!). In this situation, people should just use asmatrix. -tim >-- Ed > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From tim.hochberg at cox.net Wed Feb 15 09:37:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 09:37:02 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: <43F36675.7040501@cox.net> Travis Oliphant wrote: > > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. > Just a datapoint: I just compiled a clean checkout here using VC7 on windows XP and it compiled and passed all tests. -tim From tim.hochberg at cox.net Wed Feb 15 10:03:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 10:03:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> Message-ID: <43F36CB6.5050004@cox.net> David M. Cooke wrote: >[changed subject to reflect this better] > >Tim Hochberg writes: > > >>David M. Cooke wrote: >> >> >>>Tim Hochberg writes: >>> >>> >>>>David M. Cooke wrote: >>>> 2, Don't distinguish between scalars and arrays -- that just makes >>>>things harder to explain. >>>> >>>> >>>Makes the optimizations better, though. >>> >>> >>> >>> >>Ah, Because you can hoist all the checks for what type of optimization >>to do, if any, out of the core loop, right? That's a good point. Still >>I'm not keen on a**b having different performance *and* different >>results depending on whether b is a scalar or matrix. The first thing >>to do is to measure how much overhead doing the optimization element >>by element is going to add. Assuming that it's signifigant that leaves >>us with the familiar dilema: fast, simple or general purpose; pick any >>two. >> >>1. Do what I've proposed: optimize things at the c_pow level. This is >>general purpose and relatively simple to implement (since we can steal >>most of the code from complexobject.c). It may have a signifigant >>speed penalty versus 2 though: >> >>2. Do what you've proposed: optimize things at the ufunc level. This >>fast and relatively simple to implement. It's more limited in scope >>and a bit harder to explain than 2. >> >>3. Do both. This is straightforward, but adds a bunch of extra code >>paths with all the attendant required testing and possibility for >>bugs. So, fast, general purpose, but not simple. >> >> > >Start with #1, then try #2. The problem with #2 is that you still have >to include #1: if you're doing x**y when y is an array, then you have >do if (y==2) etc. checks in your inner loop anyways. In that case, you >might as well do it in nc_pow. At that point, it may be better to move >the #1 optimization to the level of x.__pow__ (see below). > > OK. >>>The speed comes from the inlining of how to do the power _outside_ the >>>inner loop. The reason x**2, etc. are slower currently is there is a >>>function call in the inner loop. Your's and mine C library's pow() >>>function mostly likely does something like I have above, for a single >>>case: pow(x, 2.0) is calculated as x*x. However, each time through it >>>has do decide _how_ to do it. >>> >>> >>> >>> >>Part of our difference in perspective comes from the fact that I've >>just been staring at the guts of complex power. In this case you >>always have function calls at present, even for s*s. (At least I'm >>fairly certain that doesn't get inlined although I haven't checked). >>Since much of the work I do is with complex matrices, it's >>appropriate that I focus on this. >> >> > >Ah, ok, now things are clicking. Complex power is going to be harder, >because making sure that going from x**2.001 to x**2 doesn't do some >funny complex branch cut stuff (I work in reals all the time :-) > > We're always dealing with the principle branch though, so probably we can just ignore any branch cut issues. We'll see I suppose. >For the real numbers, these type of optimizations *are* a big win, and >don't have the same type of continuity problems. I'll put them into numpy soon. > > Complex power is something like thirty times slower than s*s, so there is some room for optimization there. Some peril too though, as you note. > > >>Have you measured the effect of a function call on the speed here, or >>is that just an educated guess. If it's an educated guess, it's >>probably worth determining how much of speed hit the function call >>actually causes. I was going to try to get a handle on this by >>comparing multiplication of Complex numbers (which requires a function >>call plus more math), with multiplication of Floats which does not. >>Perversly, the Complex multiplication came out marginally faster, >>which is hard to explain whichever way you look at it. >> >> >> >>>>>timeit.Timer("a*b", "from numpy import arange; a = >>>>> >>>>> >>arange(10000)+0j; b = arange(10000)+0j").time >>it(10000) >>3.2974959107959876 >> >> >>>>>timeit.Timer("a*b", "from numpy import arange; a = arange(10000); >>>>>b >>>>> >>>>> >>= arange(10000)").timeit(100 >>00) >>3.4541194481425919 >> >> > >You're not multiplying floats in the last one: you're multiplying >integers. You either need to use a = arange(10000.0), or a = >arange(10000.0, dtype=float) (to be more specific). > > Doh! Now that's embarassing. Well, when I actually measure float multiplication, it's between two and ten times as fast. For small arrays (N=1000) the difference is relatively small (3.5x), I assume because the setup overhead starts to dominate. For midsized array (N=10,000) the difference is larger (10x). For large arrays (N=100,000) the difference becomes small (2x). Presumably the memory is no longer fitting in the cache and I'm having memory bandwidth issues. >Your integer numbers are about 3x better than mine, though (difference >in architecture, maybe? I'm on an Athlon64). > > I'm on a P4. > > >>>That's why I limited the optimization to scalar exponents: array >>>exponents would mean it's about as slow as the pow() call, even if the >>>checks were inlined into the loop. It would probably be even slower >>>for the non-optimized case, as you'd check for the special exponents, >>>then call pow() if it fails (which would likely recheck the exponents). >>> >>> >>> >>> >>Again, here I'm thinking of the complex case. In that case at least, I >>don't think that the non-optimized case would take a noticeable speed >>hit. I would put it into pow itself, which already special cases a==0 >>and b==0. For float pow it might, but that's already slow, so I doubt >>that it would make much difference. >> >> > >It does make a bit of difference with float pow: the general case >slows down a bit. > > OK. I was hoping that the difference would not be noticeable. I suspect that in the complex pow case, that will be the case since complex pow is so slow to begin with and since it already is doing some testing on the exponent. >>>Maybe a simple way to add this is to rewrite x.__pow__() as something >>>like the C equivalent of >>> >>>def __pow__(self, p): >>> if p is not a scalar: >>> return power(self, p) >>> elif p == 1: >>> return p >>> elif p == 2: >>> return square(self) >>> elif p == 3: >>> return cube(self) >>> elif p == 4: >>> return power_4(self) >>> elif p == 0: >>> return ones(self.shape, dtype=self.dtype) >>> elif p == -1: >>> return 1.0/self >>> elif p == 0.5: >>> return sqrt(self) >>> >>>and add ufuncs square, cube, power_4 (etc.). >>> >>> >>It sounds like we need to benchmark some stuff and see what we come up >>with. One approach would be for each of us to implement this for one >>time (say float) and see how the approaches compare speed wise. That's >>not entirely fair as my approach will do much better at complex than >>float I believe, but it's certainly easier. >> >> > >The way the ufuncs are templated, we can split out the complex >routines easily enough. > >Here's what I propose: > >- add a square() ufunc, where square(x) == x*x (but faster of course) >- I'll fiddle with the floats >- you fiddle with the complex numbers :-) > >I've created a new branch in svn, at >http://svn.scipy.org/svn/numpy/branches/power_optimization >to do this fiddling. The changes below I mention are all checked in as >revision 2104 (http://projects.scipy.org/scipy/numpy/changeset/2104). > >I've added a square() ufunc to the power_optimization branch because >I'd argue that it's probably *the* most common use of **. I've >implemented it, and it's as fast as a*a for reals, and runs in 2/3 the >time as a*a for complex (which makes sense: squaring a complex numbers >has 3 real multiplications, while multiplying has 4 in the (simple) >scheme [1]). > >At least with square(), there's no argument about continuity, as it >only squares :-). > > Actually that's not entirely true, This get's back to the odd inaccuracy that started this thread: >>> array([1234567j])**2 array([ -1.52415568e+12+0.00018665j]) If you special case this the extraneous imaginary value will vanish, but raising things to the 2.000001 power or 1.999999 power will still be off a similar amount. I played with this a bunch though and I couldn't come up with a plausible way for this to make things break. I suspect I could come up with an implausible one though. [some time passes while I sleep and otherwise try to live a normal life....] >The next step I'd suggest is special-casing x.__pow__, like I suggest >above. We could just test for integer scalar exponents (0, 1, 2), and >just special-case those (returning ones(), x.copy(), square(x)), and >leave all the rest to power(). > > As I've been thinking about this some more, I think the correct thing to do is not to mess with the power ufuncs at all. Rather in x.__pow__ (since I don't know that there's anywhere else to do it), after the above checks check the types of the array and in the cases where the first argument is a float or complex and the second argument is some sort of integer array. This would be dispatched to some other helper function instead of the normal pow_ufunc. In other words, optimize: A**2, A**2.0, A**(2.0+0j), etc and A**array([1,2,3]) but not A**array[1.0, 2.0, 3.0] I think that this takes care of the optimization slowing down power for general floats and optimizes the only array-array case that really matter. >I've also checked in code to the power_optimization branch that >special cases power(x, ), or anytime the basic ufunc >gets called with a stride of 0 for the exponent. It doesn't do complex >x, so no problems on your side, but it's a good chunk faster for this >case than what we've got now. One reason I'm also looking at adding >square() is because my optimization of power() makes x**2 run (only) >1.5 slower than x*x (and I can't for the life of me see where that 0.5 >is coming from! It should be 1.0 like square()!). > > I just checked out your branch and I'll fiddle with the complex stuff as I've got time. I've got relatives in town this week, so my extra cycles just dropped precipitously. >[1] which brings up another point. Would using the 3-multiplication >version for complex multiplication be good? There might be some >effects with cancellation errors due to the extra subtractions... > > I'm inclined to leave this be for now. Both because I'm unsure of the rounding issues and because I'm not sure it would actually be faster. It has one less multiplication, but several more additions, so it would depend on the relative speed add/sub with multiplication and how things end up getting scheduled in the FP pipeline. At some point it's probably worth trying; if it turns out to be signifigantly faster we can think about rounding then. If it's not faster then no need to think. -tim From oliphant.travis at ieee.org Wed Feb 15 10:24:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 10:24:02 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: References: Message-ID: <43F37183.3000103@ieee.org> Arnd Baecker wrote: >Reasons not to switch >- there is no enthought edition yet (right?) >- there are only packages for a few platforms/distribution >- we need scipy.xplt > (matplotlib is still no option at this point) > >Discussion/Background: > >To us the two main show-stoppers are scipy.xplt and the question >about an Enthought Edition for Windows. >For the Pool of PCs where the tutorial groups are to be held, >it won't be a problem to install numpy/scipy in such >a way that scipy.sandbox.xplt is visible as scipy.xplt >(at least I hope). >However, the students will either have windows (around 80%) >or Linux at home. For windows users we have used >the Enthought Edition (http://code.enthought.com/enthon/) >and linux users were pointed to >available packages for their machines or to install Numeric/scipy >themselves. > > > As long as there are binaries for all the packages. Just having a list of Windows installers can also work. Were you using all of what is in the Enthon edition? >Concerning xplt another option might be to >install scipy.sandbox.xplt in such a way >that a `import xplt` would work. If that is possible we could >try to supply `xplt` separately for some of the distributions, >and maybe also for windows (which I don't use, so I have >no idea how difficult that would be). > > I don't think that would be hard at all. You can just run python setup.py bdist_inst from within the sandbox/xplt directory and get a windows installer. >If something like this was possible, the main question is >whether a new enthon distribution with new numpy/scipy/ipython >and all the other niceties of mayavi/VTK/wxPython/.... >will come out in the near future? > > I have no idea about that one. But, it sounds like the guy (Joe) at Enthought who did most of the work on the Enthon distribution is no longer as available for them, so I'm not sure... -Travis From stefan at sun.ac.za Wed Feb 15 10:38:05 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed Feb 15 10:38:05 2006 Subject: [Numpy-discussion] possible bug in dtype for records Message-ID: <20060215184056.GA31926@alpha> Using In [3]: numpy.__version__ Out[3]: '0.9.5.2024' I see the following: In [4]: import numpy as N In [5]: ctype = N.dtype({'names': ('x', 'y', 'z'), 'formats' : [N.float32, N.float32, N.float32]}) In [6]: ctype Out[6]: dtype([('x', ' References: <20060215184056.GA31926@alpha> Message-ID: <43F37C1E.4010409@ee.byu.edu> Stefan van der Walt wrote: >Using > >In [3]: numpy.__version__ >Out[3]: '0.9.5.2024' > >I see the following: > >In [4]: import numpy as N > >In [5]: ctype = N.dtype({'names': ('x', 'y', 'z'), 'formats' : [N.float32, N.float32, N.float32]}) > >In [6]: ctype >Out[6]: dtype([('x', ' >In [7]: N.array([(1,2,3), (4,5,6)], dtype=ctype) >Segmentation fault > > A segmentation fault is never expected behavior. Thanks for pointing this out. I'll see if I can figure out what is wrong. -Travis From oliphant at ee.byu.edu Wed Feb 15 11:27:14 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 11:27:14 2006 Subject: [Numpy-discussion] possible bug in dtype for records In-Reply-To: <20060215184056.GA31926@alpha> References: <20060215184056.GA31926@alpha> Message-ID: <43F38054.4000704@ee.byu.edu> Stefan van der Walt wrote: >Using > >In [3]: numpy.__version__ >Out[3]: '0.9.5.2024' > >I see the following: > >In [4]: import numpy as N > >In [5]: ctype = N.dtype({'names': ('x', 'y', 'z'), 'formats' : [N.float32, N.float32, N.float32]}) > >In [6]: ctype >Out[6]: dtype([('x', ' >In [7]: N.array([(1,2,3), (4,5,6)], dtype=ctype) >Segmentation fault > >However, when I use a mutable list for defining dtype, i.e. > >'names': ['x', 'y', 'z'] instead of >'names': ('x', 'y', 'z') > >it works fine. > >Is this expected behaviour? > > Got it. Basically, the VOID_setitem code expected the special -1 entry in the fields dictionary to be a list, but it was never enforced. So, when you entered a tuple like this problems arose. I changed it in SVN so that the -1 entry is always a tuple (you can't change the fields names anyway unless you define a new data-type) so its more aptly described as a tuple. Thanks again for finding this... -Travis From viznut at charter.net Wed Feb 15 11:43:03 2006 From: viznut at charter.net (Randall Hopper) Date: Wed Feb 15 11:43:03 2006 Subject: [Numpy-discussion] Numeric.identify(4) failure Message-ID: Here on 64-bit Linux, I get strange Python errors with some Numeric functions (see below). Has this been fixed in more recent versions of Numeric? Currently running python-numeric-23.7-3 which is what comes with SuSE 9.3. Thanks, Randall Python 2.4 (#1, Mar 22 2005, 18:42:42) [GCC 3.3.5 20050117 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. import Numeric Numeric.identity(4) Traceback (most recent call last): File "", line 1, in ? File "/usr/lib64/python2.4/site-packages/Numeric/Numeric.py", line 604, in identity return resize(array([1]+n*[0],typecode=typecode), (n,n)) File "/usr/lib64/python2.4/site-packages/Numeric/Numeric.py", line 398, in resize return reshape(a, new_shape) ValueError: total size of new array must be unchanged From Chris.Barker at noaa.gov Wed Feb 15 12:00:08 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 15 12:00:08 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: References: Message-ID: <43F3881E.6060101@noaa.gov> Arnd Baecker wrote: > - we need scipy.xplt > (matplotlib is still no option at this point) Why not? just curious. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at ee.byu.edu Wed Feb 15 13:22:18 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 13:22:18 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3837B.5040603@gmail.com> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> Message-ID: <43F39B20.50502@ee.byu.edu> Robert Kern wrote: >skip at pobox.com wrote: > > >>I'm trying to build numpy an svn sandbox (just updated a couple minutes >>ago). If I grub around in numpy/distutils/system_info.py it says something >>about creating a site.cfg file with (for example) information about locating >>atlas. It says nothing about where this file belongs. >> >> > >Sure it does. "The file 'site.cfg' in the same directory as this module is read >for configuration options." I think it's a really bad place for it to be, but >that is the state of affairs right now. > > > So, in particular, does this mean that it is read from (relative to the location of the main setup.py file) numpy/distutils/site.cfg ?? Yes, that is a bad place. We need some suggestions as to where site.cfg should be read from. I think you can set the environment variable ATLAS to 'None' and it will ignore ATLAS... I believe this is true of any of the configuration options. -Travis From oliphant at ee.byu.edu Wed Feb 15 13:27:14 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 13:27:14 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <17395.33239.806646.855808@montanaro.dyndns.org> References: <17395.33239.806646.855808@montanaro.dyndns.org> Message-ID: <43F39C96.6080506@ee.byu.edu> skip at pobox.com wrote: >I'm trying to build numpy an svn sandbox (just updated a couple minutes >ago). If I grub around in numpy/distutils/system_info.py it says something >about creating a site.cfg file with (for example) information about locating >atlas. It says nothing about where this file belongs. I took a stab and >placed it in my numpy source tree, right next to setup.py, with these lines: > >Failing all this, is there some way to build numpy/scipy without atlas? At >this point I just want the damn thing to build. I'll worry about >performance later (if at all). > > Yes. You need to set the appropriate environment variables to 'None' In particular, on my system (which is multithreaded and has a BLAS picked up from /usr/lib and an unthreaded ATLAS that the system will find) export PTATLAS='None' export ATLAS='None' export BLAS='None' did the trick. -Travis From robert.kern at gmail.com Wed Feb 15 13:46:10 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 15 13:46:10 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F39B20.50502@ee.byu.edu> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> Message-ID: <43F3A0FD.4000004@gmail.com> Travis Oliphant wrote: > So, in particular, does this mean that it is read from (relative to the > location of the main setup.py file) > > numpy/distutils/site.cfg ?? Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils directory. > Yes, that is a bad place. We need some suggestions as to where > site.cfg should be read from. Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in system_info.py will give you this directory even if you start running the script from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can check this in if we agree that this is what we want. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From cookedm at physics.mcmaster.ca Wed Feb 15 14:06:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 15 14:06:02 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3A0FD.4000004@gmail.com> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> Message-ID: <20060215220433.GA30194@arbutus.physics.mcmaster.ca> On Wed, Feb 15, 2006 at 03:45:33PM -0600, Robert Kern wrote: > Travis Oliphant wrote: > > > So, in particular, does this mean that it is read from (relative to the > > location of the main setup.py file) > > > > numpy/distutils/site.cfg ?? > > Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils > directory. > > > Yes, that is a bad place. We need some suggestions as to where > > site.cfg should be read from. > > Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in > system_info.py will give you this directory even if you start running the script > from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can > check this in if we agree that this is what we want. As a note: Python's distutils looks for distutils.cfg in it's installed location (/usr/lib/python2.4/distutils or whatever), then in ~/.pydistutils.cfg (or $HOME/pydistutils.cfg on non-Posix systems like Windows), then for setup.cfg in the current directory. Keys in later files override ones in earlier files. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant at ee.byu.edu Wed Feb 15 14:15:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 14:15:01 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3A0FD.4000004@gmail.com> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> Message-ID: <43F3A7A4.6040703@ee.byu.edu> Robert Kern wrote: >Travis Oliphant wrote: > > > >>So, in particular, does this mean that it is read from (relative to the >>location of the main setup.py file) >> >>numpy/distutils/site.cfg ?? >> >> > >Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils >directory. > > > >>Yes, that is a bad place. We need some suggestions as to where >>site.cfg should be read from. >> >> > >Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in >system_info.py will give you this directory even if you start running the script >from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can >check this in if we agree that this is what we want. > > > I've started this process already. I think a useful search order is 1) next to current setup.py --- os.getcwd() is probably better than what I did (backing up the frame until you can't go back anymore and getting the __file__ from that frame). Incidentally, it looks like a site.cfg present there is already copied to numpy/distutils on install --- it's looks like its just not used for the numpy build itself. 2) in the compilers "HOME" directory --- not sure how to implement that. 3) in the system-wide directory (what is currently done --- except when you are installing numpy that means it has to be in numpy/distutils/site.cfg). I created a get_site_cfg() function in system_info where this searching can be done. Feel free to change it as appropriate. -Travis From oliphant at ee.byu.edu Wed Feb 15 14:16:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 14:16:02 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <20060215220433.GA30194@arbutus.physics.mcmaster.ca> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> <20060215220433.GA30194@arbutus.physics.mcmaster.ca> Message-ID: <43F3A800.4020508@ee.byu.edu> David M. Cooke wrote: >On Wed, Feb 15, 2006 at 03:45:33PM -0600, Robert Kern wrote: > > >>Travis Oliphant wrote: >> >> >> >>>So, in particular, does this mean that it is read from (relative to the >>>location of the main setup.py file) >>> >>>numpy/distutils/site.cfg ?? >>> >>> >>Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils >>directory. >> >> >> >>>Yes, that is a bad place. We need some suggestions as to where >>>site.cfg should be read from. >>> >>> >>Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in >>system_info.py will give you this directory even if you start running the script >>from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can >>check this in if we agree that this is what we want. >> >> > >As a note: Python's distutils looks for distutils.cfg in it's installed >location (/usr/lib/python2.4/distutils or whatever), then in >~/.pydistutils.cfg (or $HOME/pydistutils.cfg on non-Posix systems like >Windows), then for setup.cfg in the current directory. Keys in later files >override ones in earlier files. > > > I think this is a good plan. However, what I've started doesn't implement the over-riding process properly. Anybody want to take a stab at that? It would be nice if it could get into this next release. -Travis From cookedm at physics.mcmaster.ca Wed Feb 15 14:45:07 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 15 14:45:07 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3A7A4.6040703@ee.byu.edu> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> <43F3A7A4.6040703@ee.byu.edu> Message-ID: <20060215224318.GA30435@arbutus.physics.mcmaster.ca> On Wed, Feb 15, 2006 at 03:13:56PM -0700, Travis Oliphant wrote: > Robert Kern wrote: > I've started this process already. I think a useful search order is > > 1) next to current setup.py --- os.getcwd() is probably better than > what I did (backing up the frame until you can't go back anymore and > getting the __file__ from that frame). Incidentally, it looks like a > site.cfg present there is already copied to numpy/distutils on install > --- it's looks like its just not used for the numpy build itself. > > 2) in the compilers "HOME" directory --- not sure how to implement that. Have a look at distutils.dist for the Distribution.find_config_files method. Also, the parse_config_files method reads the config options in a way that keeps which filenames they come from. > 3) in the system-wide directory (what is currently done --- except when > you are installing numpy that means it has to be in > numpy/distutils/site.cfg). -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From gruben at bigpond.net.au Wed Feb 15 15:27:08 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Wed Feb 15 15:27:08 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F36CB6.5050004@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> Message-ID: <43F3B8A9.3000507@bigpond.net.au> Tim Hochberg wrote: > As I've been thinking about this some more, I think the correct thing to > do is not to mess with the power ufuncs at all. Rather in x.__pow__ > (since I don't know that there's anywhere else to do it), after the > above checks check the types of the array and in the cases where the > first argument is a float or complex and the second argument is some > sort of integer array. This would be dispatched to some other helper > function instead of the normal pow_ufunc. In other words, optimize: > > A**2, A**2.0, A**(2.0+0j), etc > > and > > A**array([1,2,3]) > > but not > > A**array[1.0, 2.0, 3.0] > > I think that this takes care of the optimization slowing down power for > general floats and optimizes the only array-array case that really matter. I think this might still be a tiny bit dangerous despite not observing monotonicity problems and would be a bit more conservative and change it to: optimize: A**2, A**(2+0j), etc and A**array([1,2,3]) but not A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) -- Gary R. From tim.hochberg at cox.net Wed Feb 15 16:42:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 16:42:02 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F3B8A9.3000507@bigpond.net.au> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> Message-ID: <43F3CA12.4000907@cox.net> Gary Ruben wrote: > Tim Hochberg wrote: > > >> As I've been thinking about this some more, I think the correct thing >> to do is not to mess with the power ufuncs at all. Rather in >> x.__pow__ (since I don't know that there's anywhere else to do it), >> after the above checks check the types of the array and in the cases >> where the first argument is a float or complex and the second >> argument is some sort of integer array. This would be dispatched to >> some other helper function instead of the normal pow_ufunc. In other >> words, optimize: >> >> A**2, A**2.0, A**(2.0+0j), etc >> >> and >> >> A**array([1,2,3]) >> >> but not >> >> A**array[1.0, 2.0, 3.0] >> >> I think that this takes care of the optimization slowing down power >> for general floats and optimizes the only array-array case that >> really matter. > > > I think this might still be a tiny bit dangerous despite not observing > monotonicity problems and would be a bit more conservative and change > it to: > > optimize: > > A**2, A**(2+0j), etc I'm guessing here that you did not mean to include (2+0j) on both lists and that, in fact you wanted not to optimize on complex exponents: . So, optimize: A**-1, A**0, A**1, A**2, etc. > > and > > A**array([1,2,3]) > > but not > > A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) That makes sense. It's safer and easier to explain: "numpy optimizes raising matrices (and possibly scalars) to integer powers)". The only sticking point that I see is if David is still interested in optimizing A**0.5, that's not going to mesh with this. On the other hand, perhaps he can be persuaded that sqrt(A) is just as good. After all, it's only one more character long ;) -tim From oliphant.travis at ieee.org Wed Feb 15 17:10:13 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 17:10:13 2006 Subject: [Numpy-discussion] Re: New release of NumPy coming In-Reply-To: <200602151405.47640.faltet@carabos.com> References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> <200602151405.47640.faltet@carabos.com> Message-ID: Francesc Altet wrote: > A Dimecres 15 Febrer 2006 11:18, Pearu Peterson va escriure: > >>On Wed, 15 Feb 2006, Francesc Altet wrote: >> >>>I've been trying to see how to correctly load the unicode tests, but >>>failed miserably. Perhaps Pearu can tell us about the correct way to >>>do that. >> >>I have fixed it in svn. When importing modules from tests/ directory, one >>must surround the corresponding import statements with set_local_path() >>and restore_path() calls. > > > Ah, ok. Is there any place where this is explained or we have to use > the source to figure out these sort of things? > Good thing to put in the doc subdirectory.... I was obviously not certain about the use of set_local_path() myself until Pearu corrected me. -Travis From harrison.ian at gmail.com Wed Feb 15 17:21:17 2006 From: harrison.ian at gmail.com (Ian Harrison) Date: Wed Feb 15 17:21:17 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays Message-ID: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> Hello, I have two groups of 3x1 arrays that are arranged into two larger 3xn arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In Matlab, I'd use the function cross() to calculate the cross product of the corresponding 'vectors' from each array. In other words: if ai and bj are 3x1 column vectors: A = [ a1 a2 a3 ] B = [ b1 b2 b3 ] C = A x B = [ (a1 x b1) (a2 x b2) (a3 x b3) ] Could someone suggest a clean way to do this? I suppose I could write a for loop to cycle through each pair of vectors and send them to numpy's cross(), but since I'm new to python/scipy/numpy, I'm guessing that there's probably a better method that I'm overlooking. Thanks, Ian From gruben at bigpond.net.au Wed Feb 15 17:25:11 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Wed Feb 15 17:25:11 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F3CA12.4000907@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> Message-ID: <43F3D43B.6090301@bigpond.net.au> Tim Hochberg wrote: >> optimize: >> >> A**2, A**(2+0j), etc > > > I'm guessing here that you did not mean to include (2+0j) on both lists > and that, in fact you wanted not to optimize on complex exponents: Oops. The complex index in the optimise list has integer parts. I assumed Python distinguished between complex numbers with integer and real parts, but it doesn't, so you're correct about which cases I'd vote to optimise. > So, optimize: > > A**-1, A**0, A**1, A**2, etc. > >> >> and >> >> A**array([1,2,3]) >> >> but not >> >> A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) > > > That makes sense. It's safer and easier to explain: "numpy optimizes > raising matrices (and possibly scalars) to integer powers)". The only > sticking point that I see is if David is still interested in optimizing > A**0.5, that's not going to mesh with this. On the other hand, perhaps > he can be persuaded that sqrt(A) is just as good. After all, it's only > one more character long ;) > > -tim From skip at pobox.com Wed Feb 15 17:39:32 2006 From: skip at pobox.com (skip at pobox.com) Date: Wed Feb 15 17:39:32 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F39B20.50502@ee.byu.edu> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> Message-ID: <17395.55188.130052.595605@montanaro.dyndns.org> Travis> Yes, that is a bad place. We need some suggestions as to where Travis> site.cfg should be read from. First place to look should be `pwd`. Travis> I think you can set the environment variable ATLAS to 'None' and Travis> it will ignore ATLAS... Thank you, thank you, thank you. I now have numpy built... I'll tackle the rest of scipy ma?ana. Skip From gruben at bigpond.net.au Wed Feb 15 18:04:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Wed Feb 15 18:04:02 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> Message-ID: <43F3DD7B.1010107@bigpond.net.au> This *almost* does what you want, I think. I can't see a neat way to give column vectors in the solution: In [21]: a=array([[[1],[2],[3]],[[4],[5],[6]],[[7],[8],[9]]]) In [22]: b=array([[[1],[2],[4]],[[4],[5],[7]],[[7],[8],[10]]]) In [24]: cross(a.transpose(),b.transpose()) Out[24]: array([[[ 2, -1, 0], [ 5, -4, 0], [ 8, -7, 0]]]) Gary R. Ian Harrison wrote: > Hello, > > I have two groups of 3x1 arrays that are arranged into two larger 3xn > arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In > Matlab, I'd use the function cross() to calculate the cross product of > the corresponding 'vectors' from each array. In other words: > > if ai and bj are 3x1 column vectors: > > A = [ a1 a2 a3 ] > > B = [ b1 b2 b3 ] > > C = A x B = [ (a1 x b1) (a2 x b2) (a3 x b3) ] > > Could someone suggest a clean way to do this? I suppose I could write > a for loop to cycle through each pair of vectors and send them to > numpy's cross(), but since I'm new to python/scipy/numpy, I'm guessing > that there's probably a better method that I'm overlooking. > > Thanks, > Ian From oliphant.travis at ieee.org Wed Feb 15 19:16:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 19:16:04 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> Message-ID: <43F3EE4E.6030301@ieee.org> Ian Harrison wrote: >Hello, > >I have two groups of 3x1 arrays that are arranged into two larger 3xn >arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In >Matlab, I'd use the function cross() to calculate the cross product of >the corresponding 'vectors' from each array. In other words: > > Help on function cross in module numpy.core.numeric: cross(a, b, axisa=-1, axisb=-1, axisc=-1) Return the cross product of two (arrays of) vectors. The cross product is performed over the last axis of a and b by default, and can handle axes with dimensions 2 and 3. For a dimension of 2, the z-component of the equivalent three-dimensional cross product is returned. It's the axisa, axisb, and axisc that you are interested in. The default is to assume you have Nx3 arrays and return an Nx3 array. But you can change the axis used to find vectors. cross(A,B,axisa=0,axisb=0,axisc=0) will do what you want. I suppose, a single axis= argument might be useful as well for the common situation of having all the other axis arguments be the same. -Travis >if ai and bj are 3x1 column vectors: > >A = [ a1 a2 a3 ] > >B = [ b1 b2 b3 ] > >C = A x B = [ (a1 x b1) (a2 x b2) (a3 x b3) ] > >Could someone suggest a clean way to do this? I suppose I could write >a for loop to cycle through each pair of vectors and send them to >numpy's cross(), but since I'm new to python/scipy/numpy, I'm guessing >that there's probably a better method that I'm overlooking. > >Thanks, >Ian > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From oliphant.travis at ieee.org Wed Feb 15 21:58:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 21:58:02 2006 Subject: [Numpy-discussion] Release tomorrow Message-ID: <43F41440.2040504@ieee.org> I just made some changes to the system_info file to parse site.cfg from the three standard locations (in order) 1) System-wide (the location of numpy/distutils/system_info.py) 2) Users HOME directory 3) Current working directory I'm assuming the config parser will update the appropriate sections as later files are read. I'd like to make a release tomorrow but would like a code-review on my changes in this section before hand. If somebody who uses site.cfg could try out the SVN version and see if it works as expected. That would be great. -Travis From gruben at bigpond.net.au Thu Feb 16 01:00:04 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 16 01:00:04 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F3EE4E.6030301@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> Message-ID: <43F43F02.1090001@bigpond.net.au> Hi Travis, Have you tested this? It appears to give the wrong answer on my system. I expect to get from this In [21]: a=array([[[1],[2],[3]],[[4],[5],[6]],[[7],[8],[9]]]) In [22]: b=array([[[1],[2],[4]],[[4],[5],[7]],[[7],[8],[10]]]) the solution Out[24]: array([[[ 2], [-1], [ 0]], [[ 5], [-4], [ 0]], [[ 8], [-7], [ 0]]]) i.e. the same as my example but with column vectors instead of rows, but doing cross(a,b,axisa=0,axisb=0,axisc=0) gives Out[15]: array([[[ 0], [ 0], [-3]], [[ 0], [ 0], [ 6]], [[ 0], [ 0], [-3]]]) Gary R. Travis Oliphant wrote: > Ian Harrison wrote: > >> Hello, >> >> I have two groups of 3x1 arrays that are arranged into two larger 3xn >> arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In >> Matlab, I'd use the function cross() to calculate the cross product of >> the corresponding 'vectors' from each array. In other words: >> >> > > > Help on function cross in module numpy.core.numeric: > > cross(a, b, axisa=-1, axisb=-1, axisc=-1) > Return the cross product of two (arrays of) vectors. > > The cross product is performed over the last axis of a and b by default, > and can handle axes with dimensions 2 and 3. For a dimension of 2, > the z-component of the equivalent three-dimensional cross product is > returned. > > It's the axisa, axisb, and axisc that you are interested in. > > The default is to assume you have Nx3 arrays and return an Nx3 array. > But you can change the axis used to find vectors. > > cross(A,B,axisa=0,axisb=0,axisc=0) > > will do what you want. I suppose, a single axis= argument might be > useful as well for the common situation of having all the other axis > arguments be the same. > > -Travis From arnd.baecker at web.de Thu Feb 16 01:47:03 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Thu Feb 16 01:47:03 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: <43F37183.3000103@ieee.org> References: <43F37183.3000103@ieee.org> Message-ID: Hi, On Wed, 15 Feb 2006, Travis Oliphant wrote: > Arnd Baecker wrote: > > >Reasons not to switch > >- there is no enthought edition yet (right?) > >- there are only packages for a few platforms/distribution > >- we need scipy.xplt > > (matplotlib is still no option at this point) > > > >Discussion/Background: > > > >To us the two main show-stoppers are scipy.xplt and the question > >about an Enthought Edition for Windows. > >For the Pool of PCs where the tutorial groups are to be held, > >it won't be a problem to install numpy/scipy in such > >a way that scipy.sandbox.xplt is visible as scipy.xplt > >(at least I hope). > >However, the students will either have windows (around 80%) > >or Linux at home. For windows users we have used > >the Enthought Edition (http://code.enthought.com/enthon/) > >and linux users were pointed to > >available packages for their machines or to install Numeric/scipy > >themselves. > > > As long as there are binaries for all the packages. Just having a list > of Windows installers can also work. > Were you using all of what is in > the Enthon edition? Of course not (there is so much stuff ;-), but the students made good use of VPython and also VTK/MayaVi was well recieved. So the bare minimum might be python+numpy/scipy/ipython/Vpython (+any windows specific stuff?) + maybe VTK and MayaVi > >Concerning xplt another option might be to > >install scipy.sandbox.xplt in such a way > >that a `import xplt` would work. If that is possible we could > >try to supply `xplt` separately for some of the distributions, > >and maybe also for windows (which I don't use, so I have > >no idea how difficult that would be). > > > > > I don't think that would be hard at all. You can just run python > setup.py bdist_inst from within the sandbox/xplt directory and get a > windows installer. OK, some in our group have much better knowledge about windows, so I will ask them to test this approach. > >If something like this was possible, the main question is > >whether a new enthon distribution with new numpy/scipy/ipython > >and all the other niceties of mayavi/VTK/wxPython/.... > >will come out in the near future? > > > I have no idea about that one. But, it sounds like the guy (Joe) at > Enthought who did most of the work on the Enthon distribution is no > longer as available for them, so I'm not sure... I see - could this somehow be turned into a community effort? Surely it is a non-trivial task - in particular the monolithic structure of an all-in-one download package seems to make upgrades of individual components difficult. Would something like some super-installer, calling the individual package installers be possible? (or does anything like a package management system for windows exist?) You see, I don't know anything about Windows, so I better shut up on this ;-). Many thanks, Arnd From pearu at scipy.org Thu Feb 16 02:32:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Thu Feb 16 02:32:03 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <200602151405.47640.faltet@carabos.com> References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> <200602151405.47640.faltet@carabos.com> Message-ID: On Wed, 15 Feb 2006, Francesc Altet wrote: > A Dimecres 15 Febrer 2006 11:18, Pearu Peterson va escriure: >> On Wed, 15 Feb 2006, Francesc Altet wrote: >>> I've been trying to see how to correctly load the unicode tests, but >>> failed miserably. Perhaps Pearu can tell us about the correct way to >>> do that. >> >> I have fixed it in svn. When importing modules from tests/ directory, one >> must surround the corresponding import statements with set_local_path() >> and restore_path() calls. > > Ah, ok. Is there any place where this is explained or we have to use > the source to figure out these sort of things? There is a single note about set_local_path in numpy/doc/DISTUTILS.txt. I agree that it would be nice to have Howtos such as "Howto write scipy-styled setup.py" "Howto write scipy-styled unit tests" etc Pearu From faltet at carabos.com Thu Feb 16 02:57:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 16 02:57:07 2006 Subject: [Numpy-discussion] [ANN] PyTables (A Hierarchical Database) 1.2.2 is out Message-ID: <200602161156.46810.faltet@carabos.com> =========================== Announcing PyTables 1.2.2 =========================== This is a maintenance version. Some important improvements and bug fixes has been addressed in it. Go to the PyTables web site for downloading the beast: http://pytables.sourceforge.net/ or keep reading for more info about the new features and bugs fixed in this version. Changes more in depth ===================== Improvements: - Multidimensional arrays of strings are now supported as node attributes. They just need to be wrapped into ``CharArray`` objects (see the ``numarray.strings`` module). - The limit of 512 KB for row sizes in tables has been removed. Now, there is no limit in the row size. - When using table row iterators in non-iterator contexts, a warning is issued recommending the users to use them in iterator contexts. Before, when these iterators were used, it was printed a regular record get from an arbitrary place of the memory, giving a non-sense record as a result. - Compression libraries are now dynamically loaded as different extension modules, so there is no longer need for producing several binary packages supporting different sets of compressors. Bug fixes: - Solved a leak that exposed when reading VLArray data. The problem was due to the usage of different heaps (C and Python) of memory. Thanks to Russel Howe to report this and to provide an initial patch. Known issues: - Time datatypes are non-portable between big-endian and little-endian architectures. This is ultimately a consequence of an HDF5 limitation. See SF bug #1234709 for more info. Backward-incompatible changes: - Please, see RELEASE-NOTES.txt file. Important notes for Windows users ================================= If you are willing to use PyTables with Python 2.4 in Windows platforms, you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET 2003. It can be found at: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win-net.ZIP Users of Python 2.3 on Windows will have to download the version of HDF5 compiled with MSVC 6.0 available in: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win.ZIP Also, note that support for the UCL compressor has not been added in the binary build of PyTables for Windows because of memory problems (perhaps some bad interaction between UCL and something else). Eventually, UCL support might be dropped in the future, so, please, refrain to create datasets compressed with it. What it is ========== **PyTables** is a package for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data (with support for full 64-bit file addressing). It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code, makes it a very easy-to-use tool for high performance data storage and retrieval. PyTables runs on top of the HDF5 library and numarray (Numeric is also supported and NumPy support is coming along) package for achieving maximum throughput and convenient use. Besides, PyTables I/O for table objects is buffered, implemented in C and carefully tuned so that you can reach much better performance with PyTables than with your own home-grown wrappings to the HDF5 library. PyTables sports indexing capabilities as well, allowing doing selections in tables exceeding one billion of rows in just seconds. Platforms ========= This version has been extensively checked on quite a few platforms, like Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64 (Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC and MacOSX on PowerPC. For other platforms, chances are that the code can be easily compiled and run without further issues. Please, contact us in case you are experiencing problems. Resources ========= Go to the PyTables web site for more details: http://pytables.sourceforge.net/ About the HDF5 library: http://hdf.ncsa.uiuc.edu/HDF5/ About numarray: http://www.stsci.edu/resources/software_hardware/numarray To know more about the company behind the PyTables development, see: http://www.carabos.com/ Acknowledgments =============== Thanks to various the users who provided feature improvements, patches, bug reports, support and suggestions. See THANKS file in distribution package for a (incomplete) list of contributors. Many thanks also to SourceForge who have helped to make and distribute this package! And last but not least, a big thank you to THG (http://www.hdfgroup.org/) for sponsoring many of the new features recently introduced in PyTables. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From arnd.baecker at web.de Thu Feb 16 04:39:01 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Thu Feb 16 04:39:01 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: <43F3881E.6060101@noaa.gov> References: <43F3881E.6060101@noaa.gov> Message-ID: Hi Chris, On Wed, 15 Feb 2006, Christopher Barker wrote: > Arnd Baecker wrote: > > - we need scipy.xplt > > (matplotlib is still no option at this point) > > Why not? just curious. That's a slightly longer story, but since you asked ;-): First, I should emphasize that I really think that matplotlib is extremely good, the quality of the plots is superb! Also it is used a lot in our group for research. However, in our opinion we cannot use matplotlib for our course for the following two main reasons (a) some (for us crucial) bugs (b) speed Concerning (a), the most crucial problem is the double-buffering problem. This did not exist with matplotlib 0.82 and has been reported several times, the last one being http://sourceforge.net/mailarchive/forum.php?thread_id=9559204&forum_id=33405 The presently suggested work-around is to use TkAgg as backend. However, the TkAgg backend is slower than any other backend. We cannot tell our students to use TkAgg for one problem and switch to WXAgg for another problem - they already struggle enough with learning python (there are several first-time-programmers as well) and the hard physics problems we give them ;-). To us this double-buffering problem is the show-stopper number one. Unfortunately, I don't understand the internals of matplotlib well enough to help with tracking this one down. There are a couple of further problems which we reported, but have fallen through the cracks - no complaint, that's how things are. But it is (from our point of view) not worth talking about them again as long as the double-buffering problem is still there. On the speed side (b): we have been using scipy.xplt and even that (though generally considered to be really fast) is not as fast as for example pgplot ;-). In addition many of our students run older maschines starting from PIIIs (I think the PIIs are gone by now, but two years ago quite a few still used them). So this is something to be kept in mind when talking about speed. We hired a student to do the conversion of our exercises from scipy.xplt to matplotlib and look into some of the speed issues. With John Hunters help this got pretty far, http://sourceforge.net/mailarchive/forum.php?thread_id=8153459&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8185639&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8243168&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8346924&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8498518&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8728580&forum_id=33405 I think that there was no further message after this and the whole approach has not yet been incorporated into MPL. To me it made the impression that it was very close to a good solution, and I would be willing to take up this issue again if there is a chance that it gets integrated into MPL. So, presumably many of the speed issues could be resolved The price to be paid is in some cases a factor of two more lines of code for the plotting (compared to scipy.xplt). By using a bit more encapsulation, this could surely be overcome. Ok, I hope I could roughly explain why we think that we cannot yet use Matplotlib - it is really almost there, so I remain very optimistic that at least next year we will be using it as the default plotting environment. Best, Arnd From oliphant.travis at ieee.org Thu Feb 16 06:21:15 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 06:21:15 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F43F02.1090001@bigpond.net.au> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> Message-ID: <43F48A2D.80705@ieee.org> Gary Ruben wrote: > I expect to get from this > > In [21]: a=array([[[1],[2],[3]],[[4],[5],[6]],[[7],[8],[9]]]) > In [22]: b=array([[[1],[2],[4]],[[4],[5],[7]],[[7],[8],[10]]]) > > the solution > > Out[24]: > array([[[ 2], > [-1], > [ 0]], > > [[ 5], > [-4], > [ 0]], > > [[ 8], > [-7], > [ 0]]]) Why do you expect to get this solution with axis=0? Remember axis=0 thinks the vectors are formed in the 0th dimension. Thus a[:,0,0] and a[:,1,0] and a[:,2,0] are the vectors you are using. You appear to be thinking of the vectors in the axis=1 dimension where the vectors would be a[0,:,0], a[1,:,0], a[2,:,0] But this is specified with axis=1 (there is a single axis argument available now in SVN which means axisa=axisb=axisc=axis) Thus, cross(a,b,axis=1) Gives the solution I think you are after. -Travis From travis at enthought.com Thu Feb 16 11:18:05 2006 From: travis at enthought.com (Travis N. Vaught) Date: Thu Feb 16 11:18:05 2006 Subject: [Numpy-discussion] ANN: Python Enthought Edition Version 0.9.2 Released Message-ID: <43F4CFB4.1080305@enthought.com> Enthought is pleased to announce the release of Python Enthought Edition Version 0.9.2 (http://code.enthought.com/enthon/) -- a python distribution for Windows. This is a kitchen-sink-included Python distribution including the following packages/tools out of the box: Numeric 24.2 SciPy 0.3.3 IPython 0.6.15 Enthought Tool Suite 1.0.2 wxPython 2.6.1.0 PIL 1.1.4 mingw 20030504-1 f2py 2.45.241_1926 MayaVi 1.5 Scientific Python 2.4.5 VTK 4.4 and many more... 0.9.2 Release Notes Summary --------------------------- Version 0.9.2 of Python Enthought Edition is the first to include the Enthought Tool Suite Package (http://code.enthought.com/ets/). Other changes include upgrading to Numeric 24.2, including MayaVi 1.5 (rather than 1.3) and removing a standalone PyCrust package in favor of the one included with wxPython. Also, elementtree and celementtree have been added to the distribution. Notably, this release is still based on Python 2.3.5 and still includes SciPy 0.3.3. You'll also notice that we have changed the version numbering to a major.minor.point format (from a build number format). see full release notes at: http://code.enthought.com/release/changelog-enthon0.9.2.shtml Best, Travis N. Vaught From stefan at sun.ac.za Thu Feb 16 11:27:05 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 16 11:27:05 2006 Subject: [Numpy-discussion] storage for records Message-ID: <20060216192556.GA20396@alpha> Is there any way to control the underlying storage for a record? I am trying to use Travis' earlier example of an image with named fields: dt = N.dtype(' References: <20060216192556.GA20396@alpha> Message-ID: <43F4E947.9070409@ieee.org> Stefan van der Walt wrote: >Is there any way to control the underlying storage for a record? > >I am trying to use Travis' earlier example of an image with named fields: > >dt = N.dtype('img = N.array(N.empty((rows,columns)), dtype=dt) > >Using this, I can access the different bands of the image using > >img['r'], img['g'], img['b'] (but not img.r as mentioned in some of >the posts). > > Attribute lookup (img.r) is the purpose of the record array subclass. rimg= img.view(numpy.recarray) rimg.r --- will now work. >'img' itself is a matrix of similar dimension as img['r'], but >contains the combined items of type ' > Be-ware that the 'However, I would like to store the image as a 3xMxN array, with the r, >g and b bands being contained in > >img[0], img[1] and img[2] > > You don't need a record array to do that. Just define your array as a 3xMxN array of floats. But, you could just re-define the data-type as img2 = img.view(('f4',3)) -- if img is MxN then img2 is MxNx3. or use rimg.field(0) --- field was recently added to record-arrays. >Is there a way to construct the record so that this structure is used >for storage? Further, how do I specify the dtype above, i.e. > >N.dtype(' >in the style > >N.dtype({'names' : ['r','g','b'], 'formats': ['f4','f4','f4']}) > > > >(how do I specify that the combined type is 'f12')? > > Use a tuple N.dtype((' I was looking at the code implementing array_new in arrayobject.c and for a while I could not convince myself that it handles ref. counts correctly. The cleanup code (at the "fail:" label contains Py_XDECREF(descr), meaning that descr is unreferenced on failure unless it is NULL. This makes sense because descr is created inside array_new by PyArray_DescrConverter, but if the failure is detected in PyArg_ParseTupleAndKeywords, descr may be NULL. What was puzzling to me, failures of PyArray_NewFromDescr are handled by "if (ret == NULL) {descr=NULL;goto fail;}" that sets descr to NULL before jumping to cleanup. As I investigated further, I've discovered the following helpful comment preceding PyArray_NewFromDescr : /* steals a reference to descr (even on failure) */ that explains why descr=NULL is necessary. I wonder what was the motivation for this design choice. I don't think this is a natural behavior for python C-API functions. I am not proposing to make any changes, just curious about the design. From oliphant.travis at ieee.org Thu Feb 16 13:51:05 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 13:51:05 2006 Subject: [Numpy-discussion] Reference counting question In-Reply-To: References: Message-ID: <43F4F3A1.7000400@ieee.org> Sasha wrote: >I was looking at the code implementing array_new in arrayobject.c and >for a while I could not convince myself that it handles ref. counts >correctly. The cleanup code (at the "fail:" label contains >Py_XDECREF(descr), meaning that descr is unreferenced on failure >unless it is NULL. This makes sense because descr is created inside >array_new by PyArray_DescrConverter, but if the failure is detected >in PyArg_ParseTupleAndKeywords, descr may be NULL. What was >puzzling to me, failures of PyArray_NewFromDescr are handled by "if >(ret == NULL) {descr=NULL;goto fail;}" that sets descr to NULL before >jumping to cleanup. As I investigated further, I've discovered the >following helpful comment preceding PyArray_NewFromDescr : /* steals a >reference to descr (even on failure) */ that explains why descr=NULL >is necessary. > >I wonder what was the motivation for this design choice. I don't >think this is a natural behavior for python C-API functions. I am not >proposing to make any changes, just curious about the design. > > The PyArray_Descr structure never used to be a Python object. Now it is. There is the C-API PyArray_DescrFromType that used to just return a C-structure but now it returns a reference-counted Python object. People are not used to reference counting the PyArray_Descr objects. The easiest way to make this work in my mind was to have the functions that use the Descr object steal a reference because ultimately the Descr objects purpose is to reside in an array. It is created for the purpose of being a member of an array structure which therefore steals it's reference. As an example, with this design you can write (and there are macros that do). PyArray_NewFromDescr(...., PyArray_DescrFromType(type_num), ....) and not create reference-count leaks. -Travis From stefan at sun.ac.za Thu Feb 16 13:55:05 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 16 13:55:05 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <43F4E947.9070409@ieee.org> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> Message-ID: <20060216215403.GH20396@alpha> On Thu, Feb 16, 2006 at 02:06:15PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >Is there any way to control the underlying storage for a record? > > > >I am trying to use Travis' earlier example of an image with named fields: > > > >dt = N.dtype(' >img = N.array(N.empty((rows,columns)), dtype=dt) > > > >Using this, I can access the different bands of the image using > > > >img['r'], img['g'], img['b'] (but not img.r as mentioned in some of > >the posts). > > > > > Attribute lookup (img.r) is the purpose of the record array subclass. > > rimg= img.view(numpy.recarray) > > rimg.r --- will now work. Thanks for the quick response! This is very useful information. Regards St?fan From gruben at bigpond.net.au Thu Feb 16 15:01:04 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 16 15:01:04 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F48A2D.80705@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: <43F50402.2000009@bigpond.net.au> Thanks Travis, I think this is what Ian was asking for (axis=1 rather than axis=0). I was confused by your previous reply in this thread which I blindly followed without thinking about it. >(there is a single axis argument available now in SVN which means > axisa=axisb=axisc=axis) Nice addition. Gary R. From oliphant.travis at ieee.org Thu Feb 16 16:11:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 16:11:08 2006 Subject: [Numpy-discussion] Release of NumPy 0.9.5 Message-ID: <43F5147A.3020202@ieee.org> I'm pleased to announce the release of NumPy 0.9.5 The release notes and download site can be found at http://www.scipy.org Best regards, -Travis Oliphant From bblais at bryant.edu Thu Feb 16 18:41:01 2006 From: bblais at bryant.edu (Brian Blais) Date: Thu Feb 16 18:41:01 2006 Subject: [Numpy-discussion] calculating matrix values at particular indices? Message-ID: <43F53719.50907@bryant.edu> Hello, In my attempt to learn python, migrating from matlab, I have the following problem. Here is what I want to do, (with the wrong syntax): from numpy import * t=arange(0,20,.1) x=zeros(len(t),'f') idx=where(t>5) tau=5 x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) #------------------ what is the best way to replace the wrong line with something that works: replace all of the values of x at the indices idx with exp(-t/tau) for values of t at indices idx? I do this all the time in matlab scripts, but I don't know that the pythonic preferred method is. thanks, bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From bblais at bryant.edu Thu Feb 16 18:47:03 2006 From: bblais at bryant.edu (Brian Blais) Date: Thu Feb 16 18:47:03 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: References: Message-ID: <43F5388A.7010905@bryant.edu> Colin J. Williams wrote: > Brian Blais wrote: >> In my attempt to learn python, migrating from matlab, I have the >> following problem. Here is what I want to do, (with the wrong syntax): >> >> from numpy import * >> >> t=arange(0,20,.1) >> x=zeros(len(t),'f') >> >> idx=(t>5) # <---this produces a Boolean array, probably not what you want. >> tau=5 >> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) >> > What are you trying to do? It is most unlikely that you need Boolean > values in x[idx] > in this example, as in many that I would do in matlab, I want to replace part of a vector with values from another vector. In this case, I want x to be zero from t=0 to 5, and then have a value of exp(-t/tau) for t>5. I could do it with an explicit for-loop, but that would be both inefficient and unpython-like. For those who know matlab, what I am doing here is: t=0:.1:20; idx=find(t>5); tau=5; x=zeros(size(t)); x(idx)=exp(-t(idx)/tau) is that clearer? I am sure there is a nice method to do this in python, but I haven't found it in the python or numpy docs. thanks, bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From oliphant.travis at ieee.org Thu Feb 16 19:02:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 19:02:02 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: <43F5388A.7010905@bryant.edu> References: <43F5388A.7010905@bryant.edu> Message-ID: <43F53C87.6040903@ieee.org> Brian Blais wrote: > Colin J. Williams wrote: > >> Brian Blais wrote: >> >>> In my attempt to learn python, migrating from matlab, I have the >>> following problem. Here is what I want to do, (with the wrong syntax): >>> >>> from numpy import * >>> >>> t=arange(0,20,.1) >>> x=zeros(len(t),'f') >>> >>> idx=(t>5) # <---this produces a Boolean array, >>> probably not what you want. >>> tau=5 >>> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) >>> >> What are you trying to do? It is most unlikely that you need Boolean >> values in x[idx] >> > > in this example, as in many that I would do in matlab, I want to > replace part of a vector with values from another vector. In this > case, I want x to be zero from t=0 to 5, and then have a value of > exp(-t/tau) for t>5. I could do it with an explicit for-loop, but > that would be both inefficient and unpython-like. For those who know > matlab, what I am doing here is: > from numpy import * t = r_[0:20:0.1] idx = t>5 tau = 5 x = zeros_like(t) x[idx] = exp(-t[idx]/tau) Should do it. -Travis From ryanlists at gmail.com Thu Feb 16 19:16:06 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu Feb 16 19:16:06 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: <43F53C87.6040903@ieee.org> References: <43F5388A.7010905@bryant.edu> <43F53C87.6040903@ieee.org> Message-ID: Brian's example works fine for me as long as x=zeros(len(t),'d') for some reason the error seems to come from assignment of a double array to a single precision array: In [318]: x=zeros(len(t),'f') In [319]: t=arange(0,20,.1) In [320]: idx=where(t>5) In [321]: x[idx]=exp(-t[idx]/tau) --------------------------------------------------------------------------- exceptions.TypeError Traceback (most recent call last) /home/ryan/thesis/accuracy/ TypeError: array cannot be safely cast to required type In [322]: x=zeros(len(t),'d') In [323]: x[idx]=exp(-t[idx]/tau) In [324]: Ryan On 2/16/06, Travis Oliphant wrote: > Brian Blais wrote: > > > Colin J. Williams wrote: > > > >> Brian Blais wrote: > >> > >>> In my attempt to learn python, migrating from matlab, I have the > >>> following problem. Here is what I want to do, (with the wrong syntax): > >>> > >>> from numpy import * > >>> > >>> t=arange(0,20,.1) > >>> x=zeros(len(t),'f') > >>> > >>> idx=(t>5) # <---this produces a Boolean array, > >>> probably not what you want. > >>> tau=5 > >>> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) > >>> > >> What are you trying to do? It is most unlikely that you need Boolean > >> values in x[idx] > >> > > > > in this example, as in many that I would do in matlab, I want to > > replace part of a vector with values from another vector. In this > > case, I want x to be zero from t=0 to 5, and then have a value of > > exp(-t/tau) for t>5. I could do it with an explicit for-loop, but > > that would be both inefficient and unpython-like. For those who know > > matlab, what I am doing here is: > > > from numpy import * > > > t = r_[0:20:0.1] > idx = t>5 > tau = 5 > x = zeros_like(t) > x[idx] = exp(-t[idx]/tau) > > > Should do it. > > -Travis > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ndarray at mac.com Thu Feb 16 19:16:07 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 16 19:16:07 2006 Subject: [Numpy-discussion] calculating matrix values at particular indices? In-Reply-To: <43F53719.50907@bryant.edu> References: <43F53719.50907@bryant.edu> Message-ID: You should make t and x the same type: either add dtype='f' to arange or change dtype='f' to dtype='d' in zeros. On 2/16/06, Brian Blais wrote: > Hello, > > In my attempt to learn python, migrating from matlab, I have the following problem. > Here is what I want to do, (with the wrong syntax): > > from numpy import * > > t=arange(0,20,.1) > x=zeros(len(t),'f') > > idx=where(t>5) > tau=5 > x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) > > #------------------ > > what is the best way to replace the wrong line with something that works: replace all > of the values of x at the indices idx with exp(-t/tau) for values of t at indices idx? > > I do this all the time in matlab scripts, but I don't know that the pythonic > preferred method is. > > > > thanks, > > bb > > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From harrison.ian at gmail.com Thu Feb 16 19:21:03 2006 From: harrison.ian at gmail.com (Ian Harrison) Date: Thu Feb 16 19:21:03 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F3EE4E.6030301@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> Message-ID: <764834db0602161920r59568392m611c7abd9a895900@mail.gmail.com> On 2/15/06, Travis Oliphant wrote: > Ian Harrison wrote: > > >Hello, > > > >I have two groups of 3x1 arrays that are arranged into two larger 3xn > >arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In > >Matlab, I'd use the function cross() to calculate the cross product of > >the corresponding 'vectors' from each array. In other words: > > > > > > > Help on function cross in module numpy.core.numeric: > > cross(a, b, axisa=-1, axisb=-1, axisc=-1) > Return the cross product of two (arrays of) vectors. > > The cross product is performed over the last axis of a and b by default, > and can handle axes with dimensions 2 and 3. For a dimension of 2, > the z-component of the equivalent three-dimensional cross product is > returned. > > It's the axisa, axisb, and axisc that you are interested in. > > The default is to assume you have Nx3 arrays and return an Nx3 array. > But you can change the axis used to find vectors. > > cross(A,B,axisa=0,axisb=0,axisc=0) > > will do what you want. I suppose, a single axis= argument might be > useful as well for the common situation of having all the other axis > arguments be the same. > > -Travis Travis, Thanks for your patience. This is what I was looking for. Ian From wbaxter at gmail.com Thu Feb 16 19:36:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 16 19:36:02 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: <43F5388A.7010905@bryant.edu> References: <43F5388A.7010905@bryant.edu> Message-ID: Howdy, On 2/17/06, Brian Blais wrote: > > Colin J. Williams wrote: > > Brian Blais wrote: > >> In my attempt to learn python, migrating from matlab, I have the > >> following problem. Here is what I want to do, (with the wrong syntax): > >> > >> from numpy import * > >> > >> t=arange(0,20,.1) > >> x=zeros(len(t),'f') This was the line causing the type error. t is type double (float64). 'f' makes x be type float32. That causes the assignment below to fail. Replacing that line with x=zeros(len(t),'d') should work. Or the zeros_like() that Travis suggested. >> > >> idx=(t>5) # <---this produces a Boolean array, probably > not what you want. > >> tau=5 > >> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) > >> You could also use idx=where(t>5) In place of idx=(t>5) Although in this case it probably doesn't make much difference, where(expr) is more directly equivalent to matlab's find(expr). See http://www.scipy.org/Wiki/NumPy_for_Matlab_Users for more Matlab equivalents. And consider contributing your own, if you have some good ones that aren't there already. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Feb 16 21:33:02 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu Feb 16 21:33:02 2006 Subject: [Numpy-discussion] OT: Apologies for my (relative) silence Message-ID: <43F56002.6050408@gmail.com> I just realized today that for some reason I haven't actually been subscribed to this list since the end of September. Apparently, I've only been getting mails addressed to numpy-discussion if they were CC'ed to one of the scipy lists or to me personally. This was just enough traffic to fool me into thinking I was still subscribed. I wondered why you guys were so quiet. To the people whom I redirected here from comp.lang.python and then (seemingly) ignored, I'm sorry! -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From charlesr.harris at gmail.com Thu Feb 16 21:41:05 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu Feb 16 21:41:05 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F48A2D.80705@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: Would anyone be interested in a quaternion version of this for nx4 arrays with nx3 as a special case where the scalar part == 0? Looking at the the cross product implementation, it shouldn't be to hard to duplicate this for quaternions. What should such a product be called? Something like qprod? Chuck From charlesr.harris at gmail.com Thu Feb 16 21:45:04 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu Feb 16 21:45:04 2006 Subject: [Numpy-discussion] Creating arrays with fromfile Message-ID: Hi Travis, I notice that the fromfile function in NumPy no longer accepts the shape keyword that the numarray version has. The functionalitiy can be duplicated by reshaping the array after creating it, but I think the shape keyword is a bit more convenient for that. Thoughts? Chuck From wbaxter at gmail.com Thu Feb 16 21:59:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 16 21:59:02 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: Quaternions using which convention? [s,x,y,z] or [x,y,z,w]? The docstring should make it very clear. Perhaps support a flag for choosing which, unless there's some python-wide standard for quaternions that I'm not aware of. --Bill On 2/17/06, Charles R Harris wrote: > > Would anyone be interested in a quaternion version of this for nx4 > arrays with nx3 as a special case where the scalar part == 0? Looking > at the the cross product implementation, it shouldn't be to hard to > duplicate this for quaternions. What should such a product be called? > Something like qprod? > > Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Feb 16 22:21:15 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu Feb 16 22:21:15 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: Bill, On 2/16/06, Bill Baxter wrote: > Quaternions using which convention? [s,x,y,z] or [x,y,z,w]? > The docstring should make it very clear. Perhaps support a flag for > choosing which, unless there's some python-wide standard for quaternions > that I'm not aware of. > > --Bill > > > On 2/17/06, Charles R Harris wrote: > > Would anyone be interested in a quaternion version of this for nx4 > > arrays with nx3 as a special case where the scalar part == 0? Looking > > at the the cross product implementation, it shouldn't be to hard to > > duplicate this for quaternions. What should such a product be called? > > Something like qprod? > > > > Chuck > > > > I like to put the scalar last, but I am open to putting it first if anyone has strong feelings about it. As far as I know, there is no scipy convention on this. Hmm, maybe a flag would be useful just because folks are likely to have files sitting around full of quaternions using both conventions. Maybe one more scalar type to add to the NumPy mix? I must admit that dtype=Quaternion512 seems a bit much. Anyway, I am open to suggestions. From oliphant.travis at ieee.org Thu Feb 16 22:51:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 22:51:02 2006 Subject: [Numpy-discussion] Creating arrays with fromfile In-Reply-To: References: Message-ID: <43F5721D.3030801@ieee.org> Charles R Harris wrote: >Hi Travis, > >I notice that the fromfile function in NumPy no longer accepts the >shape keyword that the numarray version has. The functionalitiy can be >duplicated by reshaping the array after creating it, but I think the >shape keyword is a bit more convenient for that. Thoughts? > > > It was just that much more effort to implement correctly in C and since it can be easily done using fromfile(....).reshape(dim1,dim2,dim3,...) I didn't think it critical. Perhaps numarray compatibility functions should be placed in a numcompat module. -Travis From cookedm at physics.mcmaster.ca Thu Feb 16 23:31:03 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 16 23:31:03 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F36CB6.5050004@cox.net> (Tim Hochberg's message of "Wed, 15 Feb 2006 11:02:30 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>[1] which brings up another point. Would using the 3-multiplication >>version for complex multiplication be good? There might be some >>effects with cancellation errors due to the extra subtractions... >> >> > I'm inclined to leave this be for now. Both because I'm unsure of the > rounding issues and because I'm not sure it would actually be faster. > It has one less multiplication, but several more additions, so it > would depend on the relative speed add/sub with multiplication and how > things end up getting scheduled in the FP pipeline. At some point it's > probably worth trying; if it turns out to be signifigantly faster we > can think about rounding then. If it's not faster then no need to > think. I did some thinking, and looked up how to analyse it. 3M goes like this: xy = (a+bi)(c+di) = (ac-bd) + ((a+c)(b+d)-ac-bd)i Consider x = y = t + i/t, for which x**2 = (t**2-1/t**2) + 2i, then xy=x^2 = t*t - 1/t*1/t + ((t+1/t)(t+1/t) - t**2 - 1/t**2)i Consider when t is large enough that (t+1/t)**2 = t**2 in floating point; then Im fl(xy) will be -1/t**2, instead of 2. So...let's leave it as is. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cookedm at physics.mcmaster.ca Thu Feb 16 23:39:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 16 23:39:02 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F3CA12.4000907@cox.net> (Tim Hochberg's message of "Wed, 15 Feb 2006 17:40:50 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> Message-ID: Tim Hochberg writes: > Gary Ruben wrote: > So, optimize: > > A**-1, A**0, A**1, A**2, etc. > >> >> and >> >> A**array([1,2,3]) >> >> but not >> >> A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) > > > That makes sense. It's safer and easier to explain: "numpy optimizes > raising matrices (and possibly scalars) to integer powers)". The only > sticking point that I see is if David is still interested in > optimizing A**0.5, that's not going to mesh with this. On the other > hand, perhaps he can be persuaded that sqrt(A) is just as good. After > all, it's only one more character long ;) sigh, ok :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From wbaxter at gmail.com Thu Feb 16 23:42:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 16 23:42:03 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: For folks using quats to represent rotations (which is all I use them for, anyway), if you're batch transforming a bunch of vectors by one quaternion, it's a lot more efficient to convert the quat to a 3x3 matrix first and transform using matrix multiply (9 mults per transform that way vs 21 or so depending on the implementation of q*v*q^-1). Given that, I can't see many situations when I'd need a super speedy C version of quaternion multiply. --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Fri Feb 17 00:16:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Fri Feb 17 00:16:01 2006 Subject: [Numpy-discussion] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash Message-ID: After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running the following script causes a crash of python.exe: -------------test.py----------------- import pylab ---------------------------------------- In my .matplotlibrc file I have the following: ------------- backend : WXAgg numerix : numpy ------------- Reinstalling numpy 0.9.4 fixed the problem. Matplotlib version is 0.86.2-win32-py2.4 I also tried reinstalling matplotlib, but that didn't help. --Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From cookedm at physics.mcmaster.ca Fri Feb 17 00:18:19 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 17 00:18:19 2006 Subject: [Numpy-discussion] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash In-Reply-To: (Bill Baxter's message of "Fri, 17 Feb 2006 17:15:45 +0900") References: Message-ID: Bill Baxter writes: > After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running the > following script causes a crash of python.exe: > > -------------test.py----------------- > import pylab > ---------------------------------------- > > > In my .matplotlibrc file I have the following: > ------------- > backend : WXAgg > numerix : numpy > ------------- > > > Reinstalling numpy 0.9.4 fixed the problem. > > Matplotlib version is 0.86.2-win32-py2.4 > I also tried reinstalling matplotlib, but that didn't help. You'll have to recompile matplotlib against the newer numpy. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From wbaxter at gmail.com Fri Feb 17 00:31:10 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Fri Feb 17 00:31:10 2006 Subject: [Numpy-discussion] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash In-Reply-To: References: Message-ID: Ew. Ok. No thanks. :-) I'll just stick with numpy 0.9.4 for now. I appreciate the speedy response. --bb On 2/17/06, David M. Cooke wrote: > > Bill Baxter writes: > > > After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running the > > following script causes a crash of python.exe: > > > > -------------test.py----------------- > > import pylab > > ---------------------------------------- > > > > > > In my .matplotlibrc file I have the following: > > ------------- > > backend : WXAgg > > numerix : numpy > > ------------- > > > > > > Reinstalling numpy 0.9.4 fixed the problem. > > > > Matplotlib version is 0.86.2-win32-py2.4 > > I also tried reinstalling matplotlib, but that didn't help. > > You'll have to recompile matplotlib against the newer numpy. > > -- > |>|\/|< > > /--------------------------------------------------------------------------\ > |David M. Cooke > http://arbutus.physics.mcmaster.ca/dmc/ > |cookedm at physics.mcmaster.ca > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant.travis at ieee.org Fri Feb 17 01:35:05 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 01:35:05 2006 Subject: [Numpy-discussion] Re: [matplotlib-devel] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash In-Reply-To: References: Message-ID: <43F5988A.70001@ieee.org> Bill Baxter wrote: > After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running > the following script causes a crash of python.exe: > > -------------test.py----------------- > import pylab > ---------------------------------------- > > > In my .matplotlibrc file I have the following: > ------------- > backend : WXAgg > numerix : numpy > ------------- > > > Reinstalling numpy 0.9.4 fixed the problem. > > Matplotlib version is 0.86.2-win32-py2.4 > I also tried reinstalling matplotlib, but that didn't help. > You have to re-compile the matplotlib extension. There are warnings present now so that hopefully in the future such needs will be comunicated better. -Travis From josegomez at gmx.net Fri Feb 17 01:52:04 2006 From: josegomez at gmx.net (Jose Gomez-Dans) Date: Fri Feb 17 01:52:04 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin Message-ID: Hi! Yesterday I posted on the scipy mailing list that I could not compile NumPy on Cygwin. I would like to provide some more information on what the problems are, as I would really like to be able to use it on Cygwin. I got the 0.9.5 tarball, and uncompress it, and type python setup.py build. The process starts, there is an indication that it finds BLAS and LAPACK (cygwin versions). It stops when linking the umath.dll, complaining about missing references. Here's an extract: "gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes' compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.4 -c' gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o -L/usr/lib/python2.4/config -lpython2.4 -o build/lib.cygwin-1.5.19-i686-2.4/numpy/core/umath.dll build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x2f5e): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x2fe3): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x3081): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x3139): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x320c): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x129ef): referencia a `_fetestexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x12a1d): referencia a `_feclearexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x12b1b): referencia a `_fetestexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x12b27): referencia a `_feclearexcept' sin definir collect2: ld devolvi'o el estado de salida 1 error: Command "gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o -L/usr/lib/python2.4/config -lpython2.4 -o build/lib.cygwin-1.5.19-i686-2.4/numpy/core/umath.dll" failed with exit status 1" (yes, I have the Spanish locale set :D). The functions needed are responsible for setting exceptions, and presumably, only need a simple addition to the library linking path. Is this correct? does anyone know how to deal with this? Many thanks! Jose From oliphant.travis at ieee.org Fri Feb 17 02:20:01 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Fri Feb 17 02:20:01 2006 Subject: [Numpy-discussion] Re: ANN: Release of NumPy 0.9.5 In-Reply-To: <45ljqdF7706gU1@individual.net> References: <45ljqdF7706gU1@individual.net> Message-ID: <43F5A317.8010008@ieee.org> Thomas Gellekum wrote: > "Travis E. Oliphant" writes: > > >> - Improvements to numpy.distutils > > > Stupid questions: is it really necessary to keep your own copy of > distutils and even install it? What's wrong with the version in the > Python distribution? How can I disable the colorized output to get > something more readable than yellow on white (well, seashell1)? > Yes --- distutils does not provide enough functionality. Besides, it's not a *copy* of distutils. It's enhancements to distutils. It builds on top of standard distutils. I don't know the answer to the colorized output question. Please post to numpy-discussions at lists.sourceforge.net to get more help and/or make suggestions. -Travis From oliphant.travis at ieee.org Fri Feb 17 03:22:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 03:22:04 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin In-Reply-To: References: Message-ID: <43F5B1AE.9080208@ieee.org> Jose Gomez-Dans wrote: >Hi! >Yesterday I posted on the scipy mailing list that I could not compile NumPy on >Cygwin. I would like to provide some more information on what the problems are, >as I would really like to be able to use it on Cygwin. > > Thanks Jose. It looks like we are not doing the right thing in the platform-specific section of code here. But the right thing can potentially be done. Look here: ftp://sunsite.dk/projects/cygwinports/release/python/numpy/ It looks like somebody figured out how to make it work with cygwin (one option, of course is to just disable the IEEE error-setting modes for cygwin). -Travis From oliphant.travis at ieee.org Fri Feb 17 03:50:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 03:50:10 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin In-Reply-To: References: Message-ID: <43F5B856.5060301@ieee.org> Jose Gomez-Dans wrote: >Hi! >Yesterday I posted on the scipy mailing list that I could not compile NumPy on >Cygwin. I would like to provide some more information on what the problems are, >as I would really like to be able to use it on Cygwin. > > I looked in to how people at cygwin ports got the IEEE math stuff done. They borrowed it from BSD basically. So, I've taken their patch and placed it in the main tree. Jose, could you check out the latest SVN version of numpy and try to build and install it on cygwin to see if I made the right changes? -Travis From bblais at bryant.edu Fri Feb 17 05:33:15 2006 From: bblais at bryant.edu (Brian Blais) Date: Fri Feb 17 05:33:15 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: References: <43F50D7F.8010108@bryant.edu> Message-ID: <43F5D003.2090705@bryant.edu> Robert Kern wrote: > The traceback tells you exactly what's wrong: > > In [7]: x[idx] = exp(-t[idx]/tau) > --------------------------------------------------------------------------- > exceptions.TypeError Traceback (most recent call > last) > yes, I saw that, but all of the types (i.e. type(x)) came out to be the same, so I figured the problem was with the indexing, and that was causing a typecast problem. I didn't know about dtype > In [13]: x = zeros(len(t), float) well that is confusing! zeros(5,'f') is single precision, zeros(5,'d') is double, and zeros(5,float) is double! that's where I got whacked, because I remembered that "float" was "double" in python...but I guess, not always. thanks for your help! bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From agn at noc.soton.ac.uk Fri Feb 17 06:37:02 2006 From: agn at noc.soton.ac.uk (George Nurser) Date: Fri Feb 17 06:37:02 2006 Subject: [Numpy-discussion] maxima of masked arrays Message-ID: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> I am trying to get the n-1 dimensional array of maxima of an array taken over a given axis. with ordinary arrays this works fine. E.g. In [49]: a = arange(1,13).reshape(3,4) In [50]: a Out[50]: array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]]) The maximum over all elements is: In [51]: print a.max() 12 & the array of maxima over the 0-axis is OK too: In [52]: print a.max(0) [ 9 10 11 12] But with a masked array there are problems. In [54]: amask = ma.masked_where(a < 5,a) In [55]: amask Out[55]: array(data = [[999999 999999 999999 999999] [ 5 6 7 8] [ 9 10 11 12]], mask = [[True True True True] [False False False False] [False False False False]], fill_value=999999) The maximum over all elements is fine: In [56]: amask.max() Out[56]: 12 but trying to get an array of maxima over the 0-axis fails: n [57]: amask.max(0) Out[57]: array(data = [[999999 999999 999999 999999] [ 5 6 7 8] [ 9 10 11 12]], mask = [[True True True True] ....... Are there any workarounds for this? -George. From schofield at ftw.at Fri Feb 17 07:23:13 2006 From: schofield at ftw.at (Ed Schofield) Date: Fri Feb 17 07:23:13 2006 Subject: [Numpy-discussion] Dot products and casting Message-ID: <20060217152142.GA9621@ftw.at> Hi all, I think there's a bug in dot() that prevents it from operating on two arrays, neither of which can be safely cast to the other. Here's an example: >>> from numpy import * >>> a = arange(10, dtype=float32) >>> b = arange(10, dtype=float64) >>> c = arange(10, dtype=int64) >>> d = arange(10, dtype=int32) >>> e = arange(10, dtype=int16) # Dot products between b and either c or d work fine: >>> dot(b,c) 285.0 >>> dot(b,d) 285.0 # Dot products with e also work fine: >>> dot(a,e) 285.0 >>> dot(b,e) 285.0 # But dot products between a and either c or d don't work: >>> dot(a,c) Traceback (most recent call last): File "", line 1, in ? TypeError: array cannot be safely cast to required type >>> dot(a,d) Traceback (most recent call last): File "", line 1, in ? TypeError: array cannot be safely cast to required type The problem seems to be with the PyArray_ObjectType() calls in dotblas_matrixproduct(), which are returning typenum=PyArray_FLOAT, but this isn't sufficiently large for a safe cast from the int32 and int64 arrays. It seems like PyArray_ObjectType() should be returning PyArray_DOUBLE here instead. Here's another example: >>> f = arange(10, dtype=complex64) >>> dot(b, f) Traceback (most recent call last): File "", line 1, in ? TypeError: array cannot be safely cast to required type So it seems like the problem isn't isolated to float32 arrays, but occurs elsewhere when we need to find a minimum data type of two arrays when *both* need to be upcasted. -- Ed From stefan at sun.ac.za Fri Feb 17 07:24:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 17 07:24:01 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <43F4E947.9070409@ieee.org> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> Message-ID: <20060217152207.GA971@sun.ac.za> I am probably trying to do something silly, but still: In [1]: import numpy as N In [2]: N.__version__ Out[2]: '0.9.6.2127' In [3]: P = N.array(N.zeros((2,2)), N.dtype((('f4',3), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']}))) *** glibc detected *** malloc(): memory corruption: 0x0830bb48 *** Aborted Regards St?fan On Thu, Feb 16, 2006 at 02:06:15PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >Is there any way to control the underlying storage for a record? > > > >I am trying to use Travis' earlier example of an image with named fields: > > > >dt = N.dtype(' >img = N.array(N.empty((rows,columns)), dtype=dt) From arnd.baecker at web.de Fri Feb 17 08:11:04 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Fri Feb 17 08:11:04 2006 Subject: [Numpy-discussion] ANN: Python Enthought Edition Version 0.9.2 Released In-Reply-To: <43F4CFB4.1080305@enthought.com> References: <43F4CFB4.1080305@enthought.com> Message-ID: On Thu, 16 Feb 2006, Travis N. Vaught wrote: > Enthought is pleased to announce the release of Python Enthought Edition > Version 0.9.2 (http://code.enthought.com/enthon/) -- a python > distribution for Windows. This is a kitchen-sink-included Python > distribution including the following packages/tools out of the box: > > Numeric 24.2 > SciPy 0.3.3 > IPython 0.6.15 > Enthought Tool Suite 1.0.2 > wxPython 2.6.1.0 > PIL 1.1.4 > mingw 20030504-1 > f2py 2.45.241_1926 > MayaVi 1.5 > Scientific Python 2.4.5 > VTK 4.4 > and many more... Brilliant - many thanks for the effort! I was just about to ask for the plans about numpy/scipy, but the changelog at http://code.enthought.com/release/changelog-enthon0.9.2.shtml shows quite a bit of activity in this direction! Do you have an estimate about when a numpy/scipy version of the Enthought Edition might happen? Many thanks, Arnd From charlesr.harris at gmail.com Fri Feb 17 08:23:06 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri Feb 17 08:23:06 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: On 2/17/06, Bill Baxter wrote: > For folks using quats to represent rotations (which is all I use them for, > anyway), if you're batch transforming a bunch of vectors by one quaternion, > it's a lot more efficient to convert the quat to a 3x3 matrix first and > transform using matrix multiply (9 mults per transform that way vs 21 or so > depending on the implementation of q*v*q^-1). Given that, I can't see many > situations when I'd need a super speedy C version of quaternion multiply. > > --Bill True. On the other hand, I have files containing 20,000 quaternions, each of which needs to be converted to a rotation matrix and applied to some 400 vectors, so a c version would have its place. I have a python quaternion class that I use for such things but profiling shows that it is one of the prime time bandits, so I am tempted to use c for that class anyway. Note that the current NumPy cross product is implemented in Python. On a related note, indexing in numarray is some 3x faster than in NumPy and I'm wondering what needs to be done to speed that up. Chuck From travis at enthought.com Fri Feb 17 08:46:06 2006 From: travis at enthought.com (Travis N. Vaught) Date: Fri Feb 17 08:46:06 2006 Subject: [Numpy-discussion] ANN: Python Enthought Edition Version 0.9.2 Released In-Reply-To: References: <43F4CFB4.1080305@enthought.com> Message-ID: <43F5FD9D.2090706@enthought.com> Arnd Baecker wrote: > On Thu, 16 Feb 2006, Travis N. Vaught wrote: > > >> Enthought is pleased to announce the release of Python Enthought Edition >> Version 0.9.2 (http://code.enthought.com/enthon/) -- a python >> distribution for Windows. This is a kitchen-sink-included Python >> distribution including the following packages/tools out of the box: >> >> Numeric 24.2 >> SciPy 0.3.3 >> IPython 0.6.15 >> Enthought Tool Suite 1.0.2 >> wxPython 2.6.1.0 >> PIL 1.1.4 >> mingw 20030504-1 >> f2py 2.45.241_1926 >> MayaVi 1.5 >> Scientific Python 2.4.5 >> VTK 4.4 >> and many more... >> > > Brilliant - many thanks for the effort! > > I was just about to ask for the plans about numpy/scipy, > but the changelog at > http://code.enthought.com/release/changelog-enthon0.9.2.shtml > shows quite a bit of activity in this direction! > > Do you have an estimate about when a numpy/scipy version > of the Enthought Edition might happen? > > Many thanks, > > Arnd > It's a bit difficult to say with much accuracy, so I'll be transparent but imprecise. Our release of Enthon versions typically tracks the state of the platform we are using for the custom software development we do to pay the bills. Thus, our current project code typically has to be ported to build and run on a cobbled-together build of the newer versions before we do a release. I realize this is a drag on the release schedule for Enthon, but it's how we allocate resources to the builds. Enough excuses, though--we are working on the migration of our project code now (Pearu Peterson) and I expect in weeks (rather than months) we'll have an Enthon release candidate with Python 2.4.2, and the latest SciPy and NumPy on Windows. Robert Kern is already working on a project that is based on this tool chain, so the wedge is in place. Thanks for the interest! (and sorry for the cross-post) Travis From faltet at carabos.com Fri Feb 17 10:21:26 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 17 10:21:26 2006 Subject: [Numpy-discussion] PyTables with support for NumPy 0.9.5 Message-ID: <200602171920.35075.faltet@carabos.com> Hi, I've just uploaded a new version of PyTables with (almost) complete support for the recent NumPy 0.9.5. All the range of homogeneous and heterogeneous (including those with nested fields) arrays using any combination of data-types should be supported. The only exception is the lack of support of unicode types (I have to figure out yet which HDF5 datatype would be best to mapping them; suggestions are welcome!). You can fetch the tarball from: http://pytables.carabos.com/download/preliminary/pytables-1.3beta2.tar.gz Test it as much as you can, and if you find any strange quirk, do not hesitate to report it back. Regards, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From tim.hochberg at cox.net Fri Feb 17 11:14:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 17 11:14:12 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> Message-ID: <43F62069.80209@cox.net> Here's a little progress report: I now have A**2 running as fast as square(A). This is done by special casing stuff in array_power so that A**2 acutally calls square(A) instead of going through power(A,2). Things still need a bunch of cleaning up (in fact right now A**1 segfaults, but I know why and it should be an easy fix). However, I think I've discovered why you couldn't get your special cased power to run as fast for A**2 as square(A) or A*A. It appears that the overhead of creating a new array object from the integer 2 is the bottleneck. I was running into the same mysterious overhead, even when dispatching from array_power, until I special cased on PyInt to avoid the array creation in that case. -tim From oliphant.travis at ieee.org Fri Feb 17 11:23:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 11:23:02 2006 Subject: [Numpy-discussion] Dot products and casting In-Reply-To: <20060217152142.GA9621@ftw.at> References: <20060217152142.GA9621@ftw.at> Message-ID: <43F62258.3040602@ieee.org> Ed Schofield wrote: >Hi all, > > > >The problem seems to be with the PyArray_ObjectType() calls in >dotblas_matrixproduct(), which are returning typenum=PyArray_FLOAT, but >this isn't sufficiently large for a safe cast from the int32 and int64 >arrays. It seems like PyArray_ObjectType() should be returning >PyArray_DOUBLE here instead. > > This sounds like an accurate diagnosis. I'll have to look at the type-evaluation code a bit more to see why a suitable type is not being found --- unless someone else gets there first. I won't have time for awhile today. -Travis From oliphant.travis at ieee.org Fri Feb 17 11:25:59 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 11:25:59 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: <43F62313.9030607@ieee.org> Charles R Harris wrote: >On 2/17/06, Bill Baxter wrote: > > >>For folks using quats to represent rotations (which is all I use them for, >>anyway), if you're batch transforming a bunch of vectors by one quaternion, >>it's a lot more efficient to convert the quat to a 3x3 matrix first and >>transform using matrix multiply (9 mults per transform that way vs 21 or so >>depending on the implementation of q*v*q^-1). Given that, I can't see many >>situations when I'd need a super speedy C version of quaternion multiply. >> >>--Bill >> >> > >On a related note, indexing in numarray is some 3x faster than in >NumPy and I'm wondering what needs to be done to speed that up. > > Please explain with a benchmark. This is not true for all indexing operations. But, it is possible that certain use cases are faster. We can't do anything without knowing what you are talking about exactly. -Travis From cookedm at physics.mcmaster.ca Fri Feb 17 11:31:05 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 17 11:31:05 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F62069.80209@cox.net> (Tim Hochberg's message of "Fri, 17 Feb 2006 12:13:45 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> Message-ID: Tim Hochberg writes: > Here's a little progress report: I now have A**2 running as fast as > square(A). This is done by special casing stuff in array_power so that > A**2 acutally calls square(A) instead of going through power(A,2). > Things still need a bunch of cleaning up (in fact right now A**1 > segfaults, but I know why and it should be an easy fix). However, I > think I've discovered why you couldn't get your special cased power to > run as fast for A**2 as square(A) or A*A. It appears that the overhead > of creating a new array object from the integer 2 is the bottleneck. I > was running into the same mysterious overhead, even when dispatching > from array_power, until I special cased on PyInt to avoid the array > creation in that case. Hmm, if that's true about the overhead, that'll hit all computations of the type op(x, ). Something to look at. That ufunc code for casting the arguments is pretty big and hairy, so I'm not going to look at right now :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From bryan at cole.uklinux.net Fri Feb 17 12:11:06 2006 From: bryan at cole.uklinux.net (Bryan Cole) Date: Fri Feb 17 12:11:06 2006 Subject: [Numpy-discussion] Re: number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: References: Message-ID: <1139945386.3346.38.camel@pc1.cole.uklinux.net> > > > First, I think the range() function in python is ugly to begin with. > Why can't python just support range notation directly like 'for a in > 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more > sense to me than having to call a named function. Anyway, that's a > python pet peeve, and python's probably not going to change something > so fundamental... There was a python PEP on this. It got rejected as having too many 'issues'. Pity, in my view. see http://www.python.org/peps/pep-0204.html BC From Chris.Barker at noaa.gov Fri Feb 17 12:24:01 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 17 12:24:01 2006 Subject: [Numpy-discussion] Behavior of array scalars Message-ID: <43F630F9.9050208@noaa.gov> Hi all, It just dawned on my that the numpy array scalars might give something I have wanted once in a while: mutable scalars. However, it seems that we almost, but no quite, have them. A few questions: >>> import numpy as N >>> N.__version__ '0.9.2' >>> N.array(5) array(5) >>> >>> x = N.array(5) >>> x.shape () So it looks like a scalar. >>> y = x Now I have two names bound to the same object. >>> x += 5 I expect this to change the object in place. >>> x 10 but what is this? is it no longer an array? >>> y array(10) y changed, so it looks like the object has changed in place. >>> type(x) >>> type(y) So why did x += 5 result in a different type of object? Also: I can see that we could use += and friends to mutate an array scalar, but what if I want to set it's value, as a mutation, like: >>> x = N.array((5,)) >>> x array([5]) >>> x[0] = 10 >>> x array([10]) but I can't so that with an array scalar: >>> x = N.array(5) >>> x array(5) >>> x[0] = 10 Traceback (most recent call last): File "", line 1, in ? IndexError: 0-d arrays can't be indexed. >>> x[] = 10 File "", line 1 x[] = 10 ^ SyntaxError: invalid syntax >>> x[:] = 10 Traceback (most recent call last): File "", line 1, in ? ValueError: cannot slice a scalar Is there a way to set the value in place, without resorting to: >>> x *= 0 >>> x += 34 I think it would be really handy to have a full featured, mutable scalar. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From alexander.belopolsky at gmail.com Fri Feb 17 12:51:03 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri Feb 17 12:51:03 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F630F9.9050208@noaa.gov> References: <43F630F9.9050208@noaa.gov> Message-ID: On 2/17/06, Christopher Barker wrote: > >>> x += 5 > > I expect this to change the object in place. > > >>> x > 10 > > but what is this? is it no longer an array? I would say it is a bug, but here is an easy work-around >>> x = array(5) >>> id(x) 6425088 >>> x[()]+=5 >>> id(x) 6425088 >>> x array(10) You can also use >>> x[...]+=5 >>> x array(15) With an additional benefit that the same syntax works for any shape. > Is there a way to set the value in place, without resorting to: >>> x[...] = 10 or >>> x[()] = 10 You can see more on this feature at http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray From ndarray at mac.com Fri Feb 17 13:09:07 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 13:09:07 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> Message-ID: On 2/17/06, Christopher Barker wrote: > >>> x += 5 > > I expect this to change the object in place. > > >>> x > 10 > > but what is this? is it no longer an array? I would say it is a bug, but here is an easy work-around >>> x = array(5) >>> id(x) 6425088 >>> x[()]+=5 >>> id(x) 6425088 >>> x array(10) You can also use >>> x[...]+=5 >>> x array(15) With an additional benefit that the same syntax works for any shape. > Is there a way to set the value in place, without resorting to: >>> x[...] = 10 or >>> x[()] = 10 You can see more on this feature at http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray From tim.hochberg at cox.net Fri Feb 17 13:19:00 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 17 13:19:00 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> Message-ID: <43F63D82.7060702@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>Here's a little progress report: I now have A**2 running as fast as >>square(A). This is done by special casing stuff in array_power so that >>A**2 acutally calls square(A) instead of going through power(A,2). >>Things still need a bunch of cleaning up (in fact right now A**1 >>segfaults, but I know why and it should be an easy fix). However, I >>think I've discovered why you couldn't get your special cased power to >>run as fast for A**2 as square(A) or A*A. It appears that the overhead >>of creating a new array object from the integer 2 is the bottleneck. I >>was running into the same mysterious overhead, even when dispatching >>from array_power, until I special cased on PyInt to avoid the array >>creation in that case. >> >> > >Hmm, if that's true about the overhead, that'll hit all computations >of the type op(x, ). Something to look at. That ufunc code >for casting the arguments is pretty big and hairy, so I'm not going to >look at right now :-) > > Well, it's just a guess based on the fact that the extra time went away when I stopped calling PyArray_EnsureArray(o2) for python ints. For what it's worth, numpy scalars seem to have much less overhead. As indicated below (note that numpy scalars are not currently special cased like PyInts are). The overhead from PyInts was closer to 75% versus about 15% for numpy scalars. Of course, the percentage of overhead is going to go up for smaller arrays. >>> Timer('a**2', 'from numpy import arange;a = arange(10000.); b = a[2]').timeit(10000) 0.28345055027943999 >>> Timer('a**b', 'from numpy import arange;a = arange(10000.); b = a[2]').timeit(10000) 0.32190487897204889 >>> Timer('a*a', 'from numpy import arange;a = arange(10000.); b = a[2]').timeit(10000) 0.27305732991204223 >>> Timer('square(a)', 'from numpy import arange, square;a = arange(10000.); b = a[2]').timeit(10000) 0.27989618792332749 -tim From oliphant at ee.byu.edu Fri Feb 17 15:04:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 15:04:01 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F630F9.9050208@noaa.gov> References: <43F630F9.9050208@noaa.gov> Message-ID: <43F65627.5080901@ee.byu.edu> Christopher Barker wrote: > Hi all, > > It just dawned on my that the numpy array scalars might give something > I have wanted once in a while: mutable scalars. However, it seems that > we almost, but no quite, have them. A few questions: NumPy (starting with Numeric) has always had this love-hate relationship with zero-dimensional arrays. We use them internally to simplify the code, but try not to expose them to the user. Ultimately, we couldn't figure out how to do that cleanly and so we have the current compromise situation where 0-d arrays are available but treated as second-class citizens. Thus, we still get funny behavior in certain circumstances. I think you found another such quirky area. I'm open to suggestions. To analyze this particular case... The a+= 10 operation should be equivalent to add(a,10,a). Note that explicitly writing add(a,10,a) returns a scalar (all ufuncs return scalars if 0-d arrays are the result). But, a is modified in-place as you wanted. Perhaps what is going on is that a += 10 is begin translated to a = a + 10 rather than add(a,10,a) I'll have to look deeper to see why. -Travis From ndarray at mac.com Fri Feb 17 15:44:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 15:44:04 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F65627.5080901@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: On 2/17/06, Travis Oliphant wrote: > ... > Perhaps what is going on is that > a += 10 > is begin translated to > a = a + 10 > rather than > add(a,10,a) > I'll have to look deeper to see why. It is actually being translated to "a = add(a,10,a)" by virtue of array_inplace_add supplied in the inplace_add slot. Here is the proof: >>> a = array(0) >>> a = b = array(0) >>> a += 10 >>> b array(10) >>> a 10 Another way to explain it is to note that a += 10 is equivalent to "a = a.__iadd__(10)" and a.__iadd__(10) is equivalent to "add(a, 10, a)". This is not easy to fix because the real culprit is >>> a = array(0) >>> type(a) is type(a+a) False Maybe we can change ufunc logic so that when the output argument is supplied it is returned without scalar conversion. From oliphant at ee.byu.edu Fri Feb 17 15:51:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 15:51:01 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: <43F66123.204@ee.byu.edu> Sasha wrote: >Maybe we can change ufunc logic so that when the output argument is >supplied it is returned without scalar conversion. > > That seems sensible. Any objections? It is PyArray_Return that changes things from 0-d array's to scalars. It's all that function has every really done.... Notice that this behavior was always in Numeric... a = Numeric.array(5) a += 10 type(a) >>> a = Numeric.array(5) >>> type(a) == type(a+a) False But... >>> a = Numeric.array(5,'f') >>> type(a) == type(a+a) True So, we've been dealing with these issues (poorly) for a long time.... -Travis From oliphant at ee.byu.edu Fri Feb 17 16:05:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 16:05:01 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: <43F6647C.6060107@ee.byu.edu> Sasha wrote: >It is actually being translated to "a = add(a,10,a)" by virtue of >array_inplace_add supplied in the inplace_add slot. Here is the >proof: > > I think we do need to fix something. Because the problem is even more apparent when you in-place add an array to a matrix. Consider... a = rand(5,5) b = mat(a) a += b What do you think the type of a now is? What should it be? Currently, this code would change a from an array to a matrix because add(a,b,a) returns a matrix. I'm thinking that we should establish the rule that if output arrays are given, then what is returned should just be those output arrays... This seems to make consistent sense and it will make the inplace operators work as expected (not changing the type). We are currently not letting in-place operators change the data-type. Our argument against that behavior is weakened if we do let them change the Python type.... -Travis From cookedm at physics.mcmaster.ca Fri Feb 17 16:21:08 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 17 16:21:08 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F6647C.6060107@ee.byu.edu> (Travis Oliphant's message of "Fri, 17 Feb 2006 17:04:12 -0700") References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F6647C.6060107@ee.byu.edu> Message-ID: Travis Oliphant writes: > Consider... > > a = rand(5,5) > b = mat(a) > > a += b > > What do you think the type of a now is? What should it be? > > Currently, this code would change a from an array to a matrix because > > add(a,b,a) returns a matrix. > > I'm thinking that we should establish the rule that if output arrays > are given, then what is returned should just be those output arrays... +1. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From ndarray at mac.com Fri Feb 17 16:31:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 16:31:03 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F65627.5080901@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: Sorry for a truncated post. Here is what I intended. On 2/17/06, Travis Oliphant wrote: > NumPy (starting with Numeric) has always had this love-hate relationship > with zero-dimensional arrays. We use them internally to simplify the > code, but try not to expose them to the user. Ultimately, we couldn't > figure out how to do that cleanly and so we have the current compromise > situation where 0-d arrays are available but treated as second-class > citizens. Thus, we still get funny behavior in certain circumstances. It would be nice to collect the motivations behind the current state of affairs with rank-0 arrays in one place. Due to the "hard-hat" nature of the issue, I would suggest to do it at http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray . Travis' Numeric3 design document actually leaves the issue open """ What does single element indexing return? Scalars or rank-0 arrays? Right now, a scalar is returned if there is a direct map to a Python type, otherwise a rank-0 array (Numeric scalar) is returned. But, in problems which reduce to an array of arbitrary size, this can lead to a lot of code that basically just checks to see if the object is a scalar. There are two ways I can see to solve this: 1) always return rank-0 arrays (never convert to Python scalars) and 2) always use special functions (like alen) that handle Python scalars correctly. I'm open to both ideas, but probably prefer #1 (never convert to Python scalars) unless requested. """ I can think of two compelling reasons in favor of scalar array types: 1. Rank-0 arrays cannot be used as indices to tuples. 2. Rank-0 arrays cannot be used as keys in dicts. Neither of these resons is future proof. It looks like python 2.5 will introduce __index__ slot that will fix #1 and #2 is probably better solved by introduction of "frozen" ndarray. In any case I will collect all these thoughts on the ZeroRankArray page unless I hear that this belongs to the main wiki. From oliphant at ee.byu.edu Fri Feb 17 16:54:05 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 16:54:05 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: <43F67012.5090303@ee.byu.edu> Sasha wrote: >It would be nice to collect the motivations behind the current state >of affairs with rank-0 arrays in one place. Due to the "hard-hat" >nature of the issue, I would suggest to do it at >http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray . > >Travis' Numeric3 design document actually leaves the issue open > > > This document is old. Please don't refer to it too stringently. It reflected my thinking at the start of the project. There are mailing list discussions that have more relevance. The source reflects what was actually done. What was done is introduce scalar array types for every data-type and return those. I had originally thought that the pure Python user would *never* see rank-0 arrays. That's why PyArray_Return is called all over the place in the code. The concept that practicality beats purity won out and there are a few limited wasy you can get zero-dimensional arrays (i.e. using array(5) which used to return an array scalar). They just don't *stay* 0-d arrays and are converted to array scalars at almost every opportunity.... I have been relaxing this over time, however. I can't say I have some grand understanding that is guiding the relaxation of this rule, however, except that I still think array scalars are *better* to deal with (I think this will be especially obvious when we get scalar math implemented). So, I relunctantly give visibility to 0-d arrays when particular use-cases emerge. >In any case I will collect all these thoughts on the ZeroRankArray >page unless I hear that this belongs to the main wiki. > > It's a good start. This particular use case of course is actually showing us a deeper flaw in our use of output arguments in the ufunc which needs changing. -Travis From Chris.Barker at noaa.gov Fri Feb 17 17:19:03 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 17 17:19:03 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F67012.5090303@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F67012.5090303@ee.byu.edu> Message-ID: <43F67618.8060501@noaa.gov> Travis Oliphant wrote: > They just don't > *stay* 0-d arrays and are converted to array scalars at almost every > opportunity.... I'm still confused as to what the difference is. This (recent) convesation started with my desire for a mutable scalar. CAn array scalars fill this role? What I mean by that role is some way to do: x += 5 # (and friends) x[()] = 45 # or some other notation And have x be the same object throughout. Heck even something like: x.set(45) would work for me. Alexander Belopolsky wrote: >>>>x[...] = 10 > > or > >>>>x[()] = 10 I can't do that. HAs that been added since version: '0.9.2' ? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at ee.byu.edu Fri Feb 17 17:30:09 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 17:30:09 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F67618.8060501@noaa.gov> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F67012.5090303@ee.byu.edu> <43F67618.8060501@noaa.gov> Message-ID: <43F67875.1090302@ee.byu.edu> Christopher Barker wrote: > Travis Oliphant wrote: > >> They just don't *stay* 0-d arrays and are converted to array scalars >> at almost every opportunity.... > > > I'm still confused as to what the difference is. This (recent) > convesation started with my desire for a mutable scalar. CAn array > scalars fill this role? No. array scalars are immutable (well except for the void array scalar...) > > What I mean by that role is some way to do: > > x += 5 # (and friends) This now works in SVN. In-place operations on 0-d arrays don't change on you. -Travis From ndarray at mac.com Fri Feb 17 17:32:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 17:32:04 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F67618.8060501@noaa.gov> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F67012.5090303@ee.byu.edu> <43F67618.8060501@noaa.gov> Message-ID: On 2/17/06, Christopher Barker wrote: > >>>>x[()] = 10 > > I can't do that. HAs that been added since version: '0.9.2' ? Yes, you need 0.9.4 or later. From ndarray at mac.com Fri Feb 17 17:56:05 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 17:56:05 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F66123.204@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F66123.204@ee.byu.edu> Message-ID: On 2/17/06, Travis Oliphant wrote: > Sasha wrote: > > >Maybe we can change ufunc logic so that when the output argument is > >supplied it is returned without scalar conversion. > > > > > That seems sensible. Attached patch implements this idea. With the patch applied: >>> from numpy import * >>> x = array(5) >>> add(x,5,x) array(10) >>> x+=5 >>> x array(15) The patch passes all the tests, but I would like to hear from others before commit. Personally, I am unhappy that I had to change C-API function. -------------- next part -------------- Index: numpy/core/src/ufuncobject.c =================================================================== --- numpy/core/src/ufuncobject.c (revision 2128) +++ numpy/core/src/ufuncobject.c (working copy) @@ -846,7 +846,8 @@ #undef _GETATTR_ static int -construct_matrices(PyUFuncLoopObject *loop, PyObject *args, PyArrayObject **mps) +construct_matrices(PyUFuncLoopObject *loop, PyObject *args, + PyArrayObject **mps, Bool *supplied_output) { int nargs, i, maxsize; int arg_types[MAX_ARGS]; @@ -952,6 +953,7 @@ mps[i] = NULL; continue; } + supplied_output[i] = TRUE; Py_INCREF(mps[i]); if (!PyArray_Check((PyObject *)mps[i])) { PyObject *new; @@ -1000,6 +1002,7 @@ NULL, NULL, 0, 0, NULL); if (mps[i] == NULL) return -1; + supplied_output[i] = FALSE; } /* reset types for outputs that are equivalent @@ -1271,7 +1274,8 @@ } static PyUFuncLoopObject * -construct_loop(PyUFuncObject *self, PyObject *args, PyArrayObject **mps) +construct_loop(PyUFuncObject *self, PyObject *args, + PyArrayObject **mps, Bool* supplied_output) { PyUFuncLoopObject *loop; int i; @@ -1299,7 +1303,7 @@ &(loop->errobj)) < 0) goto fail; /* Setup the matrices */ - if (construct_matrices(loop, args, mps) < 0) goto fail; + if (construct_matrices(loop, args, mps, supplied_output) < 0) goto fail; PyUFunc_clearfperr(); @@ -1381,13 +1385,13 @@ /*UFUNC_API*/ static int PyUFunc_GenericFunction(PyUFuncObject *self, PyObject *args, - PyArrayObject **mps) + PyArrayObject **mps, Bool* supplied_output) { PyUFuncLoopObject *loop; int i; BEGIN_THREADS_DEF - if (!(loop = construct_loop(self, args, mps))) return -1; + if (!(loop = construct_loop(self, args, mps, supplied_output))) return -1; if (loop->notimplemented) {ufuncloop_dealloc(loop); return -2;} LOOP_BEGIN_THREADS @@ -2561,6 +2565,7 @@ PyTupleObject *ret; PyArrayObject *mps[MAX_ARGS]; PyObject *retobj[MAX_ARGS]; + Bool supplied_output[MAX_ARGS]; PyObject *res; PyObject *wrap; int errval; @@ -2569,7 +2574,7 @@ if something goes wrong. */ for(i=0; inargs; i++) mps[i] = NULL; - errval = PyUFunc_GenericFunction(self, args, mps); + errval = PyUFunc_GenericFunction(self, args, mps, supplied_output); if (errval < 0) { for(i=0; inargs; i++) Py_XDECREF(mps[i]); if (errval == -1) @@ -2619,7 +2624,9 @@ continue; } } - retobj[i] = PyArray_Return(mps[j]); + retobj[i] = supplied_output[j] + ? (PyObject *)mps[j] + : PyArray_Return(mps[j]); } if (self->nout == 1) { From oliphant.travis at ieee.org Fri Feb 17 20:11:17 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 20:11:17 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F66123.204@ee.byu.edu> Message-ID: <43F69E3C.30407@ieee.org> Sasha wrote: >On 2/17/06, Travis Oliphant wrote: > > >>>>from numpy >>>> > >The patch passes all the tests, but I would like to hear from others >before commit. Personally, I am unhappy that I had to change C-API >function. > > Sorry we worked on the same code. I already comitted a changed that solves the problem. It doesn't change the C-API function but instead changes the _find_wrap code which needed changing anyway so that other objects passed in as output arrays work cause the returned object to be obj.__array_wrap__() no matter what the array priority was. Now. a += b doesn't change type no matter what a is. -Travis From ndarray at mac.com Fri Feb 17 20:23:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 20:23:03 2006 Subject: [Numpy-discussion] maxima of masked arrays In-Reply-To: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> References: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> Message-ID: On 2/17/06, George Nurser wrote: > But with a masked array there are problems. What you see is a bug in ma. > Are there any workarounds for this? For now you can use >>> ma.maximum.reduce(amask, 0) array(data = [ 9 10 11 12], mask = [False False False False], fill_value=999999) From ndarray at mac.com Fri Feb 17 20:36:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 20:36:04 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F69E3C.30407@ieee.org> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F66123.204@ee.byu.edu> <43F69E3C.30407@ieee.org> Message-ID: On 2/17/06, Travis Oliphant wrote: > Sorry we worked on the same code. Not a problem. My code was just a proof of concept anyway. > I already comitted a changed that solves the problem. I've seen your change on the timeline. Thanks for extensive comments. > It doesn't change the C-API function but instead > changes the _find_wrap code which needed changing anyway ... You are right, my patch could be fooled by an output arg with an __array_wrap__. However, I am not sure calling output argument's __array_wrap__ is a good idea: it looks like it may lead to "(a is add(a, 2, a)) == False)" in some circumstances. Another concern is that it looks like what output arguments are supplied is determined twice: in _find_wrap and in construct_matrices. From ndarray at mac.com Fri Feb 17 21:08:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 21:08:01 2006 Subject: [Numpy-discussion] maxima of masked arrays In-Reply-To: References: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> Message-ID: On 2/17/06, Sasha wrote: > On 2/17/06, George Nurser wrote: > > But with a masked array there are problems. > > What you see is a bug in ma. Fixed in SVN. From ndarray at mac.com Sat Feb 18 11:50:08 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 18 11:50:08 2006 Subject: [Numpy-discussion] A case for rank-0 arrays Message-ID: I have reviewed mailing list discussions of rank-0 arrays vs. scalars and I concluded that the current implementation that contains both is (almost) correct. I will address the "almost" part with a concrete proposal at the end of this post (search for PROPOSALS if you are only interested in the practical part). The main criticism of supporting both scalars and rank-0 arrays is that it is "unpythonic" in the sense that it provides two almost equivalent ways to achieve the same result. However, I am now convinced that this is the case where practicality beats purity. If you take the "one way" rule to it's logical conclusion, you will find that once your language has functions, it does not need numbers or any other data type because they all can be represented by functions (see http://en.wikipedia.org/wiki/Church_numeral). Another example of core python violating the "one way rule" is the presence of scalars and length-1 tuples. In S+, for example, scalars are represented by single element lists. The situation with ndarrays is somewhat similar. A rank-N array is very similar to a function with N arguments, where each argument has a finite domain (i-th domain of a is range(a.shape[i])). A rank-0 array is just a function with no arguments and as such it is quite different from a scalar. Just as a function with no arguments cannot be replaced by a constant in the case when a value returned may change during the run of the program, rank-0 array cannot be replaced by an array scalar because it is mutable. (See http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray for use cases). Rather than trying to hide rank-0 arrays from the end-user and treat it as an implementation artifact, I believe numpy should emphasize the difference between rank-0 arrays and scalars and have clear rules on when to use what. PROPOSALS ========== Here are three suggestions: 1. Probably the most controversial question is what getitem should return. I believe that most of the confusion comes from the fact that the same syntax implements two different operations: indexing and projection (for the lack of better name). Using the analogy between ndarrays and functions, indexing is just the application of the function to its arguments and projection is the function projection ((f, x) -> lambda (*args): f(x, *args)). The problem is that the same syntax results in different operations depending on the rank of the array. Let >>> x = ones((2,2)) >>> y = ones(2) then x[1] is projection and type(x[1]) is ndarray, but y[1] is indexing and type(y[1]) is int32. Similarly, y[1,...] is indexing, while x[1,...] is projection. I propose to change numpy rules so that if ellipsis is present inside [], the operation is always projection and both y[1,...] and x[1,1,...] return zero-rank arrays. Note that I have previously rejected Francesc's idea that x[...] and x[()] should have different meaning for zero-rank arrays. I was wrong. 2. Another source of ambiguity is the various "reduce" operations such as sum or max. Using the previous example, type(x.sum(axis=0)) is ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: a. Make x.sum(axis) return ndarray unless axis is None, making type(y.sum(axis=0)) is ndarray true in the example. b. Allow axis to be a sequence of ints and make x.sum(axis=range(rank(x))) return rank-0 array according to the rule 2.a above. c. Make x.sum() raise an error for rank-0 arrays and scalars, but allow x.sum(axis=()) to return x. This will make numpy sum consistent with the built-in sum that does not work on scalars. 3. This is a really small change currently >>> empty(()) array(0) but >>> ndarray(()) Traceback (most recent call last): File "", line 1, in ? ValueError: need to give a valid shape as the first argument I propose to make shape=() valid in ndarray constructor. From tim.hochberg at cox.net Sat Feb 18 17:19:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sat Feb 18 17:19:03 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> Message-ID: <43F7C73B.2000806@cox.net> OK, I now have a faily clean implementation in C of: def __pow__(self, p): if p is not a scalar: return power(self, p) elif p == 1: return p elif p == 2: return square(self) # elif p == 3: # return cube(self) # elif p == 4: # return power_4(self) # elif p == 0: # return ones(self.shape, dtype=self.dtype) # elif p == -1: # return 1.0/self elif p == 0.5: return sqrt(self) First a couple of technical questions, then on to the philosophical portion of this note. 1. Is there a nice fast way to get a matrix filled with ones from C. I've been tempted to write a ufunc 'ones_like', but I'm afraid that might be considered inappropriate. 2. Are people aware that array_power is sometimes passed non arrays as its first argument? Despite having the signature: array_power(PyArrayObject *a1, PyObject *o2) This caused me almost no end of headaches, not to mention crashes during numpy.test(). I'll check this into the power_optimization branch RSN, hopefully with a fix for the zero power case. Possibly also after extending it to inplace power as well. OK, now on to more important stuff. As I've been playing with this my opinion has gone in circles a couple of times. I now think the issue of optimizing integer powers of complex numbers and integer powers of floats are almost completely different. Because complex powers are quite slow and relatively inaccurate, it is appropriate to optimize them for integer powers at the level of nc_pow. This should be just a matter of liberal borrowing from complexobject.c, but I haven't tried it yet. On the other hand, real powers are fast enough that doing anything at the single element level is unlikely to help. So in that case we're left with either optimizing the cases where the dimension is zero as David has done, or optimizing at the __pow__ (AKA array_power) level as I've done now based on David's original suggestion. This second approach is faster because it avoids the mysterious python scalar -> zero-D array conversion overhead. However, it suffers if we want to optimize lots of different powers since one needs a ufunc for each one. So the question becomes, which powers should we optimize? My latest thinking on this is that we should optimize only those cases where the optimized result is no less accurate than that produced by pow. I'm going to assume that all C operations are equivalently accurate, so pow(x,2) has roughly the same amount of error as x*x. (Something on the order of 0.5 ULP I'd guess). In that case: pow(x, -1) -> 1 / x pow(x, 0) -> 1 pow(x, 0.5) -> sqrt(x) pow(x, 1) -> x pow(x, 2) -> x*x can all be implemented in terms of multiply or divide with the same accuracy as the original power methods. Once we get beyond these, the error will go up progressively. The minimal set described above seems like it should be relatively uncontroversial and it's what I favor. Once we get beyond this basic set, we would need to reach some sort of consensus on how much additional error we are willing to tolerate for optimizing these extra cases. You'll notice that I've changed my mind, yet again, over whether to optimize A**0.5. Since the set of additional ufuncs needed in this case is relatively small, just square and inverse (==1/x), this minimal set works well if optimizing in pow as I've done. That's the state of my thinking on this at this exact moment. I'd appreciate any comments and suggestions you might have. From oliphant.travis at ieee.org Sat Feb 18 17:21:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 18 17:21:10 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <20060217152207.GA971@sun.ac.za> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> <20060217152207.GA971@sun.ac.za> Message-ID: <43F7C7F2.4050600@ieee.org> Stefan van der Walt wrote: >I am probably trying to do something silly, but still: > >In [1]: import numpy as N > >In [2]: N.__version__ >Out[2]: '0.9.6.2127' > >In [3]: P = N.array(N.zeros((2,2)), N.dtype((('f4',3), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']}))) >*** glibc detected *** malloc(): memory corruption: 0x0830bb48 *** >Aborted > >Regards >St?fan > > This code found a bug that's been there for a while in the PyArray_CastTo code (only seen on multiple copies) which is being done here as the 2x2 array of zeros is being cast to a 2x2x3 array of floating-point zeros. The bug should be fixed in SVN, now. Despite the use of fields, the base-type is ('f4',3) which is equivalent to (tack on a 3 to the shape of the array of 'f4'). So, on array creation the fields will be lost and you will get a 2x2x3 array of float32. Types like ('f4', 3) are really only meant to be used in records. If they are used "by themselves" they simply create an array of larger dimension. By the way, the N.dtype in the array constructor is unnecessary as that is essentially what is done to the second argument anyway You can get two different views of the same data (which it seems you are after) like this: P = N.zeros((2,2), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']}) Q = P.view(('f4',3)) Then Q[...,0] = 10 print P['x'] If you want the field to vary in the first dimension, then you really want a FORTRAN array. So, P = N.zeros((2,2), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']},fortran=1) Q = P.view(('f4',3)) Then Q[0] = 20 print P['x'] Best, -Travis From cjw at sympatico.ca Sun Feb 19 12:44:08 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sun Feb 19 12:44:08 2006 Subject: [Numpy-discussion] Re: number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <1139945386.3346.38.camel@pc1.cole.uklinux.net> References: <1139945386.3346.38.camel@pc1.cole.uklinux.net> Message-ID: <43F8D881.5030804@sympatico.ca> Bryan Cole wrote: >> >> >>First, I think the range() function in python is ugly to begin with. >>Why can't python just support range notation directly like 'for a in >>0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more >>sense to me than having to call a named function. Anyway, that's a >>python pet peeve, and python's probably not going to change something >>so fundamental... >> >> > >There was a python PEP on this. It got rejected as having too many >'issues'. Pity, in my view. > >see http://www.python.org/peps/pep-0204.html > >BC > > This decision appears to have been made nearly six years ago. It would be a good idea to revisit the decision, particularly since the reasons for rejection are not clearly spelled out. The conditional expression (PEP 308) was rejected but is currently being implemented in Python version 2.5. It will have the syntax A if C else B. I have felt, as Gary Ruben says above, that the range structure is ugly. Two alternatives have been suggested: a) a:b:c b) a..b..c Do either of these create parsing problems? for i in a:b: Should this be treated as an error, with a missing c (the increment) print i for i in 2..5: print i We don't know whether the 2 is an integer until the second period is scanned. It would be good if the range and slice could be merged in some way, although the extended slice is rather complicated - I don't understand it. The semantics for an extended slicing are as follows. The primary must evaluate to a mapping object, and it is indexed with a key that is constructed from the slice list, as follows. If the slice list contains at least one comma, the key is a tuple containing the conversion of the slice items; otherwise, the conversion of the lone slice item is the key. The conversion of a slice item that is an expression is that expression. The conversion of an ellipsis slice item is the built-in |Ellipsis| object. The conversion of a proper slice is a slice object (see section section 4.2 The standard type hierarchy ) whose |start|, |stop| and |step| attributes are the values of the expressions given as lower bound, upper bound and stride, respectively, substituting |None| for missing expressions. [source: http://www.network-theory.co.uk/docs/pylang/ref_60.html] The seems to be a bit of a problem with slicing that needs sorting out. The syntax for a slice list appears to allow multiple slices in a list: extended_slicing ::= primary "[" slice_list "]" slice_list ::= slice_item ("," slice_item )* [","] but the current interpreter reports an error: >>> a= range(20) >>> a[slice(3, 9, 2)] [3, 5, 7] >>> a[slice(3, 9, 2), slice(5, 10)] Traceback (most recent call last): File "", line 1, in ? TypeError: list indices must be integers >>> I have taken the liberty of cross posting this to c.l.p. Colin W. From stefan at sun.ac.za Sun Feb 19 13:37:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sun Feb 19 13:37:01 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <43F7C7F2.4050600@ieee.org> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> <20060217152207.GA971@sun.ac.za> <43F7C7F2.4050600@ieee.org> Message-ID: <20060219213542.GA15643@alpha> On Sat, Feb 18, 2006 at 06:20:50PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >I am probably trying to do something silly, but still: > > > >In [1]: import numpy as N > > > >In [2]: N.__version__ > >Out[2]: '0.9.6.2127' > > > >In [3]: P = N.array(N.zeros((2,2)), N.dtype((('f4',3), {'names': > >['x','y','z'], 'formats': ['f4','f4','f4']}))) > >*** glibc detected *** malloc(): memory corruption: 0x0830bb48 *** > >Aborted > > > >Regards > >St?fan > > > > > This code found a bug that's been there for a while in the > PyArray_CastTo code (only seen on multiple copies) which is being done > here as the 2x2 array of zeros is being cast to a 2x2x3 array of > floating-point zeros. > > The bug should be fixed in SVN, now. Thank you very much for fixing this! (It works now). > Despite the use of fields, the base-type is ('f4',3) which is equivalent > to (tack on a 3 to the shape of the array of 'f4'). So, on array > creation the fields will be lost and you will get a 2x2x3 array of > float32. Types like ('f4', 3) are really only meant to be used in > records. If they are used "by themselves" they simply create an array > of larger dimension. Exactly what I needed for my application! I'll write this up and put it on the wiki. Cheers St?fan From aisaac at american.edu Sun Feb 19 13:39:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Sun Feb 19 13:39:01 2006 Subject: [Numpy-discussion] Re: number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <43F8D881.5030804@sympatico.ca> References: <1139945386.3346.38.camel@pc1.cole.uklinux.net><43F8D881.5030804@sympatico.ca> Message-ID: On Sun, 19 Feb 2006, "Colin J. Williams" apparently wrote: > The conditional expression (PEP 308) was rejected but is currently being > implemented in Python version 2.5. It will have the syntax A if C else B. It's coming http://www.python.org/peps/pep-0308.html But in 2.5? http://www.python.org/dev/doc/devel/whatsnew/whatsnew25.html Thank you, Alan Isaac From tim.hochberg at cox.net Sun Feb 19 14:34:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sun Feb 19 14:34:02 2006 Subject: [Numpy-discussion] complex division Message-ID: <43F8F202.80500@cox.net> While rummaging around Python's complexobject.c looking for code to steal for complex power, I came across the following comment relating to complex division: /****************************************************************** This was the original algorithm. It's grossly prone to spurious overflow and underflow errors. It also merrily divides by 0 despite checking for that(!). The code still serves a doc purpose here, as the algorithm following is a simple by-cases transformation of this one: Py_complex r; double d = b.real*b.real + b.imag*b.imag; if (d == 0.) errno = EDOM; r.real = (a.real*b.real + a.imag*b.imag)/d; r.imag = (a.imag*b.real - a.real*b.imag)/d; return r; ******************************************************************/ /* This algorithm is better, and is pretty obvious: first divide the * numerators and denominator by whichever of {b.real, b.imag} has * larger magnitude. The earliest reference I found was to CACM * Algorithm 116 (Complex Division, Robert L. Smith, Stanford * University). As usual, though, we're still ignoring all IEEE * endcases. */ The algorithm shown, and maligned, in this comment is pretty much exactly what is done in numpy at present. The function goes on to use the improved algorithm, which I will include at the bottom of the post. It seems nearly certain that using this algorithm will result in some speed hit, although I'm not certain how much. I will probably try this out at some point and see what the speed hit, but in case I drop the ball I thought I'd throw this out there as something we should at least look at. In most cases, I'll take accuracy over raw speed (within reason). -tim Py_complex r; /* the result */ const double abs_breal = b.real < 0 ? -b.real : b.real; const double abs_bimag = b.imag < 0 ? -b.imag : b.imag; if (abs_breal >= abs_bimag) { /* divide tops and bottom by b.real */ if (abs_breal == 0.0) { errno = EDOM; r.real = r.imag = 0.0; } else { const double ratio = b.imag / b.real; const double denom = b.real + b.imag * ratio; r.real = (a.real + a.imag * ratio) / denom; r.imag = (a.imag - a.real * ratio) / denom; } } else { /* divide tops and bottom by b.imag */ const double ratio = b.real / b.imag; const double denom = b.real * ratio + b.imag; assert(b.imag != 0.0); r.real = (a.real * ratio + a.imag) / denom; r.imag = (a.imag * ratio - a.real) / denom; } return r; From charlesr.harris at gmail.com Sun Feb 19 16:13:02 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun Feb 19 16:13:02 2006 Subject: [Numpy-discussion] complex division In-Reply-To: <43F8F202.80500@cox.net> References: <43F8F202.80500@cox.net> Message-ID: Hmm... The new algorithm does look better with respect to overflow and underflow, but I wonder if it is not a bit of overkill. It seems to me that the same underflow/overflow problems attend complex multiplication, which is pretty much all that goes on in the original algorithm. One thing I do know is that division is expensive. I wonder if one division and two multiplications might be cheaper than two divisions. I'll have to check that out. Chuck On 2/19/06, Tim Hochberg wrote: > > While rummaging around Python's complexobject.c looking for code to > steal for complex power, I came across the following comment relating to > complex division: > > /****************************************************************** > This was the original algorithm. It's grossly prone to spurious > overflow and underflow errors. It also merrily divides by 0 despite > checking for that(!). The code still serves a doc purpose here, as > the algorithm following is a simple by-cases transformation of this > one: > > Py_complex r; > double d = b.real*b.real + b.imag*b.imag; > if (d == 0.) > errno = EDOM; > r.real = (a.real*b.real + a.imag*b.imag)/d; > r.imag = (a.imag*b.real - a.real*b.imag)/d; > return r; > ******************************************************************/ > > /* This algorithm is better, and is pretty obvious: first > divide the > * numerators and denominator by whichever of {b.real, b.imag} has > * larger magnitude. The earliest reference I found was to CACM > * Algorithm 116 (Complex Division, Robert L. Smith, Stanford > * University). As usual, though, we're still ignoring all IEEE > * endcases. > */ > > The algorithm shown, and maligned, in this comment is pretty much > exactly what is done in numpy at present. The function goes on to use > the improved algorithm, which I will include at the bottom of the post. > It seems nearly certain that using this algorithm will result in some > speed hit, although I'm not certain how much. I will probably try this > out at some point and see what the speed hit, but in case I drop the > ball I thought I'd throw this out there as something we should at least > look at. In most cases, I'll take accuracy over raw speed (within reason). > > -tim > > > > > > > > Py_complex r; /* the result */ > const double abs_breal = b.real < 0 ? -b.real : b.real; > const double abs_bimag = b.imag < 0 ? -b.imag : b.imag; > > if (abs_breal >= abs_bimag) { > /* divide tops and bottom by b.real */ > if (abs_breal == 0.0) { > errno = EDOM; > r.real = r.imag = 0.0; > } > else { > const double ratio = b.imag / b.real; > const double denom = b.real + b.imag * ratio; > r.real = (a.real + a.imag * ratio) / denom; > r.imag = (a.imag - a.real * ratio) / denom; > } > } > else { > /* divide tops and bottom by b.imag */ > const double ratio = b.real / b.imag; > const double denom = b.real * ratio + b.imag; > assert(b.imag != 0.0); > r.real = (a.real * ratio + a.imag) / denom; > r.imag = (a.imag * ratio - a.real) / denom; > } > return r; > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From cookedm at physics.mcmaster.ca Sun Feb 19 16:25:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Sun Feb 19 16:25:02 2006 Subject: [Numpy-discussion] complex division In-Reply-To: <43F8F202.80500@cox.net> References: <43F8F202.80500@cox.net> Message-ID: <20060220002319.GA15783@arbutus.physics.mcmaster.ca> On Sun, Feb 19, 2006 at 03:32:34PM -0700, Tim Hochberg wrote: > > While rummaging around Python's complexobject.c looking for code to > steal for complex power, I came across the following comment relating to > complex division: > > /****************************************************************** > This was the original algorithm. It's grossly prone to spurious > overflow and underflow errors. It also merrily divides by 0 despite > checking for that(!). The code still serves a doc purpose here, as > the algorithm following is a simple by-cases transformation of this > one: > > Py_complex r; > double d = b.real*b.real + b.imag*b.imag; > if (d == 0.) > errno = EDOM; > r.real = (a.real*b.real + a.imag*b.imag)/d; > r.imag = (a.imag*b.real - a.real*b.imag)/d; > return r; > ******************************************************************/ > > /* This algorithm is better, and is pretty obvious: first > divide the > * numerators and denominator by whichever of {b.real, b.imag} has > * larger magnitude. The earliest reference I found was to CACM > * Algorithm 116 (Complex Division, Robert L. Smith, Stanford > * University). As usual, though, we're still ignoring all IEEE > * endcases. > */ > > The algorithm shown, and maligned, in this comment is pretty much > exactly what is done in numpy at present. The function goes on to use > the improved algorithm, which I will include at the bottom of the post. > It seems nearly certain that using this algorithm will result in some > speed hit, although I'm not certain how much. I will probably try this > out at some point and see what the speed hit, but in case I drop the > ball I thought I'd throw this out there as something we should at least > look at. In most cases, I'll take accuracy over raw speed (within reason). > > -tim The condition for accuracy on this is |Z - z| < epsilon |z| where I'm using Z for the computed value of z=a/b, and epsilon is on the order of machine accuracy. As pointed out by Steward (ACM TOMS, v. 11, pg 238 (1985)), this doesn't mean that the real and imaginary components are accurate. The example he gives is a = 1e70 + 1e-70i and b=1e56+1e-56i, where z=a/b=1e14 + 1e-99i, which is susceptible to underflow for a machine with 10 decimal digits and a exponent range of +-99. Priest (ACM TOMS v30, pg 389 (2004)) gives an alternative, which I won't show here, b/c it does bit-twiddling with the double representation. But it does a better job of handling overflow and underflow in intermediate calculations, is competitive in terms of accuracy, and is faster (at least on a 750 MHz UltraSPARC-III ;) than the other algorithms except for the textbook version. One problem is the sample code is for double precision; for single or longdouble, we'd have to figure out some magic constants. Maybe I'll look into it later, but for now Smith's algorithm is better than the textbook one we were using :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cookedm at physics.mcmaster.ca Sun Feb 19 16:39:00 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Sun Feb 19 16:39:00 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F7C73B.2000806@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> Message-ID: <20060220003714.GB15783@arbutus.physics.mcmaster.ca> On Sat, Feb 18, 2006 at 06:17:47PM -0700, Tim Hochberg wrote: > > OK, I now have a faily clean implementation in C of: > > def __pow__(self, p): > if p is not a scalar: > return power(self, p) > elif p == 1: > return p > elif p == 2: > return square(self) > # elif p == 3: > # return cube(self) > # elif p == 4: > # return power_4(self) > # elif p == 0: > # return ones(self.shape, dtype=self.dtype) > # elif p == -1: > # return 1.0/self > elif p == 0.5: > return sqrt(self) > > > First a couple of technical questions, then on to the philosophical portion > of this note. > > 1. Is there a nice fast way to get a matrix filled with ones from C. I've > been tempted to write a ufunc 'ones_like', but I'm afraid that might be > considered inappropriate. > > 2. Are people aware that array_power is sometimes passed non arrays as its > first argument? Despite having the signature: > > array_power(PyArrayObject *a1, PyObject *o2) > > This caused me almost no end of headaches, not to mention crashes during > numpy.test(). Yes; because it's the implementation of __pow__, the second argument can be anything. > I'll check this into the power_optimization branch RSN, hopefully with a > fix for the zero power case. Possibly also after extending it to inplace > power as well. > > > OK, now on to more important stuff. As I've been playing with this my > opinion has gone in circles a couple of times. I now think the issue of > optimizing integer powers of complex numbers and integer powers of floats > are almost completely different. Because complex powers are quite slow and > relatively inaccurate, it is appropriate to optimize them for integer > powers at the level of nc_pow. This should be just a matter of liberal > borrowing from complexobject.c, but I haven't tried it yet. Ok. > On the other hand, real powers are fast enough that doing anything at the > single element level is unlikely to help. So in that case we're left with > either optimizing the cases where the dimension is zero as David has done, > or optimizing at the __pow__ (AKA array_power) level as I've done now based > on David's original suggestion. This second approach is faster because it > avoids the mysterious python scalar -> zero-D array conversion overhead. > However, it suffers if we want to optimize lots of different powers since > one needs a ufunc for each one. So the question becomes, which powers > should we optimize? Hmm, ufuncs are passed a void* argument for passing info to them. Now, what that argument is defined when the ufunc is created, but maybe there's a way to piggy-back on it. > My latest thinking on this is that we should optimize only those cases > where the optimized result is no less accurate than that produced by pow. > I'm going to assume that all C operations are equivalently accurate, so > pow(x,2) has roughly the same amount of error as x*x. (Something on the > order of 0.5 ULP I'd guess). In that case: > pow(x, -1) -> 1 / x > pow(x, 0) -> 1 > pow(x, 0.5) -> sqrt(x) > pow(x, 1) -> x > pow(x, 2) -> x*x > can all be implemented in terms of multiply or divide with the same > accuracy as the original power methods. Once we get beyond these, the error > will go up progressively. > > The minimal set described above seems like it should be relatively > uncontroversial and it's what I favor. Once we get beyond this basic set, > we would need to reach some sort of consensus on how much additional error > we are willing to tolerate for optimizing these extra cases. You'll notice > that I've changed my mind, yet again, over whether to optimize A**0.5. > Since the set of additional ufuncs needed in this case is relatively small, > just square and inverse (==1/x), this minimal set works well if optimizing > in pow as I've done. Ok. I'm still not happy with the speed of pow(), though. I'll have to sit and look at. We may be able to optimize integer powers better. And there's another area: integer powers of integers. Right now that uses pow(), whereas we might be able to do better. I'm looking into that. A framework for that could be helpful for the complex powers too. Too bad we couldn't make a function generator :-) [Well, we could using weave...] -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cjw at sympatico.ca Sun Feb 19 17:46:03 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sun Feb 19 17:46:03 2006 Subject: [Numpy-discussion] number ranges In-Reply-To: References: <1139945386.3346.38.camel@pc1.cole.uklinux.net><43F8D881.5030804@sympatico.ca> Message-ID: <43F91F2D.6070208@sympatico.ca> An HTML attachment was scrubbed... URL: From mahmood.tariq at gmail.com Sun Feb 19 17:58:03 2006 From: mahmood.tariq at gmail.com (Tariq Mahmood) Date: Sun Feb 19 17:58:03 2006 Subject: [Numpy-discussion] numpy 0.9.4 on cygwin Message-ID: Hi, Has anyone been successful at installing numpy 0.9.4 on cygwin? Details: os.name: posix uname -a: CYGWIN_NT-5.1 sys.platform: cygwin sys.version: 2.4.1 numpy.version: 0.9.4 Steps taken: 1. unpacked numpy-0.9.4.tar.gz 2. changed to numpy-0.9.4 directory 3. python setup.py install Major error messages while compiling C source: 1. undefined reference to `_feclearexcept' in umathmodule 2. Command "gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.18-i686-2.4/build/src/numpy/core/src/umathmodule.o -L/usr/lib/python2.4/config -lpython2.4 -o build/lib.cygwin-1.5.18-i686-2.4/numpy/core/umath.dll" failed with exit status 1 Other information that might be useful: 1. successfully installed both Numeric 24.2 and numarray 1.5.0 Any help would be appreciated. Tariq From rhl at astro.princeton.edu Sun Feb 19 18:14:01 2006 From: rhl at astro.princeton.edu (Robert Lupton) Date: Sun Feb 19 18:14:01 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray Message-ID: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> I have a swig extension that defines a class that inherits from both a personal C-coded image struct (actImage), and also from Numeric's UserArray. This works very nicely, but I thought that it was about time to upgrade to numpy. The code looks like: from UserArray import * class Image(UserArray, actImage): def __init__(self, *args): actImage.__init__(self, *args) UserArray.__init__(self, self.getArray(), 'd', copy=False, savespace=False) I can't figure out how to convert this to use ndarray, as ndarray doesn't seem to have an __init__ method, merely a __new__. So what's the approved numpy way to handle multiple inheritance? I've a nasty idea that this is a python question that I should know the answer to, but I'm afraid that I don't... R From tim.hochberg at cox.net Sun Feb 19 19:36:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sun Feb 19 19:36:12 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <20060220003714.GB15783@arbutus.physics.mcmaster.ca> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> Message-ID: <43F938FA.80200@cox.net> David M. Cooke wrote: >On Sat, Feb 18, 2006 at 06:17:47PM -0700, Tim Hochberg wrote: > > >>OK, I now have a faily clean implementation in C of: >> >>def __pow__(self, p): >> if p is not a scalar: >> return power(self, p) >> elif p == 1: >> return p >> elif p == 2: >> return square(self) >># elif p == 3: >># return cube(self) >># elif p == 4: >># return power_4(self) >># elif p == 0: >># return ones(self.shape, dtype=self.dtype) >># elif p == -1: >># return 1.0/self >> elif p == 0.5: >> return sqrt(self) >> >> >>First a couple of technical questions, then on to the philosophical portion >>of this note. >> >>1. Is there a nice fast way to get a matrix filled with ones from C. I've >>been tempted to write a ufunc 'ones_like', but I'm afraid that might be >>considered inappropriate. >> >>2. Are people aware that array_power is sometimes passed non arrays as its >>first argument? Despite having the signature: >> >>array_power(PyArrayObject *a1, PyObject *o2) >> >>This caused me almost no end of headaches, not to mention crashes during >>numpy.test(). >> >> > >Yes; because it's the implementation of __pow__, the second argument can >be anything. > > No, you misunderstand.. What I was talking about was that the *first* argument can also be something that's not a PyArrayObject, despite the functions signature. > > >>I'll check this into the power_optimization branch RSN, hopefully with a >>fix for the zero power case. Possibly also after extending it to inplace >>power as well. >> >> >>OK, now on to more important stuff. As I've been playing with this my >>opinion has gone in circles a couple of times. I now think the issue of >>optimizing integer powers of complex numbers and integer powers of floats >>are almost completely different. Because complex powers are quite slow and >>relatively inaccurate, it is appropriate to optimize them for integer >>powers at the level of nc_pow. This should be just a matter of liberal >>borrowing from complexobject.c, but I haven't tried it yet. >> >> > >Ok. > > > >>On the other hand, real powers are fast enough that doing anything at the >>single element level is unlikely to help. So in that case we're left with >>either optimizing the cases where the dimension is zero as David has done, >>or optimizing at the __pow__ (AKA array_power) level as I've done now based >>on David's original suggestion. This second approach is faster because it >>avoids the mysterious python scalar -> zero-D array conversion overhead. >>However, it suffers if we want to optimize lots of different powers since >>one needs a ufunc for each one. So the question becomes, which powers >>should we optimize? >> >> > >Hmm, ufuncs are passed a void* argument for passing info to them. Now, >what that argument is defined when the ufunc is created, but maybe >there's a way to piggy-back on it. > > Yeah, I really felt like I was fighting the ufuncs when I was playing with this. On the one hand, you really want to use the ufunc machinery. On the other hand that forces you into using the same types for both arguments. That really wouldn't be a problem, since we could just define an integer_power that took doubles, but did integer powers, except for the conversion overhead of Python_Integers into arrays. It looks like you started down this road and I played with this as well. I can think a of at least one (horrible) way around the matrix overhead, but the real fix would be to dig into PyArray_EnsureArray and see why it's slow for Python_Ints. It is much faster for numarray scalars. Another approach is to actually compute (x*x)*(x*x) for pow(x,4) at the level of array_power. I think I could make this work. It would probably work well for medium size arrays, but might well make things worse for large arrays that are limited by memory bandwidth since it would need to move the array from memory into the cache multiple times. >>My latest thinking on this is that we should optimize only those cases >>where the optimized result is no less accurate than that produced by pow. >>I'm going to assume that all C operations are equivalently accurate, so >>pow(x,2) has roughly the same amount of error as x*x. (Something on the >>order of 0.5 ULP I'd guess). In that case: >> pow(x, -1) -> 1 / x >> pow(x, 0) -> 1 >> pow(x, 0.5) -> sqrt(x) >> pow(x, 1) -> x >> pow(x, 2) -> x*x >>can all be implemented in terms of multiply or divide with the same >>accuracy as the original power methods. Once we get beyond these, the error >>will go up progressively. >> >>The minimal set described above seems like it should be relatively >>uncontroversial and it's what I favor. Once we get beyond this basic set, >>we would need to reach some sort of consensus on how much additional error >>we are willing to tolerate for optimizing these extra cases. You'll notice >>that I've changed my mind, yet again, over whether to optimize A**0.5. >>Since the set of additional ufuncs needed in this case is relatively small, >>just square and inverse (==1/x), this minimal set works well if optimizing >>in pow as I've done. >> >> Just to add a little more confusion to the mix. I did a little testing to see how close pow(x,n) and x*x*... actually are. They are slightly less close for small values of N and slightly closer for large values of N than I would have expected. The upshot of this is that integer powers between -2 and +4 all seem to vary by the same amount when computed using pow(x,n) versus multiplies. I'm including the test code at the end. Assuming that this result is not a fluke that expands the noncontroversial set by at least 3 more values. That's starting to strain the ufunc aproach, so perhaps optimizing in @TYP at _power is the way to go after all. Or, more likely, adding @TYP at _int_power or maybe @TYP at _fast_power (so as to be able to include some half integer powers) and dispatching appropriately from array_power. The problem here, of course, is the overhead that PyArray_EnsureArray runs into. I'm not sure if the ufuncs actually call that, but I was using that to convert things to arrays at one point and I saw the slowdown, so I suspect that the slowdown is in something PyArray_EnsureArray calls if not in that routine itself. I'm afraid to dig into that stuff though.. On the other hand, it would probably speed up all kinds of stuff if that was sped up. > >Ok. I'm still not happy with the speed of pow(), though. I'll have to >sit and look at. We may be able to optimize integer powers better. >And there's another area: integer powers of integers. Right now that >uses pow(), whereas we might be able to do better. > So that's why that's so slow. I assumed it was doing some sort of sucessive multiplication. For this, the code that complexobject uses for integer powers might be helpful. >I'm looking into >that. A framework for that could be helpful for the complex powers too. > >Too bad we couldn't make a function generator :-) [Well, we could using >weave...]\ > > Yaigh! -tim def check(n=100000): import math sqrt = math.sqrt failures = {} for x in [math.pi, math.e, 1.1]+[1.0+1.0/y for y in range(1,1+n)]: for e, expr in [ (-5, "1/((x*x)*(x*x)*x)"), (-4,"1/((x*x)*(x*x))"), (3, "1/((x*x)*x)"), (1, "x"), (-2, "1/(x*x)"), (-1,"1/x"), (0, "1"), (1, "x"), (2, "x*x"), (3, "x*x*x"), (4, "(x*x)*(x*x)"), (4, "x*x*x*x"), (5, '(x*x)*(x*x)*x'), (6, '(x*x)*(x*x)*(x*x)'), (7, '(x*x)*(x*x)*(x*x)*x'), (8, '((x*x)*(x*x))*((x*x)*(x*x))'), (-1.5, "1/(sqrt(x)*x)"), (-0.5, "1/sqrt(x)"), (0.5, "sqrt(x)"), (1.5, "x*sqrt(x)")]: delta = abs(pow(x,e) - eval(expr, locals())) / pow(x,e) if delta: key = (e, expr) if key not in failures: failures[key] = [(delta, x)] failures[key].append((delta, x)) for key in sorted(failures.keys()): e, expr = key fstring = ', '.join(str(x) for x in list(reversed(sorted(failures[key])))[:1]) if len(failures[key]) > 1: fstring += ', ...' print "Failures for x**%s (%s): %s" % (e, expr, fstring) From ndarray at mac.com Sun Feb 19 19:38:02 2006 From: ndarray at mac.com (Sasha) Date: Sun Feb 19 19:38:02 2006 Subject: [Numpy-discussion] What is the status of the multidimensional arrays PEP? Message-ID: What is the status of the multidimensional arrays PEP? It seems to me that there is one part of the PEP that can be easily separated into a rather uncontroversial PEP. This is the part that defines array protocol. Python already has a (1-dimensional) array object in the standard library. Python array already supports buffer protocol and it looks like implementing full array protocol is straightforward. I believe that having an object that supports array protocol even without multiple dimensions will be immediately useful. From wbaxter at gmail.com Sun Feb 19 22:13:05 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 22:13:05 2006 Subject: [Numpy-discussion] Some missing linalg things (wanted: LU decomposition) Message-ID: This url http://www.rexx.com/~dkuhlman/scipy_course_01.html seems to keep turning up in my searches for numpy and scipy things, but many of the linalg operations it lists don't seem to exist in recent versions of numpy (or scipy). Some of them are: * norm * factorizations: lu, lu_factor, lu_solve, qr * iterative solvers: cg, cgs, gmres etc. Did these things used to exist in Numeric but they haven't been ported over? Will they be re-introduced sometime? In the short term, the one I'm after right now is LU decompose and solve functionality. Anyone have a numpy implementation? --Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sun Feb 19 22:37:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 22:37:02 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: References: Message-ID: Upon further inspection I find that if I call 'from scipy import *' then linalg.lu etc are defined. But if I do anything else to import scipy like 'import scipy' or 'import scipy as S' or 'from scipy import linalg', then lu, cg etc are not defined. Why is that? I can get at them without importing * by doing 'from scipy.linalg import lu', but that's kind of odd to have to do that. --bb On 2/20/06, Bill Baxter wrote: > > This url http://www.rexx.com/~dkuhlman/scipy_course_01.htmlseems to keep turning up in my searches for numpy and scipy things, > but many of the linalg operations it lists don't seem to exist in recent > versions of numpy (or scipy). > > Some of them are: > > * norm > * factorizations: lu, lu_factor, lu_solve, qr > * iterative solvers: cg, cgs, gmres etc. > > Did these things used to exist in Numeric but they haven't been ported > over? Will they be re-introduced sometime? > > In the short term, the one I'm after right now is LU decompose and solve > functionality. Anyone have a numpy implementation? > > --Bill Baxter > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sun Feb 19 22:49:05 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 22:49:05 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: References: Message-ID: Ack. I may be able to get references to lu, lu_factor, et al, but they don't actually work with numpy arrays: from scipy.linalg import lu,lu_factor,lu_solve import scipy as S A = S.rand(2,2) lu(A) Traceback (most recent call last): File "", line 1, in ? File "C:\Python24\Lib\site-packages\scipy\linalg\decomp.py", line 249, in lu flu, = get_flinalg_funcs(('lu',),(a1,)) File "C:\Python24\Lib\site-packages\scipy\linalg\flinalg.py", line 30, in get_flinalg_funcs t = arrays[i].dtypechar AttributeError: 'numpy.ndarray' object has no attribute 'dtypechar' Ok, so, once again, does anyone have an lu_factor / lu_solve implementation in python that I could borrow? Apologies for the monologue. --bb On 2/20/06, Bill Baxter wrote: > > Upon further inspection I find that if I call 'from scipy import *' then > linalg.lu etc are defined. > But if I do anything else to import scipy like 'import scipy' or 'import > scipy as S' or 'from scipy import linalg', then lu, cg etc are not defined. > > > Why is that? > > I can get at them without importing * by doing 'from scipy.linalg import > lu', but that's kind of odd to have to do that. > > --bb > > On 2/20/06, Bill Baxter wrote: > > > > This url http://www.rexx.com/~dkuhlman/scipy_course_01.htmlseems to keep turning up in my searches for numpy and scipy things, > > but many of the linalg operations it lists don't seem to exist in recent > > versions of numpy (or scipy). > > > > Some of them are: > > > > * norm > > * factorizations: lu, lu_factor, lu_solve, qr > > * iterative solvers: cg, cgs, gmres etc. > > > > Did these things used to exist in Numeric but they haven't been ported > > over? Will they be re-introduced sometime? > > > > In the short term, the one I'm after right now is LU decompose and solve > > functionality. Anyone have a numpy implementation? > > > > --Bill Baxter > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sun Feb 19 23:38:08 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 23:38:08 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: <43F96736.3020003@mecha.uni-stuttgart.de> References: <43F96736.3020003@mecha.uni-stuttgart.de> Message-ID: Should have mentioned -- I was using numpy 0.9.4 / scipy 0.4.4. Looks like it works in numpy 0.9.5 / scipy 0.4.6 But matplotlib, which I also need, hasn't been updated for numpy 0.9.5 yet. :-( It's also still pretty weird to me that you have to do "from scipy.linalgimport lu" specifically. And then after doing that one import, then all the other scipy.linalg.* functions magically spring into existence too. Is that sort of hing expected behavior from Python imports? >>> import numpy as N >>> import scipy as S >>> S.linalg.lu Traceback (most recent call last): File "", line 1, in ? AttributeError: 'module' object has no attribute 'lu' >>> from scipy.linalg import lu >>> S.linalg.lu(N.rand(2,2)) (array([[ 0., 1.], [ 1., 0.]]), array([[ 1. , 0. ], [ 0.18553085, 1. ]]), array([[ 0.71732168, 0.48540043], [ 0. , 0.61379118]])) >>> (N.__version__, S.__version__) ('0.9.5', '0.4.6') --bb On 2/20/06, Nils Wagner wrote: > > Bill Baxter wrote: > > Ack. I may be able to get references to lu, lu_factor, et al, but > > they don't actually work with numpy arrays: > > > > from scipy.linalg import lu,lu_factor,lu_solve > > import scipy as S > > A = S.rand(2,2) > > lu(A) > > Traceback (most recent call last): > > File "", line 1, in ? > > File "C:\Python24\Lib\site-packages\scipy\linalg\decomp.py", line > > 249, in lu > > flu, = get_flinalg_funcs(('lu',),(a1,)) > > File "C:\Python24\Lib\site-packages\scipy\linalg\flinalg.py", line > > 30, in get_flinalg_funcs > > t = arrays[i].dtypechar > > AttributeError: 'numpy.ndarray' object has no attribute 'dtypechar' > > > > > > Ok, so, once again, does anyone have an lu_factor / lu_solve > > implementation in python that I could borrow? > > > > Apologies for the monologue. > > > > --bb > > > > > > On 2/20/06, *Bill Baxter* > > wrote: > > > > Upon further inspection I find that if I call 'from scipy import > > *' then linalg.lu etc are defined. > > But if I do anything else to import scipy like 'import scipy' or > > 'import scipy as S' or 'from scipy import linalg', then lu, cg etc > > are not defined. > > > > Why is that? > > > > I can get at them without importing * by doing 'from scipy.linalg > > import lu', but that's kind of odd to have to do that. > > > > --bb > > > > > > On 2/20/06, * Bill Baxter* > > wrote: > > > > This url http://www.rexx.com/~dkuhlman/scipy_course_01.html > > seems > > to keep turning up in my searches for numpy and scipy things, > > but many of the linalg operations it lists don't seem to exist > > in recent versions of numpy (or scipy). > > > > Some of them are: > > > > * norm > > * factorizations: lu, lu_factor, lu_solve, qr > > * iterative solvers: cg, cgs, gmres etc. > > > > Did these things used to exist in Numeric but they haven't > > been ported over? Will they be re-introduced sometime? > > > > In the short term, the one I'm after right now is LU decompose > > and solve functionality. Anyone have a numpy implementation? > > > > --Bill Baxter > > > No problem here. > > >>> from scipy.linalg import lu,lu_factor,lu_solve > >>> import scipy as S > >>> A = S.rand(2,2) > >>> lu(A) > (array([[ 0., 1.], > [ 1., 0.]]), array([[ 1. , 0. ], > [ 0.81367315, 1. ]]), array([[ 0.49886054, 0.57065709], > [ 0. , -0.30862809]])) > >>> S.__version__ > '0.4.7.1614' > > > Nils > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josegomez at gmx.net Mon Feb 20 00:19:01 2006 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Mon Feb 20 00:19:01 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin References: <43F5B856.5060301@ieee.org> Message-ID: <18838.1140423471@www073.gmx.net> Hi Travis, > I looked in to how people at cygwin ports got the IEEE math stuff done. > They borrowed it from BSD basically. So, I've taken their patch and > placed it in the main tree. > > Jose, could you check out the latest SVN version of numpy and try to > build and install it on cygwin to see if I made the right changes? I can?t access remote SVNs/CVSs servers from work, but I will download it at home and try it tonight. Many thanks! Jose -- Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko! Satte Provisionen f?r GMX Partner: http://www.gmx.net/de/go/partner From curzio.basso at gmail.com Mon Feb 20 08:49:23 2006 From: curzio.basso at gmail.com (Curzio Basso) Date: Mon Feb 20 08:49:23 2006 Subject: [Numpy-discussion] [nd_image] histograms of RGB images Message-ID: Hello everybody! I was wondering if someone already had the problem of computing histograms of RGB images for all channels simultaneously (that is getting a rank-3 array) rather than on the three channels separately. Just looking for a way to avoid writing the C function :-) cheers curzio From fonnesbeck at gmail.com Mon Feb 20 11:28:06 2006 From: fonnesbeck at gmail.com (Chris Fonnesbeck) Date: Mon Feb 20 11:28:06 2006 Subject: [Numpy-discussion] selecting random array element Message-ID: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> What is the best way to select a random element from a numpy array? I know I could index by a random integer, but was wondering if there was a built-in method or function. Thanks, C. -- Chris Fonnesbeck + Atlanta, GA + http://trichech.us From aisaac at american.edu Mon Feb 20 11:44:02 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 20 11:44:02 2006 Subject: [Numpy-discussion] selecting random array element In-Reply-To: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: At http://www.american.edu/econ/pytrix/pytrix.py find def permute(x): '''Return a permutation of a sequence or array. :note: Also consider numpy.random.shuffle (to permute *inplace* 1-d arrays) ''' x = numpy.asarray(x) xshape = x.shape pidx = numpy.random.random(x.size).argsort() return x.flat[pidx].reshape(xshape) Note the note. ;-) Cheers, Alan Isaac From strawman at astraw.com Mon Feb 20 11:59:02 2006 From: strawman at astraw.com (Andrew Straw) Date: Mon Feb 20 11:59:02 2006 Subject: [Numpy-discussion] [nd_image] histograms of RGB images In-Reply-To: References: Message-ID: <43FA1F58.1030204@astraw.com> See this example. You don't really need pylab/matplotlib -- you could just use numpy.histogram. import pylab import matplotlib.numerix as nx import Image im = Image.open('data/lena.jpg') imbuf = im.tostring('raw','RGB',0,-1) imnx = nx.fromstring(imbuf,nx.UInt8) imnx.shape = im.size[1], im.size[0], 3 bins = nx.arange(0,256) pylab.hist( nx.ravel(imnx[:,:,0]), bins=bins, facecolor='r', edgecolor='r' ) pylab.hist( nx.ravel(imnx[:,:,1]), bins=bins, facecolor='g', edgecolor='g' ) pylab.hist( nx.ravel(imnx[:,:,2]), bins=bins, facecolor='b', edgecolor='b' ) pylab.show() Curzio Basso wrote: >Hello everybody! > >I was wondering if someone already had the problem of computing >histograms of RGB images for all channels simultaneously (that is >getting a rank-3 array) rather than on the three channels separately. >Just looking for a way to avoid writing the C function :-) > >cheers >curzio > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From robert.kern at gmail.com Mon Feb 20 14:42:06 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon Feb 20 14:42:06 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: Alan G Isaac wrote: > At http://www.american.edu/econ/pytrix/pytrix.py find > def permute(x): > '''Return a permutation of a sequence or array. > > :note: Also consider numpy.random.shuffle > (to permute *inplace* 1-d arrays) > ''' > x = numpy.asarray(x) > xshape = x.shape > pidx = numpy.random.random(x.size).argsort() > return x.flat[pidx].reshape(xshape) You may want to consider numpy.random.permutation() In [22]: numpy.random.permutation? Type: builtin_function_or_method Base Class: String Form: Namespace: Interactive Docstring: Given an integer, return a shuffled sequence of integers >= 0 and < x; given a sequence, return a shuffled array copy. permutation(x) -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From robert.kern at gmail.com Mon Feb 20 15:47:01 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon Feb 20 15:47:01 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: Chris Fonnesbeck wrote: > What is the best way to select a random element from a numpy array? I > know I could index by a random integer, but was wondering if there was > a built-in method or function. Generating a random index is what I do. I think there's certainly room for a RandomState.choice() method. I think something like this covers most of the use cases: import numpy from numpy import random def choice(x, axis=None): """Select an element or subarray uniformly randomly. If axis is None, then a single element is chosen from the entire array. Otherwise, a subarray is chosen from the given axis. """ x = numpy.asarray(x) if axis is None: length = numpy.multiply.reduce(x.shape) n = random.randint(length) return x.flat[n] else: n = random.randint(x.shape[axis]) # I'm sure there's a better way of doing this idx = map(slice, x.shape) idx[axis] = n return x[tuple(idx)] -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From aisaac at american.edu Mon Feb 20 17:31:13 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 20 17:31:13 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: On Mon, 20 Feb 2006, Robert Kern apparently wrote: > length = numpy.multiply.reduce(x.shape) Can this be different from x.size? Thanks, Alan Isaac From robert.kern at gmail.com Mon Feb 20 17:39:06 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon Feb 20 17:39:06 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: Alan G Isaac wrote: > On Mon, 20 Feb 2006, Robert Kern apparently wrote: > >> length = numpy.multiply.reduce(x.shape) > > Can this be different from x.size? No, it's just that old habits die hard. I knew there was a clean way to get that information, I just didn't remember it or bother myself to look for it. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From wbaxter at gmail.com Mon Feb 20 17:49:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 20 17:49:02 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: <463e11f90602200851o6885dd64l404eaf028509f544@mail.gmail.com> References: <43F96736.3020003@mecha.uni-stuttgart.de> <463e11f90602200851o6885dd64l404eaf028509f544@mail.gmail.com> Message-ID: On 2/20/06, Bill Baxter wrote: > > Should have mentioned -- I was using numpy 0.9.4 / scipy 0.4.4. > > Looks like it works in numpy 0.9.5 / scipy 0.4.6 > > > > But matplotlib, which I also need, hasn't been updated for numpy 0.9.5yet. > > :-( > > > > On 2/21/06, Jonathan Taylor wrote: > > For matplotlib, I just use tolist() like > > a = array([1,3,2,3]) > > ... > > pylab.plot(a.tolist()) > > Maybe that will work for you until you can fix your problem. > J. Excellent idea! That does the trick for now (if I take the numerix: numpy line out of my .matplotlibrc to stop it from crashing on import). --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Mon Feb 20 17:58:23 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 20 17:58:23 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: On Mon, 20 Feb 2006, Robert Kern apparently wrote: > You may want to consider numpy.random.permutation() Yes indeed. Thanks, Alan From oliphant.travis at ieee.org Mon Feb 20 19:45:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 20 19:45:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F938FA.80200@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> Message-ID: <43FA8C9C.2020002@ieee.org> Tim Hochberg wrote: >> Hmm, ufuncs are passed a void* argument for passing info to them. Now, >> what that argument is defined when the ufunc is created, but maybe >> there's a way to piggy-back on it. >> >> > Yeah, I really felt like I was fighting the ufuncs when I was playing > with this. On the one hand, you really want to use the ufunc > machinery. On the other hand that forces you into using the same types > for both arguments. This is not true. Ufuncs can have different types for their arguments. Perhaps you meant something else? > Just to add a little more confusion to the mix. I did a little testing > to see how close pow(x,n) and x*x*... actually are. They are slightly > less close for small values of N and slightly closer for large values > of N than I would have expected. The upshot of this is that integer > powers between -2 and +4 all seem to vary by the same amount when > computed using pow(x,n) versus multiplies. I'm including the test code > at the end. Assuming that this result is not a fluke that expands the > noncontroversial set by at least 3 more values. That's starting to > strain the ufunc aproach, so perhaps optimizing in @TYP at _power is the > way to go after all. Or, more likely, adding @TYP at _int_power or maybe > @TYP at _fast_power (so as to be able to include some half integer > powers) and dispatching appropriately from array_power. > > The problem here, of course, is the overhead that PyArray_EnsureArray > runs into. I'm not sure if the ufuncs actually call that, but I was > using that to convert things to arrays at one point and I saw the > slowdown, so I suspect that the slowdown is in something > PyArray_EnsureArray calls if not in that routine itself. I'm afraid to > dig into that stuff though.. On the other hand, it would probably > speed up all kinds of stuff if that was sped up. EnsureArray simply has some short cuts and then calls PyArray_FromAny. PyArray_FromAny is the big array conversion code. It converts anything (it can) to an array. >> Too bad we couldn't make a function generator :-) [Well, we could using >> weave...]\ >> >> > Yaigh! That's actually an interesting approach that could use some attention.. -Travis From pearu at scipy.org Tue Feb 21 00:55:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 21 00:55:03 2006 Subject: [Numpy-discussion] comparing container objects with arrays Message-ID: Hi, Question: what is the recommended way to compare two array objects? And when they are contained in a tuple or list or dictinary or etc.? I ask because I found that arr1.__eq__(arr2) can return either bool or an array of bools when shape(arr1)!=shape(arr2) or shape(arr1)==shape(arr2), respectively: >>> array([1,2])==array([1,0,0]) False >>> array([1,2])==array([1,0]) array([True, False], dtype=bool) I wonder if numpy users are happy with that? Shouldn't arr1==arr2 return bool as well because current __eq__ behaviour is handled by equal() function when the shapes are equal? Note that if __eq__ would always return bool then the following codes would work as I would expect: >>> (1,array([1,2]))==(1,array([1,2])) Traceback (most recent call last): File "", line 1, in ? ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>> # I would expect True, compare with >>> (1,[1,2])==(1,[1,2]) True >>> (1,array([1,2]))==(1,array([1,2,0])) False I started to write this message because object1 == object2 returns boolean for (all?) Python builtin objects but as soon as these objects contain arrays, the test will fail with an exception. May be numpy needs equalobjs(obj1,obj2) that always returns boolean and can handle comparing objects like {1:array([1,2])}, [3,[array([2,2])], etc. Pearu From arabic at post.cz Tue Feb 21 01:59:02 2006 From: arabic at post.cz (=?windows-1251?B?wO3g8fLg8ej/IMji4O3u4uA=?=) Date: Tue Feb 21 01:59:02 2006 Subject: [Numpy-discussion] =?windows-1251?B?zM7SyMLA1sjfIM3AIDEwMCU=?= Message-ID: <069801c636cd$972e2de1$d7b5d1ce@post.cz> ??? ?????????? ?? ?????????, ????????????? ??????????? ?????????????. ???????-???????: ????????? 100%. ??????? ?????????, ??? ???????????? ????????????? ?????????? ????? ???????????. 9-10 ????? ? ???? ??????? ????? ???????? ?????? ? ???????, ?? ???? ??? ?? ???????? ?? ????????, ???? ??? ?? ???????. ????????????? ?????????????? ??????????? ? ???????? ????? ????? ????? ?????????? ???????????? ?????????????, ? ????? ????????? ???????? ?? ????????? ??? ?????. ??? ??????? ?? ???????????? ????? ????????, ?? ????????? ????, ??? ?? ?????????? ?????? ? ??? ?????? ?? ?????????. ??? ??????, ?????????? ??? ???????????? ?????????? ?????? ? ??????? ??? ??????, ????? ?? ??????? ????? ?????????? ? ???????? ??????? ? ??????? ????? ??? ??? ????? ????????????? ????? ?? ??????? ????????? ????????????? ?? ?????? ?????????? ????? ??????? ??? ?? ?????? ??????? ?? ????? ?? ??? ? ?????? ?????? ??????? ?? ??????? ???????? ?????? ? ???? ????????. ?????????? ????????: - ????????? ????????????? ?????????????? ??????????? ?? ??????? ????????????? - ???????????? ???????? ????? ???????????? - ????????? ?????????? ????? ???????? ? ??????????? ?????????? ?????????: 1. ???????? ????????? ? ????????? ???????????. ???? ?????????? ?????, ??? ??????? ?? ?????????? ??????????? ??????? ? ??????? ? ????????? ??? ????????????? ???????? ? ????????? ????????? ? ????????? ????? ????? 2. ??????????? ?????????????? ??????????? ? ?? ????????????? ? ???????? ?????????? ? ??????????????? ??????? ???????? ????????? ??????????? ??????????? ??????????? ?? ???????????????? ??????? ????????? ???????? ?????????????? ???????????? ?????????? ? ????????????, ?????? ????????????? ?????????????? ?????? ???? ? ?????????? ??? ????????? ????????? ??????????? ????????? 3 ????????? ? ????????????? ??????????? 4. ?????????? ?????????? ??? ?????????? ???????????? ?????????? ? ?????????? ????????. ????? 80 ???????? ???????????? ????????? (???????????? ? ??????????????) ????? ???????????? ??????????? ? ??????????? ?? ???? ???????????? ? ???? ?????? ???????????? ??????, ???????????? cases 5. ?????????? ????? ? ????????????? ??? ????????????? ?????? SMART - ????, ?????? ?????, ?????????????, ??????????? ???????? ?????? ??? ????????????? - ?????????? ?????????????? ????????? ??? ????????????? ????????????? ??? ?????? ????????? ? ??????????? ????????? 6. ????????????? ???????? ? ?? ??????? ?? ?????????????? ? ????????? ????????? ??????????? ??????? ????????? ?? ???? ????????????? ???????? ????????????? ???????? ??? ????????????? ?????? ????????? 9. ?????????????? ??????? ??? ?????? ????????? ?????? ? ??????????? ??????????????? ??????? (???? ???????, ?????????????? ???????, ??????? ????????????? ???????, ?????????? ? ??????) ????????? 10. ???????????? ??????? ? ???????? ?????????? ?????????? ??? ?????????? ? ????????????????? ????? ??????? ???????????? ??????? ????????? 11. ???????? ???????????: ??? ??????? ???????? ??????????: ? ???????? ???????????? ???????????? ?????????? cases ?? ???????? ????????, ??????? ????, ? ???? ??????? ????????? ?????? ?????? ????? ??????? ???????? ?????????????? ??????? ??????? ?????? ????????? ? ???????????????, ??????? ????. ? ???? ???????? ??????????? ??? ???????????? ??????? - ???????????? ?????????????? ???????????? ???? ? ??????????????, ??????? ???????????? ????? ???? ??????????? ? ???????? ??????. ??????? ????????: ?????? ? ??????????? ? ????? ???????????, ???????????????? ????????, ????????? ?????????, ????????????. ???????????? ???? ? ??????? ?????????? ?????????? ????? 10 ???. ????????????? ????? MBA "?????????? ????????????? ?????????" ??????. ????? ????? 70 ??????. ??????? ??????? ?? ????? ?? ?????? ????????????, ??????????? ?????????. ???????? ????????? ?????????? ?? ?????? ?? ?????????: (495) 200-2634 ??? ? ???? ?????? ?? ?????? 910032006 at mail.ru -------------- next part -------------- An HTML attachment was scrubbed... URL: From bblais at bryant.edu Tue Feb 21 04:27:01 2006 From: bblais at bryant.edu (Brian Blais) Date: Tue Feb 21 04:27:01 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? Message-ID: <43FB0661.1040202@bryant.edu> Hello, I am trying to translate some Matlab/mex code to Python, for doing neural simulations. This application is definitely computing-time limited, and I need to optimize at least one inner loop of the code, or perhaps even rethink the algorithm. The procedure is very simple, after initializing any variables: 1) select a random input vector, which I will call "x". right now I have it as an array, and I choose columns from that array randomly. in other cases, I may need to take an image, select a patch, and then make that a column vector. 2) calculate an output value, which is the dot product of the "x" and a weight vector, "w", so y=dot(x,w) 3) modify the weight vector based on a matrix equation, like: w=w+ eta * (y*x - y**2*w) ^ | +---- learning rate constant 4) repeat steps 1-3 many times I've organized it like: for e in 100: # outer loop for i in 1000: # inner loop (steps 1-3) display things. so that the bulk of the computation is in the inner loop, and is amenable to converting to a faster language. This is my issue: straight python, in the example posted below for 250000 inner-loop steps, takes 20 seconds for each outer-loop step. I tried Pyrex, which should work very fast on such a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex file in matlab takes 1.5 seconds per outer-loop step. Given the huge difference between the Pyrex and the Mex, I feel that there is something I am doing wrong, because the C-code for both should run comparably. Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind coding some in C, but the Python API seemed a bit challenging to me. One note: I am using the Numeric package, not numpy, only because I want to be able to use the Enthought version for Windows. I develop on Linux, and haven't had a chance to see if I can compile numpy using the Enthought Python for Windows. If there is anything else anyone needs to know, I'll post it. I put the main script, and a dohebb.pyx code below. thanks! Brian Blais -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais # Main script: from dohebb import * import pylab as p from Numeric import * from RandomArray import * import time x=random((100,1000)) # 1000 input vectors numpats=x.shape[0] w=random((numpats,1)); th=random((1,1)) params={} params['eta']=0.001; params['tau']=100.0; old_mx=0; for e in range(100): rnd=randint(0,numpats,250000) t1=time.time() if 0: # straight python for i in range(len(rnd)): pat=rnd[i] xx=reshape(x[:,pat],(1,-1)) y=matrixmultiply(xx,w) w=w+params['eta']*(y*transpose(xx)-y**2*w); th=th+(1.0/params['tau'])*(y**2-th); else: # pyrex dohebb(params,w,th,x,rnd) print time.time()-t1 p.plot(w,'o-') p.xlabel('weights') p.show() #============================================= # dohebb.pyx cdef extern from "Numeric/arrayobject.h": struct PyArray_Descr: int type_num, elsize char type ctypedef class Numeric.ArrayType [object PyArrayObject]: cdef char *data cdef int nd cdef int *dimensions, *strides cdef object base cdef PyArray_Descr *descr cdef int flags def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): cdef int num_iterations cdef int num_inputs cdef int offset cdef double *wp,*xp,*thp cdef int *rndp cdef double eta,tau eta=params['eta'] # learning rate tau=params['tau'] # used for variance estimate cdef double y num_iterations=rnd.dimensions[0] num_inputs=w.dimensions[0] # get the pointers wp=w.data xp=X.data rndp=rnd.data thp=th.data for it from 0 <= it < num_iterations: offset=rndp[it]*num_inputs # calculate the output y=0.0 for i from 0 <= i < num_inputs: y=y+wp[i]*xp[i+offset] # change in the weights for i from 0 <= i < num_inputs: wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) # estimate the variance thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) From stefan at sun.ac.za Tue Feb 21 04:29:02 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 21 04:29:02 2006 Subject: [Numpy-discussion] wiki page for record arrays Message-ID: <20060221122737.GA14470@alpha> I wrote a short tutorial on using record arrays, which can be found at http://www.scipy.org/ArrayRecords The page is named ArrayRecords instead of RecordArrays, so I'd be glad if someone with priviledges could rename it. Also, please fix any mistakes I might have made. Regards St?fan From svetosch at gmx.net Tue Feb 21 05:43:03 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Tue Feb 21 05:43:03 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? Message-ID: <43FB0A96.10803@gmx.net> Hi, sometimes I'm still struggling with peculiarities of numpy-arrays vs. numpy-matrices; my latest story goes like this: I first slice out a column of a 2d-numpy-array (a = somearray[:,1]). I can just manage to understand the resulting shape ( == (112,) ). Then I slice a column from a numpy-matrix b = somematrix[:,1] and get the expected (112,1) shape. Then I do what I thought was the easiest thing in the world, I subtract the two vectors: c = a - b I was very surprised by the bug that showed up due to the fact that c.shape == (112,112) !! First conclusion: broadcasting is nice and everything, but here I somehow think that it shouldn't be like this. I like numpy, but this is frustrating. Next, I try to workaround by b.squeeze(). That seems to work, but why is b.squeeze().shape == (1, 112) instead of (112,)? Then I thought maybe b.flattened() does the job, but then I get an error (matrix has no attr flattened). Again, I'm baffled. Could someone please explain? I already own the numpy-book, otherwise I wouldn't even have thought of using those methods, but here it hasn't enlightened me. Second (preliminary) conclusion: I will paranoically use even more asmatrix()-conversions in my code to avoid dealing with those array-beasts ;-) and get column vectors I can trust... Is there a better general advice than to say: "numpy-matrices and numpy-arrays are best kept in separated worlds" ? Thanks for any insights, Sven From mpi at osc.kiku.dk Tue Feb 21 06:18:04 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 06:18:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? Message-ID: Hey, I am relatively new to python and Numeric, and am currently involved in a project of developing molecular dynamics code written in python and using Numeric for number crunching. During this, I've run into some problems, that I hope I can get some assistance with here. My main issues here are: 1. around() appears to be slow 2. C code appears to be much faster 1. One of the bottle necks in MD is the calculation of the distances between all particle pairs. In MD simulations with periodic boundary conditions, you have to estimate the shortest distance between all the particle pairs in your system. Fx. on a line of length box = 10, the distance dx between two points x0 = 1 and x1 = 9, will be dx = -2 (and NOT dx = 9). One way to do this in numpy is dx = x1 - x0 dx -= around(dx/box) My first observation here, is that around() seems to be very slow. So I looked in umathmodule.c and implemented rint() from the C math library and made my own custom Numeric module. This gives a speed-up by a factor of app. 4 compared to around(). I suggest that rint() is added as a ufunc or is there any concerns here that I am not aware of? 2. Here is the main loop for finding all possible pair distances, which corresponds to a loop over the upper triangular part of a square matrix # Loop over all particles for i in range(n-1): dx = x[i+1:] - x[i] dy = y[i+1:] - y[i] dx -= box*rint(dx/box) dy -= box*rint(dy/box) r2 = dx**2 + dy**2 # square of dist. between points where x and y contain the positions of the particles. A naive implementation in C is // loop over all particles for (int i=0; i References: <20060221122737.GA14470@alpha> Message-ID: <43FB2298.2080003@bigpond.net.au> Thanks for this St?fan, Can I make some observations. I don't want to just change your formatting. I think it would be good to have some discussion about the formatting used in tutorials like this, because all should probably follow a standard presentation style. I like the usage summary at the end. 1. I'd put 'assumes from numpy import *' in the preamble. 2. Is it possible to change the formatting to make it more obvious what is input and what is output? I think it is better to show the input and output with a standard Python prompt a'la idle or possibly ipython. A couple of things specific to your examples: 3. I think it might be worth pointing out that img = array([(0,0,0), (1,0,0), (0,1,0), (0,0,1)], [('r',Float32),('g',F loat32),('b',Float32)]) is valid syntax that can be replaced by the 2-line version you present. Should the valid syntax for creating a record array be presented in EBNF format? 4. Can you explain dtype=(void,12)? 5. When the page's name is changed, a link should be put to it in the 'Getting Started and Tutorial' section of the Documentation page. What do you and others think? Gary R. Stefan van der Walt wrote: > I wrote a short tutorial on using record arrays, which can be found at > > http://www.scipy.org/ArrayRecords > > The page is named ArrayRecords instead of RecordArrays, so I'd be glad > if someone with priviledges could rename it. Also, please fix any > mistakes I might have made. > > Regards > St?fan From alexander.belopolsky at gmail.com Tue Feb 21 06:52:16 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue Feb 21 06:52:16 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On 2/21/06, Mads Ipsen wrote: > I suggest that rint() is added as a ufunc or is there any concerns > here that I am not aware of? You might want to use astype(int). On my system it is much faster than around: > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" 10000 loops, best of 3: 176 usec per loop > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" 100000 loops, best of 3: 3.2 usec per loop the difference is too big to be explained by the fact that around allocates twice as much memory for the result. In fact the following equivalent of rint is still very fast: > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int).astype(float)" 100000 loops, best of 3: 6.48 usec per loop From ndarray at mac.com Tue Feb 21 07:00:00 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 21 07:00:00 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On the second thought, the difference between around and astype is not surprising because around operates in terms of decimals. Rather than adding rint, I would suggest to make a special case decimals=0 use C rint. > On 2/21/06, Mads Ipsen wrote: > > I suggest that rint() is added as a ufunc or is there any concerns > > here that I am not aware of? > > You might want to use astype(int). On my system it is much faster than around: > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" > 10000 loops, best of 3: 176 usec per loop > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" > 100000 loops, best of 3: 3.2 usec per loop > > the difference is too big to be explained by the fact that around > allocates twice as much memory for the result. In fact the following > equivalent of rint is still very fast: > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int).astype(float)" > 100000 loops, best of 3: 6.48 usec per loop > From travis at enthought.com Tue Feb 21 07:06:02 2006 From: travis at enthought.com (Travis N. Vaught) Date: Tue Feb 21 07:06:02 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <20060221122737.GA14470@alpha> References: <20060221122737.GA14470@alpha> Message-ID: <43FB2C19.2060001@enthought.com> Stefan van der Walt wrote: > I wrote a short tutorial on using record arrays, which can be found at > > http://www.scipy.org/ArrayRecords > > The page is named ArrayRecords instead of RecordArrays, so I'd be glad > if someone with priviledges could rename it. Also, please fix any > mistakes I might have made. > ... I've renamed it. Now the page is at: http://www.scipy.org/RecordArray Travis From travis at enthought.com Tue Feb 21 07:12:07 2006 From: travis at enthought.com (Travis N. Vaught) Date: Tue Feb 21 07:12:07 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <43FB2C19.2060001@enthought.com> References: <20060221122737.GA14470@alpha> <43FB2C19.2060001@enthought.com> Message-ID: <43FB2DA5.1010306@enthought.com> Travis N. Vaught wrote: > > I've renamed it. Now the page is at: > > http://www.scipy.org/RecordArray > Doh! That should have been http://www.scipy.org/RecordArrays . From bsouthey at gmail.com Tue Feb 21 07:16:07 2006 From: bsouthey at gmail.com (Bruce Southey) Date: Tue Feb 21 07:16:07 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <43FB0661.1040202@bryant.edu> References: <43FB0661.1040202@bryant.edu> Message-ID: Hi, In the current version, note that Y is scalar so replace the squaring (Y**2) with Y*Y as you do in the dohebb function. On my system without blas etc removing the squaring removes a few seconds (16.28 to 12.4). It did not seem to help factorizing Y. Also, eta and tau are constants so define them only once as scalars outside the loops and do the division outside the loop. It only saves about 0.2 seconds but these add up. The inner loop probably can be vectorized because it is just vector operations on a matrix. You are just computing over the ith dimension of X. I think that you could be able to find the matrix version on the net. Regards Bruce On 2/21/06, Brian Blais wrote: > Hello, > > I am trying to translate some Matlab/mex code to Python, for doing neural > simulations. This application is definitely computing-time limited, and I need to > optimize at least one inner loop of the code, or perhaps even rethink the algorithm. > The procedure is very simple, after initializing any variables: > > 1) select a random input vector, which I will call "x". right now I have it as an > array, and I choose columns from that array randomly. in other cases, I may need to > take an image, select a patch, and then make that a column vector. > > 2) calculate an output value, which is the dot product of the "x" and a weight > vector, "w", so > > y=dot(x,w) > > 3) modify the weight vector based on a matrix equation, like: > > w=w+ eta * (y*x - y**2*w) > ^ > | > +---- learning rate constant > > 4) repeat steps 1-3 many times > > I've organized it like: > > for e in 100: # outer loop > for i in 1000: # inner loop > (steps 1-3) > > display things. > > so that the bulk of the computation is in the inner loop, and is amenable to > converting to a faster language. This is my issue: > > straight python, in the example posted below for 250000 inner-loop steps, takes 20 > seconds for each outer-loop step. I tried Pyrex, which should work very fast on such > a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex > file in matlab takes 1.5 seconds per outer-loop step. > > Given the huge difference between the Pyrex and the Mex, I feel that there is > something I am doing wrong, because the C-code for both should run comparably. > Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind > coding some in C, but the Python API seemed a bit challenging to me. > > One note: I am using the Numeric package, not numpy, only because I want to be able > to use the Enthought version for Windows. I develop on Linux, and haven't had a > chance to see if I can compile numpy using the Enthought Python for Windows. > > If there is anything else anyone needs to know, I'll post it. I put the main script, > and a dohebb.pyx code below. > > > thanks! > > Brian Blais > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > > > # Main script: > > from dohebb import * > import pylab as p > from Numeric import * > from RandomArray import * > import time > > x=random((100,1000)) # 1000 input vectors > > numpats=x.shape[0] > w=random((numpats,1)); > > th=random((1,1)) > > params={} > params['eta']=0.001; > params['tau']=100.0; > old_mx=0; > for e in range(100): > > rnd=randint(0,numpats,250000) > t1=time.time() > if 0: # straight python > for i in range(len(rnd)): > pat=rnd[i] > xx=reshape(x[:,pat],(1,-1)) > y=matrixmultiply(xx,w) > w=w+params['eta']*(y*transpose(xx)-y**2*w); > th=th+(1.0/params['tau'])*(y**2-th); > else: # pyrex > dohebb(params,w,th,x,rnd) > print time.time()-t1 > > > p.plot(w,'o-') > p.xlabel('weights') > p.show() > > > #============================================= > > # dohebb.pyx > > cdef extern from "Numeric/arrayobject.h": > > struct PyArray_Descr: > int type_num, elsize > char type > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > cdef char *data > cdef int nd > cdef int *dimensions, *strides > cdef object base > cdef PyArray_Descr *descr > cdef int flags > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > cdef int num_iterations > cdef int num_inputs > cdef int offset > cdef double *wp,*xp,*thp > cdef int *rndp > cdef double eta,tau > > eta=params['eta'] # learning rate > tau=params['tau'] # used for variance estimate > > cdef double y > num_iterations=rnd.dimensions[0] > num_inputs=w.dimensions[0] > > # get the pointers > wp=w.data > xp=X.data > rndp=rnd.data > thp=th.data > > for it from 0 <= it < num_iterations: > > offset=rndp[it]*num_inputs > > # calculate the output > y=0.0 > for i from 0 <= i < num_inputs: > y=y+wp[i]*xp[i+offset] > > # change in the weights > for i from 0 <= i < num_inputs: > wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > # estimate the variance > thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From mpi at osc.kiku.dk Tue Feb 21 07:24:08 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 07:24:08 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On Tue, 21 Feb 2006, Alexander Belopolsky wrote: > On 2/21/06, Mads Ipsen wrote: > > I suggest that rint() is added as a ufunc or is there any concerns > > here that I am not aware of? > > You might want to use astype(int). On my system it is much faster than around: > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" > 10000 loops, best of 3: 176 usec per loop > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" > 100000 loops, best of 3: 3.2 usec per loop > > the difference is too big to be explained by the fact that around > allocates twice as much memory for the result. In fact the following > equivalent of rint is still very fast: > > > python -m timeit -s "from numpy import array, around; Maybe I am wrong here, but around() and rint() is supposed to round to the closest integer, i.e. for x = array([1.1, 1.8]) around(x) = [1.0, 2.0] whereas x.astype(int).astype(float) = [1.0, 1.0] This particular property of around() as well as rint() is crucial for my application. // Mads From ndarray at mac.com Tue Feb 21 07:32:09 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 21 07:32:09 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On 2/21/06, Mads Ipsen wrote: > Maybe I am wrong here, but around() and rint() is supposed to round to > the closest integer, i.e. for x = array([1.1, 1.8]) You are right. In the follow-up I've suggested to speed-up the case decimals=0 in around in around instead of adding another function. I think that would be a more "pythonic" solution. From stefan at sun.ac.za Tue Feb 21 07:56:09 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 21 07:56:09 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <43FB2298.2080003@bigpond.net.au> References: <20060221122737.GA14470@alpha> <43FB2298.2080003@bigpond.net.au> Message-ID: <20060221155513.GC14470@alpha> Hi Gary Thanks for your suggestions. I incorporated them. St?fan On Wed, Feb 22, 2006 at 01:24:24AM +1100, Gary Ruben wrote: > 1. I'd put 'assumes from numpy import *' in the preamble. > 2. Is it possible to change the formatting to make it more obvious what > is input and what is output? I think it is better to show the input and > output with a standard Python prompt a'la idle or possibly ipython. > 3. I think it might be worth pointing out that > > img = array([(0,0,0), (1,0,0), (0,1,0), (0,0,1)], [('r',Float32),('g',F > loat32),('b',Float32)]) > > is valid syntax that can be replaced by the 2-line version you present. > 4. Can you explain dtype=(void,12)? > 5. When the page's name is changed, a link should be put to it in the > 'Getting Started and Tutorial' section of the Documentation page. From pau.gargallo at gmail.com Tue Feb 21 08:02:07 2006 From: pau.gargallo at gmail.com (Pau Gargallo) Date: Tue Feb 21 08:02:07 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: <6ef8f3380602210801j3e321795l5ade05c7c1539002@mail.gmail.com> > the closest integer, i.e. for x = array([1.1, 1.8]) > > around(x) = [1.0, 2.0] > > whereas > > x.astype(int).astype(float) = [1.0, 1.0] > (x+0.5).astype(int).astype(float) = [1.0, 2.0] i hope it helps, pau From zpincus at stanford.edu Tue Feb 21 09:19:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 21 09:19:02 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> Mads, The game with numpy, just as it is with Matlab or any other interpreted numeric environment, is to try push as much of the looping down into the C code as you can. This is because, as you now know, compiled C can loop much faster than interpreted python. A simple example for averaging 1000 (x,y,z) points: print data.shape (1000, 3) # bad: explicit for loop in python avg = numpy.zeros(3, numpy.float_) for i in data: avg += i avg /= 1000.0 # good: implicit for loop in C avg = numpy.add.reduce(data, axis = 0) avg /= 1000.0 In your case, instead of explicitly looping through each point, why not do the calculations in parallel, operating on entire vectors of points at one time? Then the looping is "pushed down" into compiled C code. Or if you're really lucky, it's pushed all the way down to the vector math units on your cpu if you have a good BLAS or whatever installed. Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine > 2. Here is the main loop for finding all possible pair distances, > which corresponds to a loop over the upper triangular part of a > square matrix > > # Loop over all particles > for i in range(n-1): > dx = x[i+1:] - x[i] > dy = y[i+1:] - y[i] > > dx -= box*rint(dx/box) > dy -= box*rint(dy/box) > > r2 = dx**2 + dy**2 # square of dist. between points > > where x and y contain the positions of the particles. A naive > implementation in C is > > > // loop over all particles > for (int i=0; i for (int j=i+1; j dx = x[j] - x[i]; > dy = y[j] - y[i]; > > dx -= box*rint(dx/box); > dy -= box*rint(dy/box); > > r2 = dx*dx + dy*dy; > } > } > > For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is > app. 10 times faster than the Python/Numeric counterpart. This is of > course not satisfactory. > > Are there any things I am doing completely wrong here, basic > approaches completely misunderstood, misuses etc? > > Any suggestions, guidelines, hints are most welcome. > > Best regards, > > Mads Ipsen > > > +---------------------------------+-------------------------+ > | Mads Ipsen | | > | Dept. of Chemistry | phone: +45-35320220 | > | H.C.?rsted Institute | fax: +45-35320322 | > | Universitetsparken 5 | | > | DK-2100 Copenhagen ?, Denmark | mpi at osc.kiku.dk | > +---------------------------------+-------------------------+ > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From mpi at osc.kiku.dk Tue Feb 21 10:25:04 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 10:25:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> Message-ID: On Tue, 21 Feb 2006, Zachary Pincus wrote: > Mads, > > The game with numpy, just as it is with Matlab or any other > interpreted numeric environment, is to try push as much of the > looping down into the C code as you can. This is because, as you now > know, compiled C can loop much faster than interpreted python. > > A simple example for averaging 1000 (x,y,z) points: > > print data.shape > (1000, 3) > # bad: explicit for loop in python > avg = numpy.zeros(3, numpy.float_) > for i in data: avg += i > avg /= 1000.0 > > # good: implicit for loop in C > avg = numpy.add.reduce(data, axis = 0) > avg /= 1000.0 > > In your case, instead of explicitly looping through each point, why > not do the calculations in parallel, operating on entire vectors of > points at one time? Then the looping is "pushed down" into compiled C > code. Or if you're really lucky, it's pushed all the way down to the > vector math units on your cpu if you have a good BLAS or whatever > installed. > > Zach Pincus > > Program in Biomedical Informatics and Department of Biochemistry > Stanford University School of Medicine > > > > > 2. Here is the main loop for finding all possible pair distances, > > which corresponds to a loop over the upper triangular part of a > > square matrix > > > > # Loop over all particles > > for i in range(n-1): > > dx = x[i+1:] - x[i] > > dy = y[i+1:] - y[i] > > > > dx -= box*rint(dx/box) > > dy -= box*rint(dy/box) > > > > r2 = dx**2 + dy**2 # square of dist. between points > > > > where x and y contain the positions of the particles. A naive > > implementation in C is > > > > > > // loop over all particles > > for (int i=0; i > for (int j=i+1; j > dx = x[j] - x[i]; > > dy = y[j] - y[i]; > > > > dx -= box*rint(dx/box); > > dy -= box*rint(dy/box); > > > > r2 = dx*dx + dy*dy; > > } > > } > > > > For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is > > app. 10 times faster than the Python/Numeric counterpart. This is of > > course not satisfactory. > > > > Are there any things I am doing completely wrong here, basic > > approaches completely misunderstood, misuses etc? > > > > Any suggestions, guidelines, hints are most welcome. > > > > Best regards, > > > > Mads Ipsen > > > > > > +---------------------------------+-------------------------+ > > | Mads Ipsen | | > > | Dept. of Chemistry | phone: +45-35320220 | > > | H.C.?rsted Institute | fax: +45-35320322 | > > | Universitetsparken 5 | | > > | DK-2100 Copenhagen ?, Denmark | mpi at osc.kiku.dk | > > +---------------------------------+-------------------------+ > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through > > log files > > for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD > > SPLUNK! > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 > > _______________________________________________ > > Numpy-discussion mailing list > > Nump I agree completely with your comments. But, as you can, the innermost part of the loop has been removed in the code, and replaced with numpy slices. It's hard for me to see how to compress the outer loop as well, since it determines the ranges for the inner loop. Unless there is some fancy slice notation, that allows you to loop over a triangular part of a matrix, ie. x[i] = sum(A[i+1,i]) meaning x[i] = sum of elements in i'th row of A using only elements from position i+1 up to n. Of course, there is the possibility of hardcoding this in C and then make it available as a Python module. But I don't want to do this before I am sure there isn't a numpy way out this. Let me know, if you have any suggestions. // Mads From oliphant.travis at ieee.org Tue Feb 21 10:58:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 21 10:58:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43FA9B28.6070309@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FA8C9C.2020002@ieee.org> <43FA9B28.6070309@cox.net> Message-ID: <43FB628D.1050305@ieee.org> Tim Hochberg wrote: > Yeah, sort of. I meant that the little helper functions that ufuncs > call, such as DOUBLE_multiply, take the same types of arguments. > However, I just realized that I'm not certain that's true -- I just > assumed it because all the one's I've ever seen do. Also, this isn't > really a problem anyway -- the real problem is the slow conversion of > Python scalars to arrays in ufuncs. Yes, that is true. We have only defined multiplication for same-types. But, I just wanted to clarify that the ufunc machinery is more general than that, because others have been confused in the past. > I took a look at this earlier and it appears that the reason that > conversion of Python scalars are slow is that FromAny trys every other > conversion first. The check for Python scalars looks pretty cheap, so > it seems reasonable to check for them and do the appropriate > conversion early. Do the ufunc's call EnsureArray or FromAny? If the > former it would seem pretty straighforward to just stick another check > in there. Then David's original strategy of optimizing in DOUBLE_pow > should be close to as fast as what I'm doing. Yes, I suspect the biggest slow-downs are the two attribute lookups which allow anything with __array__ or the array interface defined to be used. I think we could special-case Python scalars in that code. -Travis From tim.hochberg at cox.net Tue Feb 21 11:12:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 21 11:12:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> Message-ID: <43FB65D1.5080707@cox.net> Mads Ipsen wrote: >On Tue, 21 Feb 2006, Zachary Pincus wrote: > > > >>Mads, >> >>The game with numpy, just as it is with Matlab or any other >>interpreted numeric environment, is to try push as much of the >>looping down into the C code as you can. This is because, as you now >>know, compiled C can loop much faster than interpreted python. >> >>A simple example for averaging 1000 (x,y,z) points: >> >>print data.shape >>(1000, 3) >># bad: explicit for loop in python >>avg = numpy.zeros(3, numpy.float_) >>for i in data: avg += i >>avg /= 1000.0 >> >># good: implicit for loop in C >>avg = numpy.add.reduce(data, axis = 0) >>avg /= 1000.0 >> >>In your case, instead of explicitly looping through each point, why >>not do the calculations in parallel, operating on entire vectors of >>points at one time? Then the looping is "pushed down" into compiled C >>code. Or if you're really lucky, it's pushed all the way down to the >>vector math units on your cpu if you have a good BLAS or whatever >>installed. >> >>Zach Pincus >> >>Program in Biomedical Informatics and Department of Biochemistry >>Stanford University School of Medicine >> >> >> >> >> >>>2. Here is the main loop for finding all possible pair distances, >>> which corresponds to a loop over the upper triangular part of a >>> square matrix >>> >>> # Loop over all particles >>> for i in range(n-1): >>> dx = x[i+1:] - x[i] >>> dy = y[i+1:] - y[i] >>> >>> dx -= box*rint(dx/box) >>> dy -= box*rint(dy/box) >>> >>> r2 = dx**2 + dy**2 # square of dist. between points >>> >>>where x and y contain the positions of the particles. A naive >>>implementation in C is >>> >>> >>> // loop over all particles >>> for (int i=0; i>> for (int j=i+1; j>> dx = x[j] - x[i]; >>> dy = y[j] - y[i]; >>> >>> dx -= box*rint(dx/box); >>> dy -= box*rint(dy/box); >>> >>> r2 = dx*dx + dy*dy; >>> } >>> } >>> >>>For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is >>>app. 10 times faster than the Python/Numeric counterpart. This is of >>>course not satisfactory. >>> >>>Are there any things I am doing completely wrong here, basic >>>approaches completely misunderstood, misuses etc? >>> >>>Any suggestions, guidelines, hints are most welcome. >>> >>>Best regards, >>> >>>Mads Ipsen >>> >>> >>>+---------------------------------+-------------------------+ >>>| Mads Ipsen | | >>>| Dept. of Chemistry | phone: +45-35320220 | >>>| H.C.?rsted Institute | fax: +45-35320322 | >>>| Universitetsparken 5 | | >>>| DK-2100 Copenhagen ?, Denmark | mpi at osc.kiku.dk | >>>+---------------------------------+-------------------------+ >>> >>> >>>------------------------------------------------------- >>>This SF.net email is sponsored by: Splunk Inc. Do you grep through >>>log files >>>for problems? Stop! Download the new AJAX search engine that makes >>>searching your log files as easy as surfing the web. DOWNLOAD >>>SPLUNK! >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 >>>_______________________________________________ >>>Numpy-discussion mailing list >>>Nump >>> >>> > >I agree completely with your comments. But, as you can, the innermost part >of the loop has been removed in the code, and replaced with numpy slices. >It's hard for me to see how to compress the outer loop as well, since it >determines the ranges for the inner loop. Unless there is some fancy slice >notation, that allows you to loop over a triangular part of a matrix, ie. > > x[i] = sum(A[i+1,i]) > >meaning x[i] = sum of elements in i'th row of A using only elements from >position i+1 up to n. > >Of course, there is the possibility of hardcoding this in C and then make >it available as a Python module. But I don't want to do this before I am >sure there isn't a numpy way out this. > >Let me know, if you have any suggestions. > Can you explain a little more about what you are trying to calculate? The bit about subtracting off box*rint(dx/box) is a little odd. It almost seems like you should be able to do something with fmod, but I admit that I'm not sure how. If I had to guess as to source of the relative slowness I'd say it's because you are creating a lot of temporary matrices. There are ways to avoid this, but when taken to the extreme, they make your code look ugly. You might try the following, untested, code or some variation and see if it speeds things up. This makes extensive use of the little known optional destination argument for ufuncs. I only tend to do this sort of stuff where it's very critical since, as you can see, it makes things quite ugly. dx_space = x.copy() dy_space = y.copy() scratch_space = x.copy() for i in range(n-1): dx = dx_space[i+1:] dy = dy_space[i+1:] scratch = scratch_space[i+1:] subtract(x[i+1:], x[i], dx) subtract(y[i+1:], y[i], dy) # dx -= box*rint(dx/box) divide(dx, box, scratch) rint(scratch, scratch) scratch *= box dx -= scratch # dy -= box*rint(dy/box) divide(dy, box, scratch) rint(scratch, scratch) scratch *= box dy -= scratch r2 = dx**2 + dy**2 # square of dist. between points Hope that helps: -tim From oliphant.travis at ieee.org Tue Feb 21 11:13:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 21 11:13:02 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB0A96.10803@gmx.net> References: <43FB0A96.10803@gmx.net> Message-ID: <43FB6604.6000406@ieee.org> Sven Schreiber wrote: >Hi, sometimes I'm still struggling with peculiarities of numpy-arrays >vs. numpy-matrices; my latest story goes like this: > >I first slice out a column of a 2d-numpy-array (a = somearray[:,1]). I >can just manage to understand the resulting shape ( == (112,) ). > >Then I slice a column from a numpy-matrix b = somematrix[:,1] and get >the expected (112,1) shape. > >Then I do what I thought was the easiest thing in the world, I subtract >the two vectors: c = a - b >I was very surprised by the bug that showed up due to the fact that >c.shape == (112,112) !! > > As you know this isn't a bug, but very expected behavior. I don't see this changing any time soon. Arrays are different than matrices. Matrices are always 2-d arrays while arrays can have any number of dimensions. The default relationship between arrays and matrices is that 1-d arrays get converted to row-matrices (1,N). Regardless of which convention is chosen somebody will be bitten by that conversion if they think in terms of the other default. I don't see a way around that except to be careful when you mix arrays and matrices. >Next, I try to workaround by b.squeeze(). That seems to work, but why is >b.squeeze().shape == (1, 112) instead of (112,)? > > Again the same reason as before. A matrix is returned from b.squeeze() and there are no 1-d matrices. Thus, you get a row-vector. Use .T if you want a column vector. >Then I thought maybe b.flattened() does the job, but then I get an error >(matrix has no attr flattened). Again, I'm baffled. > > The correct spelling is b.flatten() And again you are going to get a (1,N) matrix out because of how 1d arrays are interpreted as matrices. In short, there is no way to get a 1-d matrix because that doesn't make sense. You can get a 1-d array using b.A.squeeze() >Second (preliminary) conclusion: I will paranoically use even more >asmatrix()-conversions in my code to avoid dealing with those >array-beasts ;-) and get column vectors I can trust... > > > >Is there a better general advice than to say: "numpy-matrices and >numpy-arrays are best kept in separated worlds" ? > > You can mix arrays and matrices just fine if you remember that 1d arrays are equivalent to row-vectors. -Travis From skip at pobox.com Tue Feb 21 12:20:05 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue Feb 21 12:20:05 2006 Subject: [Numpy-discussion] Problems building numpy w/ ATLAS on Solaris 8 Message-ID: <17403.30180.676655.892180@montanaro.dyndns.org> After a brief hiatus I'm back to trying to build numpy. Last time I checked in (on the scipy list) I had successfully built ATLAS and created this simple site.cfg file in .../numpy/distutils/site.cfg: [atlas] library_dirs = /home/titan/skipm/src/ATLAS/lib/SunOS_Babe include_dirs = /home/titan/skipm/src/ATLAS/include/SunOS_Babe # for overriding the names of the atlas libraries atlas_libs = lapack, f77blas, cblas, atlas I svn up'd (now at rev 2138), zapped my build directory, then executed "python setup.py build". Just in case it matters, I'm using Python 2.4.2 built with GCC 3.4.1 on Solaris 8. Here's the output of my build attempt: Running from numpy source directory. No module named __svn_version__ F2PY Version 2_2138 blas_opt_info: blas_mkl_info: /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[] warnings.warn("Library error: libs=%s found_libs=%s" % \ NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['lapack', 'f77blas', 'cblas', 'atlas'] found_libs=[] warnings.warn("Library error: libs=%s found_libs=%s" % \ Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] language = c include_dirs = ['/opt/include'] ... See my site.cfg file? Why does it affect library_dirs but not include_dirs? running build_src building extension "atlas_version" sources adding 'build/src/atlas_version_0x33c6fa32.c' to sources. running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext building 'atlas_version' extension compiling C sources gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC' compile options: '-I/opt/include -Inumpy/core/include -I/opt/app/g++lib6/python-2.4/include/python2.4 -c' /opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so Text relocation remains referenced against symbol offset in file 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) 0xc /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ... bunch of missing s elided ... printf 0x1b /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) printf 0x2d /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) printf 0x3f /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) printf 0x51 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ... what's this? can't find printf??? ... ld: fatal: relocations remain against allocatable but non-writable sections collect2: ld returned 1 exit status Text relocation remains referenced against symbol offset in file 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ... more eliding ... printf 0x108 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ld: fatal: relocations remain against allocatable but non-writable sections collect2: ld returned 1 exit status ##### msg: error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 FOUND: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] language = c define_macros = [('NO_ATLAS_INFO', 2)] include_dirs = ['/opt/include'] Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas ... How can I initialize things earlier? Does it matter? Traceback (most recent call last): File "setup.py", line 76, in ? setup_package() File "setup.py", line 63, in setup_package config.add_subpackage('numpy') File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage config_list = self.get_subpackage(subpackage_name,subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "/home/ink/skipm/src/numpy/numpy/setup.py", line 10, in configuration config.add_subpackage('core') File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage config_list = self.get_subpackage(subpackage_name,subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "numpy/core/setup.py", line 215, in configuration config.add_data_dir('tests') File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 636, in add_data_dir self.add_data_files((ds,filenames)) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 702, in add_data_files dist.data_files.extend(data_dict.items()) AttributeError: 'NoneType' object has no attribute 'extend' And finally, a traceback. What's up with that? In parallel with trying to build with ATLAS I'm also trying Travis's suggestion of explicitly setting PTATLAS, ATLAS and BLAS to "None". Numpy builds when I do that. -- Skip Montanaro - skip at pobox.com "The values to which people cling most stubbornly under inappropriate conditions are those values that were previously the source of their greatest triumphs over adversity." -- Jared Diamond in "Collapse" From Chris.Barker at noaa.gov Tue Feb 21 12:49:02 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 21 12:49:02 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB6604.6000406@ieee.org> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> Message-ID: <43FB7C87.1010007@noaa.gov> Travis Oliphant wrote: > You can mix arrays and matrices just fine if you remember that 1d arrays > are equivalent to row-vectors. and you can easily get a column vector out of an array, if you remember that you want to keep it 2-d. i.e. use a slice rather than an index: >>> import numpy as N >>> a = N.ones((5,10)) >>> a[:,1].shape # an index: it reduces the rank (5,) >>> a[:,1:2].shape # a slice: it keeps the rank (5, 1) -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From mpi at osc.kiku.dk Tue Feb 21 13:05:05 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 13:05:05 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <43FB65D1.5080707@cox.net> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> Message-ID: On Tue, 21 Feb 2006, Tim Hochberg wrote: > Can you explain a little more about what you are trying to calculate? > The bit about subtracting off box*rint(dx/box) is a little odd. It > almost seems like you should be able to do something with fmod, but I > admit that I'm not sure how. > > If I had to guess as to source of the relative slowness I'd say it's > because you are creating a lot of temporary matrices. There are ways to > avoid this, but when taken to the extreme, they make your code look > ugly. You might try the following, untested, code or some variation and > see if it speeds things up. This makes extensive use of the little known > optional destination argument for ufuncs. I only tend to do this sort of > stuff where it's very critical since, as you can see, it makes things > quite ugly. > > dx_space = x.copy() > dy_space = y.copy() > scratch_space = x.copy() > for i in range(n-1): > dx = dx_space[i+1:] > dy = dy_space[i+1:] > scratch = scratch_space[i+1:] > subtract(x[i+1:], x[i], dx) > subtract(y[i+1:], y[i], dy) > # dx -= box*rint(dx/box) > divide(dx, box, scratch) > rint(scratch, scratch) > scratch *= box > dx -= scratch > # dy -= box*rint(dy/box) > divide(dy, box, scratch) > rint(scratch, scratch) > scratch *= box > dy -= scratch > r2 = dx**2 + dy**2 # square of dist. between points > > > > Hope that helps: > > -tim Here's what I am trying to do: My system consists of N particles, whose coordinates in the xy-plane is given by the two vectors x and y. I need to calculate the distance between all particle pairs, which goes like this: I pick particle 1 and calculate its distance to the N-1 other points. Then I pick particle 2. Since its distance to particle 1 was found in the previuos step, I only have to find its distance to the N-2 remaining points. In the i'th step, I therefore only have to consider particle i+1 up to particle N. That explains the loop structure, where dx = x[i+1:] - x[i] dy = y[i+1:] - y[i] the resulting vectors dx and dy will contain the x-distances from x[i] to the proceeding points from i+1 up to N. The square of the distance r2 is the given by r2 = dx**2 + dy**2 Another approach would be to use dx = subtract.outer(x,x) dy = subtract.outer(y,y) but that will be overkill, since all distances are counted twice, and also, the storage requirements grow rapidly if you have more than 1000 particles (app. 10^6 particle pairs). Thanks for your code feedback, which I'll have a closer look at. But I try to believe, that numpy/Numeric/Python was invented with the one purpose of avoiding coding like this - I think this is also a point you already made. But thanks again. // Mads From aisaac at american.edu Tue Feb 21 13:43:04 2006 From: aisaac at american.edu (Alan G Isaac) Date: Tue Feb 21 13:43:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> Message-ID: On Tue, 21 Feb 2006, (CET) Mads Ipsen apparently wrote: > My system consists of N particles, whose coordinates in > the xy-plane is given by the two vectors x and y. I need > to calculate the distance between all particle pairs Of possible interest? http://www.cs.umd.edu/~mount/ANN/ Cheers, Alan Isaac From robert.kern at gmail.com Tue Feb 21 14:10:12 2006 From: robert.kern at gmail.com (Robert Kern) Date: Tue Feb 21 14:10:12 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: <17403.30180.676655.892180@montanaro.dyndns.org> References: <17403.30180.676655.892180@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > After a brief hiatus I'm back to trying to build numpy. Last time I checked > in (on the scipy list) I had successfully built ATLAS and created this > simple site.cfg file in .../numpy/distutils/site.cfg: > > [atlas] > library_dirs = /home/titan/skipm/src/ATLAS/lib/SunOS_Babe > include_dirs = /home/titan/skipm/src/ATLAS/include/SunOS_Babe > # for overriding the names of the atlas libraries > atlas_libs = lapack, f77blas, cblas, atlas > > I svn up'd (now at rev 2138), zapped my build directory, then executed > "python setup.py build". Just in case it matters, I'm using Python 2.4.2 > built with GCC 3.4.1 on Solaris 8. Here's the output of my build attempt: > > Running from numpy source directory. > No module named __svn_version__ > F2PY Version 2_2138 > blas_opt_info: > blas_mkl_info: > /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[] > warnings.warn("Library error: libs=%s found_libs=%s" % \ > NOT AVAILABLE > > atlas_blas_threads_info: > Setting PTATLAS=ATLAS > /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['lapack', 'f77blas', 'cblas', 'atlas'] found_libs=[] > warnings.warn("Library error: libs=%s found_libs=%s" % \ > Setting PTATLAS=ATLAS > Setting PTATLAS=ATLAS > FOUND: > libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] > language = c > include_dirs = ['/opt/include'] > ... > > See my site.cfg file? Why does it affect library_dirs but not include_dirs? Probably a bug, but I don't see exactly where at the moment. It shouldn't really affect anything, I don't think. The only header file that comes with ATLAS is cblas.h, and I'm pretty sure that numpy itself doesn't need it. In fact, we provide our own copy in numpy/core/blasdot/ for the parts that can use it. > running build_src > building extension "atlas_version" sources > adding 'build/src/atlas_version_0x33c6fa32.c' to sources. > running build_ext > customize UnixCCompiler > customize UnixCCompiler using build_ext > building 'atlas_version' extension > compiling C sources > gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC' > compile options: '-I/opt/include -Inumpy/core/include -I/opt/app/g++lib6/python-2.4/include/python2.4 -c' > /opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so > Text relocation remains referenced > against symbol offset in file > 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > 0xc /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ... bunch of missing s elided ... > printf 0x1b /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > printf 0x2d /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > printf 0x3f /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > printf 0x51 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ... what's this? can't find printf??? ... > ld: fatal: relocations remain against allocatable but non-writable sections > collect2: ld returned 1 exit status Hmm. Was ATLAS compiled -fPIC? I'm afraid I'm a little out of my depth when it comes to linking shared objects on Solaris. > Text relocation remains referenced > against symbol offset in file > 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ... more eliding ... > printf 0x108 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ld: fatal: relocations remain against allocatable but non-writable sections > collect2: ld returned 1 exit status > ##### msg: error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 > error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 > FOUND: > libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] > language = c > define_macros = [('NO_ATLAS_INFO', 2)] > include_dirs = ['/opt/include'] > > Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas > ... > > How can I initialize things earlier? Does it matter? You get messages like this when something previous goes wrong. There's nothing you can do to initialize things earlier except to make sure that the previous steps don't fail. It's not the most informative error message, I know. > Traceback (most recent call last): > File "setup.py", line 76, in ? > setup_package() > File "setup.py", line 63, in setup_package > config.add_subpackage('numpy') > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage > config_list = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage > subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py > config = setup_module.configuration(*args) > File "/home/ink/skipm/src/numpy/numpy/setup.py", line 10, in configuration > config.add_subpackage('core') > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage > config_list = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage > subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py > config = setup_module.configuration(*args) > File "numpy/core/setup.py", line 215, in configuration > config.add_data_dir('tests') > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 636, in add_data_dir > self.add_data_files((ds,filenames)) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 702, in add_data_files > dist.data_files.extend(data_dict.items()) > AttributeError: 'NoneType' object has no attribute 'extend' > > And finally, a traceback. What's up with that? Essentially, the same issue here. Since an earlier step failed, dist.data_files is still None. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From alexander.belopolsky at gmail.com Tue Feb 21 14:24:05 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue Feb 21 14:24:05 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: It turns out that around (or round_) is implemented in python: def round_(a, decimals=0): """Round 'a' to the given number of decimal places. Rounding behaviour is equivalent to Python. Return 'a' if the array is not floating point. Round both the real and imaginary parts separately if the array is complex. """ a = asarray(a) if not issubclass(a.dtype.type, _nx.inexact): return a if issubclass(a.dtype.type, _nx.complexfloating): return round_(a.real, decimals) + 1j*round_(a.imag, decimals) if decimals is not 0: decimals = asarray(decimals) s = sign(a) if decimals is not 0: a = absolute(multiply(a, 10.**decimals)) else: a = absolute(a) rem = a-asarray(a).astype(_nx.intp) a = _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) # convert back if decimals is not 0: return multiply(a, s/(10.**decimals)) else: return multiply(a, s) I see many ways to improve the performance here. First, there is no need to check for "decimals is not 0" three times. This can be done once, maybe at the expense of some code duplication. Second, _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a ufunc as Mads originally suggested, "decimals is 0" branch can just call that. It is tempting to rewrite the whole thing in C, but before I do that I have a few questions about current implementation. 1. It is implemented in oldnumeric.py . Does this mean it is deprecated. If so, what is the recommended replacement? 2. Was it intended to support array and fractional values for decimals or is it an implementation artifact. Currently: >>> around(array([1.2345]*5),[1,2,3,4,5]) array([ 1.2 , 1.23 , 1.235 , 1.2345, 1.2345]) >>> around(1.2345,2.5) array(1.2332882874656679) 3. It does nothing to exact types, even if decimals<0 >>> around(1234, -2) array(1234) Is this a bug? Consider that >>> round(1234, -2) 1200.0 and >>> around(1234., -2) array(1200.0) Docstring is self-contradictory: "Rounding behaviour is equivalent to Python" is not consistent with "Return 'a' if the array is not floating point." I propose to deprecate around and implement a new "round" member function in C that will only accept scalar "decimals" and will behave like a properly vectorized builtin round. I will do the coding if there is interest. In any case, something has to be done here. I don't think the following timings are acceptable: > python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)" 100000 loops, best of 3: 18.8 usec per loop > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" 10000 loops, best of 3: 155 usec per loop On 2/21/06, Sasha wrote: > On the second thought, the difference between around and astype is not > surprising because around operates in terms of decimals. Rather than > adding rint, I would suggest to make a special case decimals=0 use C > rint. > > > On 2/21/06, Mads Ipsen wrote: > > > I suggest that rint() is added as a ufunc or is there any concerns > > > here that I am not aware of? > > > > You might want to use astype(int). On my system it is much faster than around: > > > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" > > 10000 loops, best of 3: 176 usec per loop > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" > > 100000 loops, best of 3: 3.2 usec per loop > > > > the difference is too big to be explained by the fact that around > > allocates twice as much memory for the result. In fact the following > > equivalent of rint is still very fast: > > > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int).astype(float)" > > 100000 loops, best of 3: 6.48 usec per loop > > > From tim.hochberg at cox.net Tue Feb 21 14:24:07 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 21 14:24:07 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> <43FB8933.1080400@cox.net> Message-ID: <43FB92D1.4020802@cox.net> Mads Ipsen wrote: >On Tue, 21 Feb 2006, Tim Hochberg wrote: > > > >>This all makes perfect sense, but what happended to box? In your >>original code there was a step where you did some mumbo jumbo and box >>and rint. Namely: >> >> > >It's a minor detail, but the reason for this is the following > >Suppose you have a line with length of box = 10 with periodic boundary >conditions (basically this is a circle). Now consider two points x0 = 1 >and x1 = 9 on this line. The shortest distance dx between the points x0 >and x1 is dx = -2 and not 8. The calculation > > dx = x1 - x0 ( = +8) > dx -= box*rint(dx/box) ( = -2) > >will give you the desired result, namely dx = -2. Hope this makes better >sense. Note that fmod() won't work since > > fmod(dx,box) = 8 > > I think you could use some variation like "fmod(dx+box/2, box) - box/2" but rint seems better. >Part of my original post was concerned with the fact, that I initially was >using around() from numpy for this step. This was terribly slow, so I made >some custom changes and added rint() from the C-math library to the numpy >module, giving a speedup factor about 4 for this particular line in the >code. > >Best regards // Mads > > > OK, that all makes sense. You might want to try the following, which factors out all the divisions and half the multiplies by box and produces several fewer temporaries. Note I replaced x**2 with x*x, which for the moment is much faster (I don't know if you've been following the endless yacking about optimizing x**n, but x**2 will get fast eventually). Depending on what you're doing with r2, you may be able to avoid the last multiple by box as well. # Loop over all particles xbox = x/box ybox = y/box for i in range(n-1): dx = xbox[i+1:] - xbox[i] dy = ybox[i+1:] - ybox[i] dx -= rint(dx) dy -= rint(dy) r2 = (dx*dx + dy*dy) r2 *= box Regards, -tim From gruben at bigpond.net.au Tue Feb 21 15:18:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Tue Feb 21 15:18:02 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <20060221155513.GC14470@alpha> References: <20060221122737.GA14470@alpha> <43FB2298.2080003@bigpond.net.au> <20060221155513.GC14470@alpha> Message-ID: <43FB9F90.90606@bigpond.net.au> Thanks St?fan, I find this much better now. However, I'd like to hear suggestions from others if they can think of ways of further improving the style since I see this as a template for future tutorials. I'll just note that ipython on my windows systems doesn't do the syntax colouring the same, so if I was to make a similarly styled tutorial, there would be some variation in colouring. I also think others would be likely to use the default >>> Python prompt. I don't think this minor variation in styles would detract from getting the information across, so I wouldn't advocate trying to lock authors into any particular style. Good work, Gary Stefan van der Walt wrote: > Hi Gary > > Thanks for your suggestions. I incorporated them. > > St?fan From gruben at bigpond.net.au Tue Feb 21 15:34:01 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Tue Feb 21 15:34:01 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> Message-ID: <43FBA352.3030400@bigpond.net.au> Something like this would be great to see in scipy. Pity about the licence. Gary R. Alan G Isaac wrote: > On Tue, 21 Feb 2006, (CET) Mads Ipsen apparently wrote: >> My system consists of N particles, whose coordinates in >> the xy-plane is given by the two vectors x and y. I need >> to calculate the distance between all particle pairs > > Of possible interest? > http://www.cs.umd.edu/~mount/ANN/ > > Cheers, > Alan Isaac From cookedm at physics.mcmaster.ca Tue Feb 21 15:43:04 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 21 15:43:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F938FA.80200@cox.net> (Tim Hochberg's message of "Sun, 19 Feb 2006 20:35:22 -0700") References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>On Sat, Feb 18, 2006 at 06:17:47PM -0700, Tim Hochberg wrote: >> >> >>>OK, I now have a faily clean implementation in C of: >>> >>>def __pow__(self, p): >>> if p is not a scalar: >>> return power(self, p) >>> elif p == 1: >>> return p >>> elif p == 2: >>> return square(self) >>># elif p == 3: >>># return cube(self) >>># elif p == 4: >>># return power_4(self) >>># elif p == 0: >>># return ones(self.shape, dtype=self.dtype) >>># elif p == -1: >>># return 1.0/self >>> elif p == 0.5: >>> return sqrt(self) I've gone through your code you checked in, and fixed it up. Looks good. One side effect is that def zl(x): a = ones_like(x) a[:] = 0 return a is now faster than zeros_like(x) :-) One problem I had is that in PyArray_SetNumericOps, the "copy" method wasn't picked up on. It may be due to the order of initialization of the ndarray type, or something (since "copy" isn't a ufunc, it's initialized in a different place). I couldn't figure out how to fiddle that, so I replaced the x.copy() call with a call to PyArray_Copy(). >>Yes; because it's the implementation of __pow__, the second argument can >>be anything. >> >> > No, you misunderstand.. What I was talking about was that the *first* > argument can also be something that's not a PyArrayObject, despite the > functions signature. Ah, I suppose that's because the power slot in the number protocol also handles __rpow__. >>> On the other hand, real powers are fast enough that doing anything >>> at the single element level is unlikely to help. So in that case >>> we're left with either optimizing the cases where the dimension is >>> zero as David has done, or optimizing at the __pow__ (AKA >>> array_power) level as I've done now based on David's original >>> suggestion. This second approach is faster because it avoids the >>> mysterious python scalar -> zero-D array conversion overhead. >>> However, it suffers if we want to optimize lots of different powers >>> since one needs a ufunc for each one. So the question becomes, >>> which powers should we optimize? >> >>Hmm, ufuncs are passed a void* argument for passing info to them. Now, >>what that argument is defined when the ufunc is created, but maybe >>there's a way to piggy-back on it. >> >> > Yeah, I really felt like I was fighting the ufuncs when I was playing > with this. On the one hand, you really want to use the ufunc > machinery. On the other hand that forces you into using the same types > for both arguments. That really wouldn't be a problem, since we could > just define an integer_power that took doubles, but did integer > powers, except for the conversion overhead of Python_Integers into > arrays. It looks like you started down this road and I played with > this as well. I can think a of at least one (horrible) way around > the matrix overhead, but the real fix would be to dig into > PyArray_EnsureArray and see why it's slow for Python_Ints. It is much > faster for numarray scalars. Right; that needs to be looked at. > Another approach is to actually compute (x*x)*(x*x) for pow(x,4) at > the level of array_power. I think I could make this work. It would > probably work well for medium size arrays, but might well make things > worse for large arrays that are limited by memory bandwidth since it > would need to move the array from memory into the cache multiple times. I don't like that; I think it would be better memory-wise to do it elementwise. Not just speed, but size of intermediate arrays. >>> My latest thinking on this is that we should optimize only those >>> cases where the optimized result is no less accurate than that >>> produced by pow. I'm going to assume that all C operations are >>> equivalently accurate, so pow(x,2) has roughly the same amount of >>> error as x*x. (Something on the order of 0.5 ULP I'd guess). In >>> that case: >>> pow(x, -1) -> 1 / x >>> pow(x, 0) -> 1 >>> pow(x, 0.5) -> sqrt(x) >>> pow(x, 1) -> x >>> pow(x, 2) -> x*x >>> can all be implemented in terms of multiply or divide with the same >>> accuracy as the original power methods. Once we get beyond these, >>> the error will go up progressively. >>> >>> The minimal set described above seems like it should be relatively >>> uncontroversial and it's what I favor. Once we get beyond this >>> basic set, we would need to reach some sort of consensus on how >>> much additional error we are willing to tolerate for optimizing >>> these extra cases. You'll notice that I've changed my mind, yet >>> again, over whether to optimize A**0.5. Since the set of additional >>> ufuncs needed in this case is relatively small, just square and >>> inverse (==1/x), this minimal set works well if optimizing in pow >>> as I've done. >>> >>> > > Just to add a little more confusion to the mix. I did a little testing > to see how close pow(x,n) and x*x*... actually are. They are slightly > less close for small values of N and slightly closer for large values > of N than I would have expected. The upshot of this is that integer > powers between -2 and +4 all seem to vary by the same amount when > computed using pow(x,n) versus multiplies. I'm including the test code > at the end. Assuming that this result is not a fluke that expands the > noncontroversial set by at least 3 more values. That's starting to > strain the ufunc aproach, so perhaps optimizing in @TYP at _power is the > way to go after all. Or, more likely, adding @TYP at _int_power or maybe > @TYP at _fast_power (so as to be able to include some half integer > powers) and dispatching appropriately from array_power. 'int_power' we could do; that would the next step I think. The half integer powers we could maybe leave; if you want x**(-3/2), for instance, you could do y = x**(-1)*sqrt(x) (or do y = x**(-1); sqrt(y,y) if you're worried about temporaries). Or, 'fast_power' could be documented as doing the optimizations for integer and half-integer _scalar_ exponents, up to a certain size, like 100), and falling back on pow() if necessary. I think we could do a precomputation step to split the exponent into appropiate squarings and such that'll make the elementwise loop faster. Half-integer exponents are exactly representable as doubles (up to some number of course), so there's no chance of decimal-to-binary conversions making things look different. That might work out ok. Although, at that point I'd suggest we make it 'power', and have 'rawpower' (or ????) as the version that just uses pow(). Another point is to look at __div__, and use reciprocal if the dividend is 1. > The problem here, of course, is the overhead that PyArray_EnsureArray > runs into. I'm not sure if the ufuncs actually call that, but I was > using that to convert things to arrays at one point and I saw the > slowdown, so I suspect that the slowdown is in something > PyArray_EnsureArray calls if not in that routine itself. I'm afraid to > dig into that stuff though.. On the other hand, it would probably > speed up all kinds of stuff if that was sped up. I've added a page to the developer's wiki at http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas to keep a list of areas like that to look into if someone has time :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From svetosch at gmx.net Tue Feb 21 15:49:02 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Tue Feb 21 15:49:02 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB6604.6000406@ieee.org> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> Message-ID: <43FBA6C6.6040507@gmx.net> Travis Oliphant schrieb: > Sven Schreiber wrote: > >> Next, I try to workaround by b.squeeze(). That seems to work, but why is >> b.squeeze().shape == (1, 112) instead of (112,)? >> >> > Again the same reason as before. A matrix is returned from b.squeeze() > and there are no 1-d matrices. Thus, you get a row-vector. Use .T if > you want a column vector. Well if squeeze can't really squeeze matrix-vectors (doing a de-facto transpose instead), wouldn't it make more sense to disable the squeeze method for matrices altogether? > >> Then I thought maybe b.flattened() does the job, but then I get an error >> (matrix has no attr flattened). Again, I'm baffled. >> >> > The correct spelling is b.flatten() Ok, but I copied .flattened() from p. 48 of your book, must be a typo then. > You can mix arrays and matrices just fine if you remember that 1d arrays > are equivalent to row-vectors. > -Travis > Ok, thanks. Btw, did the recent numpy release change anything in terms of preserving matrix types when passing to decompositions etc? I checked the release notes but maybe they're just not verbose enough. -Sven From svetosch at gmx.net Tue Feb 21 15:57:00 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Tue Feb 21 15:57:00 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB7C87.1010007@noaa.gov> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> Message-ID: <43FBA8B2.8010708@gmx.net> Christopher Barker schrieb: > > and you can easily get a column vector out of an array, if you remember > that you want to keep it 2-d. i.e. use a slice rather than an index: > >>>> import numpy as N >>>> a = N.ones((5,10)) >>>> a[:,1].shape # an index: it reduces the rank > (5,) >>>> a[:,1:2].shape # a slice: it keeps the rank > (5, 1) > That's very interesting, thanks. But I find it a little unintuitive/surprising, so I'm not sure if I will use it. I fear that I wouldn't understand my own code after a while of not working on it. I guess I'd rather follow the advice and just remember to treat 1d as a row. But thanks alot, sven From Chris.Barker at noaa.gov Tue Feb 21 16:47:01 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 21 16:47:01 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <43FB92D1.4020802@cox.net> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> <43FB8933.1080400@cox.net> <43FB92D1.4020802@cox.net> Message-ID: <43FBB444.1040504@noaa.gov> > r2 = (dx*dx + dy*dy) Might numpy.hypot() help here? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Feb 21 16:56:01 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 21 16:56:01 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FBA8B2.8010708@gmx.net> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> <43FBA8B2.8010708@gmx.net> Message-ID: <43FBB65C.4040001@noaa.gov> Sven Schreiber wrote: >>>>> a = N.ones((5,10)) >>>>> a[:,1].shape # an index: it reduces the rank >> (5,) >>>>> a[:,1:2].shape # a slice: it keeps the rank >> (5, 1) >> > That's very interesting, thanks. But I find it a little > unintuitive/surprising, so I'm not sure if I will use it. I fear that I > wouldn't understand my own code after a while of not working on it. Well, what's surprising to different people is different. However.... > I guess I'd rather follow the advice and just remember to treat 1d as a row. Except that it's not, universally. For instance, it won't transpose: >>> a = N.ones((5,)) >>> a.transpose() array([1, 1, 1, 1, 1]) >>> a.shape = (1,-1) >>> a array([[1, 1, 1, 1, 1]]) >>> a.transpose() array([[1], [1], [1], [1], [1]]) so while a rank-1 array is often treated like a row vector, it really isn't the same. The concept of a row vs a column vector is a rank-2 array concept -- so keep your arrays rank-2. It's very helpful to remember that indexing reduces rank, and slicing keeps the rank the same. It will serve you well to use that in the future anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ndarray at mac.com Tue Feb 21 17:06:08 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 21 17:06:08 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? Message-ID: [I am reposting this under a different subject because my original post got buried in a long thread that went on to discussing unrelated topics. Sorry if you had to read this post twice.] It turns out that around (or round_) is implemented in python: def round_(a, decimals=0): """Round 'a' to the given number of decimal places. Rounding behaviour is equivalent to Python. Return 'a' if the array is not floating point. Round both the real and imaginary parts separately if the array is complex. """ a = asarray(a) if not issubclass(a.dtype.type, _nx.inexact): return a if issubclass(a.dtype.type, _nx.complexfloating): return round_(a.real, decimals) + 1j*round_(a.imag, decimals) if decimals is not 0: decimals = asarray(decimals) s = sign(a) if decimals is not 0: a = absolute(multiply(a, 10.**decimals)) else: a = absolute(a) rem = a-asarray(a).astype(_nx.intp) a = _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) # convert back if decimals is not 0: return multiply(a, s/(10.**decimals)) else: return multiply(a, s) I see many ways to improve the performance here. First, there is no need to check for "decimals is not 0" three times. This can be done once, maybe at the expense of some code duplication. Second, _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a ufunc as Mads originally suggested, "decimals is 0" branch can just call that. It is tempting to rewrite the whole thing in C, but before I do that I have a few questions about current implementation. 1. It is implemented in oldnumeric.py . Does this mean it is deprecated. If so, what is the recommended replacement? 2. Was it intended to support array and fractional values for decimals or is it an implementation artifact. Currently: >>> around(array([1.2345]*5),[1,2,3,4,5]) array([ 1.2 , 1.23 , 1.235 , 1.2345, 1.2345]) >>> around(1.2345,2.5) array(1.2332882874656679) 3. It does nothing to exact types, even if decimals<0 >>> around(1234, -2) array(1234) Is this a bug? Consider that >>> round(1234, -2) 1200.0 and >>> around(1234., -2) array(1200.0) Docstring is self-contradictory: "Rounding behaviour is equivalent to Python" is not consistent with "Return 'a' if the array is not floating point." I propose to deprecate around and implement a new "round" member function in C that will only accept scalar "decimals" and will behave like a properly vectorized builtin round. I will do the coding if there is interest. In any case, something has to be done here. I don't think the following timings are acceptable: > python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)" 100000 loops, best of 3: 18.8 usec per loop > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" 10000 loops, best of 3: 155 usec per loop From skip at pobox.com Tue Feb 21 18:19:01 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue Feb 21 18:19:01 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: References: <17403.30180.676655.892180@montanaro.dyndns.org> Message-ID: <17403.51674.348469.279480@montanaro.dyndns.org> Robert> Hmm. Was ATLAS compiled -fPIC? I'm not certain, but I doubt it should matter since only .a files were generated. There's nothing to relocate: $ ls -ltr total 9190 lrwxrwxrwx 1 skipm develop 41 Feb 9 14:51 Make.inc -> /home/ink/skipm/src/ATLAS/Make.SunOS_Babe -rw-r--r-- 1 skipm develop 1529 Feb 9 14:51 Makefile -rw-r--r-- 1 skipm develop 236004 Feb 9 14:57 libtstatlas.a -rw-r--r-- 1 skipm develop 241352 Feb 9 16:28 libcblas.a -rw-r--r-- 1 skipm develop 280464 Feb 9 16:33 libf77blas.a -rw-r--r-- 1 skipm develop 278616 Feb 9 16:34 liblapack.a -rw-r--r-- 1 skipm develop 3603644 Feb 9 16:36 libatlas.a Robert> You get messages like this when something previous goes Robert> wrong. Thanks. Now I know to focus on only the first problem... Skip From zpincus at stanford.edu Tue Feb 21 19:15:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 21 19:15:02 2006 Subject: [Numpy-discussion] Shouldn't singular_value_decomposition respect full_matrices? Message-ID: <29AE3219-9E29-45EC-BA94-E9487E983A2D@stanford.edu> numpy.linglg.singular_value_decomposition is defined as follows: def singular_value_decomposition(A, full_matrices=0): return svd(A, 0) Shouldn't that last line be return svd(A, full_matrices) Zach From robert.kern at gmail.com Tue Feb 21 19:19:01 2006 From: robert.kern at gmail.com (Robert Kern) Date: Tue Feb 21 19:19:01 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: <17403.51674.348469.279480@montanaro.dyndns.org> References: <17403.30180.676655.892180@montanaro.dyndns.org> <17403.51674.348469.279480@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > Robert> Hmm. Was ATLAS compiled -fPIC? > > I'm not certain, but I doubt it should matter since only .a files were > generated. There's nothing to relocate: > > $ ls -ltr > total 9190 > lrwxrwxrwx 1 skipm develop 41 Feb 9 14:51 Make.inc -> /home/ink/skipm/src/ATLAS/Make.SunOS_Babe > -rw-r--r-- 1 skipm develop 1529 Feb 9 14:51 Makefile > -rw-r--r-- 1 skipm develop 236004 Feb 9 14:57 libtstatlas.a > -rw-r--r-- 1 skipm develop 241352 Feb 9 16:28 libcblas.a > -rw-r--r-- 1 skipm develop 280464 Feb 9 16:33 libf77blas.a > -rw-r--r-- 1 skipm develop 278616 Feb 9 16:34 liblapack.a > -rw-r--r-- 1 skipm develop 3603644 Feb 9 16:36 libatlas.a Google suggests that it does matter. E.g. http://mail.python.org/pipermail/python-dev/2001-March/013510.html http://bugs.mysql.com/bug.php?id=14202 http://mail.python.org/pipermail/image-sig/2002-June/001884.html -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From skip at pobox.com Tue Feb 21 19:59:02 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue Feb 21 19:59:02 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: References: <17403.30180.676655.892180@montanaro.dyndns.org> <17403.51674.348469.279480@montanaro.dyndns.org> Message-ID: <17403.57676.925970.142021@montanaro.dyndns.org> Robert> Google suggests that it does matter. E.g. Robert> http://mail.python.org/pipermail/python-dev/2001-March/013510.html Robert> http://bugs.mysql.com/bug.php?id=14202 Robert> http://mail.python.org/pipermail/image-sig/2002-June/001884.html *sigh* Thanks. You'd think that Solaris was a common enough platform that the ATLAS folks would get this right... Skip From nadavh at visionsense.com Tue Feb 21 22:59:02 2006 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue Feb 21 22:59:02 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? Message-ID: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> You may get a significant boost by replacing the line: w=w+ eta * (y*x - y**2*w) with w *= 1.0 - eta*y*y w += eta*y*x I ran a test on a similar expression and got 5 fold speed increase. The dot() function runs faster if you compile with dotblas. Nadav. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net on behalf of Bruce Southey Sent: Tue 21-Feb-06 17:15 To: Brian Blais Cc: python-list at python.org; numpy-discussion at lists.sourceforge.net; scipy-user at scipy.net Subject: Re: [Numpy-discussion] algorithm, optimization, or other problem? Hi, In the current version, note that Y is scalar so replace the squaring (Y**2) with Y*Y as you do in the dohebb function. On my system without blas etc removing the squaring removes a few seconds (16.28 to 12.4). It did not seem to help factorizing Y. Also, eta and tau are constants so define them only once as scalars outside the loops and do the division outside the loop. It only saves about 0.2 seconds but these add up. The inner loop probably can be vectorized because it is just vector operations on a matrix. You are just computing over the ith dimension of X. I think that you could be able to find the matrix version on the net. Regards Bruce On 2/21/06, Brian Blais wrote: > Hello, > > I am trying to translate some Matlab/mex code to Python, for doing neural > simulations. This application is definitely computing-time limited, and I need to > optimize at least one inner loop of the code, or perhaps even rethink the algorithm. > The procedure is very simple, after initializing any variables: > > 1) select a random input vector, which I will call "x". right now I have it as an > array, and I choose columns from that array randomly. in other cases, I may need to > take an image, select a patch, and then make that a column vector. > > 2) calculate an output value, which is the dot product of the "x" and a weight > vector, "w", so > > y=dot(x,w) > > 3) modify the weight vector based on a matrix equation, like: > > w=w+ eta * (y*x - y**2*w) > ^ > | > +---- learning rate constant > > 4) repeat steps 1-3 many times > > I've organized it like: > > for e in 100: # outer loop > for i in 1000: # inner loop > (steps 1-3) > > display things. > > so that the bulk of the computation is in the inner loop, and is amenable to > converting to a faster language. This is my issue: > > straight python, in the example posted below for 250000 inner-loop steps, takes 20 > seconds for each outer-loop step. I tried Pyrex, which should work very fast on such > a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex > file in matlab takes 1.5 seconds per outer-loop step. > > Given the huge difference between the Pyrex and the Mex, I feel that there is > something I am doing wrong, because the C-code for both should run comparably. > Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind > coding some in C, but the Python API seemed a bit challenging to me. > > One note: I am using the Numeric package, not numpy, only because I want to be able > to use the Enthought version for Windows. I develop on Linux, and haven't had a > chance to see if I can compile numpy using the Enthought Python for Windows. > > If there is anything else anyone needs to know, I'll post it. I put the main script, > and a dohebb.pyx code below. > > > thanks! > > Brian Blais > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > > > # Main script: > > from dohebb import * > import pylab as p > from Numeric import * > from RandomArray import * > import time > > x=random((100,1000)) # 1000 input vectors > > numpats=x.shape[0] > w=random((numpats,1)); > > th=random((1,1)) > > params={} > params['eta']=0.001; > params['tau']=100.0; > old_mx=0; > for e in range(100): > > rnd=randint(0,numpats,250000) > t1=time.time() > if 0: # straight python > for i in range(len(rnd)): > pat=rnd[i] > xx=reshape(x[:,pat],(1,-1)) > y=matrixmultiply(xx,w) > w=w+params['eta']*(y*transpose(xx)-y**2*w); > th=th+(1.0/params['tau'])*(y**2-th); > else: # pyrex > dohebb(params,w,th,x,rnd) > print time.time()-t1 > > > p.plot(w,'o-') > p.xlabel('weights') > p.show() > > > #============================================= > > # dohebb.pyx > > cdef extern from "Numeric/arrayobject.h": > > struct PyArray_Descr: > int type_num, elsize > char type > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > cdef char *data > cdef int nd > cdef int *dimensions, *strides > cdef object base > cdef PyArray_Descr *descr > cdef int flags > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > cdef int num_iterations > cdef int num_inputs > cdef int offset > cdef double *wp,*xp,*thp > cdef int *rndp > cdef double eta,tau > > eta=params['eta'] # learning rate > tau=params['tau'] # used for variance estimate > > cdef double y > num_iterations=rnd.dimensions[0] > num_inputs=w.dimensions[0] > > # get the pointers > wp=w.data > xp=X.data > rndp=rnd.data > thp=th.data > > for it from 0 <= it < num_iterations: > > offset=rndp[it]*num_inputs > > # calculate the output > y=0.0 > for i from 0 <= i < num_inputs: > y=y+wp[i]*xp[i+offset] > > # change in the weights > for i from 0 <= i < num_inputs: > wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > # estimate the variance > thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From zpincus at stanford.edu Wed Feb 22 00:50:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed Feb 22 00:50:04 2006 Subject: [Numpy-discussion] Method to shift elements in an array? Message-ID: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Hello folks, Does numpy have an built-in mechanism to shift elements along some axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, to make [2,3,0,1]) If not, what would be the fastest way to implement this in python? Using take? Using slicing and concatenation? Zach From hot_night at obqv.com Wed Feb 22 02:20:04 2006 From: hot_night at obqv.com (=?ISO-2022-JP?B?GyRCJWklViU5JUYhPCU3JWclcxsoQg==?=) Date: Wed Feb 22 02:20:04 2006 Subject: [Numpy-discussion] $B!!$*BT$A$7$F$*$j$^$9(B Message-ID: <20060222043755.30160.qmail@mail.obqv.com> ?????????????????????? ????????????????????????? ??????? ?????????????????????????????? ?????????????????????? ????????????????OL?????????? ?????????????????????????????? ???????? http://www.covcov.net?num=112 ???????????????? ????????????????? ??????BOX?????? ???refuse at www.covcov.net From mpi at osc.kiku.dk Wed Feb 22 02:27:02 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Wed Feb 22 02:27:02 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On Tue, 21 Feb 2006, Sasha wrote: > > python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)" > 100000 loops, best of 3: 18.8 usec per loop > > python -m timeit -s just want to point out that the function foo(x) = (x+0.5).astype(int).astype(float) is different from around. For x = array([1.2, 1.8]) it works but for x = array([-1.2, -1.8]) you get around(x) = array([-1., -2.]) whereas foo(x) gives foo(x) = array([0., -1.]) Using foo(x) = where(greater(x,0),x+0.5,x-0.5).astype(int).astype(float) will work. // Mads From schofield at ftw.at Wed Feb 22 02:48:06 2006 From: schofield at ftw.at (Ed Schofield) Date: Wed Feb 22 02:48:06 2006 Subject: [Numpy-discussion] A proposal to implement round in C In-Reply-To: References: Message-ID: <43FC412D.2050402@ftw.at> Sasha wrote: >I propose to deprecate around and implement a new "round" member >function in C that will only accept scalar "decimals" and will behave >like a properly vectorized builtin round. I will do the coding if >there is interest. > >In any case, something has to be done here. I don't think the >following timings are acceptable: > > This sounds great to me :) -- Ed From mpi at osc.kiku.dk Wed Feb 22 03:56:04 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Wed Feb 22 03:56:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <43FB92D1.4020802@cox.net> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> <43FB8933.1080400@cox.net> <43FB92D1.4020802@cox.net> Message-ID: On Tue, 21 Feb 2006, Tim Hochberg wrote: > Mads Ipsen wrote: > > >On Tue, 21 Feb 2006, Tim Hochberg wrote: > > > > > > > >>This all makes perfect sense, but what happended to box? In your > >>original code there was a step where you did some mumbo jumbo and box > >>and rint. Namely: > >> > >> > > > >It's a minor detail, but the reason for this is the following > > > >Suppose you have a line with length of box = 10 with periodic boundary > >conditions (basically this is a circle). Now consider two points x0 = 1 > >and x1 = 9 on this line. The shortest distance dx between the points x0 > >and x1 is dx = -2 and not 8. The calculation > > > > dx = x1 - x0 ( = +8) > > dx -= box*rint(dx/box) ( = -2) > > > >will give you the desired result, namely dx = -2. Hope this makes better > >sense. Note that fmod() won't work since > > > > fmod(dx,box) = 8 > > > > > I think you could use some variation like "fmod(dx+box/2, box) - box/2" > but rint seems better. > > >Part of my original post was concerned with the fact, that I initially was > >using around() from numpy for this step. This was terribly slow, so I made > >some custom changes and added rint() from the C-math library to the numpy > >module, giving a speedup factor about 4 for this particular line in the > >code. > > > >Best regards // Mads > > > > > > > OK, that all makes sense. You might want to try the following, which > factors out all the divisions and half the multiplies by box and > produces several fewer temporaries. Note I replaced x**2 with x*x, > which for the moment is much faster (I don't know if you've been > following the endless yacking about optimizing x**n, but x**2 will get > fast eventually). Depending on what you're doing with r2, you may be > able to avoid the last multiple by box as well. > > > # Loop over all particles > xbox = x/box > ybox = y/box > for i in range(n-1): > dx = xbox[i+1:] - xbox[i] > dy = ybox[i+1:] - ybox[i] > dx -= rint(dx) > dy -= rint(dy) > r2 = (dx*dx + dy*dy) > r2 *= box > > > > Regards, > > -tim > Thanks Tim, I am only a factor 2.5 slower than the C loop now, thanks to your suggestions. // Mads From mfmorss at aep.com Wed Feb 22 06:07:32 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Wed Feb 22 06:07:32 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. Message-ID: I built Python successfully on our AIX 5.2 server using "./configure --without-cxx --disable-ipv6". (This uses the native IBM C compiler, invoking it as "cc_r". We have no C++ compiler.) But I have been unable to install Numpy-0.9.5 using the same compiler. After "python setup.py install," the relevant section of the output was: compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/pydirectory/include/python2.4 -c' cc_r: build/src/numpy/core/src/umathmodule.c "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. error: Command "cc_r -DNDEBUG -O -Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/app/sandbox/s625662/installed/include/python2.4 -c build/src/numpy/core/src/umathmodule.c -o build/temp.aix-5.2-2.4 /build/src/numpy/core/src/umathmodule.o" failed with exit status 1 A closely related question is, how can I modify the Numpy setup.py and/or distutils files to enable me to control the options with which cc_r is invoked? I inspected these files, but not being very expert in Python, I could not figure this out. Mark F. Morss Principal Analyst, Market Risk American Electric Power From bsouthey at gmail.com Wed Feb 22 06:25:05 2006 From: bsouthey at gmail.com (Bruce Southey) Date: Wed Feb 22 06:25:05 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> Message-ID: Hi, Actually it makes it slightly worse - given the responses on another thread it is probably due to not pushing enough into C code. Obviously use of blas etc will be faster but it doesn't change the fact that removing the inner loop would be faster still. Bruce On 2/22/06, Nadav Horesh wrote: > You may get a significant boost by replacing the line: > w=w+ eta * (y*x - y**2*w) > with > w *= 1.0 - eta*y*y > w += eta*y*x > > I ran a test on a similar expression and got 5 fold speed increase. > The dot() function runs faster if you compile with dotblas. > > Nadav. > > > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net on behalf of Bruce Southey > Sent: Tue 21-Feb-06 17:15 > To: Brian Blais > Cc: python-list at python.org; numpy-discussion at lists.sourceforge.net; scipy-user at scipy.net > Subject: Re: [Numpy-discussion] algorithm, optimization, or other problem? > Hi, > In the current version, note that Y is scalar so replace the squaring > (Y**2) with Y*Y as you do in the dohebb function. On my system > without blas etc removing the squaring removes a few seconds (16.28 to > 12.4). It did not seem to help factorizing Y. > > Also, eta and tau are constants so define them only once as scalars > outside the loops and do the division outside the loop. It only saves > about 0.2 seconds but these add up. > > The inner loop probably can be vectorized because it is just vector > operations on a matrix. You are just computing over the ith dimension > of X. I think that you could be able to find the matrix version on > the net. > > Regards > Bruce > > > > On 2/21/06, Brian Blais wrote: > > Hello, > > > > I am trying to translate some Matlab/mex code to Python, for doing neural > > simulations. This application is definitely computing-time limited, and I need to > > optimize at least one inner loop of the code, or perhaps even rethink the algorithm. > > The procedure is very simple, after initializing any variables: > > > > 1) select a random input vector, which I will call "x". right now I have it as an > > array, and I choose columns from that array randomly. in other cases, I may need to > > take an image, select a patch, and then make that a column vector. > > > > 2) calculate an output value, which is the dot product of the "x" and a weight > > vector, "w", so > > > > y=dot(x,w) > > > > 3) modify the weight vector based on a matrix equation, like: > > > > w=w+ eta * (y*x - y**2*w) > > ^ > > | > > +---- learning rate constant > > > > 4) repeat steps 1-3 many times > > > > I've organized it like: > > > > for e in 100: # outer loop > > for i in 1000: # inner loop > > (steps 1-3) > > > > display things. > > > > so that the bulk of the computation is in the inner loop, and is amenable to > > converting to a faster language. This is my issue: > > > > straight python, in the example posted below for 250000 inner-loop steps, takes 20 > > seconds for each outer-loop step. I tried Pyrex, which should work very fast on such > > a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex > > file in matlab takes 1.5 seconds per outer-loop step. > > > > Given the huge difference between the Pyrex and the Mex, I feel that there is > > something I am doing wrong, because the C-code for both should run comparably. > > Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind > > coding some in C, but the Python API seemed a bit challenging to me. > > > > One note: I am using the Numeric package, not numpy, only because I want to be able > > to use the Enthought version for Windows. I develop on Linux, and haven't had a > > chance to see if I can compile numpy using the Enthought Python for Windows. > > > > If there is anything else anyone needs to know, I'll post it. I put the main script, > > and a dohebb.pyx code below. > > > > > > thanks! > > > > Brian Blais > > > > -- > > ----------------- > > > > bblais at bryant.edu > > http://web.bryant.edu/~bblais > > > > > > > > > > # Main script: > > > > from dohebb import * > > import pylab as p > > from Numeric import * > > from RandomArray import * > > import time > > > > x=random((100,1000)) # 1000 input vectors > > > > numpats=x.shape[0] > > w=random((numpats,1)); > > > > th=random((1,1)) > > > > params={} > > params['eta']=0.001; > > params['tau']=100.0; > > old_mx=0; > > for e in range(100): > > > > rnd=randint(0,numpats,250000) > > t1=time.time() > > if 0: # straight python > > for i in range(len(rnd)): > > pat=rnd[i] > > xx=reshape(x[:,pat],(1,-1)) > > y=matrixmultiply(xx,w) > > w=w+params['eta']*(y*transpose(xx)-y**2*w); > > th=th+(1.0/params['tau'])*(y**2-th); > > else: # pyrex > > dohebb(params,w,th,x,rnd) > > print time.time()-t1 > > > > > > p.plot(w,'o-') > > p.xlabel('weights') > > p.show() > > > > > > #============================================= > > > > # dohebb.pyx > > > > cdef extern from "Numeric/arrayobject.h": > > > > struct PyArray_Descr: > > int type_num, elsize > > char type > > > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > > cdef char *data > > cdef int nd > > cdef int *dimensions, *strides > > cdef object base > > cdef PyArray_Descr *descr > > cdef int flags > > > > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > > > > cdef int num_iterations > > cdef int num_inputs > > cdef int offset > > cdef double *wp,*xp,*thp > > cdef int *rndp > > cdef double eta,tau > > > > eta=params['eta'] # learning rate > > tau=params['tau'] # used for variance estimate > > > > cdef double y > > num_iterations=rnd.dimensions[0] > > num_inputs=w.dimensions[0] > > > > # get the pointers > > wp=w.data > > xp=X.data > > rndp=rnd.data > > thp=th.data > > > > for it from 0 <= it < num_iterations: > > > > offset=rndp[it]*num_inputs > > > > # calculate the output > > y=0.0 > > for i from 0 <= i < num_inputs: > > y=y+wp[i]*xp[i+offset] > > > > # change in the weights > > for i from 0 <= i < num_inputs: > > wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > > > # estimate the variance > > thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > > for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From svetosch at gmx.net Wed Feb 22 06:48:09 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 22 06:48:09 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FBB65C.4040001@noaa.gov> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> <43FBA8B2.8010708@gmx.net> <43FBB65C.4040001@noaa.gov> Message-ID: <43FC799B.6040505@gmx.net> Christopher Barker schrieb: > Sven Schreiber wrote: >> I guess I'd rather follow the advice and just remember to treat 1d as >> a row. > > Except that it's not, universally. For instance, it won't transpose: > > > It's very helpful to remember that indexing reduces rank, and slicing > keeps the rank the same. It will serve you well to use that in the > future anyway. > Anyway, the problem is really about interaction with pylab/matplotlib (so slightly OT here, sorry); when getting data from a text file with pylab.load you can't be sure if the result is 1d or 2d. This means that: - If I have >1 variable then everything is fine (provided I use your advice of slicing instead of indexing afterwards) and the variables are in the _columns_ of the 2d-array. - But if there's just one data _column_ in the file, then pylab/numpy gives me a 1d-array that sometimes works as a _row_ (and as you noted, sometimes not), but never works as a column. Imho that's bad, because as a consequence I must use overhead code to distinguish between these cases. To me it seems more like pylab's bug instead of numpy's, so please excuse this OT twist, but since there seems to be overlap between the pylab/matplotlib and numpy folks, maybe it's not so bad. Thanks for your patience and helpful input, Sven From cjw at sympatico.ca Wed Feb 22 07:29:05 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Wed Feb 22 07:29:05 2006 Subject: [Numpy-discussion] dtype Message-ID: <43FC8312.3040402@sympatico.ca> I've been trying to gain some understanding of dtype from the builtin documentation and would appreciate advice. I don't find anything in http://projects.scipy.org/scipy/numpy or http://wiki.python.org/moin/NumPy Chapter 2.1 of the book has a good overview, but little reference material. In the following, dt= numpy.dtype Some specific problems are flagged ** below. Colin W. [Dbg]>>> h(dt) Help on class dtype in module numpy: class dtype(__builtin__.object) | Methods defined here: | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __len__(...) | x.__len__() <==> len(x) | | __reduce__(...) | self.__reduce__() for pickling. | | __repr__(...) | x.__repr__() <==> repr(x) | | __setstate__(...) | self.__setstate__() for pickling. | | __str__(...) | x.__str__() <==> str(x) | | newbyteorder(...) | self.newbyteorder() returns a copy of the dtype object | with altered byteorders. If is not given all byteorders | are swapped. Otherwise endian can be '>', '<', or '=' to force | a byteorder. Descriptors in all fields are also updated in the | new dtype object. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T ** What are the parameters? In other words, | what does ... stand for? ** | | alignment = | | | base = | The base data-type or self if no subdtype | | byteorder = | | | char = | | | descr = | The array_protocol type descriptor. | | fields = | | | hasobject = | | | isbuiltin = | Is this a buillt-in data-type descriptor? | | isnative = | Is the byte-order of this descriptor native? | | itemsize = | | | kind = | | | name = | The name of the true data-type | | num = | | | shape = | The shape of the subdtype or (1,) | | str = | The array_protocol typestring. | | subdtype = | A tuple of (descr, shape) or None. | | type = [Dbg]>>> dt.num.__doc__ ** no doc string ** [Dbg]>>> help(dt.num) Help on member_descriptor object: num = class member_descriptor(object) | Methods defined here: | | __delete__(...) | descr.__delete__(obj) | | __get__(...) | descr.__get__(obj[, type]) -> value | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __repr__(...) | x.__repr__() <==> repr(x) | | __set__(...) | descr.__set__(obj, value) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __objclass__ = [Dbg]>>> help(dt.num) Help on member_descriptor object: num = class member_descriptor(object) | Methods defined here: | | __delete__(...) | descr.__delete__(obj) | | __get__(...) | descr.__get__(obj[, type]) -> value | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __repr__(...) | x.__repr__() <==> repr(x) | | __set__(...) | descr.__set__(obj, value) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __objclass__ = [Dbg]>>> help(dt.num.__objclass__) Help on class dtype in module numpy: class dtype(__builtin__.object) | Methods defined here: | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __len__(...) | x.__len__() <==> len(x) | | __reduce__(...) | self.__reduce__() for pickling. | | __repr__(...) | x.__repr__() <==> repr(x) | | __setstate__(...) | self.__setstate__() for pickling. | | __str__(...) | x.__str__() <==> str(x) | | newbyteorder(...) | self.newbyteorder() returns a copy of the dtype object | with altered byteorders. If is not given all byteorders | are swapped. Otherwise endian can be '>', '<', or '=' to force | a byteorder. Descriptors in all fields are also updated in the | new dtype object. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | alignment = | | | base = | The base data-type or self if no subdtype | | byteorder = | | | char = | | | descr = | The array_protocol type descriptor. | | fields = | | | hasobject = | | | isbuiltin = | Is this a buillt-in data-type descriptor? | | isnative = | Is the byte-order of this descriptor native? | | itemsize = | | | kind = | | | name = | The name of the true data-type ** How does this differ from what, in common | Python usage, is a class.__name__? ** | | num = ** What does this mean? ** | | | shape = | The shape of the subdtype or (1,) | | str = | The array_protocol typestring. | | subdtype = | A tuple of (descr, shape) or None. | | type = [Dbg]>>> ** There is no __module__ attribute. How does one identify the modules holding the code? ** From mfmorss at aep.com Wed Feb 22 08:16:15 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Wed Feb 22 08:16:15 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: Message-ID: This problem was solved by adding "#include " to ...numpy-0.9.5 /numpy/core/src/umathmodule.c.src Mark F. Morss Principal Analyst, Market Risk American Electric Power mfmorss at aep.com Sent by: numpy-discussion- To admin at lists.sourc numpy-discussion eforge.net cc 02/22/2006 09:06 AM Subject [Numpy-discussion] Trouble installing Numpy on AIX 5.2. I built Python successfully on our AIX 5.2 server using "./configure --without-cxx --disable-ipv6". (This uses the native IBM C compiler, invoking it as "cc_r". We have no C++ compiler.) But I have been unable to install Numpy-0.9.5 using the same compiler. After "python setup.py install," the relevant section of the output was: compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/pydirectory/include/python2.4 -c' cc_r: build/src/numpy/core/src/umathmodule.c "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. error: Command "cc_r -DNDEBUG -O -Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/app/sandbox/s625662/installed/include/python2.4 -c build/src/numpy/core/src/umathmodule.c -o build/temp.aix-5.2-2.4 /build/src/numpy/core/src/umathmodule.o" failed with exit status 1 A closely related question is, how can I modify the Numpy setup.py and/or distutils files to enable me to control the options with which cc_r is invoked? I inspected these files, but not being very expert in Python, I could not figure this out. Mark F. Morss Principal Analyst, Market Risk American Electric Power ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From oliphant.travis at ieee.org Wed Feb 22 08:20:09 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 08:20:09 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: References: Message-ID: <43FC8F0D.5040409@ieee.org> mfmorss at aep.com wrote: >I built Python successfully on our AIX 5.2 server using "./configure >--without-cxx --disable-ipv6". (This uses the native IBM C compiler, >invoking it as "cc_r". We have no C++ compiler.) > >But I have been unable to install Numpy-0.9.5 using the same compiler. >After "python setup.py install," the relevant section of the output was: > >compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include >-Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include >-I/pydirectory/include/python2.4 -c' >cc_r: build/src/numpy/core/src/umathmodule.c >"build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) >Undeclared identifier FE_OVERFLOW. > > Thanks for this check. This is an error in the _AIX section of the header. Change line 304 in ufuncobject.h from FE_OVERFLOW to FP_OVERFLOW. >"build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) >Function argument assignment between types "long double*" and "double*" is >not allowed. > > I'm not sure where this error comes from. It seems to appear when modfl is used. What is the content of config.h (in your /numpy/core/include/numpy directory)? Can you find out if modfl is defined on your platform already? >A closely related question is, how can I modify the Numpy setup.py and/or >distutils files to enable me to control the options with which cc_r is >invoked? I inspected these files, but not being very expert in Python, I >could not figure this out. > > The default CFLAGS are those you used to build Python with. I think you can set the CFLAGS environment variable in order to change this. Thank you for your test. I don't have access to _AIX platform and so I appreciate your feedback. -Travis From oliphant.travis at ieee.org Wed Feb 22 08:31:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 08:31:01 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: References: Message-ID: <43FC917B.3040900@ieee.org> mfmorss at aep.com wrote: >This problem was solved by adding "#include " to ...numpy-0.9.5 >/numpy/core/src/umathmodule.c.src > > I suspect this allowed compilation, but I'm not sure if it "solved the problem." It depends on whether or not the FE_OVERFLOW defined in fenv.h is the same as FP_OVERFLOW on the _AIX (it might be...). The better solution is to change the constant to what it should be... Did the long double *, double * problem also resolve itself? This seems to an error with the modfl function you are picking up since the AIX docs say that modfl should take and receive long double arguments. Best, -Travis From mfmorss at aep.com Wed Feb 22 08:34:03 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Wed Feb 22 08:34:03 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: <43FC917B.3040900@ieee.org> Message-ID: Thanks for this observation. I will modify ufuncobject.h as you suggested, instead. The other problem still results in a complaint, but not an error; it does not prevent compilation. I have another little problem but I expect to be able to solve it. I will report when and if I have Numpy installed. Mark F. Morss Principal Analyst, Market Risk American Electric Power Travis Oliphant To mfmorss at aep.com 02/22/2006 11:29 cc AM numpy-discussion Subject Re: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. mfmorss at aep.com wrote: >This problem was solved by adding "#include " to ...numpy-0.9.5 >/numpy/core/src/umathmodule.c.src > > I suspect this allowed compilation, but I'm not sure if it "solved the problem." It depends on whether or not the FE_OVERFLOW defined in fenv.h is the same as FP_OVERFLOW on the _AIX (it might be...). The better solution is to change the constant to what it should be... Did the long double *, double * problem also resolve itself? This seems to an error with the modfl function you are picking up since the AIX docs say that modfl should take and receive long double arguments. Best, -Travis From robert.kern at gmail.com Wed Feb 22 09:59:12 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 09:59:12 2006 Subject: [Numpy-discussion] Re: dtype In-Reply-To: <43FC8312.3040402@sympatico.ca> References: <43FC8312.3040402@sympatico.ca> Message-ID: Colin J. Williams wrote: > I've been trying to gain some understanding of dtype from the builtin > documentation and would appreciate advice. > > I don't find anything in http://projects.scipy.org/scipy/numpy or > http://wiki.python.org/moin/NumPy > > Chapter 2.1 of the book has a good overview, but little reference material. > > In the following, dt= numpy.dtype > > Some specific problems are flagged ** below. > > Colin W. > > [Dbg]>>> h(dt) > Help on class dtype in module numpy: > > class dtype(__builtin__.object) > | Methods defined here: > | | __cmp__(...) > | x.__cmp__(y) <==> cmp(x,y) > | | __getitem__(...) > | x.__getitem__(y) <==> x[y] > | | __len__(...) > | x.__len__() <==> len(x) > | | __reduce__(...) > | self.__reduce__() for pickling. > | | __repr__(...) > | x.__repr__() <==> repr(x) > | | __setstate__(...) > | self.__setstate__() for pickling. > | | __str__(...) > | x.__str__() <==> str(x) > | | newbyteorder(...) > | self.newbyteorder() returns a copy of the dtype object > | with altered byteorders. If is not given all byteorders > | are swapped. Otherwise endian can be '>', '<', or '=' to force > | a byteorder. Descriptors in all fields are also updated in the > | new dtype object. > | | ---------------------------------------------------------------------- > | Data and other attributes defined here: > | | __new__ = | > T.__new__(S, ...) -> a new object with type S, a subtype of > T ** What are the parameters? In other words, > | > what does ... stand for? ** http://www.python.org/2.2.3/descrintro.html#__new__ """Recall that you create class instances by calling the class. When the class is a new-style class, the following happens when it is called. First, the class's __new__ method is called, passing the class itself as first argument, followed by any (positional as well as keyword) arguments received by the original call. This returns a new instance. Then that instance's __init__ method is called to further initialize it. (This is all controlled by the __call__ method of the metaclass, by the way.) """ > ** There is no __module__ attribute. How does one identify the modules > holding the code? ** It's an extension type PyArray_Descr* in numpy/core/src/arrayobject.c . -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From aisaac at american.edu Wed Feb 22 11:09:04 2006 From: aisaac at american.edu (Alan G Isaac) Date: Wed Feb 22 11:09:04 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: > Does numpy have an built-in mechanism to shift elements along some > axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, > to make [2,3,0,1]) This sounds like the rotater command in GAUSS. As far as I know there is no equivalent in numpy. Please post your ultimate solution. Cheers, Alan Isaac From tim.hochberg at cox.net Wed Feb 22 11:30:17 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 22 11:30:17 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: <43FCBB6F.50401@cox.net> Alan G Isaac wrote: >On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: > > >>Does numpy have an built-in mechanism to shift elements along some >>axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, >>to make [2,3,0,1]) >> >> > >This sounds like the rotater command in GAUSS. >As far as I know there is no equivalent in numpy. >Please post your ultimate solution. > > If you need to roll just a few elements the following should work fairly efficiently. If you don't want to roll in place, you could instead copy A on the way in and return the modified copy. However, in that case, concatenating slices might be better. -------------------------------------------------------------------- import numpy def roll(A, n): "Roll the array A in place. Positive n -> roll right, negative n -> roll left" if n > 0: n = abs(n) temp = A[-n:] A[n:] = A[:-n] A[:n] = temp elif n < 0: n = abs(n) temp = A[:n] A[:-n] = A[n:] A[-n:] = temp else: pass A = numpy.arange(10) print A roll(A, 3) print A roll(A, -3) print A From mpi at osc.kiku.dk Wed Feb 22 11:41:18 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Wed Feb 22 11:41:18 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: On Wed, 22 Feb 2006, Alan G Isaac wrote: > On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: > > Does numpy have an built-in mechanism to shift elements along some > > axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, > > to make [2,3,0,1]) > > This sounds like the rotater command in GAUSS. > As far as I know there is no equivalent in numpy. > Please post your ultimate solution. > > Cheers, > Alan Isaac > Similar to cshift() (cyclic shift) in F90. Very nice for calculating finite differences, such as x' = ( cshift(x,+1) - cshift(x-1) ) / dx This would be a very handy feature indeed. // Mads From cwmoad at gmail.com Wed Feb 22 12:02:04 2006 From: cwmoad at gmail.com (Charlie Moad) Date: Wed Feb 22 12:02:04 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray In-Reply-To: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> Message-ID: <6382066a0602221201l37495d0fxb5a3fb78e1b28b8e@mail.gmail.com> Since no one has answered this, I am going to take a whack at it. Experts feel free to shoot me down. Here is a sample showing multiple inheritance with a mix of old style and new style classes. I don't claim there is any logic to the code, but it is just for demo purposes. -------------------------------------- from numpy import * class actImage: def __init__(self, colorOrder='RGBA'): self.colorOrder = colorOrder class Image(actImage, ndarray): def __new__(cls, shape=(1024,768), dtype=float32): return ndarray.__new__(cls, shape=shape, dtype=dtype) x = Image() assert isinstance(x[0,1], float32) assert x.colorOrder == 'RGBA' -------------------------------------- Running "help(ndarray)" has some useful info as well. - Charlie On 2/19/06, Robert Lupton wrote: > I have a swig extension that defines a class that inherits from > both a personal C-coded image struct (actImage), and also from > Numeric's UserArray. This works very nicely, but I thought that > it was about time to upgrade to numpy. > > The code looks like: > > from UserArray import * > > class Image(UserArray, actImage): > def __init__(self, *args): > actImage.__init__(self, *args) > UserArray.__init__(self, self.getArray(), 'd', copy=False, > savespace=False) > > I can't figure out how to convert this to use ndarray, as ndarray > doesn't > seem to have an __init__ method, merely a __new__. > > So what's the approved numpy way to handle multiple inheritance? > I've a nasty > idea that this is a python question that I should know the answer to, > but I'm > afraid that I don't... > > R > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From zpincus at stanford.edu Wed Feb 22 12:26:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed Feb 22 12:26:04 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: Here is my eventual solution. I'm not sure it's speed-optimal for even a python implementation, but it is terse. I agree that it might be nice to have this fast, and/or in C (I'm using it for finite difference and related things). def cshift(l, offset): offset %= len(l) return numpy.concatenate((l[-offset:], l[:-offset])) Zach On Feb 22, 2006, at 11:40 AM, Mads Ipsen wrote: > On Wed, 22 Feb 2006, Alan G Isaac wrote: > >> On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: >>> Does numpy have an built-in mechanism to shift elements along some >>> axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, >>> to make [2,3,0,1]) >> >> This sounds like the rotater command in GAUSS. >> As far as I know there is no equivalent in numpy. >> Please post your ultimate solution. >> >> Cheers, >> Alan Isaac >> > > Similar to cshift() (cyclic shift) in F90. Very nice for calculating > finite differences, such as > > x' = ( cshift(x,+1) - cshift(x-1) ) / dx > > This would be a very handy feature indeed. > > // Mads > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From Chris.Barker at noaa.gov Wed Feb 22 12:27:11 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 22 12:27:11 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FC799B.6040505@gmx.net> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> <43FBA8B2.8010708@gmx.net> <43FBB65C.4040001@noaa.gov> <43FC799B.6040505@gmx.net> Message-ID: <43FCC8D9.6020508@noaa.gov> Sven Schreiber wrote: > - If I have >1 variable then everything is fine (provided I use your > advice of slicing instead of indexing afterwards) and the variables are > in the _columns_ of the 2d-array. > - But if there's just one data _column_ in the file, then pylab/numpy > gives me a 1d-array that sometimes works as a _row_ (and as you noted, > sometimes not), but never works as a column. > > Imho that's bad, because as a consequence I must use overhead code to > distinguish between these cases. I'd do that on load. You must have a way of knowing how many variables you're loading, so when it is one you can add this line: a.shape = (1,-1) and then proceed the same way after that. -chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cjw at sympatico.ca Wed Feb 22 13:19:05 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Wed Feb 22 13:19:05 2006 Subject: [Numpy-discussion] Constructor parameters - was Re: dtype In-Reply-To: References: <43FC8312.3040402@sympatico.ca> Message-ID: <43FCD521.2020007@sympatico.ca> Robert Kern wrote: >Colin J. Williams wrote: > > >>I've been trying to gain some understanding of dtype from the builtin >>documentation and would appreciate advice. >> >>I don't find anything in http://projects.scipy.org/scipy/numpy or >>http://wiki.python.org/moin/NumPy >> >>Chapter 2.1 of the book has a good overview, but little reference material. >> >>In the following, dt= numpy.dtype >> >>Some specific problems are flagged ** below. >> >>Colin W. >>[snip] >> >> >>| | ---------------------------------------------------------------------- >>| Data and other attributes defined here: >>| | __new__ = | >>T.__new__(S, ...) -> a new object with type S, a subtype of >>T ** What are the parameters? In other words, >>| >>what does ... stand for? ** >> >> > >http://www.python.org/2.2.3/descrintro.html#__new__ > >"""Recall that you create class instances by calling the class. When the class >is a new-style class, the following happens when it is called. First, the >class's __new__ method is called, passing the class itself as first argument, >followed by any (positional as well as keyword) arguments received by the >original call. This returns a new instance. Then that instance's __init__ method >is called to further initialize it. (This is all controlled by the __call__ >method of the metaclass, by the way.) >""" > > > >>** There is no __module__ attribute. How does one identify the modules >>holding the code? ** >> >> > >It's an extension type PyArray_Descr* in numpy/core/src/arrayobject.c . > > > Robert, Many thank for this. You have described the standard Python approach to constructing an instance. As I understand it, numpy uses the __new__ method, but not __init__, in most cases. My interest is in " any (positional as well as keyword) arguments". What should the user feed the constuctor? This isn't clear from the online documentation. From a Python user's point of view, the module holding the dtype class appears to be multiarray. The standard Python approach is to put the information in a __module__ attribute so that one doesn't have to go hunting around. Please see below. While on the subject of the Standand Python aproach, class names usually start with an upper case letter and the builtins have their own style, ListType etc. numpy equates ArrayType to ndarray but ArrayType is deprecated. Colin W. C:\>python Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy.core.multiarray as mu >>> dir(mu) ['_ARRAY_API', '__doc__', '__file__', '__name__', '__version__', '_fastCopyAndTranspose', '_flagdict ', '_get_ndarray_c_version', 'arange', 'array', 'bigndarray', 'broadcast', 'can_cast', 'concatenate' , 'correlate', 'dot', 'dtype', 'empty', 'error', 'flatiter', 'frombuffer', 'fromfile', 'fromstring', 'getbuffer', 'inner', 'lexsort', 'ndarray', 'newbuffer', 'register_dtype', 'scalar', 'set_numeric_o ps', 'set_string_function', 'set_typeDict', 'typeinfo', 'where', 'zeros'] >>> From robert.kern at gmail.com Wed Feb 22 14:11:05 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 14:11:05 2006 Subject: [Numpy-discussion] Re: Constructor parameters - was Re: dtype In-Reply-To: <43FCD521.2020007@sympatico.ca> References: <43FC8312.3040402@sympatico.ca> <43FCD521.2020007@sympatico.ca> Message-ID: Colin J. Williams wrote: > Robert, > > Many thank for this. You have described the standard Python approach to > constructing an instance. As I understand it, > numpy uses the __new__ method, but not __init__, in most cases. > > My interest is in " any (positional as well as keyword) arguments". > What should the user feed the constuctor? This isn't clear from the > online documentation. Look in the code. The PyArrayDescr_Type method table gives arraydescr_new() as the implementation of the tp_new slot (the C name for __new__). You can read the implementation for information. Patches for documentation will be gratefully accepted. That said: In [16]: a = arange(10) In [17]: a.dtype Out[17]: dtype('>i4') In [18]: dtype('>i4') Out[18]: dtype('>i4') If you want complete documentation on data-type descriptors, it's in Chapter 7 of Travis's book. > From a Python user's point of view, the module holding the dtype class > appears to be multiarray. > > The standard Python approach is to put the information in a __module__ > attribute so that one doesn't have to go hunting around. Please see below. dtype.__module__ (== 'numpy') tells you the canonical place to access it from Python code. It will never be able to tell you what C source file to look in. You'll have to break out grep no matter what. > While on the subject of the Standand Python aproach, class names usually > start with an upper case letter and the builtins have their own style, > ListType etc. numpy equates ArrayType to ndarray but ArrayType is > deprecated. ListType, TupleType et al. are also deprecated in favor of list and tuple, etc. But yes, we do use all lower-case names for classes. This is a conscious decision. It's just a style convention, just like PEP-8 is just a style convention for the standard library. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From zpincus at stanford.edu Wed Feb 22 19:00:06 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed Feb 22 19:00:06 2006 Subject: [Numpy-discussion] simple subclassing of ndarray Message-ID: Hello folks, I'm interested in creating a simple subclass of ndarray that just has a few additional methods. I've stared at defmatrix.py, but I'm not sure what is necessary to do. Specifically, I'm not sure how to get new instances of my subclass created properly. e.g.: numpy.matrix([1,2,3]) Out: matrix([[1, 2, 3]]) class m(numpy.ndarray): pass m([1,2,3]) Out: m([[[ 13691, 0, 0], [ 196608, 296292267, 296303312]]]) So clearly I need something else. Looking at the matrix class, it looks like I need a custom __new__ operator. However, looking at matrix's __new__ operator, I see a lot of complexity that I just don't understand. What's the minimum set of things I need in __new__ to get a proper constructor? Or perhaps there's a different and better way to construct instances of my subclass? Something akin to the 'array' function would be perfect. Now, how do I go about creating such a function (or getting 'array' to do it)? Can anyone give me any pointers here? Thanks, Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine From oliphant.travis at ieee.org Wed Feb 22 19:28:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 19:28:02 2006 Subject: [Numpy-discussion] simple subclassing of ndarray In-Reply-To: References: Message-ID: <43FD2B85.8080407@ieee.org> Zachary Pincus wrote: > Hello folks, > > I'm interested in creating a simple subclass of ndarray that just has > a few additional methods. I've stared at defmatrix.py, but I'm not > sure what is necessary to do. > > Specifically, I'm not sure how to get new instances of my subclass > created properly. > > e.g.: > numpy.matrix([1,2,3]) > Out: matrix([[1, 2, 3]]) > > class m(numpy.ndarray): > pass This is enough to define your own sub-class. Now, you need to determine what you want to do. You need to understand that m() is now analagous to numpy.ndarray() so you should look at the numpy.ndarray() docstring for the default arguments. The array() constructor is not the same thing as ndarray.__new__. Look at the ndarray docstring. help(ndarray) You need to define the __new__ method *not* the __init__ method. You could, of course, define an __init__ method if you want to, it's just not necessary. > Out: m([[[ 13691, 0, 0], > [ 196608, 296292267, 296303312]]]) You just created an empty array of shape (1,2,3). The first argument to the default constructor is the shape. > So clearly I need something else. Looking at the matrix class, it > looks like I need a custom __new__ operator. Yes, that is exactly right. > Or perhaps there's a different and better way to construct instances > of my subclass? Something akin to the 'array' function would be > perfect. Now, how do I go about creating such a function (or getting > 'array' to do it)? You could do array(obj).view(m) to get instances of your subclass. This will not call __new__ or __init__, but it will call __array_finalize__(self, obj) where obj is the ndarray constructed from [1,2,3]. Actually __array_finalize__ is called every time a sub-class is constructed and so it could be used to pass along meta-data (or enforce rank-2 as it does in the matrix class). -Travis From ndarray at mac.com Wed Feb 22 19:52:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 19:52:03 2006 Subject: [Numpy-discussion] Why floor and ceil change the type of the array? Message-ID: I was looking for a model to implement "round" in C and discovered that floor and ceil functions change the type of their arguments: >>> floor(array([1,2,3],dtype='i2')).dtype dtype('>> floor(array([1,2,3],dtype='i4')).dtype dtype(' References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> Message-ID: <43FD31FA.6030802@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>David M. Cooke wrote: >> >> [SNIP] > >I've gone through your code you checked in, and fixed it up. Looks >good. One side effect is that > >def zl(x): > a = ones_like(x) > a[:] = 0 > return a > >is now faster than zeros_like(x) :-) > > I noticed that ones_like was faster than zeros_like, but I didn't think to try that. That's pretty impressive considering how ridicuously easy it was to write. >One problem I had is that in PyArray_SetNumericOps, the "copy" method >wasn't picked up on. It may be due to the order of initialization of >the ndarray type, or something (since "copy" isn't a ufunc, it's >initialized in a different place). I couldn't figure out how to fiddle >that, so I replaced the x.copy() call with a call to PyArray_Copy(). > > Interesting. It worked fine here. > > >>>Yes; because it's the implementation of __pow__, the second argument can >>>be anything. >>> >>> >>> >>> >>No, you misunderstand.. What I was talking about was that the *first* >>argument can also be something that's not a PyArrayObject, despite the >>functions signature. >> >> > >Ah, I suppose that's because the power slot in the number protocol >also handles __rpow__. > > That makes sense. It was giving me fits whatever the cause. [SNIP] > >>but the real fix would be to dig into >>PyArray_EnsureArray and see why it's slow for Python_Ints. It is much >>faster for numarray scalars. >> >> > >Right; that needs to be looked at. > > It doesn't look to bad. But I haven't had a chance to try to do anything about it yet. >>Another approach is to actually compute (x*x)*(x*x) for pow(x,4) at >>the level of array_power. I think I could make this work. It would >>probably work well for medium size arrays, but might well make things >>worse for large arrays that are limited by memory bandwidth since it >>would need to move the array from memory into the cache multiple times. >> >> > >I don't like that; I think it would be better memory-wise to do it >elementwise. Not just speed, but size of intermediate arrays. > > Yeah, for a while I was real hot on the idea since I could do everything without messing with ufuncs. But then I decided not to pursue it because I thought it would be slow because of memory usage -- it would be pullling data into the cache over and over again and I think that would slow things down a lot. [SNIP] >'int_power' we could do; that would the next step I think. The half >integer powers we could maybe leave; if you want x**(-3/2), for >instance, you could do y = x**(-1)*sqrt(x) (or do y = x**(-1); >sqrt(y,y) if you're worried about temporaries). > >Or, 'fast_power' could be documented as doing the optimizations for >integer and half-integer _scalar_ exponents, up to a certain size, >like 100), and falling back on pow() if necessary. I think we could do >a precomputation step to split the exponent into appropiate squarings >and such that'll make the elementwise loop faster. > There's a clever implementation of this in complexobject.c. Speaking of complexobject.c, I did implement fast integer powers for complex objects at the nc_pow level. For small powers at least, it's over 10 times as fast. And, since it's at the nc_pow level it works for matrix matrix powers as well. My implementation is arguably slightly faster than what's in complexobject, but I won't have a chance to check it in till next week -- I'm off for some snowboarding tomorrow. I kind of like power and scalar_power. Then ** could be advertised as calling scalar_power for scalars and power for arrays. Scalar power would do optimizations on integer and half_integer powers. Of course there's no real way to enforce that scalar power is passed scalars, since presumably it would be a ufunc, short of making _scalar_power a ufunc instead and doing something like: def scalar_power(x, y): "compute x**y, where y is a scalar optimizing integer and half integer powers possibly at some minor loss of accuracy" if not is_scalar(y): raise ValuerError("Naughty!!") return _scalar_power(x,y) > Half-integer >exponents are exactly representable as doubles (up to some number of >course), so there's no chance of decimal-to-binary conversions making >things look different. That might work out ok. Although, at that point >I'd suggest we make it 'power', and have 'rawpower' (or ????) as the >version that just uses pow(). > > >Another point is to look at __div__, and use reciprocal if the >dividend is 1. > > That would be easy, but wouldn't it be just as easy to optimize __div__ for scalar divisions. Should probably check that this isn't just as fast since it would be a lot more general. > >I've added a page to the developer's wiki at >http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas >to keep a list of areas like that to look into if someone has time :-) > > Ah, good plan. -tim From oliphant.travis at ieee.org Wed Feb 22 19:59:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 19:59:04 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray In-Reply-To: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> Message-ID: <43FD32E4.10600@ieee.org> Robert Lupton wrote: > I have a swig extension that defines a class that inherits from > both a personal C-coded image struct (actImage), and also from > Numeric's UserArray. This works very nicely, but I thought that > it was about time to upgrade to numpy. > > The code looks like: > > from UserArray import * > > class Image(UserArray, actImage): > def __init__(self, *args): > actImage.__init__(self, *args) > UserArray.__init__(self, self.getArray(), 'd', copy=False, > savespace=False) > > I can't figure out how to convert this to use ndarray, as ndarray > doesn't > seem to have an __init__ method, merely a __new__. Yes, the ndarray method doesn't have an __init__ method (so you don't have to call it). What you need to do is write a __new__ method for your class. However, with multiple-inheritance the details matter. You may actually want to have your C-coded actImage class inherit (in C) from the ndarray. If you would like help on that approach let me know (I'll need to understand your actImage a bit better). But, this can all be done in Python, too, but it is a bit of effort to make sure things get created correctly. Perhaps it might make sense to actually include a slightly modified form of the UserArray in NumPy as a standard "container-class" (instead of a sub-class) of the ndarray. In reality, a container class like UserArray and a sub-class are different things. Here's an outline of what you need to do. This is, of course, untested.... For example, I don't really know what actImage is. from numpy import ndarray, array class Image(ndarray, actImage): def __new__(subtype, *args) act1 = actImage.__new__(actImage, *args) actImage.__init__(act1, *args) arr = array(act1.getArray(), 'd', copy=False) self = arr.view(subtype) # you might need to copy attributes from act1 over to self here... return self The problem here, is that apparently you are creating the array first in actImage.__init__ and then passing it to UserArray. The ndarray constructor wants to either create the array itself or use a buffer-exposing object to use as the memory. Keep us posted as your example is a good one that can help us all learn. -Travis From robert.kern at gmail.com Wed Feb 22 20:08:04 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 20:08:04 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: Sasha wrote: > I was looking for a model to implement "round" in C and discovered > that floor and ceil functions change the type of their arguments: > >>>>floor(array([1,2,3],dtype='i2')).dtype > > dtype(' >>>>floor(array([1,2,3],dtype='i4')).dtype > > dtype(' > I know that this is the same behavior as in Numeric, but wouldn't it > be more natural if fllor and ceil return the argument unchanged (maybe > a copy) if it is already integer? Only if floor() and ceil() returned integer arrays when given floats as input. I presume there are good reasons for this, since it's the same behavior as the standard C functions. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From ndarray at mac.com Wed Feb 22 20:49:08 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 20:49:08 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: On 2/22/06, Robert Kern wrote: > Sasha wrote: > > ... wouldn't it > > be more natural if fllor and ceil return the argument unchanged (maybe > > a copy) if it is already integer? > > Only if floor() and ceil() returned integer arrays when given floats as input. I > presume there are good reasons for this, since it's the same behavior as the > standard C functions. C does not have ceil(int). It has double ceil(double x); float ceilf(float x); long double ceill(long double x); and neither of these functions change the type of the argument. Numpy's "around" is a noop on integers (even for decimals<0, but that's a different story. I cannot really think of any reason for the current numpy behaviour other than the consistency with transcendental functions. Speaking of which, can someone explain this: >>> sin(array(1,'h')).dtype dtype('>> sin(array(1,'i')).dtype dtype(' References: Message-ID: Sasha wrote: > On 2/22/06, Robert Kern wrote: > >>Sasha wrote: >> >>>... wouldn't it >>>be more natural if fllor and ceil return the argument unchanged (maybe >>>a copy) if it is already integer? >> >>Only if floor() and ceil() returned integer arrays when given floats as input. I >>presume there are good reasons for this, since it's the same behavior as the >>standard C functions. > > C does not have ceil(int). It has > > double ceil(double x); > float ceilf(float x); > long double ceill(long double x); > > and neither of these functions change the type of the argument. That's exactly what I meant. These functions only apply to integers through casting to the appropriate float type. That's precisely what numpy.floor() and numpy.ceil() do. Actually, I think the reasoning for having the float versions return floats instead of integers is that an integer-valued double is possibly out of range for an int or long on some platforms, so it's kept as a float. Since this obviously isn't a problem if the input is already an integer type, I don't have any particular objection to making floor() and ceil() return integers if their inputs are integers. > Numpy's "around" is a noop on integers (even for decimals<0, but > that's a different story. It's also a function and not a ufunc. > I cannot really think of any reason for the current numpy behaviour > other than the consistency with transcendental functions. It's simply the easiest thing to do with the ufunc machinery. > Speaking of > which, can someone explain this: > >>>>sin(array(1,'h')).dtype > > dtype(' >>>>sin(array(1,'i')).dtype > > dtype(' C99 defines three functions round, rint and nearbyint that are nearly identical. The only difference is in setting the inexact flag and respecting the rounding mode. Nevertheless, these functions differ significantly in their performance. I've wraped these functions into ufuncs and go the following timings: > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000)" "round(x)" 1000 loops, best of 3: 257 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000)" "nearbyint(x)" 1000 loops, best of 3: 654 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000)" "rint(x)" 10000 loops, best of 3: 103 usec per loop Similarly for single precision: > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000,dtype='f')" "round(x)" 10000 loops, best of 3: 182 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000,dtype='f')" "nearbyint(x)" 1000 loops, best of 3: 606 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000,dtype='f')" "rint(x)" 10000 loops, best of 3: 85.5 usec per loop Obviously, I will use rint in my ndarray.round implementation, however, it may be useful to provide all three as ufuncs. The only question is what name to use for round? 1) round (may lead to confusion with ndarray.round or built-in round) 2) roundint (too similar to rint) 3) round0 (ugly) Any suggestions? Another C99 function that may be worth including is "trunc". Any objections to adding it as a ufunc? From oliphant.travis at ieee.org Wed Feb 22 22:11:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 22:11:01 2006 Subject: [Numpy-discussion] Timings for various round functions In-Reply-To: References: Message-ID: <43FD51DE.4000400@ieee.org> Sasha wrote: >C99 defines three functions round, rint and nearbyint that are nearly >identical. The only difference is in setting the inexact flag and >respecting the rounding mode. Nevertheless, these functions differ >significantly in their performance. I've wraped these functions into >ufuncs and go the following timings: > > >Obviously, I will use rint in my ndarray.round implementation, >however, it may be useful to provide all three as ufuncs. > >The only question is what name to use for round? > >1) round (may lead to confusion with ndarray.round or built-in round) >2) roundint (too similar to rint) >3) round0 (ugly) > >Any suggestions? > >Another C99 function that may be worth including is "trunc". Any >objections to adding it as a ufunc? > > I think we have agreed that C99 functions are good candidates to become ufuncs. The only problem is figuring out what to do on platforms that don't define them. For example, we could define a separate module of C99 functions that is only available on certain platforms. -Travis From oliphant.travis at ieee.org Wed Feb 22 22:42:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 22:42:06 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class Message-ID: <43FD5914.4060506@ieee.org> The bigndarray class is going to disappear (probably in the next release of NumPy). It was a stop-gap measure as the future of 64-bit fixes in Python was unclear. Python 2.5 will have removed the 64-bit limitations that led to the bigndarray and so it will be removed. I have been thinking, however, of replacing it with a super-class that does not define the dimensions or strides. In other words, the default array would be just a block of memory. The standard array would inherit from the default and add dimension and strides pointers. I was thinking that this might make it easier for sub-classes using fixed-sized dimensions and strides. I'm not sure if that would actually be useful, but since I was thinking about the disappearance of the bigndarray, I thought I would ask for comments. -Travis From ndarray at mac.com Wed Feb 22 22:47:04 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 22:47:04 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: On 2/23/06, Robert Kern wrote: > > I cannot really think of any reason for the current numpy behaviour > > other than the consistency with transcendental functions. > > It's simply the easiest thing to do with the ufunc machinery. > That's what I had in mind with the curent rule the same code can be use for ceil as for sin. However, easiest to implement is not necessarily right. > > Speaking of > > which, can someone explain this: > > > >>>>sin(array(1,'h')).dtype > > > > dtype(' > > >>>>sin(array(1,'i')).dtype > > > > dtype(' > AFAICT, the story goes like this: sin() has two implementations, one for > single-precision floats and one for doubles. The ufunc machinery sees the int16 > and picks single-precision as the smallest type of the two that can fit an int16 > without losing precision. Naturally, you probably want the function to operate > in higher precision, but that's not really information that the ufunc machinery > knows about. According to your theory long (i8) integers should cast to long doubles, but >>> sin(array(0,'i8')).dtype dtype('>> exp(400) 5.2214696897641443e+173 >>> exp(array(400,'h')) inf From robert.kern at gmail.com Wed Feb 22 23:06:14 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 23:06:14 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: Sasha wrote: > On 2/23/06, Robert Kern wrote: >>AFAICT, the story goes like this: sin() has two implementations, one for >>single-precision floats and one for doubles. The ufunc machinery sees the int16 >>and picks single-precision as the smallest type of the two that can fit an int16 >>without losing precision. Naturally, you probably want the function to operate >>in higher precision, but that's not really information that the ufunc machinery >>knows about. > > According to your theory long (i8) integers should cast to long doubles, but > >>>>sin(array(0,'i8')).dtype > > dtype(' Given that python's floating point object is a double, I think it > would be natural to cast integer arguments to double for all sizes. I > would also think that in choosing the precision for a function it is > also important that the output fits into data type. I find the > following unfortunate: > >>>>exp(400) > > 5.2214696897641443e+173 > >>>>exp(array(400,'h')) > > inf I prefer consistent, predictable rules that are dependent on the input, not the output. If I want my outputs to be double precision, I will cast appopriately. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From ndarray at mac.com Wed Feb 22 23:21:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 23:21:03 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class In-Reply-To: <43FD5914.4060506@ieee.org> References: <43FD5914.4060506@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: > ... > I have been thinking, however, of replacing it with a super-class that > does not define the dimensions or strides. > Having a simple 1-d array in numpy would be great. In an ideal world I would rather see a 1-d array implemented in C together with a set of array operations that is rich enough to allow trivial implementation of ndarray in pure python. When you say "does not define the dimensions or strides" do you refer to python interface or to C struct? I thought python did not allow to add data members to object structs in subclasses. > In other words, the default array would be just a block of memory. The > standard array would inherit from the default and add dimension and > strides pointers. > If python lets you do it, how will that block of memory know its size? From oliphant.travis at ieee.org Wed Feb 22 23:23:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 23:23:06 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: <43FD62C2.7070101@ieee.org> Sasha wrote: >On 2/23/06, Robert Kern wrote: > > >>>I cannot really think of any reason for the current numpy behaviour >>>other than the consistency with transcendental functions. >>> >>> >>It's simply the easiest thing to do with the ufunc machinery. >> >> >>AFAICT, the story goes like this: sin() has two implementations, one for >>single-precision floats and one for doubles. The ufunc machinery sees the int16 >>and picks single-precision as the smallest type of the two that can fit an int16 >>without losing precision. Naturally, you probably want the function to operate >>in higher precision, but that's not really information that the ufunc machinery >>knows about. >> >> > >According to your theory long (i8) integers should cast to long doubles, but > > Robert is basically right, except there is a special case for long integers because long doubles are not cross platform. The relevant code is PyArray_CanCastSafely. This is basically the coercion rule table. You will notice the special checks for long double placed there after it was noticed that on 64-bit platforms long doubles were cropping up an awful lot and it was decided that because long doubles are not very ubiquitous (for example many platforms don't distinguish between long double and double), we should special-case the 64-bit integer rule. You can read about it in the archives if you want. >dtype(' > >Given that python's floating point object is a double, I think it >would be natural to cast integer arguments to double for all sizes. > Perhaps, but that is not what is done. I don't think it's that big a deal because to get "different size" integers you have to ask for them and then you should know that conversion to floating point is not necessarily a double. I think the only accetable direction to pursue is to raise an error and not do automatic upcasting if a ufunc does not have a definition for any of the given types. But, this is an old behavior from Numeric, and I would think such changes now would rightly be considered as gratuitous breakage. > I >would also think that in choosing the precision for a function it is >also important that the output fits into data type. > How do you propose to determine if the output fits into the data-type? Are you proposing to have different output rules for different functions. Sheer madness... The rules now are (relatively) simple and easy to program to. >I find the >following unfortunate: > > >>>>exp(400) >>>> >>>> >5.2214696897641443e+173 > > >>>>exp(array(400,'h')) >>>> >>>> >inf > > Hardly a good example. Are you also concerned about the following? >>> exp(1000) inf >>> exp(array(1000,'g')) 1.97007111401704699387e+434 From ndarray at mac.com Wed Feb 22 23:39:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 23:39:03 2006 Subject: [Numpy-discussion] Timings for various round functions In-Reply-To: <43FD51DE.4000400@ieee.org> References: <43FD51DE.4000400@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: > ... > I think we have agreed that C99 functions are good candidates to become > ufuncs. The only problem is figuring out what to do on platforms that > don't define them. > I was going to ask this question myself, but then realized that the answer is in the source code: for functions missing on a platform numpy provides its own implementations. (See for example a comment in umathmodule "if C99 extensions not available then define dummy functions...") I was going to just use rint instead of round and nearbyint on platforms that dont have them. > For example, we could define a separate module of C99 functions that is > only available on certain platforms. This is certainly the easiest to implement option, but we don't want make numpy users worry about portability of their code. From oliphant.travis at ieee.org Wed Feb 22 23:46:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 23:46:04 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class In-Reply-To: References: <43FD5914.4060506@ieee.org> Message-ID: <43FD680F.8040708@ieee.org> Sasha wrote: >On 2/23/06, Travis Oliphant wrote: > > >>... >>I have been thinking, however, of replacing it with a super-class that >>does not define the dimensions or strides. >> >> >> >Having a simple 1-d array in numpy would be great. In an ideal world >I would rather see a 1-d array implemented in C together with a set of >array operations that is rich enough to allow trivial implementation >of ndarray in pure python. > > You do realize that this is essentially numarray, right? And your dream of *rich enough* 1-d operations to allow *trivial* implementation may be a bit far-fetched, but I'm all for dreaming. >When you say "does not define the dimensions or strides" do you refer >to python interface or to C struct? I thought python did not allow to >add data members to object structs in subclasses. > > The C-struct. Yes, you can add data-members to object structs in sub-classes. Every single Python Object does it. The standard Python Object just defines PyObject_HEAD or PyObject_VAR_HEAD. This is actually the essence of inheritance in C and it is why subclasses written in C must have compatible memory layouts. The first part of the C-structure must be identical, but you can add to it all you want. It all comes down to: Can I cast to the base-type C-struct and have everything still work out when I dereference a particular field? This will be true if PyArrayObject is struct { PyBaseArrayObject int nd intp *dimensions intp *strides } I suppose we could change the int nd to intp nd and place it in the PyBaseArrayObject where it would be used as a length But, I don't really like that... >>In other words, the default array would be just a block of memory. The >>standard array would inherit from the default and add dimension and >>strides pointers. >> >> >> >If python lets you do it, how will that block of memory know its size? > > > > It won't of course by itself unless you add an additional size field. Thus, I'm not really sure whether it's a good idea or not. I don't like the idea of adding more and more fields to the basic C-struct that has been around for 10 years unless we have a good reason. The other issue is that the data-pointer doesn't always refer to memory that the ndarray has allocated, so it's actually incorrect to think of the ndarray as both the block of memory and the dimensioned indexing. The memory pointer is just that (a memory pointer). We are currently allowing ndarray's to create their own memory but that could easily change so that they always use some other object to allocate memory. In short, I don't see how to really do it so that the base object is actually useable. From ndarray at mac.com Thu Feb 23 00:42:05 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 23 00:42:05 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: <43FD62C2.7070101@ieee.org> References: <43FD62C2.7070101@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: > How do you propose to determine if the output fits into the data-type? > Are you proposing to have different output rules for different > functions. Sheer madness... The rules now are (relatively) simple and > easy to program to. > I did not propose that. I just mentioned the output to argue that the rule to use minimal floating point type that can represent the input is an arbitrary one and no better than cast all integers to doubles. "Sheer madness...", however is too strong a characterization. Note that python (so far) changes the type of the result depending on its value in some cases: >>> type(2**30) >>> type(2**32) This is probably unacceptable to numpy for performance reasons, but it is not madness. Try to explain the following to someone who is used to python arithmetics: >>> 2*array(2**62)+array(2*2**62) 0L > Hardly a good example. Are you also concerned about the following? > > >>> exp(1000) > inf > > >>> exp(array(1000,'g')) > 1.97007111401704699387e+434 No, but I think it is because I am conditioned by C. To me exp() is a double-valued function that happened to work for ints with the help of an implicit cast. You may object that this is so because C does not allow function overloading, but C++ does overload exp so that exp((float)1) is float, and exp((long double)1) is long doubles but exp((short)1), exp((char)1) and exp((long long)1) are all double. Both numpy and C++ made an arbitrary design choice. I find C++ choice simpler and more natural, but I can live with the numpy choice once I've learned what it is. From josegomez at gmx.net Thu Feb 23 01:05:08 2006 From: josegomez at gmx.net (Jose Gomez-Dans) Date: Thu Feb 23 01:05:08 2006 Subject: [Numpy-discussion] Success with SVN on Cygwin Message-ID: Hi! A few days ago, I asked about compiling NumPy on Cygwin. Travis carried out some modifications, and with last night's SVN, I can happily report that it now compiles and works. The tests produced no errors, so it's all good :) Many thanks to all, and to Travis specially, for his really fast response. Many thanks! Jose From stefan at sun.ac.za Thu Feb 23 02:47:04 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 23 02:47:04 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray In-Reply-To: <43FD32E4.10600@ieee.org> References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> <43FD32E4.10600@ieee.org> Message-ID: <20060223104601.GB26706@alpha> On Wed, Feb 22, 2006 at 08:58:28PM -0700, Travis Oliphant wrote: > Here's an outline of what you need to do. This is, of course, > untested.... For example, I don't really know what actImage is. > > from numpy import ndarray, array > > class Image(ndarray, actImage): > def __new__(subtype, *args) > act1 = actImage.__new__(actImage, *args) > actImage.__init__(act1, *args) > arr = array(act1.getArray(), 'd', copy=False) > self = arr.view(subtype) > # you might need to copy attributes from act1 over to self here... > return self This is probably the right place to use super, i.e.: def __new__(subtype, *args): act1 = super(Image, subtype).__new__(subtype, *args) ... def __init__(self, *args): super(Image, self).__init__(*args) The attached script shows how multiple inheritance runs through different classes. St?fan -------------- next part -------------- A non-text attachment was scrubbed... Name: inh.py Type: text/x-python Size: 855 bytes Desc: not available URL: From fullung at gmail.com Thu Feb 23 03:32:03 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 03:32:03 2006 Subject: [Numpy-discussion] repmat equivalent? Message-ID: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Hello all I recently started using NumPy and one function that I am really missing from MATLAB/Octave is repmat. This function is very useful for implementing algorithms as matrix multiplications instead of for loops. Here's my first attempt at repmat for 1d and 2d (with some optimization by Stefan van der Walt): def repmat(a, m, n): if a.ndim == 1: a = array([a]) (origrows, origcols) = a.shape rows = origrows * m cols = origcols * n b = a.reshape(1,a.size).repeat(m, 0).reshape(rows, origcols).repeat(n, 0) return b.reshape(rows, cols) print repmat(array([[1,2],[3,4]]), 2, 3) produces: [[1 2 1 2 1 2] [3 4 3 4 3 4] [1 2 1 2 1 2] [3 4 3 4 3 4]] which is the same as in MATLAB. There are various issues with my function that I don't quite know how to solve: - How to handle scalar inputs (probably need asarray here) - How to handle more than 2 dimensions More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to agree on how more-dimensional data is organised? As such, I don't know what a NumPy user would expect repmat to do with more than 2 dimensions. Here are some test cases that the current repmat should pass, but doesn't: a = repmat(1, 1, 1) assert_equal(a, 1) a = repmat(array([1]), 1, 1) assert_array_equal(a, array([1])) a = repmat(array([1,2]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) a = repmat(array([[1,2],[3,4]]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [3,4,3,4,3,4], [1,2,1,2,1,2], [3,4,3,4,3,4]])) Any suggestions on how do repmat in NumPy would be appreciated. Regards Albert From cimrman3 at ntc.zcu.cz Thu Feb 23 03:47:04 2006 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu Feb 23 03:47:04 2006 Subject: [Numpy-discussion] system_info problem Message-ID: <43FDA0A8.6020105@ntc.zcu.cz> Hi, I am trying to (finally) move UMFPACK out of the sandbox to the scipy proper and so I need checking for its libraries via system_info.py of numpy distutils. I have added a new section to site.cfg: [umfpack] library_dirs = /home/share/software/packages/UMFPACK/UMFPACK/Lib:/home/share/software/packages/UMFPACK/AMD/Lib include_dirs = /home/share/software/packages/UMFPACK/UMFPACK/Include umfpack_libs = umfpack, amd the names of the libraries are libumfpack.a, libamd.a - they are correctly found in 'system_info._check_libs()' by 'self._lib_list(lib_dir, libs, exts)', but then the function fails, since 'len(found_libs) == len(libs)' which is wrong. Can some numpy.distutils expert help me? Below is the new umfpack_info class I have written using the blas_info class as template. yours clueless, r. PS: I do this because I prefer having the umfpack installed separately. It will be used, if present, to replace the default superLU-based sparse solver. Moving its sources under scipy/Lib/sparse would solve this issue, but Tim Davis recently changed the license of UMFPACK to GPL, and so the last version available for the direct inclusion is 4.4. (4.6 is the current one). Opinions are of course welcome. -- class umfpack_info(system_info): section = 'umfpack' dir_env_var = 'UMFPACK' _lib_names = ['umfpack', 'amd'] includes = 'umfpack.h' notfounderror = UmfpackNotFoundError def calc_info(self): info = {} lib_dirs = self.get_lib_dirs() print lib_dirs umfpack_libs = self.get_libs('umfpack_libs', self._lib_names) print umfpack_libs for d in lib_dirs: libs = self.check_libs(d,umfpack_libs,[]) print d, libs if libs is not None: dict_append(info,libraries=libs) break else: return include_dirs = self.get_include_dirs() print include_dirs h = (self.combine_paths(lib_dirs+include_dirs,includes) or [None])[0] if h: h = os.path.dirname(h) dict_append(info,include_dirs=[h]) print info self.set_info(**info) From stefan at sun.ac.za Thu Feb 23 03:58:03 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 23 03:58:03 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Message-ID: <20060223115630.GB28483@alpha> On Thu, Feb 23, 2006 at 01:31:47PM +0200, Albert Strasheim wrote: > More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to > agree on how more-dimensional data is organised? As such, I don't know > what a NumPy user would expect repmat to do with more than 2 > dimensions. To expand on this, here is what I see when I create (M,N,3) matrices in both octave and numpy. I expect to see an MxN matrix stacked 3 high: octave ------ octave:1> zeros(2,2,3)) ans = ans(:,:,1) = 0 0 0 0 ans(:,:,2) = 0 0 0 0 ans(:,:,3) = 0 0 0 0 numpy ----- In [19]: zeros((2,3,3)) Out[19]: array([[[0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0]]]) There is nothing wrong with numpy's array -- but the output generated seems counter-intuitive. St?fan From nadavh at visionsense.com Thu Feb 23 04:12:06 2006 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu Feb 23 04:12:06 2006 Subject: [Numpy-discussion] repmat equivalent? Message-ID: <07C6A61102C94148B8104D42DE95F7E8C8EF3F@exchange2k.envision.co.il> You should really use the "repeat" function. Nadav. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net on behalf of Albert Strasheim Sent: Thu 23-Feb-06 13:31 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] repmat equivalent? Hello all I recently started using NumPy and one function that I am really missing from MATLAB/Octave is repmat. This function is very useful for implementing algorithms as matrix multiplications instead of for loops. Here's my first attempt at repmat for 1d and 2d (with some optimization by Stefan van der Walt): def repmat(a, m, n): if a.ndim == 1: a = array([a]) (origrows, origcols) = a.shape rows = origrows * m cols = origcols * n b = a.reshape(1,a.size).repeat(m, 0).reshape(rows, origcols).repeat(n, 0) return b.reshape(rows, cols) print repmat(array([[1,2],[3,4]]), 2, 3) produces: [[1 2 1 2 1 2] [3 4 3 4 3 4] [1 2 1 2 1 2] [3 4 3 4 3 4]] which is the same as in MATLAB. There are various issues with my function that I don't quite know how to solve: - How to handle scalar inputs (probably need asarray here) - How to handle more than 2 dimensions More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to agree on how more-dimensional data is organised? As such, I don't know what a NumPy user would expect repmat to do with more than 2 dimensions. Here are some test cases that the current repmat should pass, but doesn't: a = repmat(1, 1, 1) assert_equal(a, 1) a = repmat(array([1]), 1, 1) assert_array_equal(a, array([1])) a = repmat(array([1,2]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) a = repmat(array([[1,2],[3,4]]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [3,4,3,4,3,4], [1,2,1,2,1,2], [3,4,3,4,3,4]])) Any suggestions on how do repmat in NumPy would be appreciated. Regards Albert ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From fullung at gmail.com Thu Feb 23 04:37:04 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 04:37:04 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <07C6A61102C94148B8104D42DE95F7E8C8EF3F@exchange2k.envision.co.il> References: <07C6A61102C94148B8104D42DE95F7E8C8EF3F@exchange2k.envision.co.il> Message-ID: <5eec5f300602230436h7655781dq8b34fa926e7e613f@mail.gmail.com> Hello The problem is that repeat is not the same as repmat. For example: >> repmat([1 2; 3 4], 2, 1) ans = 1 2 3 4 1 2 3 4 In [12]: repeat(array([[1, 2],[3,4]]), 2) Out[12]: array([[1, 2], [1, 2], [3, 4], [3, 4]]) How can I use the repeat function as is to accomplish this? Thanks. Regards Albert On 2/23/06, Nadav Horesh wrote: > You should really use the "repeat" function. > > Nadav. From fullung at gmail.com Thu Feb 23 04:41:13 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 04:41:13 2006 Subject: [Numpy-discussion] Re: repmat equivalent? In-Reply-To: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Message-ID: <5eec5f300602230440p75c9e542p4fc54864bd78d408@mail.gmail.com> Hello all Just to clear up any confusion: On 2/23/06, Albert Strasheim wrote: > Here are some test cases that the current repmat should pass, but doesn't: > > a = repmat(1, 1, 1) > assert_equal(a, 1) > a = repmat(array([1]), 1, 1) > assert_array_equal(a, array([1])) > a = repmat(array([1,2]), 2, 3) > assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) > a = repmat(array([[1,2],[3,4]]), 2, 3) > assert_array_equal(a, array([[1,2,1,2,1,2], [3,4,3,4,3,4], > [1,2,1,2,1,2], [3,4,3,4,3,4]])) Only the first two tests fail. The other two pass. Presumably any test that uses a matrix with more than 2 dimensions will also fail. Regards Albert From fullung at gmail.com Thu Feb 23 04:46:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 04:46:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <20060223115630.GB28483@alpha> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <20060223115630.GB28483@alpha> Message-ID: <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> Hello all On 2/23/06, Stefan van der Walt wrote: > On Thu, Feb 23, 2006 at 01:31:47PM +0200, Albert Strasheim wrote: > > More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to > > agree on how more-dimensional data is organised? As such, I don't know > > what a NumPy user would expect repmat to do with more than 2 > > dimensions. > > To expand on this, here is what I see when I create (M,N,3) matrices > in both octave and numpy. I expect to see an MxN matrix stacked 3 > high: There are other (unexpected, for me at least) differences between MATLAB/Octave and NumPy too. For a 3D array in MATLAB, only indexing on the last dimension yields a 2D array, where NumPy always returns a 2D array. I put some examples for the 3D case at: http://students.ee.sun.ac.za/~albert/numscipy.html Regards Albert From wbaxter at gmail.com Thu Feb 23 05:07:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 05:07:03 2006 Subject: [Numpy-discussion] Timings for various round functions In-Reply-To: <43FD51DE.4000400@ieee.org> References: <43FD51DE.4000400@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: >Sasha wrote: >I think we have agreed that C99 functions are good candidates to become >ufuncs. The only problem is figuring out what to do on platforms that >don't define them. >For example, we could define a separate module of C99 functions that is >only available on certain platforms. Might this be some help? http://mega-nerd.com/FPcast/ --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From NadavH at VisionSense.com Thu Feb 23 08:15:18 2006 From: NadavH at VisionSense.com (Nadav Horesh) Date: Thu Feb 23 08:15:18 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> Message-ID: <43FDFB80.9090106@VisionSense.com> An HTML attachment was scrubbed... URL: From tim.hochberg at cox.net Thu Feb 23 08:29:06 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 23 08:29:06 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43FD31FA.6030802@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FD31FA.6030802@cox.net> Message-ID: <43FDE27C.9040606@cox.net> I had some free time some morning, so I merged the nc_pow optimizations for integral powers in and committed them to the power_optimization branch. They could probably use more testing, but I thought someone might like to take a look while I'm out of town. Also, if your looking for a way to do powers as succusive multiplies for fast_power or scalar_power or whatnot, starting with the algorithm for nc_pow would probably be a good place to start. Hmm. Looking at this now, I realize I'm shadowing the input complex number 'a', with a local 'a'. Too much mindless copying from complexobject. That should be fixed, but I can't do it right now. Enjoy, -tim From faltet at carabos.com Thu Feb 23 09:04:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 23 09:04:03 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: Message-ID: <200602231803.27417.faltet@carabos.com> Hi Sasha, A Dissabte 18 Febrer 2006 20:49, Sasha va escriure: > I have reviewed mailing list discussions of rank-0 arrays vs. scalars > and I concluded that the current implementation that contains both is > (almost) correct. I will address the "almost" part with a concrete > proposal at the end of this post (search for PROPOSALS if you are only > interested in the practical part). > > The main criticism of supporting both scalars and rank-0 arrays is > that it is "unpythonic" in the sense that it provides two almost > equivalent ways to achieve the same result. However, I am now > convinced that this is the case where practicality beats purity. It's a bit late, but I want to support your proposal (most of it). I've also come to the conclusion that scalars and rank-0 arrays should coexist. This is something that appears as a natural fact when you have to deal regularly with general algorithms for treat objects with different shapes. And I think you have put this very well. Thanks, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From Chris.Barker at noaa.gov Thu Feb 23 09:20:05 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu Feb 23 09:20:05 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <20060223115630.GB28483@alpha> <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> Message-ID: <43FDEEA8.9060901@noaa.gov> Albert Strasheim wrote: > There are other (unexpected, for me at least) differences between > MATLAB/Octave and NumPy too. First: numpy is not, and was never intended to be, a MATLAB clone, work-alike, whatever. You should *expect* there to be differences. > For a 3D array in MATLAB, only indexing > on the last dimension yields a 2D array, where NumPy always returns a > 2D array. I think the key here is that MATLAB's core data type is a matrix, which is 2-d. The ability to do 3-d arrays was added later, and it looks like they are still preserving the core matrix concept, so that a 3-d array is not really a 3-d array; it is, as someone on this thread mentioned, a "stack" of matrices. In numpy, the core data type is an n-d array. That means that there is nothing special about 2-d vs 4-d vs whatever, except 0-d (scalars). So a 3-d array is a cube shape, that you might want to pull a 2-d array out of it in any orientation. There's nothing special about which axis you're indexing. For that reason, it's very important that indexing any axis will give you the same rank array. Here's the rule: -- indexing reduces the rank by 1, regardless of which axis is being indexed. >>> import numpy as N >>> a = N.zeros((2,3,4)) >>> a array([[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]]) >>> a[0,:,:] array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]) >>> a[:,0,:] array([[0, 0, 0, 0], [0, 0, 0, 0]]) >>> a[:,:,0] array([[0, 0, 0], [0, 0, 0]]) -- slicing does not reduce the rank: >>> a[:,0:1,:] array([[[0, 0, 0, 0]], [[0, 0, 0, 0]]]) >>> a[:,0:1,:].shape (2, 1, 4) It's actually very clean, logical, and useful. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bblais at bryant.edu Thu Feb 23 09:35:11 2006 From: bblais at bryant.edu (Brian Blais) Date: Thu Feb 23 09:35:11 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <43FDFB80.9090106@VisionSense.com> References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> <43FDFB80.9090106@VisionSense.com> Message-ID: <43FDF1B8.3070608@bryant.edu> Nadav Horesh wrote: > It is slower. > > I did a little study on this issue since I got into the issue of > algorithms that can not be easily vectorized (like this one). > On my PC an outer loop step took initially 17.3 seconds, and some > optimization brought it down to ~11 seconds. The dot product consumed > about 1/3 of the time. I estimate that objects creation/destruction > consumes most of the cpu time. It seems that this way comes nowhere near > cmex speed. I suspect that maybe blitz/boost may bridge the gap. > yeah, I realized that pure python would be too slow, because I ran into the exact same problem with matlab scripting. these time-dependent loops are really a mess when it comes to speed optimization. After posting to the Pyrex list, someone pointed out that my loop variables had not been declared as c datatypes. so, I had loops like: for it from 0 <= it <= 1000: for i from 0 <= i <= 100: (stuff) and the "it" and "i" were being treated, due to my oversight, as python variables. for speed, you need to have all the variables in the loop as c datatypes. just putting a line in like: cdef int it,i increases the speed from 8 seconds per block to 0.2 seconds per block, which is comparable to the mex. I learned that I have to be a bit more careful! :) thanks, Brian Blais -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From oliphant.travis at ieee.org Thu Feb 23 10:27:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 10:27:02 2006 Subject: [Numpy-discussion] Success with SVN on Cygwin In-Reply-To: References: Message-ID: <43FDFE3C.8070700@ieee.org> Jose Gomez-Dans wrote: >Hi! >A few days ago, I asked about compiling NumPy on Cygwin. Travis carried out some >modifications, and with last night's SVN, I can happily report that it now >compiles and works. The tests produced no errors, so it's all good :) > > That's good news. I wish our unit test coverage was wide enough that this actually meant that all is good :-) But, it's a good start. Thanks are really do the cygwin ports people who already had a patch (though they didn't let us know about it --- I found it using Google). I just incorporated their patch into the build tree. I'm glad to know that it works. -Travis From cjw at sympatico.ca Thu Feb 23 11:20:04 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Thu Feb 23 11:20:04 2006 Subject: [Numpy-discussion] A question Message-ID: <43FE0A87.8050907@sympatico.ca> Bar is a sub-class of ArrayType (ndarray) and bar is an instance of Bar [Dbg]>>> self.bar Bar([ 5, 1, 1, 11, 3, 7, 14, 0, 5, 2, 4, 12, 9, 10, 4, 12], dtype=uint16) [Dbg]>>> z= self.bar [Dbg]>>> z 1 << Is this expected? [Dbg]>>> type(self.bar) [Dbg]>>> self.bar.__class__.base [Dbg]>>> Colin W. From vidar+list at 37mm.no Thu Feb 23 12:02:02 2006 From: vidar+list at 37mm.no (Vidar Gundersen) Date: Thu Feb 23 12:02:02 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? Message-ID: (i've been updating the cross reference of MATLAB synonymous commands in Numeric Python to NumPy. I've kept Numeric/numarray alternatives in the source XML, but omitted it in the PDF outputs. see, http://37mm.no/download/matlab-python-xref.pdf. feedback is highly appreciated.) as i was working on this, i started wondering why a.max(0), a.min(0), a.ptp(0), a.flatten(0), ... does not allow the axis=0 keyword argument used with the exact same meaning for: m.mean(axis=0), m.sum(axis=0), ... and i also wonder why concatenate can't be used to stack 1-d arrays on top of each other, returning a 2-d array? axis relates to the number of axes in the original array(s)? n [3]: v = arange(9) In [7]: concatenate((v,v)) Out[7]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8]) In [8]: concatenate((v,v),axis=0) Out[8]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8]) In [15]: concatenate((v,v)).reshape(2,-1) Out[15]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8], [0, 1, 2, 3, 4, 5, 6, 7, 8]]) In [5]: m = v.reshape(3,-1) In [10]: concatenate((m,m)) Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: concatenate((m,m), axis=0) Out[11]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [12]: concatenate((m,m), axis=1) Out[12]: array([[0, 1, 2, 0, 1, 2], [3, 4, 5, 3, 4, 5], [6, 7, 8, 6, 7, 8]]) From fullung at gmail.com Thu Feb 23 12:12:06 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 12:12:06 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FDEEA8.9060901@noaa.gov> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <20060223115630.GB28483@alpha> <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> <43FDEEA8.9060901@noaa.gov> Message-ID: <5eec5f300602231211i23f1f199of0dd76105f6a2666@mail.gmail.com> Hello all On 2/23/06, Christopher Barker wrote: > Albert Strasheim wrote: > > There are other (unexpected, for me at least) differences between > > MATLAB/Octave and NumPy too. > > First: numpy is not, and was never intended to be, a MATLAB clone, > work-alike, whatever. You should *expect* there to be differences. I understand this. As a new user, I'm trying to understand these differences. > > For a 3D array in MATLAB, only indexing > > on the last dimension yields a 2D array, where NumPy always returns a > > 2D array. > > I think the key here is that MATLAB's core data type is a matrix, which > is 2-d. The ability to do 3-d arrays was added later, and it looks like > they are still preserving the core matrix concept, so that a 3-d array > is not really a 3-d array; it is, as someone on this thread mentioned, a > "stack" of matrices. > > In numpy, the core data type is an n-d array. That means that there is > nothing special about 2-d vs 4-d vs whatever, except 0-d (scalars). So a > 3-d array is a cube shape, that you might want to pull a 2-d array out > of it in any orientation. There's nothing special about which axis > you're indexing. For that reason, it's very important that indexing any > axis will give you the same rank array. > > Here's the rule: > > -- indexing reduces the rank by 1, regardless of which axis is being > indexed. Thanks for your comments. These cleared up a few questions I had about NumPy's design. However, I'm still wondering how the average NumPy user would expect repmat implemented for NumPy to behave with arrays with more than 2 dimensions. I would like to clear this up, since I think that a good repmat function is an essential tool for implementing algorithms that use matrix multiplication instead of for loops to perform operations (hopefully with a significant speed increase). If there is another way of accomplishing this, I would love to know. Regards Albert From oliphant.travis at ieee.org Thu Feb 23 12:26:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 12:26:06 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: <43FE1A38.8060101@ieee.org> Vidar Gundersen wrote: >(i've been updating the cross reference of MATLAB synonymous >commands in Numeric Python to NumPy. I've kept Numeric/numarray >alternatives in the source XML, but omitted it in the PDF outputs. >see, http://37mm.no/download/matlab-python-xref.pdf. >feedback is highly appreciated.) > > >as i was working on this, i started wondering why > >a.max(0), a.min(0), a.ptp(0), a.flatten(0), ... > >does not allow the axis=0 keyword argument used with >the exact same meaning for: > > It's actually consistent. These only have a single argument and so don't use keywords. But, I can see now that it might be nice to have keywords even if there is only one argument. Feel free to submit a patch. >m.mean(axis=0), m.sum(axis=0), ... > > These have multiple arguments, so the keywords are important. > >and i also wonder why concatenate can't be used to stack 1-d >arrays on top of each other, returning a 2-d array? >axis relates to the number of axes in the original array(s)? > > Because it's ambiguous what you mean to do. 1-d arrays only have a single axis. How do you propose to tell concatenate to alter the shape of the output array and in what direction? We've left it to the user to do that, like you do in the second example. When you have more than one dimension on input, then it is clear what you mean by "stack" along an axis. With only one-dimension, it isn't clear what is meant. From oliphant.travis at ieee.org Thu Feb 23 12:34:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 12:34:02 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Message-ID: <43FE1BFA.9070709@ieee.org> Albert Strasheim wrote: >Hello all > >I recently started using NumPy and one function that I am really >missing from MATLAB/Octave is repmat. This function is very useful for >implementing algorithms as matrix multiplications instead of for >loops. > > There is a function in scipy.linalg called kron that could be brought over which can do a repmat. In file: /usr/lib/python2.4/site-packages/scipy/linalg/basic.py def kron(a,b): """kronecker product of a and b Kronecker product of two matrices is block matrix [[ a[ 0 ,0]*b, a[ 0 ,1]*b, ... , a[ 0 ,n-1]*b ], [ ... ... ], [ a[m-1,0]*b, a[m-1,1]*b, ... , a[m-1,n-1]*b ]] """ if not a.flags['CONTIGUOUS']: a = reshape(a, a.shape) if not b.flags['CONTIGUOUS']: b = reshape(b, b.shape) o = outerproduct(a,b) o=o.reshape(a.shape + b.shape) return concatenate(concatenate(o, axis=1), axis=1) Thus, kron(ones((2,3)), arr) >>> sl.kron(ones((2,3)),arr) array([[1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4], [1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4]]) gives you the equivalent of repmat(arr, 2,3) We could bring this over from scipy into numpy as it is simple enough. It has a multidimensional extension (i.e. you can pass in a and b as higher dimensional arrays), But, don't ask me to explain it to you because I can't without further study.... -Travis From robert.kern at gmail.com Thu Feb 23 12:44:04 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu Feb 23 12:44:04 2006 Subject: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: Vidar Gundersen wrote: > and i also wonder why concatenate can't be used to stack 1-d > arrays on top of each other, returning a 2-d array? Use vstack() for that. Also note its companions, hstack() and column_stack(). -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From Norbert.Nemec.list at gmx.de Thu Feb 23 12:44:05 2006 From: Norbert.Nemec.list at gmx.de (Norbert Nemec) Date: Thu Feb 23 12:44:05 2006 Subject: [Numpy-discussion] Up-to-date bugtracker for NumPy? Message-ID: <43FE1E73.8020103@gmx.de> Hi there, is there any bugtracker for NumPy actually in active use? Sourceforge has one for numarray and one for "numpy", but the latter one contains only old bugs (probably for Numeric?) Greetings, Norbert From Norbert.Nemec.list at gmx.de Thu Feb 23 12:54:03 2006 From: Norbert.Nemec.list at gmx.de (Norbert Nemec) Date: Thu Feb 23 12:54:03 2006 Subject: [Numpy-discussion] Up-to-date bugtracker for NumPy? In-Reply-To: <43FE1E73.8020103@gmx.de> References: <43FE1E73.8020103@gmx.de> Message-ID: <43FE20C7.90001@gmx.de> Guess the question was answered before I even asked it: The bug that I reported two hours ago has already been fixed and closed by Travis. Amazing reaction time! Norbert Nemec wrote: >Hi there, > >is there any bugtracker for NumPy actually in active use? Sourceforge >has one for numarray and one for "numpy", but the latter one contains >only old bugs (probably for Numeric?) > >Greetings, >Norbert > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From robert.kern at gmail.com Thu Feb 23 13:17:06 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu Feb 23 13:17:06 2006 Subject: [Numpy-discussion] Re: Up-to-date bugtracker for NumPy? In-Reply-To: <43FE1E73.8020103@gmx.de> References: <43FE1E73.8020103@gmx.de> Message-ID: Norbert Nemec wrote: > Hi there, > > is there any bugtracker for NumPy actually in active use? Sourceforge > has one for numarray and one for "numpy", but the latter one contains > only old bugs (probably for Numeric?) http://projects.scipy.org/scipy/numpy Click "New Ticket" up at the top to enter a new bug. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From fullung at gmail.com Thu Feb 23 13:22:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 13:22:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FE1BFA.9070709@ieee.org> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> Message-ID: <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com> Hello all On 2/23/06, Travis Oliphant wrote: > Albert Strasheim wrote: > > >Hello all > > > >I recently started using NumPy and one function that I am really > >missing from MATLAB/Octave is repmat. This function is very useful for > >implementing algorithms as matrix multiplications instead of for > >loops. > > > > > There is a function in scipy.linalg called kron that could be brought > over which can do a repmat. > Thus, > > kron(ones((2,3)), arr) > > >>> sl.kron(ones((2,3)),arr) > array([[1, 2, 1, 2, 1, 2], > [3, 4, 3, 4, 3, 4], > [1, 2, 1, 2, 1, 2], > [3, 4, 3, 4, 3, 4]]) > > gives you the equivalent of > > repmat(arr, 2,3) Thanks! Merging this into numpy would be much appreciated. Stefan van der Walt did some benchmarks and this approach seems faster than anything we managed for 2D arrays. However, I'm a bit concerned about the ones(n,m) that is needed by this implementation. It seems to me that this would double the memory requirements of this repmat function, which is fine when working with small matrices, but could be a problem with larger ones. Any thoughts? Regards Albert From fullung at gmail.com Thu Feb 23 13:23:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 13:23:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FE1BFA.9070709@ieee.org> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> Message-ID: <5eec5f300602231322r2793cdd9r9d3d8310a8fec559@mail.gmail.com> Hello On 2/23/06, Travis Oliphant wrote: > Albert Strasheim wrote: > > >Hello all > > > >I recently started using NumPy and one function that I am really > >missing from MATLAB/Octave is repmat. This function is very useful for > >implementing algorithms as matrix multiplications instead of for > >loops. > > > > > There is a function in scipy.linalg called kron that could be brought > over which can do a repmat. I quickly tried a few of my test cases with following implementation of repmat: from numpy import asarray from scipy.linalg import kron def repmat(a, m, n): a = asarray(a) return kron(ones((m, n)), a) This test: a = repmat(1, 1, 1) assert_equal(a, 1) fails with: ValueError: 0-d arrays can't be concatenated and this test: a = repmat(array([1,2]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) fails with: AssertionError: Arrays are not equal (shapes (12,), (2, 6) mismatch) Regards Albert From alex.liberzon at gmail.com Thu Feb 23 15:46:08 2006 From: alex.liberzon at gmail.com (Alex Liberzon) Date: Thu Feb 23 15:46:08 2006 Subject: [Numpy-discussion] repmat equivalent Message-ID: <775f17a80602231545i21053caat78d9d87189d8f1c4@mail.gmail.com> I am also mostly Matlab user and I like repmat() a lot. I just realized that in SciPy, i am confused about NumPy/SciPy, but it is possible in both :-)) it is much, much easier. Just use r_[a,a] and c_[a,a] and you get a concatination like repmat() does. if you need 'm' times row concatenation of matrix, you can use (sorry for the ugly way, Pythoners): eval('r_['+m*'a,'+']') then, the repmat is just: (qute, isn't it) def repmat(a,m,n): from scipy import r_, c_ a = eval('r_['+m*'a,'+']') return eval('c_['+n*'a,'+']') the test is: >>> from scipy import * numerix Numeric 24.2 >>> a = array([[0,1],[2,3]]) >>> a array([[0, 1], [2, 3]]) >>> repmat(a,2,3) array([[0, 1, 0, 1, 0, 1], [2, 3, 2, 3, 2, 3], [0, 1, 0, 1, 0, 1], [2, 3, 2, 3, 2, 3]]) Best, Alex Liberzon From gruben at bigpond.net.au Thu Feb 23 15:54:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 23 15:54:02 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: <43FE4B00.7070306@bigpond.net.au> Hi Vidar, the pdf file appears to be broken. I get an error when I try to open it. Have you thought of a nice way to generate html from the xml source to incorporate this into the scipy website? I don't think it should be part of the wiki. We'd need a way of making the xml editable via a wiki interface and automatically generating multiple views or something. regards, Gary Vidar Gundersen wrote: > (i've been updating the cross reference of MATLAB synonymous > commands in Numeric Python to NumPy. I've kept Numeric/numarray > alternatives in the source XML, but omitted it in the PDF outputs. > see, http://37mm.no/download/matlab-python-xref.pdf. > feedback is highly appreciated.) From cookedm at physics.mcmaster.ca Thu Feb 23 16:15:01 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 23 16:15:01 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43FD31FA.6030802@cox.net> (Tim Hochberg's message of "Wed, 22 Feb 2006 20:54:34 -0700") References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FD31FA.6030802@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>Tim Hochberg writes: >> >>>David M. Cooke wrote: >>> >>> > [SNIP] > >> >>I've gone through your code you checked in, and fixed it up. Looks >>good. One side effect is that >> >>def zl(x): >> a = ones_like(x) >> a[:] = 0 >> return a >> >>is now faster than zeros_like(x) :-) >> >> > I noticed that ones_like was faster than zeros_like, but I didn't > think to try that. That's pretty impressive considering how > ridicuously easy it was to write. Might be useful to move zeros_like and empty_like to ufuncs :-) (Well, they'd be better written as regular C functions, though.) >>One problem I had is that in PyArray_SetNumericOps, the "copy" method >>wasn't picked up on. It may be due to the order of initialization of >>the ndarray type, or something (since "copy" isn't a ufunc, it's >>initialized in a different place). I couldn't figure out how to fiddle >>that, so I replaced the x.copy() call with a call to PyArray_Copy(). >> >> > Interesting. It worked fine here. Actually, it works fine in the sense that it works. However, if you time it, it was obvious that it wasn't using an optimized version (x**1 was as slow as x**1.1). > I kind of like power and scalar_power. Then ** could be advertised as > calling scalar_power for scalars and power for arrays. Scalar power > would do optimizations on integer and half_integer powers. Of course > there's no real way to enforce that scalar power is passed scalars, > since presumably it would be a ufunc, short of making _scalar_power a > ufunc instead and doing something like: > > def scalar_power(x, y): > "compute x**y, where y is a scalar optimizing integer and half > integer powers possibly at some minor loss of accuracy" > if not is_scalar(y): raise ValuerError("Naughty!!") > return _scalar_power(x,y) I'm tempted to make it have the same signature as power, but call power if passed an array (or, at the ufunc level, if the stride for the second argument is non-zero). >>Another point is to look at __div__, and use reciprocal if the >>dividend is 1. >> >> > That would be easy, but wouldn't it be just as easy to optimize > __div__ for scalar divisions. Should probably check that this isn't > just as fast since it would be a lot more general. Hmm, scalar division and multiplication could both be speed up: In [36]: a = arange(10000, dtype=float) In [37]: %time for i in xrange(100000): a * 1.0 CPU times: user 3.30 s, sys: 0.00 s, total: 3.30 s Wall time: 3.39 In [38]: b = array([1.]) In [39]: %time for i in xrange(100000): a * b CPU times: user 2.63 s, sys: 0.00 s, total: 2.63 s Wall time: 2.71 The differences in times are probably due to creating an array to hold 1.0. When I have time, I'll look at the ufunc machinery. Since ufuncs are just passed pointers to data and strides, there's no reason (besides increasing complexity ;-) to build an ndarray object for scalars. Alternatively, allow passing scalars to ufuncs: you could define a ufunc (like our scalar_power) to take an array argument and a scalar argument. Or, power could be defined to take (array, array) or (array, scalar), and the ufunc machinery would choose the appropiate one. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From vidar+list at 37mm.no Thu Feb 23 16:18:00 2006 From: vidar+list at 37mm.no (Vidar Gundersen) Date: Thu Feb 23 16:18:00 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: <43FE4B00.7070306@bigpond.net.au> (Gary Ruben's message of "Fri, 24 Feb 2006 10:53:36 +1100") References: <43FE4B00.7070306@bigpond.net.au> Message-ID: ===== Original message from Gary Ruben | 24 Feb 2006: > the pdf file appears to be broken. I get an error when I try to open it. sorry, i've uploaded the files again, and tested it, so hopefully it will work for you now. > Have you thought of a nice way to generate html from the xml source to > incorporate this into the scipy website? that's easy, i'll do it tomorrow. > We'd need a way of making the xml editable via a wiki interface > and automatically generating multiple views or something. hmmm... :) From ndarray at mac.com Thu Feb 23 16:45:04 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 23 16:45:04 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: Rint ufunc and ndarray metod "round" are in svn. x.round(...) is about 20x faster than around(x) for decimals=0 and about 6x faster for decimals>0. The case decimals<0 is slower on integers, but it actually does something :-) From alex.liberzon at gmail.com Thu Feb 23 16:48:03 2006 From: alex.liberzon at gmail.com (Alex Liberzon) Date: Thu Feb 23 16:48:03 2006 Subject: [Numpy-discussion] RE: repmat equivalent Message-ID: <775f17a80602231647g1b26ac7vef5a9ec22b99d86d@mail.gmail.com> Another thought: def repmat(a,m,n): from scipy import hstack, vstack a = eval('hstack(('+n*'a,'+'))') return eval('vstack(('+m*'a,'+'))') may hstack() and vstack() be better for 1d arrays? >>> from scipy import * numerix Numeric 24.2 >>> a = array([1,2]) >>> repmat(a,2,3) array([[1, 2, 1, 2, 1, 2], [1, 2, 1, 2, 1, 2]]) >>> equal(repmat(1,1,1),1) array([ [1]],'b') of course, scipy.linalg.kron(ones((m,n,p,...)),a) is more robust and works for higher dimensions. probably it's the best. From gruben at bigpond.net.au Thu Feb 23 17:00:04 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 23 17:00:04 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: References: <43FE4B00.7070306@bigpond.net.au> Message-ID: <43FE5A61.2080004@bigpond.net.au> >> Have you thought of a nice way to generate html from the xml source to >> incorporate this into the scipy website? > > that's easy, i'll do it tomorrow. Hi Vidar, I can open it now. Excellent. I can still see a couple of references to numarray and Numeric in there which should be removed if possible, including in the title. Regarding an html version: It would be nice to gerenate as much cross-reference material as possible out of the XML source. I don't think there will be space for all the alternatives on a single html page, so maybe you could generate separate html files for numpy versus Matlab/Octave, numpy versus IDL, numpy versus R. Have you kept the Numeric version in the source? If so, maybe you could also generate numpy versus Numeric. Another idea is to put the numpy column in one frame and all the others together in a frame next to it which can be scrolled sideways to reveal the other environment of choice. Another more difficult idea is to put some javascript in it to allow selection but this is probably not worth the effort. >> We'd need a way of making the xml editable via a wiki interface >> and automatically generating multiple views or something. > > hmmm... :) Yes; hmmm is right. Actually, I'm sure that's a bad idea. It's better if you maintain control of the original, otherwise it will lose utility for you as a general purpose cross reference which we, the lucky numpy users get a side benefit from. Gary R. From oliphant.travis at ieee.org Thu Feb 23 17:28:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 17:28:06 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: <43FE60E1.5040701@ieee.org> Sasha wrote: >Rint ufunc and ndarray metod "round" are in svn. x.round(...) is >about 20x faster than around(x) for decimals=0 and about 6x faster for >decimals>0. The case decimals<0 is slower on integers, but it >actually does something :-) > > Great job. Thanks for adding this, Sasha... I think many will enjoy using it. Regarding portability: On my system rint says it conforms to BSD 4.3. How portable is that? Can anyone try it out on say the MSVC compiler for windows? -Travis From wbaxter at gmail.com Thu Feb 23 17:30:19 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 17:30:19 2006 Subject: [Numpy-discussion] help(xxx) vs print(xxx.__doc__) Message-ID: Can someone explain why help(numpy.r_) doesn't contain all the information in print(numpy.r_.__doc__)? Namely you don't get the helpful example showing usage with 'help' that you get with '.__doc__'. I'd rather be able to use 'help' as the one-stop shop for built-in documentation. It's less typing and just looks nicer. Thanks, --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 23 17:36:09 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 17:36:09 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: <43FE60E1.5040701@ieee.org> References: <43FE60E1.5040701@ieee.org> Message-ID: There doesn't seem to be any rint() round() or nearestint() defined in MSVC 7.1. Can't find it in an MSDN search either. I think that's why a lot of people in the game biz, at least, use that lrintf function written using intrinsics that I posted a link to earlier. I first heard about that on the gd-algorithms mailing list. --Bill On 2/24/06, Travis Oliphant wrote: > > Sasha wrote: > > >Rint ufunc and ndarray metod "round" are in svn. x.round(...) is > >about 20x faster than around(x) for decimals=0 and about 6x faster for > >decimals>0. The case decimals<0 is slower on integers, but it > >actually does something :-) > > > > > Great job. Thanks for adding this, Sasha... > > I think many will enjoy using it. > > Regarding portability: On my system rint says it conforms to BSD 4.3. > How portable is that? > > Can anyone try it out on say the MSVC compiler for windows? > > -Travis > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 23 17:47:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 17:47:15 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: <43FE60E1.5040701@ieee.org> Message-ID: Generally, C99 support in MSVC.NET is pretty much nil, except for maybe support for "inline" (which MS had already added prior to the C99 standard). This wikipedia article links to a quote from the Visual C++ program manager at Microsoft saying "In general we don't see a lot of demand for C99 features": http://en.wikipedia.org/wiki/C_programming_language#C99 So it's not clear the situtation will change any time soon. I don't know if VC8 is any better in its C99 support. I doubt it. Wikipedia says Borland is dragging their feet too. --Bill On 2/24/06, Bill Baxter wrote: > > There doesn't seem to be any rint() round() or nearestint() defined in > MSVC 7.1. Can't find it in an MSDN search either. I think that's why a > lot of people in the game biz, at least, use that lrintf function written > using intrinsics that I posted a link to earlier. I first heard about that > on the gd-algorithms mailing list. > > --Bill > > On 2/24/06, Travis Oliphant wrote: > > > > Sasha wrote: > > > > >Rint ufunc and ndarray metod "round" are in svn. x.round(...) is > > >about 20x faster than around(x) for decimals=0 and about 6x faster for > > >decimals>0. The case decimals<0 is slower on integers, but it > > >actually does something :-) > > > > > > > > Great job. Thanks for adding this, Sasha... > > > > I think many will enjoy using it. > > > > Regarding portability: On my system rint says it conforms to BSD 4.3. > > How portable is that? > > > > Can anyone try it out on say the MSVC compiler for windows? > > > > -Travis > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryanlists at gmail.com Thu Feb 23 18:09:00 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu Feb 23 18:09:00 2006 Subject: [Numpy-discussion] array.argmax() question Message-ID: Is this the expected behavior for array.argmax(): ipdb> abs(real(disps)).max() Out[38]: 1.7373584411866401e-05 ipdb> abs(real(disps)).argmax() Out[38]: 32 ipdb> shape(disps) Out[38]: (11, 3) ipdb> disps[11,1] *** IndexError: invalid index ipdb> disps[10,1] Out[38]: 0j ipdb> disps[10,2] Out[38]: (-1.7373584411866401e-05+5.2046737124258386e-21j) Basically, I want to find the element with the largest absolute value in a matrix and use it to scale by. But I need to correct for the possibility that the largest abs value may be from a negative number. So, I need to get the corresponding element itself. My array is shape (11,3) and argmax without an axis argument returns 32, which would be the index if the matrix was reshaped into a (33,) vector. Is there a clean way to extract the element based on the output of argmax? (and in my case it is actually using the output of argmax to extract the element from the matrix without the abs). Or do I need to reshape the matrix into a vector first? Thanks, Ryan From ryanlists at gmail.com Thu Feb 23 18:11:06 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu Feb 23 18:11:06 2006 Subject: [Numpy-discussion] Re: array.argmax() question In-Reply-To: References: Message-ID: Is the right answer to just use flatten()? i.e. ind=abs(mymat).argmax() maxelem=mymat.flatten()[ind] On 2/23/06, Ryan Krauss wrote: > Is this the expected behavior for array.argmax(): > > ipdb> abs(real(disps)).max() > Out[38]: 1.7373584411866401e-05 > ipdb> abs(real(disps)).argmax() > Out[38]: 32 > ipdb> shape(disps) > Out[38]: (11, 3) > ipdb> disps[11,1] > *** IndexError: invalid index > ipdb> disps[10,1] > Out[38]: 0j > ipdb> disps[10,2] > Out[38]: (-1.7373584411866401e-05+5.2046737124258386e-21j) > > Basically, I want to find the element with the largest absolute value > in a matrix and use it to scale by. But I need to correct for the > possibility that the largest abs value may be from a negative number. > So, I need to get the corresponding element itself. > > My array is shape (11,3) and argmax without an axis argument returns > 32, which would be the index if the matrix was reshaped into a (33,) > vector. Is there a clean way to extract the element based on the > output of argmax? (and in my case it is actually using the output of > argmax to extract the element from the matrix without the abs). Or do > I need to reshape the matrix into a vector first? > > Thanks, > > Ryan > From cookedm at physics.mcmaster.ca Thu Feb 23 18:28:00 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 23 18:28:00 2006 Subject: [Numpy-discussion] help(xxx) vs print(xxx.__doc__) In-Reply-To: (Bill Baxter's message of "Fri, 24 Feb 2006 10:29:20 +0900") References: Message-ID: "Bill Baxter" writes: > Can someone explain why help(numpy.r_) doesn't contain all the information in > print(numpy.r_.__doc__)? > > Namely you don't get the helpful example showing usage with 'help' that you get > with '.__doc__'. > > I'd rather be able to use 'help' as the one-stop shop for built-in > documentation. It's less typing and just looks nicer. Huh, odd. Note that in IPython, numpy.r_? and numpy.r_.__doc__ give the same results. And I thought I was being clever when I rewrote numpy.r_ :-) Looks like help() looks at the class __doc__ first, while IPython looks at the object's __doc__ first. I've fixed this in svn. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From steve at shrogers.com Thu Feb 23 18:34:04 2006 From: steve at shrogers.com (Steven H. Rogers) Date: Thu Feb 23 18:34:04 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class In-Reply-To: <43FD5914.4060506@ieee.org> References: <43FD5914.4060506@ieee.org> Message-ID: <43FE709C.8040701@shrogers.com> I don't have an immediate use for this, but if available, I expect that it would be used. Steve //////////////////////// Travis Oliphant wrote: > > The bigndarray class is going to disappear (probably in the next release > of NumPy). It was a stop-gap measure as the future of 64-bit fixes in > Python was unclear. Python 2.5 will have removed the 64-bit limitations > that led to the bigndarray and so it will be removed. > I have been thinking, however, of replacing it with a super-class that > does not define the dimensions or strides. > In other words, the default array would be just a block of memory. The > standard array would inherit from the default and add dimension and > strides pointers. > > I was thinking that this might make it easier for sub-classes using > fixed-sized dimensions and strides. I'm not sure if that would actually > be useful, but since I was thinking about the disappearance of the > bigndarray, I thought I would ask for comments. > > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > -- Steven H. Rogers, Ph.D., steve at shrogers.com Weblog: http://shrogers.com/weblog "He who refuses to do arithmetic is doomed to talk nonsense." -- John McCarthy From ndarray at mac.com Thu Feb 23 19:28:03 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 23 19:28:03 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <200602231803.27417.faltet@carabos.com> References: <200602231803.27417.faltet@carabos.com> Message-ID: On 2/23/06, Francesc Altet wrote: > It's a bit late, but I want to support your proposal (most of it). You are not late -- you are the first to reply! When you say "most of it," is there anything in particular that you don't like? > I've also come to the conclusion that scalars and rank-0 arrays should > coexist. This is something that appears as a natural fact when you > have to deal regularly with general algorithms for treat objects with > different shapes. And I think you have put this very well. Thanks for your kind words. If we agree to legitimize rank-0 arrays, maybe we should start by removing conversion to scalars from ufuncs. Currently: >>> type(array(2)*2) I believe it should result in a rank-0 array instead. I've recently wrote ndarray round function and that code illustrates the problem of implicite scalar conversion: ret = PyNumber_Multiply((PyObject *)a, f); if (ret==NULL) {Py_DECREF(f); return NULL;} if (PyArray_IsScalar(ret, Generic)) { /* array scalars cannot be modified inplace */ PyObject *tmp; tmp = PyObject_CallFunction(n_ops.rint, "O", ret); Py_DECREF(ret); ret = PyObject_CallFunction(n_ops.divide, "OO", tmp, f); Py_DECREF(tmp); } else { PyObject_CallFunction(n_ops.rint, "OO", ret, ret); PyObject_CallFunction(n_ops.divide, "OOO", ret, f, ret); } From oliphant.travis at ieee.org Thu Feb 23 21:08:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 21:08:01 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: <200602231803.27417.faltet@carabos.com> Message-ID: <43FE9478.9040309@ieee.org> Sasha wrote: >On 2/23/06, Francesc Altet wrote: > > >>It's a bit late, but I want to support your proposal (most of it). >> >> > >You are not late -- you are the first to reply! When you say "most of >it," is there anything in particular that you don't like? > > Usually nobody has a strong opinion on these issues until they encounter something they don't like. I think many are still trying to understand what a rank-0 array is. >>>>type(array(2)*2) >>>> >>>> > > >I believe it should result in a rank-0 array instead. > > Can you be more precise about when rank-0 array should be returned and when scalars should be? >I've recently wrote ndarray round function and that code illustrates >the problem of implicite scalar conversion: > > I think we will have issues no matter what because rank-0 arrays and scalars have always been with us. We just need to nail down some rules for when they will show up and live by them. Right now the rule is basically: rank-0 arrays become array-scalars all the time. The exceptions are rank0.copy() rank0.view() array(5) scalar.__array__() one_el.shape=() If you can come up with a clear set of rules for when rank-0 arrays should show up and when scalars should show up, then we will understand better what you want to do. -Travis From oliphant.travis at ieee.org Thu Feb 23 21:34:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 21:34:01 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: Message-ID: <43FE9A87.3050104@ieee.org> Sasha wrote: >The main criticism of supporting both scalars and rank-0 arrays is >that it is "unpythonic" in the sense that it provides two almost >equivalent ways to achieve the same result. However, I am now >convinced that this is the case where practicality beats purity. > > I think most of us agree that both will be with us for the indefinite future. >The situation with ndarrays is somewhat similar. A rank-N array is >very similar to a function with N arguments, where each argument has a >finite domain (i-th domain of a is range(a.shape[i])). A rank-0 array >is just a function with no arguments and as such it is quite different >from a scalar. > I can buy this view. Nicely done. >Just as a function with no arguments cannot be >replaced by a constant in the case when a value returned may change >during the run of the program, rank-0 array cannot be replaced by an >array scalar because it is mutable. (See >http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray for use >cases). > >Rather than trying to hide rank-0 arrays from the end-user and treat >it as an implementation artifact, I believe numpy should emphasize the >difference between rank-0 arrays and scalars and have clear rules on >when to use what. > > I agree. The problem is what should the rules be. Right now, there are no clear rules other than rank-0 arrays --- DONT. You make a case that we should not be so hard on rank-0 arrays. >PROPOSALS >========== > >Here are three suggestions: > >1. Probably the most controversial question is what getitem should >return. I believe that most of the confusion comes from the fact that >the same syntax implements two different operations: indexing and >projection (for the lack of better name). Using the analogy between >ndarrays and functions, indexing is just the application of the >function to its arguments and projection is the function projection >((f, x) -> lambda (*args): f(x, *args)). > >The problem is that the same syntax results in different operations >depending on the rank of the array. > >Let > > >>>>x = ones((2,2)) >>>>y = ones(2) >>>> >>>> > >then x[1] is projection and type(x[1]) is ndarray, but y[1] is >indexing and type(y[1]) is int32. Similarly, y[1,...] is indexing, >while x[1,...] is projection. > >I propose to change numpy rules so that if ellipsis is present inside >[], the operation is always projection and both y[1,...] and >x[1,1,...] return zero-rank arrays. Note that I have previously >rejected Francesc's idea that x[...] and x[()] should have different >meaning for zero-rank arrays. I was wrong. > > I think this is a good and clear rule. And it seems like we may be "almost" there. Anybody want to implement it? >2. Another source of ambiguity is the various "reduce" operations such >as sum or max. Using the previous example, type(x.sum(axis=0)) is >ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: > > a. Make x.sum(axis) return ndarray unless axis is None, making >type(y.sum(axis=0)) is ndarray true in the example. > > > Hmm... I'm not sure. y.sum(axis=0) is the default spelling of sum(y). Thus, this would cause all old code to return a rank-0 array. Most people who write sum(y) want a scalar, not a "function with 0 arguments" > b. Allow axis to be a sequence of ints and make >x.sum(axis=range(rank(x))) return rank-0 array according to the rule >2.a above. > > So, this would sum over multiple axes? I guess I'm not opposed to something like that, but I'm not really excited about it either. Would that make sense for all methods that take the axis= argument? > c. Make x.sum() raise an error for rank-0 arrays and scalars, but >allow x.sum(axis=()) to return x. This will make numpy sum consistent >with the built-in sum that does not work on scalars. > > > I don't think I like this at all. This proposal has more far-reaching implications (and would require more code changes --- though the axis= arguments do have a converter function and so would not be as painful as one might imagine). In short, I don't feel as enthused about portion 2 of your proposal. >3. This is a really small change currently > > >>>>empty(()) >>>> >>>> >array(0) > >but > > > > >I propose to make shape=() valid in ndarray constructor. > > +1 I think we need more thinking about rank-0 arrays before doing something like proposal 2. However, 1 and 3 seem simple enough to move forward with... -Travis From cookedm at physics.mcmaster.ca Thu Feb 23 22:03:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 23 22:03:02 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: (David M. Cooke's message of "Thu, 23 Feb 2006 19:14:11 -0500") References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FD31FA.6030802@cox.net> Message-ID: cookedm at physics.mcmaster.ca (David M. Cooke) writes: > Hmm, scalar division and multiplication could both be speed up: > > In [36]: a = arange(10000, dtype=float) > In [37]: %time for i in xrange(100000): a * 1.0 > CPU times: user 3.30 s, sys: 0.00 s, total: 3.30 s > Wall time: 3.39 > In [38]: b = array([1.]) > In [39]: %time for i in xrange(100000): a * b > CPU times: user 2.63 s, sys: 0.00 s, total: 2.63 s > Wall time: 2.71 > > The differences in times are probably due to creating an array to hold > 1.0. > > When I have time, I'll look at the ufunc machinery. Since ufuncs are > just passed pointers to data and strides, there's no reason (besides > increasing complexity ;-) to build an ndarray object for scalars. I've had a look: basically, if you pass 1.0, say, to a ufunc, it ends up going through PyArray_FromAny. This did checks for the array interface first (looking for attributes __array_shape__, __array_typestr__, __array_struct__, __array__, etc.). These would always fail for Python scalar types. I special-cased Python scalar types (bool, int, long, float, complex) in PyArray_FromAny so they are checked for first. This *does* have the side effect that if you have a subclass of one of these that does define the array interface, that interface is not used. If anybody's worried about that, well...tough :-) Give me a reasonable test case for subclassing a Python scalar and adding the array interface. This gives me the times In [1]: a = arange(10000, dtype=float) In [2]: %time for i in xrange(100000): a * 1.0 CPU times: user 2.76 s, sys: 0.00 s, total: 2.76 s Wall time: 2.85 In [3]: b = array([1.]) In [4]: %time for i in xrange(100000): a * b CPU times: user 2.69 s, sys: 0.00 s, total: 2.69 s Wall time: 2.76 The overhead of a * 1.0 is 3% compared to a * b here, as opposed to 25% in my last set of numbers. [for those jumping in, this is all still in the power_optimization branch] -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From stefan at sun.ac.za Fri Feb 24 00:13:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 24 00:13:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com> Message-ID: <20060224081132.GB13312@alpha> On Thu, Feb 23, 2006 at 11:21:39PM +0200, Albert Strasheim wrote: > > Thus, > > > > kron(ones((2,3)), arr) > > > > >>> sl.kron(ones((2,3)),arr) > > array([[1, 2, 1, 2, 1, 2], > > [3, 4, 3, 4, 3, 4], > > [1, 2, 1, 2, 1, 2], > > [3, 4, 3, 4, 3, 4]]) > > > > gives you the equivalent of > > > > repmat(arr, 2,3) > > Thanks! Merging this into numpy would be much appreciated. Stefan van > der Walt did some benchmarks and this approach seems faster than > anything we managed for 2D arrays. My benchmark was wrong -- this function is not as fast as the version Albert previously proposed. Below follows the benchmark of seven possible repmat functions: --------------------------------------------------------------------------- 0 : 1.09316706657 (Albert) 1 : 6.15612506866 (Stefan) 2 : 5.21671295166 (Stefan) 3 : 2.78160500526 (Stefan) 4 : 1.20426011086 (Albert Optimised) 5 : 11.0923781395 (Travis) 6 : 3.47499799728 (Alex) --------------------------------------------------------------------------- 0 : 1.17543005943 1 : 6.03165698051 2 : 5.7597899437 3 : 2.40381717682 4 : 1.09497308731 5 : 11.6657807827 6 : 7.11567497253 --------------------------------------------------------------------------- 0 : 2.03999996185 1 : 9.87535595894 2 : 8.86893296242 3 : 4.56993699074 4 : 2.02298903465 5 : 22.8858327866 6 : 10.7882151604 --------------------------------------------------------------------------- I attach the code. St?fan -------------- next part -------------- A non-text attachment was scrubbed... Name: repmat.py Type: text/x-python Size: 1437 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: repmat_bench.py Type: text/x-python Size: 682 bytes Desc: not available URL: From faltet at carabos.com Fri Feb 24 00:53:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 24 00:53:02 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE9478.9040309@ieee.org> References: <43FE9478.9040309@ieee.org> Message-ID: <200602240952.42091.faltet@carabos.com> A Divendres 24 Febrer 2006 06:07, Travis Oliphant va escriure: > Sasha wrote: > >>>>type(array(2)*2) > > > > > > > >I believe it should result in a rank-0 array instead. Yes, this sounds reasonable, IMHO. > Right now the rule is basically: > > rank-0 arrays become array-scalars all the time. > > The exceptions are > > rank0.copy() > rank0.view() > array(5) > scalar.__array__() > one_el.shape=() > > If you can come up with a clear set of rules for when rank-0 arrays > should show up and when scalars should show up, then we will understand > better what you want to do. Yeah. I think Travis is right. A set of rules clearly stating the situations where rank-0 arrays have to become scalars maybe worth the effort. Travis already has the above list and now is just a matter of discovering other exceptions that should go there. So, please, if anybody comes with more exceptions, go ahead and propose them. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From faltet at carabos.com Fri Feb 24 01:01:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 24 01:01:02 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: Message-ID: <200602241000.25090.faltet@carabos.com> A Dissabte 18 Febrer 2006 20:49, Sasha va escriure: > PROPOSALS > ========== > > Here are three suggestions: > > 1. Probably the most controversial question is what getitem should > return. I believe that most of the confusion comes from the fact that > the same syntax implements two different operations: indexing and > projection (for the lack of better name). Using the analogy between > ndarrays and functions, indexing is just the application of the > function to its arguments and projection is the function projection > ((f, x) -> lambda (*args): f(x, *args)). > > The problem is that the same syntax results in different operations > depending on the rank of the array. > > Let > > >>> x = ones((2,2)) > >>> y = ones(2) > > then x[1] is projection and type(x[1]) is ndarray, but y[1] is > indexing and type(y[1]) is int32. Similarly, y[1,...] is indexing, > while x[1,...] is projection. > > I propose to change numpy rules so that if ellipsis is present inside > [], the operation is always projection and both y[1,...] and > x[1,1,...] return zero-rank arrays. Note that I have previously > rejected Francesc's idea that x[...] and x[()] should have different > meaning for zero-rank arrays. I was wrong. +1 (if I want to be consequent ;-) I guess that this would imply that: In [19]: z=numpy.array(1) In [20]: type(z[()]) Out[20]: In [21]: type(z[...]) Out[21]: isn't it? > 2. Another source of ambiguity is the various "reduce" operations such > as sum or max. Using the previous example, type(x.sum(axis=0)) is > ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: > > a. Make x.sum(axis) return ndarray unless axis is None, making > type(y.sum(axis=0)) is ndarray true in the example. > > b. Allow axis to be a sequence of ints and make > x.sum(axis=range(rank(x))) return rank-0 array according to the rule > 2.a above. > > c. Make x.sum() raise an error for rank-0 arrays and scalars, but > allow x.sum(axis=()) to return x. This will make numpy sum consistent > with the built-in sum that does not work on scalars. Well, to say the truth, I've not a strong opinion on this one (this is why I "mostly" supported your proposal ;-), but I think that if Travis has reasons to oppose to it, we should listen to him. > 3. This is a really small change currently > I propose to make shape=() valid in ndarray constructor. +1 Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From wbaxter at gmail.com Fri Feb 24 01:04:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Fri Feb 24 01:04:03 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky Message-ID: Multiplying a matrix times a scalar seems to return junk for some reason: >>> A = numpy.asmatrix(numpy.rand(1,2)) >>> A matrix([[ 0.30604211, 0.98475225]]) >>> A * 0.2 matrix([[ 6.12084210e-002, 7.18482614e-290]]) >>> 0.2 * A matrix([[ 6.12084210e-002, 7.18482614e-290]]) >>> numpy.__version__ '0.9.5' --billyb -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Fri Feb 24 01:22:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 24 01:22:07 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE6595.7050309@sympatico.ca> References: <200602231803.27417.faltet@carabos.com> <43FE6595.7050309@sympatico.ca> Message-ID: <200602241021.27150.faltet@carabos.com> A Divendres 24 Febrer 2006 02:47, v?reu escriure: > Could these be considered as dimensionless, to avoid having to explain > to people that the word rank doesn't have the same meaning as the matrix > rank? Colin Williams was proposing calling arrays coming from array(5) as 'dimensionless'. So, for the moment, we have three ways to name such a beasts: - 'rank-0' - '0-dimensional' or '0-dim' for short - 'dimensionless' Perhaps the time has come to choose a number for them (if we have to live with such array for a long time, as it seems to be the case). However, as more I think on this the more I'm convinced that, similarly to their higher dimension counterparts, we will not arrive to any definite agreement and people will continue to call them in any way he would be more comfortable with. IMO, this should be not a problem at all because all three words express a 'lack of dimensionality'. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From oliphant.travis at ieee.org Fri Feb 24 01:59:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 01:59:02 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: Message-ID: <43FED896.3060301@ieee.org> Bill Baxter wrote: > Multiplying a matrix times a scalar seems to return junk for some reason: > > >>> A = numpy.asmatrix(numpy.rand(1,2)) > >>> A > matrix([[ 0.30604211, 0.98475225]]) > >>> A * 0.2 > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> 0.2 * A > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> numpy.__version__ Unfortunately, there are still some bugs in the scalar multiplication section of _dotblas.c stemming from a re-write that allows discontiguous matrices. We are still tracking down the problems. Hopefully this should be fixed soon. -Travis From svetosch at gmx.net Fri Feb 24 02:25:04 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Fri Feb 24 02:25:04 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FE1BFA.9070709@ieee.org> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> Message-ID: <43FEDEC7.5080901@gmx.net> Travis Oliphant schrieb: > There is a function in scipy.linalg called kron that could be brought > over which can do a repmat. > > def kron(a,b): > """kronecker product of a and b That would be great, I have been missing it in numpy already! (Because scipy is rather big, I'd like to avoid depending on it for such things.) So please do bring it over. Thanks, Sven From oliphant.travis at ieee.org Fri Feb 24 02:53:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 02:53:01 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: Message-ID: <43FEE539.9010606@ieee.org> Bill Baxter wrote: > Multiplying a matrix times a scalar seems to return junk for some reason: > > >>> A = numpy.asmatrix(numpy.rand(1,2)) > >>> A > matrix([[ 0.30604211, 0.98475225]]) > >>> A * 0.2 > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> 0.2 * A > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> numpy.__version__ > '0.9.5' > This should be fixed in SVN. -Travis From cjw at sympatico.ca Fri Feb 24 04:18:09 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 24 04:18:09 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <200602241021.27150.faltet@carabos.com> References: <200602231803.27417.faltet@carabos.com> <43FE6595.7050309@sympatico.ca> <200602241021.27150.faltet@carabos.com> Message-ID: <43FEF953.1080906@sympatico.ca> Francesc Altet wrote: >A Divendres 24 Febrer 2006 02:47, v?reu escriure: > > >>Could these be considered as dimensionless, to avoid having to explain >>to people that the word rank doesn't have the same meaning as the matrix >>rank? >> >> > >Colin Williams was proposing calling arrays coming from array(5) as >'dimensionless'. So, for the moment, we have three ways to name such a >beasts: > >- 'rank-0' >- '0-dimensional' or '0-dim' for short >- 'dimensionless' > > > My suggestion was based on the usage of rank, with a different meaning, in matrices. n-dim, with n= 0 .. ? seems a neat way around. Colin W. >Perhaps the time has come to choose a number for them (if we have to >live with such array for a long time, as it seems to be the case). > >However, as more I think on this the more I'm convinced that, >similarly to their higher dimension counterparts, we will not arrive >to any definite agreement and people will continue to call them in any >way he would be more comfortable with. IMO, this should be not a >problem at all because all three words express a 'lack of >dimensionality'. > >Cheers, > > > From magnus at hetland.org Fri Feb 24 04:27:05 2006 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri Feb 24 04:27:05 2006 Subject: [Numpy-discussion] Simple NumPy-compatible vector w/C++ & SWIG? Message-ID: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> Hi! I'm working on a data structure library where one of the element types most likely will be a vector type (i.e., points in a multidimensional space, with the dimensionality set by the user). In the data structure (which is disk-based) I have work with raw bytes that I'd like to copy around as little as possible. The library itself is (being) written in C++, but I'm wrapping it with SWIG so I can drive and test it with Python. It seems to me that something NumPy-compatible might be the best choice for the vector type, but I'm not sure how I should do that. I've been thinking about simply implementing a minimal compatibility layer for the NumPy Array Interface; is it then possible to construct a NumPy array using this custom array, and get full support for the various array operations without actually copying the data? And: Any ideas on what to do on the C++ side? Is there any code/ library out there for a vector-thing that works well in C++ *and* that has wrapping code for NumPy? (I know the STL vector is wrapped to a Python list by default -- I'm just thinking that including those things in the equation would lead to lots of copied data...) -- Magnus Lie Hetland http://hetland.org From oliphant.travis at ieee.org Fri Feb 24 04:39:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 04:39:04 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: <43FEE539.9010606@ieee.org> Message-ID: <43FEFE3C.9080309@ieee.org> Bill Baxter wrote: > Excellent. > Is working on numpy your full time job? You certainly seem to be > putting a full-time effort into it at any rate. It is appreciated. No :-) But, success of NumPy is critical for me. This rate of effort will have to significantly decrease very soon. I'm very anxious to get NumPy stable, though, and so I try to respond quickly to serious errors when I can --- I can be somewhat of a perfectionist so that problems eat at me until they are fixed --- perhaps there is medicine I can take ;-) Fortunately, several people are becoming familar with the internals of the code which is essential so that it can carry forward when my time to spend on it wanes. Thanks for the appreciation. -Travis From fullung at gmail.com Fri Feb 24 05:43:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 05:43:01 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes Message-ID: <004701c63948$17d484f0$6363630a@dsp.sun.ac.za> Hello all Recent changes to multiarraymodule.c has broken the build with Visual Studio for revision 2164 of numpy. I'd fix the problem, but I'm still a bit new to the NumPy sources. I've attached the build log in case someone can come up with a quick fix. Meanwhile, the SciPy build is also broken. Is Visual Studio considered to be a supported compiler, or is Mingw's GCC the only supported compiler for Windows builds? Regards Albert -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: numpy-msvc-rev2164.txt URL: From aisaac at american.edu Fri Feb 24 06:07:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 24 06:07:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <20060224081132.GB13312@alpha> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com><20060224081132.GB13312@alpha> Message-ID: On Fri, 24 Feb 2006, Stefan van der Walt apparently wrote: > Below follows the benchmark of seven possible repmat > functions: Probably I a misunderstanding something here. But I thought the idea of repmat was that it used a single copy of the data to represent multiple copies in a matrix, and all these functions seem ultimately to use multiple copies of the data. If that is right, then a repmat should be a subclass of matrix. I think ... Cheers, Alan Isaac From dalcinl at gmail.com Fri Feb 24 06:14:08 2006 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri Feb 24 06:14:08 2006 Subject: [Numpy-discussion] Simple NumPy-compatible vector w/C++ & SWIG? In-Reply-To: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> References: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> Message-ID: Magnus, I send you attached a SWIG file I use to interface PETSc libraries and NumPy. It is a series of macros (perhaps a bit nested), but I think this can help you to quickly define IN/OUT/INOUT typemaps for arguments like this: (int size, double* data). This approach was always enough for me. If your use case is more elaborated, please feel free to ask for other alternatives. On 2/24/06, Magnus Lie Hetland wrote: > Hi! > > I'm working on a data structure library where one of the element > types most likely will be a vector type (i.e., points in a > multidimensional space, with the dimensionality set by the user). In > the data structure (which is disk-based) I have work with raw bytes > that I'd like to copy around as little as possible. > > The library itself is (being) written in C++, but I'm wrapping it > with SWIG so I can drive and test it with Python. It seems to me that > something NumPy-compatible might be the best choice for the vector > type, but I'm not sure how I should do that. > > I've been thinking about simply implementing a minimal compatibility > layer for the NumPy Array Interface; is it then possible to construct > a NumPy array using this custom array, and get full support for the > various array operations without actually copying the data? > > And: Any ideas on what to do on the C++ side? Is there any code/ > library out there for a vector-thing that works well in C++ *and* > that has wrapping code for NumPy? (I know the STL vector is wrapped > to a Python list by default -- I'm just thinking that including those > things in the equation would lead to lots of copied data...) > > -- > Magnus Lie Hetland > http://hetland.org > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy.i Type: application/octet-stream Size: 21182 bytes Desc: not available URL: From aisaac at american.edu Fri Feb 24 06:15:12 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 24 06:15:12 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: Message-ID: On Fri, 24 Feb 2006, Bill Baxter apparently wrote: > Multiplying a matrix times a scalar seems to return junk for some reason: Confirmed. Alan Isaac Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as N >>> N.__version__ '0.9.5' >>> a=N.array([[1,2],[3,4]],'f') >>> b=N.mat(a) >>> a*0.2 array([[ 0.2 , 0.40000001], [ 0.60000002, 0.80000001]], dtype=float32) #this works ok >>> b*0.2 matrix([[ 0.2, 0.4], [ 0.6, 0.8]]) >>> a = N.rand(1,2) >>> b = N.mat(a) >>> a*0.2 array([[ 0.01992175, 0.09690914]]) # this does not work ok >>> b*0.2 matrix([[ 1.99217540e-002, 2.22617305e-309]]) >>> From ndarray at mac.com Fri Feb 24 06:25:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 06:25:03 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE9478.9040309@ieee.org> References: <200602231803.27417.faltet@carabos.com> <43FE9478.9040309@ieee.org> Message-ID: On 2/24/06, Travis Oliphant wrote: > Sasha wrote: > ... > >>>>type(array(2)*2) > >>>> > >>>> > > > > > >I believe it should result in a rank-0 array instead. > > > > > Can you be more precise about when rank-0 array should be returned and > when scalars should be? A simple rule could be that unary functions that don't change the rank when operating on higer dimensional arrays should not change the type (scalar vs. array) of the dimensionless objects. For binary operations such as in the example above, the situation is less clear, but in this case an analogy with functions helps. Multiplication between a function f(...) and a scalar 2 is naturally defined as a function (2*f)(...) = 2*f(...), where ... stands for any number of arguments including zero. The scalars should be returned when an operation involves extracting an element, or evaluation of a function. This includes indexing with a complete set of indices (and no ellipsis) and reduce operations over all elements (more on that later.) From fullung at gmail.com Fri Feb 24 06:30:02 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 06:30:02 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes In-Reply-To: <004701c63948$17d484f0$6363630a@dsp.sun.ac.za> Message-ID: <006a01c6394e$a98e85c0$6363630a@dsp.sun.ac.za> Hello all Seems the build is also broken with MinGW GCC 3.4.2. I've attached both build logs to this ticket: http://projects.scipy.org/scipy/numpy/ticket/13 Regards Albert From cjw at sympatico.ca Fri Feb 24 07:56:12 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 24 07:56:12 2006 Subject: [Numpy-discussion] subclassing ndaray Message-ID: <43FF2C92.3060304@sympatico.ca> I have a subclass Bar, a 1-dim array which has some methods and some attributes. One of the attributes is a view of the Bar to permit different shaping. Suppose that 'a' is an instance of 'Bar', which has a method 'show' and a view attribute 'v'. a ^ 15 returns a Bar instance, with its methods but without the attributes. I am attempt to change this, Bar has a method __xor__, see below: def __xor__(self, other): ''' Exclusive or: __xor__(x, y) => x ^ y . ''' z= 1 << this loops to the recursion limit result= ArrayType.__xor__(self, other) n= self.n result.n= n result.rowSize= self.rowSize result.show= self.show result.v= _n.reshape(result.view(), (n*n, n*n)) return result Could anyone suggest a workaround please? Colin W. From fullung at gmail.com Fri Feb 24 08:19:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 08:19:01 2006 Subject: [Numpy-discussion] Shapes and sizes Message-ID: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Hello all I'm trying to write a function that takes a scalar, 0d-, 1d- or 2d array and returns a scalar or an array obtained by performing some computations on this input argument. When the output of such a function is the same size as input, one can do the following to preallocate the output array: def f(a): arra = asarray(a) if a.ndim > 2: raise ValueError('invalid dimensions') b = empty_like(arra) # do some operations on arra and put the values in b return b However, when the output array only depends on the size of a, but isn't exactly the same, things seem to get more complicated. Consider a function that operates on the rows of a (or the "row" if a is 1d or 0d or a scalar). For every row of length n, the function might return a row of length (n/2 + 1) if n is even or a row of length (n + 1)/2 if n is odd. Thus, depending on the shape of a, empty must be called with one of three different shapes. def outsize(n): if n % 2 == 0: return n/2+1 return (n+1)/2 if arra.ndim == 0: b = empty(()) elif arra.ndim == 1: b = empty((outsize(arra.shape[0]),)) else: b = empty((arra.shape[0],outsize(arra.shape[1])) To me this seems like a lot of code that can could be simpler if there was a function to get the size of an array that returns a useful value even if a particular dimension doesn't exist, much like MATLAB's size, where one can write: b = zeros(size(a,1),outsize(size(a,2))); function [m]=outsize(n) if mod(n, 2) == 0; m = n/2+1; else m =(n+1)/2; end and still have it work with scalars, 1d arrays and 2d arrays. Even if there were such a function for NumPy, this still leaves the problem that the output is going to have the wrong shape for 0d and 1d arrays, specifically (1,1) and (outsize(n),1) instead of () and (outsize(n),). This problem is solved in MATLAB where there is no distinction between 1 and [1]. How do you guys deal with this problem in your functions? Regars Albert From ndarray at mac.com Fri Feb 24 09:16:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 09:16:04 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE9A87.3050104@ieee.org> References: <43FE9A87.3050104@ieee.org> Message-ID: On 2/24/06, Travis Oliphant wrote: > Sasha wrote: >... > >I propose to change numpy rules so that if ellipsis is present inside > >[], the operation is always projection and both y[1,...] and > >x[1,1,...] return zero-rank arrays. Note that I have previously > >rejected Francesc's idea that x[...] and x[()] should have different > >meaning for zero-rank arrays. I was wrong. > > > > > I think this is a good and clear rule. And it seems like we may be > "almost" there. > Anybody want to implement it? > I'll implement it. I think I am well prepared to handle this after I implemented [] for rank-0 case. > >2. Another source of ambiguity is the various "reduce" operations such > >as sum or max. Using the previous example, type(x.sum(axis=0)) is > >ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: > > > > a. Make x.sum(axis) return ndarray unless axis is None, making > >type(y.sum(axis=0)) is ndarray true in the example. > > > Hmm... I'm not sure. y.sum(axis=0) is the default spelling of sum(y). > Thus, this would cause all old code to return a rank-0 array. > > Most people who write sum(y) want a scalar, not a "function with 0 > arguments" > That's a valid concern. Maybe we can first agree that it will be helful to have some way of implementing the sum operation that always returns ndarray even in the dimensionless case. Once we agree on this goal we can choose a spelling for such operation. One possiblility is if we implement (b) to keep old behavior for y.sum(axis=0), but make y.sum(axis=(0,)) return an ndarray in all cases. The ugliness of that spelling may be an advantage because it conveys "you know what you are doing" message. > > b. Allow axis to be a sequence of ints and make > >x.sum(axis=range(rank(x))) return rank-0 array according to the rule > >2.a above. > > > So, this would sum over multiple axes? I guess I'm not opposed to > something like that, but I'm not really excited about it either. It looks like this is the kind of proposal that has a better chance of being adopted once someone implements it. I will definitely implement it if it becomes a requirement for (a) because I do need some way to spell sum that does not change the type in the dimensionless case. > Would that make sense for all methods that take the axis= argument? > I think so, but I did not review all the cases. > > > c. Make x.sum() raise an error for rank-0 arrays and scalars, but > >allow x.sum(axis=()) to return x. This will make numpy sum consistent > >with the built-in sum that does not work on scalars. > > > I don't think I like this at all. > Can you be more specific about what you don't like? Why numpy sum should be different from built-in sum? Numpy made dimentionless arrays non-iterable, isn't it logical to make them non-summable as well? Note that in dimensionful case providing non-existing exis is an error: >>> array([1]).sum(1) Traceback (most recent call last): File "", line 1, in ? ValueError: axis(=1) out of bounds Why should not this be an error in the dimensionless case? Current behavior is rather odd: >>> array(1).sum(axis=0) 1 >>> array(1).sum(axis=1) 1 > >I propose to make shape=() valid in ndarray constructor. > > > > > +1 Will do. > I think we need more thinking about rank-0 arrays before doing something > like proposal 2. However, 1 and 3 seem simple enough to move forward > with... Sounds like a plan! From ndarray at mac.com Fri Feb 24 10:45:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 10:45:01 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Message-ID: On 2/24/06, Albert Strasheim wrote: > def outsize(n): > if n % 2 == 0: return n/2+1 > return (n+1)/2 > if arra.ndim == 0: > b = empty(()) > elif arra.ndim == 1: > b = empty((outsize(arra.shape[0]),)) > else: > b = empty((arra.shape[0],outsize(arra.shape[1])) Try >>> empty(map(outsize, arra.shape)) > This problem is solved in MATLAB where there is no distinction between 1 and [1]. > > How do you guys deal with this problem in your functions? You might want to take a look at the parallel thread under "A case for rank-0 arrays." From ndarray at mac.com Fri Feb 24 10:54:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 10:54:03 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Message-ID: On 2/24/06, Sasha wrote: > Try > >>> empty(map(outsize, arra.shape)) Oops. I did not realize that you want to apply outsize to the last dimension only. For ndim>1, you can do >>> empty(arra.shape[:-1]+(outsize(arra.shape[-1]),)) That will not work work for scalars though, but you might want to rethink whether your function makes sense for scalars. Remember, 1 is not the same as [1] in python, you maybe trying to copy MATLAP design too literally. From fullung at gmail.com Fri Feb 24 11:12:13 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 11:12:13 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes In-Reply-To: <006a01c6394e$a98e85c0$6363630a@dsp.sun.ac.za> Message-ID: <036201c63976$29ae22c0$6363630a@dsp.sun.ac.za> Hello all The build breaking changes (when building with Visual Studio .NET 2003) were introduced in revision 2150 with log message "Make an enumerated type out of the scalar defines.". The following files were changed: numpy\core\include\numpy\arrayobject.h numpy\core\src\multiarraymodule.c numpy\core\src\scalartypes.inc.src numpy\core\src\ufuncobject.c The build with MinGW GCC 3.4.2 seem to have been broken at least since revision 2146. I would be willing to set up a machine to perform Windows builds using Visual Studio .NET 2003, Visual C++ 2005 Express Edition and and MinGW GCC so that we can avoid these problems in future. Anybody interested in this? Regards Albert From stefan at sun.ac.za Fri Feb 24 11:16:03 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 24 11:16:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FF2C92.3060304@sympatico.ca> References: <43FF2C92.3060304@sympatico.ca> Message-ID: <20060224191426.GD21117@alpha> I see the same strange result. Here is a minimal code example to demonstrate: import numpy as N class Bar(N.ndarray): v = 0. def __new__(cls, *args, **kwargs): print "running new" return super(Bar, cls).__new__(cls, *args) def __init__(self, *args, **kwargs): print "running init" self[:] = 0 self.v = 3 In [88]: b = Bar(3) running new running init In [89]: b Out[89]: Bar([0, 0, 0]) In [90]: b.v Out[90]: 3 In [91]: c = b+1 In [92]: c.v Out[92]: 0.0 However, if I do b[:] = 1, everything works fine. St?fan On Fri, Feb 24, 2006 at 10:56:02AM -0500, Colin J. Williams wrote: > I have a subclass Bar, a 1-dim array which has some methods and some > attributes. One of the attributes is a view of the Bar to permit > different shaping. > > Suppose that 'a' is an instance of 'Bar', which has a method 'show' and > a view attribute 'v'. > > a ^ 15 returns a Bar instance, with its methods but without the attributes. > > I am attempt to change this, Bar has a method __xor__, see below: > > def __xor__(self, other): > ''' Exclusive or: __xor__(x, y) => x ^ y . ''' > z= > 1 > << this loops to the recursion limit > result= ArrayType.__xor__(self, other) > n= self.n > result.n= n > result.rowSize= self.rowSize > result.show= self.show > result.v= _n.reshape(result.view(), (n*n, n*n)) > return result > > Could anyone suggest a workaround please? > > Colin W. From strawman at astraw.com Fri Feb 24 11:21:09 2006 From: strawman at astraw.com (Andrew Straw) Date: Fri Feb 24 11:21:09 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <43FDF1B8.3070608@bryant.edu> References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> <43FDFB80.9090106@VisionSense.com> <43FDF1B8.3070608@bryant.edu> Message-ID: <43FF5C3D.3070006@astraw.com> > > cdef int it,i > > increases the speed from 8 seconds per block to 0.2 seconds per block, > which is comparable to the mex. > > I learned that I have to be a bit more careful! :) Yes, it's always good to double-check the autogenerated C code that Pyrex makes. (This becomes especially important if you release the GIL from your Pyrex code -- I once spent days tracking a weird race condition in threaded code due to this simple oversight.) I'm glad Pyrex is working to get comparable speeds to pure C now. Cheers! Andrew From fullung at gmail.com Fri Feb 24 11:44:05 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 11:44:05 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Message-ID: <20060224194323.GA12443@dogbert.sdsl.sun.ac.za> Hello On Fri, 24 Feb 2006, Sasha wrote: > On 2/24/06, Sasha wrote: > > Try > > >>> empty(map(outsize, arra.shape)) > Oops. I did not realize that you want to apply outsize to the last > dimension only. For ndim>1, you can do > > >>> empty(arra.shape[:-1]+(outsize(arra.shape[-1]),)) Thanks, this works nicely. My code: outsize = lambda n: (n/2+1, (n+1)/2)[(n%2)%2] b = empty(a.shape[:-1]+(outsize(a.shape[-1]),)) (One line less than MATLAB ;-)) > That will not work work for scalars though, but you might want to > rethink whether your function makes sense for scalars. Remember, 1 is > not the same as [1] in python, you maybe trying to copy MATLAP design > too literally. Personally, I think asarray(scalar) should return something that can actually be used as an array (i.e. has a proper shape and ndim), but if all NumPy functions operate only on arrays to begin with, I could live with that too. Regards Albert From cookedm at physics.mcmaster.ca Fri Feb 24 11:51:04 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 24 11:51:04 2006 Subject: [Numpy-discussion] Simple NumPy-compatible vector w/C++ & SWIG? In-Reply-To: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> References: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> Message-ID: <20060224194944.GA17207@arbutus.physics.mcmaster.ca> On Fri, Feb 24, 2006 at 01:26:16PM +0100, Magnus Lie Hetland wrote: > Hi! > > I'm working on a data structure library where one of the element > types most likely will be a vector type (i.e., points in a > multidimensional space, with the dimensionality set by the user). In > the data structure (which is disk-based) I have work with raw bytes > that I'd like to copy around as little as possible. > > The library itself is (being) written in C++, but I'm wrapping it > with SWIG so I can drive and test it with Python. It seems to me that > something NumPy-compatible might be the best choice for the vector > type, but I'm not sure how I should do that. > > I've been thinking about simply implementing a minimal compatibility > layer for the NumPy Array Interface; is it then possible to construct > a NumPy array using this custom array, and get full support for the > various array operations without actually copying the data? I assume you've looked at the array interface at http://numeric.scipy.org/array_interface.html ? If you implement that (if you're working with C or C++, adding just __array_struct__ is probably the easiest), then numpy can use your vectors without copying data. Call numpy.asarray(v), and you have a numpy array with all the numpy methods. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From fullung at gmail.com Fri Feb 24 12:21:05 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 12:21:05 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes In-Reply-To: <43FF6114.9060008@ieee.org> References: <036201c63976$29ae22c0$6363630a@dsp.sun.ac.za> <43FF6114.9060008@ieee.org> Message-ID: <5eec5f300602241220m292c2a10wd704d6bd699f782f@mail.gmail.com> Hello Thanks! Looks like this solves the problems with Visual Studio. I did a python setup.py clean between builds. Shouldn't this clean up the source tree sufficiently? Regards Albert On 2/24/06, Travis Oliphant wrote: > Albert Strasheim wrote: > > >Hello all > > > > > Please remove the build directory and try with a fresh build. > > It seems many of your errors come from a new version of the C-API. > > Thanks, > > -Travis From ndarray at mac.com Fri Feb 24 12:57:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 12:57:04 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 Message-ID: Usin python 2.4.2 and latest SVN version I get the following: > python Python 2.4.2 (#3, Jan 13 2006, 13:52:39) [GCC 3.4.4 20050721 (Red Hat 3.4.4-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Found 3 tests for numpy.lib.getlimits Found 30 tests for numpy.core.numerictypes Found 11 tests for numpy.core.umath Found 8 tests for numpy.lib.arraysetops Found 42 tests for numpy.lib.type_check Found 83 tests for numpy.core.multiarray Found 3 tests for numpy.dft.helper Found 27 tests for numpy.core.ma Found 1 tests for numpy.core.oldnumeric Traceback (most recent call last): File "", line 1, in ? File "/.../lib/python2.4/site-packages/numpy/__init__.py", line 46, in test return NumpyTest('numpy').test(level, verbosity) File "/.../lib/python2.4/site-packages/numpy/testing/numpytest.py", line 422, in test suites.extend(self._get_module_tests(module, abs(level), verbosity)) File "/.../lib/python2.4/site-packages/numpy/testing/numpytest.py", line 355, in _get_module_tests self.warn('FAILURE importing tests for %s' % (mstr(module))) File "/.../lib/python2.4/site-packages/numpy/testing/numpytest.py", line 469, in warn from numpy.distutils.misc_util import yellow_text File "/.../lib/python2.4/site-packages/numpy/distutils/__init__.py", line 5, in ? import ccompiler File "/.../lib/python2.4/site-packages/numpy/distutils/ccompiler.py", line 12, in ? from exec_command import exec_command File "/.../lib/python2.4/site-packages/numpy/distutils/exec_command.py", line 54, in ? import tempfile File "/.../lib/python2.4/tempfile.py", line 33, in ? from random import Random as _Random ImportError: cannot import name Random Does anyone see the same? It looks like it fails while loading test_misc_util, but running that alone works: > python lib/python2.4/site-packages/numpy/distutils/tests/test_misc_util.py Found 4 tests for __main__ .... ---------------------------------------------------------------------- Ran 4 tests in 0.001s From ndarray at mac.com Fri Feb 24 13:06:05 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 13:06:05 2006 Subject: [Numpy-discussion] Who is maintaining the SVN repository? Message-ID: Where should I report problems with the SVN repository? I've tried to edit the log entry of my commit and got this: > svn propedit -r 2166 --revprop svn:log svn: DAV request failed; it's possible that the repository's pre-revprop-change hook either failed or is non-existent svn: At least one property change failed; repository is unchanged Is that right? From ndarray at mac.com Fri Feb 24 13:30:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 13:30:03 2006 Subject: [Numpy-discussion] Why multiple ellipses? Message-ID: Numpy allows multiple ellipses in indexing expressions, but I am not sure if that is useful. AFAIK, ellipsis stands for "as many :'s as needed", but if there is more than one, how do I know how many :'s each of them represents: >>> x = arange(8) >>> x.shape=(2,2,2) >>> x[0,...,0,...] array([0, 1]) >>> x[0,0,:] array([0, 1]) >>> x[0,:,0] array([0, 2]) In the example above, the first ellipsis represents no :'s and the last one represents one. Is that the current rule that the last ellipsis represents all the needed :'s? What is the possible use for that? From ndarray at mac.com Fri Feb 24 14:43:08 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 14:43:08 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 In-Reply-To: References: <038b01c6398a$795b3b50$6363630a@dsp.sun.ac.za> Message-ID: I've located the culprit. Once I revert numpy/core/__init__.py to 2164, tests pass. The change appears to postpone the import of NumpyTest: > svn diff -r 2164:2165 numpy/core/__init__.py Index: numpy/core/__init__.py =================================================================== --- numpy/core/__init__.py (revision 2164) +++ numpy/core/__init__.py (revision 2165) @@ -25,5 +25,6 @@ __all__ += rec.__all__ __all__ += char.__all__ -from numpy.testing import ScipyTest -test = ScipyTest().test +def test(level=1, verbosity=1): + from numpy.testing import NumpyTest + return NumpyTest().test(level, verbosity) I don't understand why this causes problems on my system, but it does. Pearu? On 2/24/06, Sasha wrote: > I've reverted to r2164 and everything is back to normal. This > suggests that r2165 changes are at fault and not my setup. That > change deals with imports and the failure that I see happens during > import. > > I've tried NUMPY_IMPORT_DEBUG=1 and got this: > > Executing 'import testing' ... ok > Executing 'from testing import ScipyTest' ... ok > Executing 'from testing import NumpyTest' ... ok > Executing 'import core' ... ok > Executing 'from core import *' ... ok > Executing 'import random' ... ok > Executing 'from random import rand' ... ok > Executing 'from random import randn' ... ok > Executing 'import lib' ... ok > Executing 'from lib import *' ... ok > Executing 'import linalg' ... ok > Executing 'import dft' ... ok > Executing 'from dft import fft' ... ok > Executing 'from dft import ifft' ... ok > Found 4 tests for numpy.lib.getlimits > Found 30 tests for numpy.core.numerictypes > Found 11 tests for numpy.core.umath > Found 8 tests for numpy.lib.arraysetops > Found 42 tests for numpy.lib.type_check > Found 83 tests for numpy.core.multiarray > Found 3 tests for numpy.dft.helper > Found 27 tests for numpy.core.ma > Traceback (most recent call last): > ... > > Any suggestions for further diagnostic? > > > On 2/24/06, Albert Strasheim wrote: > > Works for me. Just built revision 2165 on Windows with Visual Studio. > > > > Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] > > > > In [1]: import numpy > > > > In [2]: numpy.test() > > Found 83 tests for numpy.core.multiarray > > Found 3 tests for numpy.lib.getlimits > > Found 11 tests for numpy.core.umath > > Found 8 tests for numpy.lib.arraysetops > > Found 42 tests for numpy.lib.type_check > > Found 4 tests for numpy.lib.index_tricks > > Found 30 tests for numpy.core.numerictypes > > Found 27 tests for numpy.core.ma > > Found 1 tests for numpy.core.oldnumeric > > Found 9 tests for numpy.lib.twodim_base > > Found 8 tests for numpy.core.defmatrix > > Found 1 tests for numpy.lib.ufunclike > > Found 33 tests for numpy.lib.function_base > > Found 3 tests for numpy.dft.helper > > Found 1 tests for numpy.lib.polynomial > > Found 6 tests for numpy.core.records > > Found 14 tests for numpy.core.numeric > > Found 44 tests for numpy.lib.shape_base > > Found 0 tests for __main__ > > ............................................................................ > > ........................ > > ............................................................................ > > ........................ > > ............................................................................ > > ........................ > > ............................ > > ---------------------------------------------------------------------- > > Ran 328 tests in 0.781s > > > > OK > > Out[2]: > > > > In [3]: numpy.__version__ > > Out[3]: '0.9.6.2165' > > > > Cheers > > > > Albert > > > > > From Chris.Barker at noaa.gov Fri Feb 24 15:19:05 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 24 15:19:05 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: <20060224194323.GA12443@dogbert.sdsl.sun.ac.za> References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> <20060224194323.GA12443@dogbert.sdsl.sun.ac.za> Message-ID: <43FF943D.5080100@noaa.gov> Albert Strasheim wrote: > Personally, I think asarray(scalar) should return something that can > actually be used as an array (i.e. has a proper shape and ndim), What is the proper shape and numdim? Only your app knows. What I generally do with a function i want to take an array or "something that can be turned into an array" is to use asarray, then set the shape to what I am expecting: >>> import numarray as N >>> >>> def rank1(input): ... A = N.asarray(input) ... A.shape = (-1) ... print repr(A) ... >>> rank1((5,6,7,8)) array([5, 6, 7, 8]) >>> rank1(5) array([5]) >>> >>> def rank2(input): ... A = N.asarray(input) ... A.shape = (-1, 2) ... print repr(A) ... >>> rank2((2,3)) array([[2, 3]]) >>> rank2(((2,3), (4,5), (6,7))) array([[2, 3], [4, 5], [6, 7]]) -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From borreguero at gmail.com Fri Feb 24 15:23:02 2006 From: borreguero at gmail.com (Jose Borreguero) Date: Fri Feb 24 15:23:02 2006 Subject: [Numpy-discussion] unsuccessful install on 64bit machine Message-ID: <7cced4ed0602241522x6bc5d501g6d21f0aea7a141d6@mail.gmail.com> While installing numpy under a non-conventional directory: *python setup.py install --prefix==/gpfs1/active/jose/code/python *I get two classes of warnings: (1) blas_mkl_info: /tmp/numpy-0.9.5/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[] warnings.warn("Library error: libs=%s found_libs=%s" % \ NOT AVAILABLE (and others warnings similar to this) (2)numpy.core - nothing done with h_files= ['build/src/numpy/core/src/scalartypes.inc', 'build/src/numpy/core/src/arraytypes.inc', 'build/src/numpy/core/config.h', 'build/src/numpy/core/__multiarray_api.h'] (and others warnings similar to this) I created a very simple site.cfg file under numpy/distutils, which goes like this [blas] library_dirs = /usr/lib64 [lapack] library_dirs = /usr/lib64 I have no Atlas library installed in the system. My compiler: gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) (I also have pg compilers) After setup.py has run, I have no */gpfs1/active/jose/code/python/lib64/python2.3/site-packages/numpy *created. So numpy is not installed. Any ideas, please? -- Jose M. Borreguero jmborr at gatech.edu, www.borreguero.com phone: 404 407 8980 Fax: 404 385 7478 Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14St NW, Atlanta GA 30318 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu at scipy.org Fri Feb 24 15:23:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 24 15:23:03 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 In-Reply-To: References: <038b01c6398a$795b3b50$6363630a@dsp.sun.ac.za> Message-ID: On Fri, 24 Feb 2006, Sasha wrote: > I've located the culprit. Once I revert numpy/core/__init__.py to > 2164, tests pass. > > The change appears to postpone the import of NumpyTest: >> svn diff -r 2164:2165 numpy/core/__init__.py > Index: numpy/core/__init__.py > =================================================================== > --- numpy/core/__init__.py (revision 2164) > +++ numpy/core/__init__.py (revision 2165) > @@ -25,5 +25,6 @@ > __all__ += rec.__all__ > __all__ += char.__all__ > > -from numpy.testing import ScipyTest > -test = ScipyTest().test > +def test(level=1, verbosity=1): > + from numpy.testing import NumpyTest > + return NumpyTest().test(level, verbosity) > > > > I don't understand why this causes problems on my system, but it does. Pearu? Could you try svn update and reinstall of numpy now? This error could be due to the fact that numpy.distutils.__init__ imported numpy.testing while importing numpy.__init__ was not finished yet. Pearu From fullung at gmail.com Fri Feb 24 15:36:07 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 15:36:07 2006 Subject: [Numpy-discussion] Build on Windows: system_info and site.cfg Message-ID: <001501c6399b$0db5cc10$6363630a@dsp.sun.ac.za> Hello all I'm trying to build NumPy on Windows using optimized ATLAS and CLAPACK libraries. The system_info functions are currently doing things like: p = self.combine_paths(lib_dir, 'lib'+l+ext) and if self.search_static_first: exts = ['.a',so_ext] else: exts = [so_ext,'.a'] if sys.platform=='cygwin': exts.append('.dll.a') which generally isn't going work on Windows, where library names don't always start with 'lib' and always end in '.lib', for static libraries and DLL import libraries. It might be worth it to have users explicitly specify their build settings (compiler, flags, BLAS libraries, LAPACK libraries, etc.) in a site.cfg instead of trying to detect all the possible combinations of C and/or FORTRAN BLAS, LAPACK, FFTW and whatnot. A few default configurations could be provided for common configurations. Any thoughts? Is anyone interested in fixing the build on Windows? Thanks! Albert From pearu at scipy.org Fri Feb 24 15:40:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 24 15:40:02 2006 Subject: [Numpy-discussion] unsuccessful install on 64bit machine In-Reply-To: <7cced4ed0602241522x6bc5d501g6d21f0aea7a141d6@mail.gmail.com> References: <7cced4ed0602241522x6bc5d501g6d21f0aea7a141d6@mail.gmail.com> Message-ID: On Fri, 24 Feb 2006, Jose Borreguero wrote: > While installing numpy under a non-conventional directory: > *python setup.py install --prefix==/gpfs1/active/jose/code/python > *I get two classes of warnings: > > (1) blas_mkl_info: > /tmp/numpy-0.9.5/numpy/distutils/system_info.py:531: UserWarning: Library > error: libs=['mkl', 'vml', 'guide'] found_libs=[] > warnings.warn("Library error: libs=%s found_libs=%s" % \ > NOT AVAILABLE > (and others warnings similar to this) > > (2)numpy.core - nothing done with h_files= > ['build/src/numpy/core/src/scalartypes.inc', > 'build/src/numpy/core/src/arraytypes.inc', 'build/src/numpy/core/config.h', > 'build/src/numpy/core/__multiarray_api.h'] > (and others warnings similar to this) This warnings can be ignored. > I created a very simple site.cfg file under numpy/distutils, which goes like > this > [blas] > library_dirs = /usr/lib64 > [lapack] > library_dirs = /usr/lib64 To avoid such hooks and all the troubles on 64-bit platforms, numpy/distutils/system_info.py needs a 64-bit support in setting up default_* directory lists. I can write the patch if someone could provide information how to determine if one is running 64-bit or 32-bit applications. numpy.distutils.cpuinfo can give is_64bit()->True but that does not guarantee that used software is 64-bit. What are the values of sys.platform os.name in your system? > I have no Atlas library installed in the system. > My compiler: gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) (I also have pg > compilers) > > After setup.py has run, I have no > */gpfs1/active/jose/code/python/lib64/python2.3/site-packages/numpy > *created. So numpy is not installed. What was the output when you run setup.py? Pearu From oliphant at ee.byu.edu Fri Feb 24 16:09:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 24 16:09:01 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: References: Message-ID: <43FF9FD8.8090704@ee.byu.edu> Sasha wrote: >Numpy allows multiple ellipses in indexing expressions, but I am not >sure if that is useful. AFAIK, ellipsis stands for "as many :'s as >needed", but if there is more than one, how do I know how many :'s >each of them represents: > > It should be that the first ellipsis is interpreted as an ellipsis. Any others are silently converted to ':' characters. > > >>>>x = arange(8) >>>>x.shape=(2,2,2) >>>>x[0,...,0,...] >>>> >>>> >array([0, 1]) > > This is equivalent to x[0,...,0,:] which is equivalent to x[0,0,:] (because the ellipsis is interpreted as nothing). >>>>x[0,0,:] >>>> >>>> >array([0, 1]) > > >>>>x[0,:,0] >>>> >>>> >array([0, 2]) > >In the example above, the first ellipsis represents no :'s and the >last one represents one. Is that the current rule that the last >ellipsis represents all the needed :'s? What is the possible use for >that? > > > The rule is that only the first ellipsis (from left to right) is used and any others are just another spelling of ':'. This is the rule that Numeric implemented and so it's what we've kept. I have no idea what the use might be, but I saw changing the rule as gratuitous breakage. Thus, only one ellipsis is actually treated like an ellipse. Everything else is treated as ':' -Travis From oliphant.travis at ieee.org Fri Feb 24 17:41:14 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 17:41:14 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <20060224191426.GD21117@alpha> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> Message-ID: <43FFB580.80307@ieee.org> Stefan van der Walt wrote: >I see the same strange result. Here is a minimal code example to >demonstrate: > >import numpy as N > >class Bar(N.ndarray): > v = 0. > > def __new__(cls, *args, **kwargs): > print "running new" > return super(Bar, cls).__new__(cls, *args) > > def __init__(self, *args, **kwargs): > print "running init" > self[:] = 0 > self.v = 3 > > It's only strange if you have assumptions your not revealing. Here's the deal. Neither the __init__ method nor the __new__ method are called for c = b+1. So, your wondering how the Bar object got created then right? Well, it got created as a subclass of ndarray in PyArray_NewFromDescr. The __init__ and __new__ methods are not called because they may have arbitrary signatures. Instead, the __array_finalize__ method is always called. So, you should use that instead of __init__. The __array_finalize__ method always receives the argument of the "parent" object. Thus in your case. def __array_finalize__(self, parent): self.v = 3 would do what you want. -Travis >In [88]: b = Bar(3) >running new >running init > >In [89]: b >Out[89]: Bar([0, 0, 0]) > >In [90]: b.v >Out[90]: 3 > >In [91]: c = b+1 > >In [92]: c.v >Out[92]: 0.0 > >However, if I do b[:] = 1, everything works fine. > >St?fan > >On Fri, Feb 24, 2006 at 10:56:02AM -0500, Colin J. Williams wrote: > > >>I have a subclass Bar, a 1-dim array which has some methods and some >>attributes. One of the attributes is a view of the Bar to permit >>different shaping. >> >>Suppose that 'a' is an instance of 'Bar', which has a method 'show' and >>a view attribute 'v'. >> >>a ^ 15 returns a Bar instance, with its methods but without the attributes. >> >>I am attempt to change this, Bar has a method __xor__, see below: >> >> def __xor__(self, other): >> ''' Exclusive or: __xor__(x, y) => x ^ y . ''' >> z= >> 1 >> << this loops to the recursion limit >> result= ArrayType.__xor__(self, other) >> n= self.n >> result.n= n >> result.rowSize= self.rowSize >> result.show= self.show >> result.v= _n.reshape(result.view(), (n*n, n*n)) >> return result >> >>Could anyone suggest a workaround please? >> >>Colin W. >> >> > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From oliphant.travis at ieee.org Fri Feb 24 17:59:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 17:59:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FF2C92.3060304@sympatico.ca> References: <43FF2C92.3060304@sympatico.ca> Message-ID: <43FFB9C2.7080905@ieee.org> Colin J. Williams wrote: > I have a subclass Bar, a 1-dim array which has some methods and some > attributes. One of the attributes is a view of the Bar to permit > different shaping. The ndarray handles sub-classing a little-bit differently. All array constructors go through the same C-code call which can create sub-classes as well. (it's what gets called by ndarray.__new__). If there is a parent object, then additionally, the __array_finalize__(self, parent) method is called right-after creation of the sub-class. This is where attributes should be finalized. But, care must be taken in this code so that a recursion is not setup If this mechanism is not sufficient for you, then you need to use a container class (for this reason UserArray has been resurrected to serve as a default container class model---it needs more testing, however). The problem __array_finalize__ helps fix is how to get subclasses to work well without having to re-define every single special method like UserArray does. For the most part it seems to work, but I suppose it creates a few surprises if you are not aware of what is going on. The most important thing to remember is that attributes are not automatically carried over to new instances because new instances can be created without every calling __new__ or __init__. I'm sure this mechanism can be improved upon and I welcome suggestions. > Suppose that 'a' is an instance of 'Bar', which has a method 'show' > and a view attribute 'v'. > a ^ 15 returns a Bar instance, with its methods but without the > attributes. > > I am attempt to change this, Bar has a method __xor__, see below: > > def __xor__(self, other): > ''' Exclusive or: __xor__(x, y) => x ^ y . ''' > z= > > 1 > << this loops to the recursion limit > result= ArrayType.__xor__(self, other) > n= self.n > result.n= n > result.rowSize= self.rowSize > result.show= self.show > result.v= _n.reshape(result.view(), (n*n, n*n)) > return result Look at the __array_finalize__ method in defmatrix.py for ideas about how it can be used. -Travis From aisaac at american.edu Fri Feb 24 21:44:02 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 24 21:44:02 2006 Subject: [Numpy-discussion] Re: Method to shift elements in an array? Message-ID: Tim wrote: > import numpy > def roll(A, n): > "Roll the array A in place. Positive n -> roll right, negative n -> > roll left" > if n > 0: > n = abs(n) > temp = A[-n:] > A[n:] = A[:-n] > A[:n] = temp > elif n < 0: > n = abs(n) > temp = A[:n] > A[:-n] = A[n:] > A[-n:] = temp > else: > pass This probably counts as a gotcha: >>> a=N.arange(10) >>> temp=a[-6:] >>> a[6:]=a[:-6] >>> a[:6]=temp >>> a array([4, 5, 0, 1, 2, 3, 0, 1, 2, 3]) Cheers, Alan Isaac PS Here's something close to the rotater functionality. #rotater: rotate row elements # Format: y = rotater(x,r,copydata) # Input: x RxC array # rotateby size R integer array, or integer (rotation amounts) # inplace boolean (default is False -> copies data) # Output: y RxC array: # rows rotated by rotateby # or None (if inplace=True) # Remarks: Intended for use with 2D arrays. # rotateby values are positive for rightward rotation, # negative for leftward rotation # :author: Alan G Isaac (aisaac AT american DOT edu) # :date: 24 Feb 2006 def rotater(x,rotateby,inplace=False) : assert(len(x.shape)==2), "For 2-d arrays only." xrotate = numpy.array(x,copy=(not inplace)) xrows = xrotate.shape[0] #make an iterater of row shifts if isinstance(rotateby,int): from itertools import repeat rowshifts = repeat(rotateby,xrows) else: rowshifts = numpy.asarray(rotateby) assert(rowshifts.size==xrows) rowshifts = rowshifts.flat #perform rotation on each row for row in xrange(xrows): rs=rowshifts.next() #do nothing if rs==0 if rs>0: xrotate[row] = numpy.concatenate([xrotate[row][-rs:],xrotate[row][:-rs]]) elif rs<0: xrotate[row] = numpy.concatenate([xrotate[row][:-rs],xrotate[row][-rs:]]) if inplace: return None else: return xrotate From ndarray at mac.com Fri Feb 24 22:32:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 22:32:04 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 In-Reply-To: References: <038b01c6398a$795b3b50$6363630a@dsp.sun.ac.za> Message-ID: On 2/24/06, Pearu Peterson wrote: > Could you try svn update and reinstall of numpy now? This error could be > due to the fact that numpy.distutils.__init__ imported numpy.testing while > importing numpy.__init__ was not finished yet. r2168 works fine. Thanks a lot for a quick fix. From nwagner at mecha.uni-stuttgart.de Fri Feb 24 23:49:01 2006 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Fri Feb 24 23:49:01 2006 Subject: Fwd: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? Message-ID: --- the forwarded message follows --- -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From robert.kern at gmail.com Fri Feb 24 18:33:41 2006 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 24 Feb 2006 17:33:41 -0600 Subject: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: <43FECB4B.8080900@mecha.uni-stuttgart.de> References: <43FECB4B.8080900@mecha.uni-stuttgart.de> Message-ID: <43FF97D5.7050300@gmail.com> Nils Wagner wrote: > Robert Kern wrote: > >>Vidar Gundersen wrote: >> >> >> >>>and i also wonder why concatenate can't be used to stack 1-d >>>arrays on top of each other, returning a 2-d array? >>> >> >>Use vstack() for that. Also note its companions, hstack() and column_stack(). >> >> > > > Hi Robert, > > Is it possible to use aliases for these operators ? > > In textbooks (Matrix Algebra by Abadir and Magnus, Cambridge University > Press (2005)) you will find > > The vec-operator transforms a matrix into a vector by stacking its > columns one underneath the other. Ask on the list, not private email. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter --_===36437881====uni-stuttgart.de===_-- From stefan at sun.ac.za Sat Feb 25 00:41:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sat Feb 25 00:41:01 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FFB580.80307@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> Message-ID: <20060225083924.GF21117@alpha> On Fri, Feb 24, 2006 at 06:40:16PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >I see the same strange result. Here is a minimal code example to > >demonstrate: > > > >import numpy as N > > > >class Bar(N.ndarray): > > v = 0. > > > > def __new__(cls, *args, **kwargs): > > print "running new" > > return super(Bar, cls).__new__(cls, *args) > > > > def __init__(self, *args, **kwargs): > > print "running init" > > self[:] = 0 > > self.v = 3 > > > > > It's only strange if you have assumptions your not revealing. Here's > the deal. > > Neither the __init__ method nor the __new__ method are called for c = b+1. > > So, your wondering how the Bar object got created then right? Well, it > got created as a subclass of ndarray in PyArray_NewFromDescr. > > The __init__ and __new__ methods are not called because they may have > arbitrary signatures. Instead, the __array_finalize__ method is always > called. So, you should use that instead of __init__. > > The __array_finalize__ method always receives the argument of the > "parent" object. > > Thus in your case. > > def __array_finalize__(self, parent): > self.v = 3 > > would do what you want. That doesn't seem to work. __array_finalize__ isn't called when the object is initially constructed: In [14]: b = Bar(2) running new In [15]: b.v Out[15]: 0.0 In [16]: b=b+1 In [17]: b.v Out[17]: 3 Should a person then call __array_finalize__ from __init__? St?fan From cjw at sympatico.ca Sat Feb 25 05:18:07 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sat Feb 25 05:18:07 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FFB9C2.7080905@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <43FFB9C2.7080905@ieee.org> Message-ID: <440058E0.1040004@sympatico.ca> Travis Oliphant wrote: > Colin J. Williams wrote: > >> I have a subclass Bar, a 1-dim array which has some methods and some >> attributes. One of the attributes is a view of the Bar to permit >> different shaping. > > > The ndarray handles sub-classing a little-bit differently. All array > constructors go through the same C-code call which can create > sub-classes as well. (it's what gets called by ndarray.__new__). > > If there is a parent object, then additionally, the > > __array_finalize__(self, parent) > > method is called right-after creation of the sub-class. This is > where attributes should be finalized. But, care must be taken in this > code so that a recursion is not setup > > If this mechanism is not sufficient for you, then you need to use a > container class (for this reason UserArray has been resurrected to > serve as a default container class model---it needs more testing, > however). > The problem __array_finalize__ helps fix is how to get subclasses to > work well without having to re-define every single special method like > UserArray does. > > For the most part it seems to work, but I suppose it creates a few > surprises if you are not aware of what is going on. > > The most important thing to remember is that attributes are not > automatically carried over to new instances because new instances can > be created without every calling __new__ or __init__. > > I'm sure this mechanism can be improved upon and I welcome suggestions. Thanks for this. Does this mean that whenever we subclass ndarray, we should use __array_finalize__ (with its additional 'parent' parameter) instead of Python's usual __init__? It would help if you could clarify the role of 'parent'. [Dbg]>>> h(self.__array_finalize__) Help on built-in function __array_finalize__: __array_finalize__(...) Is parent the next type up in the type hierarchy? If so, can this not be determined from self.__class__? I've tried a similar operation with the Python library's sets.Set. There, __init__ is called, ensuring that the expression is of the appropriate sub-type. > >> Suppose that 'a' is an instance of 'Bar', which has a method 'show' >> and a view attribute 'v'. >> a ^ 15 returns a Bar instance, with its methods but without the >> attributes. >> >> I am attempt to change this, Bar has a method __xor__, see below: >> >> def __xor__(self, other): >> ''' Exclusive or: __xor__(x, y) => x ^ y . ''' >> z= >> >> 1 >> << this loops to the recursion limit >> result= ArrayType.__xor__(self, other) >> n= self.n >> result.n= n >> result.rowSize= self.rowSize >> result.show= self.show >> result.v= _n.reshape(result.view(), (n*n, n*n)) >> return result > > > > Look at the __array_finalize__ method in defmatrix.py for ideas about > how it can be used. def __array_finalize__(self, obj): ndim = self.ndim if ndim == 0: self.shape = (1,1) elif ndim == 1: self.shape = (1,self.shape[0]) return These are functions for which one would use __init__ in numarray. This doesn't describe or illustrate the role or purpose of the parent object. Colin W. From cjw at sympatico.ca Sat Feb 25 05:31:08 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sat Feb 25 05:31:08 2006 Subject: [Numpy-discussion] ndarray - reshaping a sub-class Message-ID: <44005BF3.7080906@sympatico.ca> The function reshape, when applied to an instance of a sub-class, returns an array instance. The method reshape returns and instance of the sub-class. It seems desirable that both be treated in the same way. Colin W. From fullung at gmail.com Sat Feb 25 07:03:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Sat Feb 25 07:03:01 2006 Subject: [Numpy-discussion] Windows Build with optimized libraries Message-ID: <002301c63a1c$80b0e610$6363630a@dsp.sun.ac.za> Hello all I've got the latest NumPy from SVN building with optimized ATLAS 3.7.11 and CLAPACK on Windows. I've also replaced the CLAPACK functions that are provided by ATLAS with the ATLAS ones. The current page on building with Windows: http://www.scipy.org/Installing_SciPy/Windows doesn't have instructions to do this, so I'd like to add info on building with MinGW, Visual Studio .NET 2003, Visual C++ Toolkit 2003 and Visual C++ 2005 Express Edition (if I can figure out how to make distutils detect Visual C++ 2005). I had to change the build scripts in a few places to get things to work. I've attached the patch if someone is interested in committing it to SVN. Briefly, I did the following: 1. Built ATLAS 3.7.11 with Cygwin. I copied libatlas.a as atlas.lib and libcblas.a as cblas.lib to some directory, say c:\tmp\numpylibs. 2. Built CLAPACK 3.0 for Windows with Visual Studio .NET 2003. I added cblaswr.c to clapack.lib and disabled building of the other projects, except for libI77 and libF77. I also changed the project properties of these three projects to use SSE2 instructions (under C/C++ | Code Generation | Enable Enhanced Instruction Set). I don't know if this makes much difference though (anybody have some benchmarks?). 3. I then took release builds of clapack.lib, libF77.lib and libI77.lib and rolled them together with ATLAS's liblapack.a: cp clapack.lib lapack.lib ar x liblapack.a mkdir Release ar x libI77.lib ar x libF77.lib ar r lapack.lib Release/*.obj *.o This adds the symbols from libI77 and libF77 to the library and replaces any existing symbols with the symbols from the ATLAS LAPACK library. I copied this lapack.lib to c:\tmp\numpylibs. 4. I created the file numpy\numpy\distutils\site.cfg with contents: [atlas] library_dirs = c:\tmp\numpylibs atlas_libs = cblas,atlas [lapack] library_dirs = c:\tmp\numpylibs lapack_libs = lapack 5.1. Visual Studio: python setup.py bdist_wininst 5.2. MinGW: python setup.py config --compiler=mingw32 build --compiler=mingw32 bdist_wininst That's it. The build generated a shiny numpy-0.9.6.2168.win32-py2.4.exe. A quick question: it seems that NumPy can also use FFTW 2.1.5 to speed up its FFT functions. Is this the case? If so, I'll take a look at building FFTW 2.1.5 on Windows too. fftw.org's links to solution files for 2.1.3 are broken, so I'll probably have to make new ones. Hope this helps. Regards Albert -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy-windowsbuild.diff Type: application/octet-stream Size: 2117 bytes Desc: not available URL: From ndarray at mac.com Sat Feb 25 08:17:01 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 25 08:17:01 2006 Subject: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: > >>Vidar Gundersen wrote: > > The vec-operator transforms a matrix into a vector by stacking its > > columns one underneath the other. >>> x matrix([[1, 2], [3, 4]]) >>> matrix([[1,2],[3,4]]).T.ravel() matrix([[1, 3, 2, 4]]) From robert.kern at gmail.com Sat Feb 25 09:52:04 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sat Feb 25 09:52:04 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: Nils Wagner wrote: >> Hi Robert, >> >>Is it possible to use aliases for these operators ? >> >>In textbooks (Matrix Algebra by Abadir and Magnus, Cambridge University >>Press (2005)) you will find >> >>The vec-operator transforms a matrix into a vector by stacking its >>columns one underneath the other. It's possible to add some aliases, sure. No, we're not going to do it. There are already too many different names for the same thing in numpy because of backwards compatibility. We should not exacerbate the problem. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From pebarrett at gmail.com Sat Feb 25 09:57:02 2006 From: pebarrett at gmail.com (Paul Barrett) Date: Sat Feb 25 09:57:02 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: <43FF9FD8.8090704@ee.byu.edu> References: <43FF9FD8.8090704@ee.byu.edu> Message-ID: <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> On 2/24/06, Travis Oliphant wrote: > > Sasha wrote: > > >Numpy allows multiple ellipses in indexing expressions, but I am not > >sure if that is useful. AFAIK, ellipsis stands for "as many :'s as > >needed", but if there is more than one, how do I know how many :'s > >each of them represents: > > > > > It should be that the first ellipsis is interpreted as an ellipsis. Any > others are silently converted to ':' characters. > > > > > > >>>>x = arange(8) > >>>>x.shape=(2,2,2) > >>>>x[0,...,0,...] > >>>> > >>>> > >array([0, 1]) > > > > > This is equivalent to > > x[0,...,0,:] > > which is equivalent to > > x[0,0,:] (because the ellipsis is interpreted as nothing). > > >>>>x[0,0,:] > >>>> > >>>> > >array([0, 1]) > > > > > >>>>x[0,:,0] > >>>> > >>>> > >array([0, 2]) > > > >In the example above, the first ellipsis represents no :'s and the > >last one represents one. Is that the current rule that the last > >ellipsis represents all the needed :'s? What is the possible use for > >that? > > > > > > > The rule is that only the first ellipsis (from left to right) is used > and any others are just another spelling of ':'. > > This is the rule that Numeric implemented and so it's what we've kept. > I have no idea what the use might be, but I saw changing the rule as > gratuitous breakage. This might be a good time to change this behavior, since I've yet to find a good reason for keeping it. Maybe we can depricate it until version 1.0. -- Paul Thus, only one ellipsis is actually treated like an ellipse. Everything > else is treated as ':' > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Sat Feb 25 10:14:05 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 25 10:14:05 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> References: <43FF9FD8.8090704@ee.byu.edu> <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> Message-ID: On 2/25/06, Paul Barrett wrote: > ... > On 2/24/06, Travis Oliphant wrote: > > ... > > The rule is that only the first ellipsis (from left to right) is used > > and any others are just another spelling of ':'. > > ... > > This might be a good time to change this behavior, since I've yet to find a > good reason for keeping it. Maybe we can depricate it until version 1.0. > I am very much supporting deprecation. The distinction between '...' and ':' is hard enough to explain without '...' treated as ':' in some cases. I would suggest to allow it in 1.0, but issue python deprecation warning with a text message "repeated ellipses replaced by :'s". From oliphant.travis at ieee.org Sat Feb 25 11:55:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 25 11:55:10 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: References: <43FF9FD8.8090704@ee.byu.edu> <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> Message-ID: <4400B5EB.10808@ieee.org> Sasha wrote: >I am very much supporting deprecation. The distinction between '...' >and ':' is hard enough to explain without '...' treated as ':' in some >cases. I would suggest to allow it in 1.0, but issue python >deprecation warning with a text message "repeated ellipses replaced by >:'s". > > I'm fine with that. -Travis From cjw at sympatico.ca Sat Feb 25 12:00:02 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sat Feb 25 12:00:02 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: <4400B722.6080109@sympatico.ca> Robert Kern wrote: >Nils Wagner wrote: > > > >>>Hi Robert, >>> >>>Is it possible to use aliases for these operators ? >>> >>>In textbooks (Matrix Algebra by Abadir and Magnus, Cambridge University >>>Press (2005)) you will find >>> >>>The vec-operator transforms a matrix into a vector by stacking its >>>columns one underneath the other. >>> >>> > >It's possible to add some aliases, sure. No, we're not going to do it. There are >already too many different names for the same thing in numpy because of >backwards compatibility. We should not exacerbate the problem. > > > Could these be put into a separate modules and only included when Numeric compatibility is desired? This would help to reduce the clutter. Colin W. Colin W From aisaac at american.edu Sat Feb 25 12:07:29 2006 From: aisaac at american.edu (Alan G Isaac) Date: Sat Feb 25 12:07:29 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: <4400B722.6080109@sympatico.ca> References: <4400B722.6080109@sympatico.ca> Message-ID: > Robert Kern wrote: >> It's possible to add some aliases, sure. No, we're not going to do it. There are >> already too many different names for the same thing in numpy because of >> backwards compatibility. We should not exacerbate the problem. On Sat, 25 Feb 2006, "Colin J. Williams" apparently wrote: > Could these be put into a separate modules and only > included when Numeric compatibility is desired? > This would help to reduce the clutter. Beyond this? Cheers, Alan Isaac >>> help(numpy.core.oldnumeric) Help on module numpy.core.oldnumeric in numpy.core: NAME numpy.core.oldnumeric - # Compatibility module containing deprecated names FILE c:\python24\lib\site-packages\numpy\core\oldnumeric.py From robert.kern at gmail.com Sat Feb 25 12:08:05 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sat Feb 25 12:08:05 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: <4400B722.6080109@sympatico.ca> References: <4400B722.6080109@sympatico.ca> Message-ID: Colin J. Williams wrote: > Robert Kern wrote: >> It's possible to add some aliases, sure. No, we're not going to do it. >> There are >> already too many different names for the same thing in numpy because of >> backwards compatibility. We should not exacerbate the problem. >> > Could these be put into a separate modules and only included when > Numeric compatibility is desired? Most, if not all, of the core aliases are already isolated in numpy.core.oldnumeric. Some of the other packages also have some aliases _in situ_, too. I would personally like it if the core aliases weren't imported by default, but I think that's a decision that should have been made (one way or the other) some months ago when the first wave of code conversion was going on. I don't want to trigger a second wave of code conversion just for asthetics. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From aisaac at american.edu Sat Feb 25 13:25:06 2006 From: aisaac at american.edu (Alan G Isaac) Date: Sat Feb 25 13:25:06 2006 Subject: Fwd: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: On Sat, 25 Feb 2006, Nils Wagner apparently wrote: > Is it possible to use aliases for these operators ? > In textbooks (Matrix Algebra by Abadir and Magnus, > Cambridge University Press (2005)) you will find > The vec-operator transforms a matrix into a vector by > stacking its columns one underneath the other. At http://www.american.edu/econ/pytrix/pyGAUSS.py you will find these vec operations (as GAUSS "look alikes"). hth, Alan Isaac #vec: vectorize columns of 2-D array # Format: y = vec(x) # Input: x RxK 2-D array (or matrix) # Output: y (RK)x1 2-D array: # stacked columns of x # Remarks: ravel OK for non-contiguous arrays # Author: Alan G Isaac (aisaac AT american DOT edu) # Date: 20050420 def vec(x): assert(len(x.shape)==2), "For 2-d arrays only." return x.transpose().ravel().reshape((x.size,-1)) #vecr: vectorize rows of 2-D array # Format: y = vecr(x) # Input: x RxK 2-D array (or matrix) # Output: y (RK)x1 2-D array: # stacked rows of x # Remarks: ravel OK for non-contiguous arrays # Author: Alan G Isaac (aisaac AT american DOT edu) # Date: 20050420 def vecr(x): assert(len(x.shape)==2), "For 2-d arrays only." return x.ravel().reshape((x.size,-1)) From fullung at gmail.com Sun Feb 26 04:52:18 2006 From: fullung at gmail.com (Albert Strasheim) Date: Sun Feb 26 04:52:18 2006 Subject: [Numpy-discussion] Triangular window function Message-ID: <003d01c63ad3$5d4b6350$6363630a@dsp.sun.ac.za> Hello all NumPy already has Hamming, Hanning and some other window functions. It would be useful if the triangular window in SciPy could also be brought over. Any thoughts? Regards Albert From ndarray at mac.com Sun Feb 26 11:31:57 2006 From: ndarray at mac.com (Sasha) Date: Sun Feb 26 11:31:57 2006 Subject: [Numpy-discussion] [SciPy-user] Messing with missing values Message-ID: I am replying on "numpy-discussion" because this is really a numpy rather than scipy topic. > Unfortunately, most of the numpy/scipy functions don't handle missing values > nicely. Can you specify which *numpy* functions are giving you trouble? That should be fixed. > How could I mask the values corresponding to > MA.masked in the final list, without having to check every single element? Latest ma allows you to pass masked arrays directly to ufuncs. In order for this to work a ufunc should be registered in the "domains" and "fills" dictionaries. Not much documentation on this feature exists yet, so you will have to read the code in ma.py to figure this out. > Date: Sat, 25 Feb 2006 18:36:19 -0500 > From: pgmdevlist at mailcan.com > Subject: [SciPy-user] Messing with missing values > To: scipy-user at scipy.net > Message-ID: <200602251836.20406.pgmdevlist at mailcan.com> > Content-Type: text/plain; charset="us-ascii" > > Folks, > Most of the data I work with have missing values, and I rely on 'ma' a lot. > Unfortunately, most of the numpy/scipy functions don't handle missing values > nicely. Not a problem I thought, I just have to adapt the functions I need. > I have 2 options: wrapping the initial functions in tests, or recode the > initial functions. > * According to your experience, which is the most efficient way to go ? > * I have a function that outputs either a float or MA.masked. I call this > function recursively, appending the result to a list, and then trying to > process the list as an array. How could I mask the values corresponding to > MA.masked in the final list, without having to check every single element? > > Thanks for your ideas From robert.kern at gmail.com Sun Feb 26 12:28:08 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sun Feb 26 12:28:08 2006 Subject: [Numpy-discussion] Re: Triangular window function In-Reply-To: <003d01c63ad3$5d4b6350$6363630a@dsp.sun.ac.za> References: <003d01c63ad3$5d4b6350$6363630a@dsp.sun.ac.za> Message-ID: Albert Strasheim wrote: > Hello all > > NumPy already has Hamming, Hanning and some other window functions. It would > be useful if the triangular window in SciPy could also be brought over. > > Any thoughts? Please, no more. Of course it would be useful if a function in scipy were brought over to numpy. *Any* function in scipy. Without a more compelling reason, like special functions that are provided by C99, I'm -1 on moving anything else from scipy to numpy. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From pgmdevlist at mailcan.com Sun Feb 26 21:20:03 2006 From: pgmdevlist at mailcan.com (pgmdevlist at mailcan.com) Date: Sun Feb 26 21:20:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: References: Message-ID: <200602270019.24151.pgmdevlist@mailcan.com> On Sunday 26 February 2006 14:19, Sasha wrote: > I am replying on "numpy-discussion" because this is really a numpy > rather than scipy topic. My bad, sorry for that. > > Unfortunately, most of the numpy/scipy functions don't handle missing > > values nicely. > > Can you specify which *numpy* functions are giving you trouble? > That should be fixed. Typical examples: median, stdev, diff... `stdev` is obvious, `median` straightforward for 1d arrays (and I'm still looking for an optimal method for higher dimension). The couple of `shape_base` functions I tried (`hstack`, `column_stack`..) required to fill the array beforehand, and superimposing the corresponding mask. Or even some methods such as `ndim` (more for convenience than anything, a `len(x.shape)` does the trick for both masked & unmasked versions), or r_[]. I remmbr a message a couple of weeks ago wondering whether ma should be kpet uptodate with the rest of numpy (and of course, I can't find the reference right now). What's the status on ma ? > > How could I mask the values corresponding to > > MA.masked in the final list, without having to check every single > > element? > > Latest ma allows you to pass masked arrays directly to ufuncs. In > order for this to work a ufunc should be registered in the "domains" > and "fills" dictionaries. Not much documentation on this feature > exists yet, so you will have to read the code in ma.py to figure this > out. Let's take the `median` example for 2D arrays. I end up with something like: --- med = [] for x_i in x: med.append(median1d(x_i.compressed()) --- with `median1d` a slightly modified version of the basic numpy `median`, outputing `MA.masked` if `x_i.compressed()` is `None`. I need the `med` list to be a masked_array. Paul Dubois suggests: --- return ma.array(med, mask=[x is ma.masked for x in med]) --- I guess that's more efficient than the --- return MA.masked_values(med.filled(nodata),nodata) --- I had come up with. AAMOF, it seems even faster to hardcode the `median1d` part in the loop. But yes, I gonna check the sources for the ufunc. Thanks again. -- Pierre GM From zpincus at stanford.edu Sun Feb 26 21:35:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Sun Feb 26 21:35:01 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? Message-ID: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> Hi folks, I'm trying to write an ndarray subclass with a constructor like the matrix constructor -- one which can take matrix objects, array objects, or things that can be turned into array objects. I've copied the __new__ method from matrix (and tried to eliminate the matrix-specific stuff), but there's a lot of code there. So I'm trying to figure out what the absolute minimum I need is for correct behavior. (This would be a useful wiki entry somewhere. In fact, a whole page about subclassing ndarray would be good.) What follows is what I have so far. Have I missed anything, or can anything else be removed? Zach class contour(numpy.ndarray): def __new__(subtype, data, dtype=None, copy=True): ##### Do I need this first if block? ##### Wouldn't the second block would do fine on its own? if isinstance(data, contour): dtype2 = data.dtype if (dtype is None): dtype = dtype2 if (dtype2 == dtype) and (not copy): return data return data.astype(dtype) if isinstance(data, numpy.ndarray): if dtype is None: intype = data.dtype else: intype = numpy.dtype(dtype) new = data.view(contour) if intype != data.dtype: return new.astype(intype) if copy: return new.copy() else: return new # now convert data to an array arr = numpy.array(data, dtype=dtype, copy=copy) ##### Do I need this if block? if not (arr.flags.fortran or arr.flags.contiguous): arr = arr.copy() ##### Do I need the fortran flag? ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, buffer=arr, fortran=arr.flags.fortran) return ret From schofield at ftw.at Mon Feb 27 02:57:03 2006 From: schofield at ftw.at (Ed Schofield) Date: Mon Feb 27 02:57:03 2006 Subject: [Numpy-discussion] Sparse matrix hooks Message-ID: <4402DAA8.3030501@ftw.at> I'm trying to improve integration between SciPy's sparse matrices and NumPy's dense array/matrix objects. One problem I'm facing is that NumPy matrices redefine the * operator to call NumPy's dot() function. Since dot() has no awareness of SciPy's sparse matrix objects, this doesn't work for the operation 'dense * sparse'. (It does work for sparse * dense, which calls sparse.__mul__ first.) I'd like to propose the addition of a basic sparse matrix object to NumPy. This wouldn't need to provide any actual functionality, but could instead provide a skeletal interface or base class from which sparse matrices in other packages (SciPy and potentially PySparse) could derive. The spmatrix object in SciPy would be a good starting point. The benefit of this would be that hooks for proper handling of sparse matrices would be easy to provide for functions like dot(), where(), and var(). There may be other ways to make 'dense * sparse' work in SciPy, but I haven't been able to come up with any. This solution would at least be quite flexible and quite straightforward. -- Ed From pearu at scipy.org Mon Feb 27 03:09:04 2006 From: pearu at scipy.org (Pearu Peterson) Date: Mon Feb 27 03:09:04 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: <4402DAA8.3030501@ftw.at> References: <4402DAA8.3030501@ftw.at> Message-ID: On Mon, 27 Feb 2006, Ed Schofield wrote: > > I'm trying to improve integration between SciPy's sparse matrices and > NumPy's dense array/matrix objects. One problem I'm facing is that > NumPy matrices redefine the * operator to call NumPy's dot() function. > Since dot() has no awareness of SciPy's sparse matrix objects, this > doesn't work for the operation 'dense * sparse'. (It does work for > sparse * dense, which calls sparse.__mul__ first.) Have you tried defining sparse.__rmul__? dense.__mul__ should raise an exception when it does not know about the rhs operant and then Python calls .__rmul__. Pearu From schofield at ftw.at Mon Feb 27 04:38:01 2006 From: schofield at ftw.at (Ed Schofield) Date: Mon Feb 27 04:38:01 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: References: <4402DAA8.3030501@ftw.at> Message-ID: <4402F293.8050606@ftw.at> Pearu Peterson wrote: > On Mon, 27 Feb 2006, Ed Schofield wrote: > >> I'm trying to improve integration between SciPy's sparse matrices and >> NumPy's dense array/matrix objects. One problem I'm facing is that >> NumPy matrices redefine the * operator to call NumPy's dot() function. >> Since dot() has no awareness of SciPy's sparse matrix objects, this >> doesn't work for the operation 'dense * sparse'. (It does work for >> sparse * dense, which calls sparse.__mul__ first.) > > Have you tried defining sparse.__rmul__? dense.__mul__ should raise > an exception when it does not know about the rhs operant and then > Python calls .__rmul__. Yes, we've defined __rmul__, and this works fine for dense arrays, whose __mul__ raises an exception. The problem is that matrix.__mul__ calls dot(), which doesn't raise an exception, but rather creates an oddball object array: matrix([[ (1, 0) 0.0 (2, 1) 0.0 (3, 0) 0.0, (1, 0) 0.0 (2, 1) 0.0 (3, 0) 0.0, (1, 0) 0.0 (2, 1) 0.0 (3, 0) 0.0]], dtype=object) We could potentially modify the __mul__ function of numpy's matrix objects to make a guess about whether an array constructed out of the argument will somehow be sane or whether, like here, it should raise an exception. But this would be difficult to get right, since the sparse matrix formats are quite varied (some supporting the map/sequence protocols, some not, etc.). But being able to test isinstance(arg, spmatrix) would make this easy. -- Ed From cjw at sympatico.ca Mon Feb 27 05:07:51 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 27 05:07:51 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> Message-ID: <4402F958.7010902@sympatico.ca> Zachary Pincus wrote: > Hi folks, > > I'm trying to write an ndarray subclass with a constructor like the > matrix constructor -- one which can take matrix objects, array > objects, or things that can be turned into array objects. > > I've copied the __new__ method from matrix (and tried to eliminate > the matrix-specific stuff), but there's a lot of code there. So I'm > trying to figure out what the absolute minimum I need is for correct > behavior. (This would be a useful wiki entry somewhere. In fact, a > whole page about subclassing ndarray would be good.) > > What follows is what I have so far. Have I missed anything, or can > anything else be removed? > > Zach > > class contour(numpy.ndarray): > def __new__(subtype, data, dtype=None, copy=True): > > ##### Do I need this first if block? > ##### Wouldn't the second block would do fine on its own? > if isinstance(data, contour): > dtype2 = data.dtype > if (dtype is None): > dtype = dtype2 > if (dtype2 == dtype) and (not copy): > return data > return data.astype(dtype) > > if isinstance(data, numpy.ndarray): > if dtype is None: > intype = data.dtype > else: > intype = numpy.dtype(dtype) > new = data.view(contour) > if intype != data.dtype: > return new.astype(intype) > if copy: return new.copy() > else: return new > > # now convert data to an array > arr = numpy.array(data, dtype=dtype, copy=copy) > > ##### Do I need this if block? > if not (arr.flags.fortran or arr.flags.contiguous): > arr = arr.copy() > > ##### Do I need the fortran flag? > ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, > buffer=arr, fortran=arr.flags.fortran) > return ret > > Would there be any merit in breaking this into two parts, __new__ to allocate space and __init__ to initialize the data? Colin W. From pearu at scipy.org Mon Feb 27 05:10:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Mon Feb 27 05:10:03 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: <4402F293.8050606@ftw.at> References: <4402DAA8.3030501@ftw.at> <4402F293.8050606@ftw.at> Message-ID: On Mon, 27 Feb 2006, Ed Schofield wrote: > Pearu Peterson wrote: >> On Mon, 27 Feb 2006, Ed Schofield wrote: >> >>> I'm trying to improve integration between SciPy's sparse matrices and >>> NumPy's dense array/matrix objects. One problem I'm facing is that >>> NumPy matrices redefine the * operator to call NumPy's dot() function. >>> Since dot() has no awareness of SciPy's sparse matrix objects, this >>> doesn't work for the operation 'dense * sparse'. (It does work for >>> sparse * dense, which calls sparse.__mul__ first.) >> >> Have you tried defining sparse.__rmul__? dense.__mul__ should raise >> an exception when it does not know about the rhs operant and then >> Python calls .__rmul__. > > Yes, we've defined __rmul__, and this works fine for dense arrays, whose > __mul__ raises an exception. The problem is that matrix.__mul__ calls > dot(), which doesn't raise an exception, but rather creates an oddball > object array: > > matrix([[ (1, 0) 0.0 > (2, 1) 0.0 > (3, 0) 0.0, > (1, 0) 0.0 > (2, 1) 0.0 > (3, 0) 0.0, > (1, 0) 0.0 > (2, 1) 0.0 > (3, 0) 0.0]], dtype=object) > > > We could potentially modify the __mul__ function of numpy's matrix > objects to make a guess about whether an array constructed out of the > argument will somehow be sane or whether, like here, it should raise an > exception. But this would be difficult to get right, since the sparse > matrix formats are quite varied (some supporting the map/sequence > protocols, some not, etc.). But being able to test isinstance(arg, > spmatrix) would make this easy. Sure, isinstance(arg,spmatrix) would work but it is not a general solution for performing binary operations with matrices and such user defined objects that numpy is not aware of. But these objects may be aware of numpy matrices or arrays. Sparse matrix is one example. Other example is defining a symbolic matrix. Etc. So, IMHO matrix.__mul__ (or dot) should be fixed instead. Pearu From martin.wiechert at gmx.de Mon Feb 27 05:47:00 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Mon Feb 27 05:47:00 2006 Subject: [Numpy-discussion] object members Message-ID: <200602271436.12659.martin.wiechert@gmx.de> Hi developers, any plans on when object members will be back? a = ndarray (shape = (10,), dtype = {'names': ['x'], 'formats': ['|O4']}) TypeError: fields with object members not yet supported. in numpy 0.9.5. Of course, this is better than the segfault in 0.9.4, but it would be quite inconvenient for my project to not have object members. My C code still produces dtypes with object members. Can I safely use them, as long as I make sure new arrays are properly initialised? Thanks, Martin. From schofield at ftw.at Mon Feb 27 06:41:08 2006 From: schofield at ftw.at (Ed Schofield) Date: Mon Feb 27 06:41:08 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: References: <4402DAA8.3030501@ftw.at> <4402F293.8050606@ftw.at> Message-ID: <44030F74.1060508@ftw.at> Pearu Peterson wrote: > On Mon, 27 Feb 2006, Ed Schofield wrote: >> Yes, we've defined __rmul__, and this works fine for dense arrays, whose >> __mul__ raises an exception. The problem is that matrix.__mul__ calls >> dot(), which doesn't raise an exception, but rather creates an oddball >> object array: >> >> matrix([[ (1, 0) 0.0 >> (2, 1) 0.0 >> (3, 0) 0.0, >> (1, 0) 0.0 >> (2, 1) 0.0 >> (3, 0) 0.0, >> (1, 0) 0.0 >> (2, 1) 0.0 >> (3, 0) 0.0]], dtype=object) >> >> >> We could potentially modify the __mul__ function of numpy's matrix >> objects to make a guess about whether an array constructed out of the >> argument will somehow be sane or whether, like here, it should raise an >> exception. But this would be difficult to get right, since the sparse >> matrix formats are quite varied (some supporting the map/sequence >> protocols, some not, etc.). But being able to test isinstance(arg, >> spmatrix) would make this easy. > > Sure, isinstance(arg,spmatrix) would work but it is not a general > solution for performing binary operations with matrices and such user > defined objects that numpy is not aware of. But these objects may be > aware of numpy matrices or arrays. Sparse matrix is one example. Other > example is defining a symbolic matrix. Etc. > So, IMHO matrix.__mul__ (or dot) should be fixed instead. Ah, yes, this could be the simplest solution (at least to the __mul__ problem). We could redefine matrix.__mul__ as def __mul__(self, other): if isinstance(other, N.ndarray) or not hasattr(other, '__rmul__') \ or N.isscalar(other): return N.dot(self, other) else: return NotImplemented This seems to fix multiplication. I may make a case later for sparse matrix hooks for other functions, but I don't see a pressing need right now. ;) -- Ed From ndarray at mac.com Mon Feb 27 08:14:02 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 08:14:02 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: <200602270019.24151.pgmdevlist@mailcan.com> References: <200602270019.24151.pgmdevlist@mailcan.com> Message-ID: On 2/27/06, pgmdevlist at mailcan.com wrote: > ... > I remmbr a message a couple of weeks ago wondering whether ma should be kpet > uptodate with the rest of numpy (and of course, I can't find the reference > right now). What's the status on ma ? > Ma will be supported in numpy. See FAQ "Does NumPy support nan ("not a number")?." Ma development page is at http://projects.scipy.org/scipy/numpy/wiki/MaskedArray . Feel free to add contents there. I would welcome a section listing numpy functions that are still not available in ma. > Let's take the `median` example for 2D arrays. Median is one of those examples where Paul's recommendation does not work because missing values should be ignored rather than filled. For example, in R median has two modes: to ignore missing values and to return missing value if any value is missing: > median(c(1,NA)) [1] NA > help(median) > median(c(1,NA),na.rm=TRUE) [1] 1 > median(c(1,0)) [1] 0.5 From zpincus at stanford.edu Mon Feb 27 08:50:10 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Mon Feb 27 08:50:10 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <4402F958.7010902@sympatico.ca> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> Message-ID: <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> On Feb 27, 2006, at 5:06 AM, Colin J. Williams wrote: > [snip] > Would there be any merit in breaking this into two parts, __new__ > to allocate space and __init__ to initialize the data? What I presented below is exactly the __new__ grabbed from the matrix class definition in defmatrix.py (less matrix-specific stuff). I assume it's overkill for what I need, but it seemed like a good starting place. Figuring out what bits are really necessary and what bits aren't would be step 1, before I can even think about whether some of the bits should really live in __init__ (though I can't actually see any of the below lining in init, because most of the code is given over to deciding whether or not to allocate space). Zach > Zachary Pincus wrote: >> >> What follows is what I have so far. Have I missed anything, or >> can anything else be removed? >> >> Zach >> >> class contour(numpy.ndarray): >> def __new__(subtype, data, dtype=None, copy=True): >> >> ##### Do I need this first if block? >> ##### Wouldn't the second block would do fine on its own? >> if isinstance(data, contour): >> dtype2 = data.dtype >> if (dtype is None): >> dtype = dtype2 >> if (dtype2 == dtype) and (not copy): >> return data >> return data.astype(dtype) >> >> if isinstance(data, numpy.ndarray): >> if dtype is None: >> intype = data.dtype >> else: >> intype = numpy.dtype(dtype) >> new = data.view(contour) >> if intype != data.dtype: >> return new.astype(intype) >> if copy: return new.copy() >> else: return new >> >> # now convert data to an array >> arr = numpy.array(data, dtype=dtype, copy=copy) >> >> ##### Do I need this if block? >> if not (arr.flags.fortran or arr.flags.contiguous): >> arr = arr.copy() >> >> ##### Do I need the fortran flag? >> ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, >> buffer=arr, fortran=arr.flags.fortran) >> return ret >> >> > Would there be any merit in breaking this into two parts, __new__ > to allocate space and __init__ to initialize the data? > > Colin W. > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the > live webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From oliphant.travis at ieee.org Mon Feb 27 10:10:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 10:10:02 2006 Subject: [Numpy-discussion] object members In-Reply-To: <200602271436.12659.martin.wiechert@gmx.de> References: <200602271436.12659.martin.wiechert@gmx.de> Message-ID: <4403403B.1050405@ieee.org> Martin Wiechert wrote: >Hi developers, > >any plans on when object members will be back? > > Probably not for a while unless somebody wants to tackle the issues that are present. >My C code still produces dtypes with object members. Can I safely use them, as >long as I make sure new arrays are properly initialised? > > You can only safely use dtypes that are a single OBJECT array. If a record data-type has object members, it is not accounted for in all of the code that special-checks for object arrays. Basically, all of that code would need to be adjusted to deal with has-object (i.e. void arrays that have a part of their memory layout that is an object). The issue is that reference counts need to be handled correctly for that portion of the data-record. This was not considered as the code was written, so it would involve a bit of work to get right. It absolutely could be done, but it's not on my priority list and my time for working on NumPy has dwindled. -Travis From ndarray at mac.com Mon Feb 27 10:17:16 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 10:17:16 2006 Subject: [Numpy-discussion] Unexpected property of ndarray.fill Message-ID: The function ndarray.fill is documented as taking a scalar value, but in the current implementation it accepts arrays as well and ignores all but the first element Example 1: >>> x = empty(10) >>> x.fill([1,2]) >>> x array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) Example 2: >>> y = empty((2,3)) >>> y.fill([1,2,3]) >>> y array([[1, 1, 1], [1, 1, 1]]) I believe this somewhat unexpected. I would expect one of the following to apply: 1. Exception in both examples 2. Exception in Example 1 and array([[1,2,3], [1,2,3]]) in Example 2 3. array([1, 2, 1, 2, 1, 2, 1, 2, 1, 2]) in Example 1 and array([[1,2,3], [1,2,3]]) in Example 2 From oliphant at ee.byu.edu Mon Feb 27 11:26:04 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 11:26:04 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <20060225083924.GF21117@alpha> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> Message-ID: <44035227.9010609@ee.byu.edu> Stefan van der Walt wrote: >>The __init__ and __new__ methods are not called because they may have >>arbitrary signatures. Instead, the __array_finalize__ method is always >>called. So, you should use that instead of __init__. >> >> This is now true in SVN. Previously, __array_finalize__ was not called if the "parent" was NULL. However, now, it is still called with None as the value of the first argument. Thus __array_finalize__ will be called whenever ndarray.__new__(,...) is called. -Travis From oliphant at ee.byu.edu Mon Feb 27 11:38:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 11:38:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <20060227190844.GA28750@alpha> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44033E75.3010200@ieee.org> <20060227190844.GA28750@alpha> Message-ID: <440354DB.5020006@ee.byu.edu> Stefan van der Walt wrote: >If I understand correctly, the __array_finalize__ method should copy >all meta data from the parent object to the child. In other words, I >might have something like: > >def __init__(self, ...): > self.v = 3 > >def __array_finalize__(self, parent): > self.v = parent.v > >I do not want to do something like > >def __array_finalize__(self, parent): > self.v = 3 > >Because then, every time I do an operation on my array, self.v will be >reset. Shouldn't array_finalize look for such methods/properties and copy >them automatically? > You need to set up __array_finalize__ to do that. I did not want to do that for every sub-class because different things are required for different sub classes. So, this seems like a workable compromise. It should work a bit better now that I've changed it so that __array_finalize__ is called on every sub-class creation. If there is no "parent" then parent will be None. -Travis From aisaac at american.edu Mon Feb 27 11:50:08 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 27 11:50:08 2006 Subject: [Numpy-discussion] Unexpected property of ndarray.fill In-Reply-To: References: Message-ID: On Mon, 27 Feb 2006, Sasha apparently wrote: > 3. array([1, 2, 1, 2, 1, 2, 1, 2, 1, 2]) in Example 1 and > array([[1,2,3], [1,2,3]]) in Example 2 A user preference for 3. And if the fill array is too long, the rest should be discarded. Cheers, Alan Isaac From chanley at stsci.edu Mon Feb 27 11:57:02 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Mon Feb 27 11:57:02 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <440354DB.5020006@ee.byu.edu> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44033E75.3010200@ieee.org> <20060227190844.GA28750@alpha> <440354DB.5020006@ee.byu.edu> Message-ID: <4403593F.4090201@stsci.edu> Travis Oliphant wrote: > So, this seems like a workable compromise. It should work a bit better > now that I've changed it so that __array_finalize__ is called on every > sub-class creation. If there is no "parent" then parent will be None. Hi Travis, The change you made just broke the following bit of code: class FITS_rec(rec.recarray): def __new__(subtype, input): """Construct a FITS record array from a recarray.""" # input should be a record array self = rec.recarray.__new__(subtype, input.shape, input.dtype, buf=input.data, strides=input.strides) self._nfields = len(self.dtype.fields[-1]) self._convert = [None]*len(self.dtype.fields[-1]) self._coldefs = None return self def __array_finalize__(self,obj): self._convert = obj._convert self._coldefs = obj._coldefs self._nfields = obj._nfields In [1]: from pyfits import FITS_rec In [2]: from numpy import rec In [3]: data = FITS_rec(rec.array(None,formats="i4,i4,i4",names="c1,c2,c3",shape=3)) --------------------------------------------------------------------------- exceptions.AttributeError Traceback (most recent call last) /data/sparty1/dev/pyfits-numpy/test/ /data/sparty1/dev/site-packages/lib/python/pyfits.py in __new__(subtype, input) 3136 # self.__setstate__(input.__getstate__()) 3137 self = rec.recarray.__new__(subtype, input.shape, input.dtype, -> 3138 buf=input.data, strides=input.strides) 3139 3140 # _parent is the original (storage) array, /data/sparty1/dev/site-packages/lib/python/numpy/core/records.py in __new__(subtype, shape, formats, names, titles, buf, offset, strides, byteorder, aligned) 153 self = sb.ndarray.__new__(subtype, shape, (record, descr), 154 buffer=buf, offset=offset, --> 155 strides=strides) 156 return self 157 /data/sparty1/dev/site-packages/lib/python/pyfits.py in __array_finalize__(self, obj) 3150 def __array_finalize__(self,obj): 3151 # self._parent = obj._parent -> 3152 self._convert = obj._convert 3153 self._coldefs = obj._coldefs 3154 self._nfields = obj._nfields AttributeError: 'NoneType' object has no attribute '_convert' Given what you have just said I would not have expected this to be broken. Chris From oliphant at ee.byu.edu Mon Feb 27 12:01:06 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 12:01:06 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403593F.4090201@stsci.edu> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44033E75.3010200@ieee.org> <20060227190844.GA28750@alpha> <440354DB.5020006@ee.byu.edu> <4403593F.4090201@stsci.edu> Message-ID: <44035A5E.2030100@ee.byu.edu> Christopher Hanley wrote: > Travis Oliphant wrote: > >> So, this seems like a workable compromise. It should work a bit >> better now that I've changed it so that __array_finalize__ is called >> on every sub-class creation. If there is no "parent" then parent >> will be None. > > > Hi Travis, > > The change you made just broke the following bit of code: Right. It's because obj can sometimes be None... > > class FITS_rec(rec.recarray): > def __new__(subtype, input): > """Construct a FITS record array from a recarray.""" > # input should be a record array > self = rec.recarray.__new__(subtype, input.shape, input.dtype, > buf=input.data, strides=input.strides) > > self._nfields = len(self.dtype.fields[-1]) > self._convert = [None]*len(self.dtype.fields[-1]) > self._coldefs = None > return self > > def __array_finalize__(self,obj): > self._convert = obj._convert > self._coldefs = obj._coldefs > self._nfields = obj._nfields Add as the first line. if obj is None: return -Travis > > Given what you have just said I would not have expected this to be > broken. The change is that now __array_finalize__ is always called (even if there is no "parent" in which case obj is None). This seems easier to explain and a bit more consistent. Feedback encouraged... -Travis From strawman at astraw.com Mon Feb 27 12:16:03 2006 From: strawman at astraw.com (Andrew Straw) Date: Mon Feb 27 12:16:03 2006 Subject: [Numpy-discussion] bug report: nonzero on masked arrays Message-ID: <44035DEB.6030703@astraw.com> I know there's been some discussion along these lines lately, but I haven't followed it closely and thought I could still give a simple bug report. I can post this on Trac if no one can tackle it immediately -- let me know. --Andrew In [1]: import numpy In [2]: numpy.__file__ Out[2]: '/home/astraw/py2.3-linux-x86_64/lib/python2.3/site-packages/numpy-0.9.6.2172-py2.3-linux-x86_64.egg/numpy/__init__.pyc' In [3]: jj= numpy.ma.masked_array( [0,1,2,3,0,4,5,6], mask=[0,0,0,1,1,1,0,0]) In [4]: numpy.ma.nonzero(jj) --------------------------------------------------------------------------- exceptions.NameError Traceback (most recent call last) /home/astraw/src/kookaburra/flydra/analysis/ /home/astraw/py2.3-linux-x86_64/lib/python2.3/site-packages/numpy-0.9.6.2172-py2.3-linux-x86_64.egg/numpy/core/ma.py in __call__(self, a, *args, **kwargs) 317 else: 318 if m.shape != shape: --> 319 m = mask_or(getmaskarray(a), getmaskarray(b)) 320 return masked_array(result, m) 321 NameError: global name 'b' is not defined In [5]: From paul at pfdubois.com Mon Feb 27 12:55:04 2006 From: paul at pfdubois.com (Paul F. Dubois) Date: Mon Feb 27 12:55:04 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: References: <200602270019.24151.pgmdevlist@mailcan.com> Message-ID: <440366F2.8090705@pfdubois.com> Some things don't make sense for missing value arrays. For example, FFT's. Median does not make sense if you fill so you have to use compress. I think for each operation you must decide what you mean exactly. If it fits the "easy" wrapper model, you can do that, but otherwise, you have to code it. Sasha wrote: > On 2/27/06, pgmdevlist at mailcan.com wrote: >> ... >> I remmbr a message a couple of weeks ago wondering whether ma should be kpet >> uptodate with the rest of numpy (and of course, I can't find the reference >> right now). What's the status on ma ? >> > > Ma will be supported in numpy. See FAQ > "Does NumPy support nan ("not a > number")?." > > Ma development page is at > http://projects.scipy.org/scipy/numpy/wiki/MaskedArray . > Feel free to add contents there. I would welcome a section listing > numpy functions that are still not available in ma. > > >> Let's take the `median` example for 2D arrays. > > Median is one of those examples where Paul's recommendation does not > work because missing values should be ignored rather than filled. For > example, in R median has two modes: to ignore missing values and to > return missing value if any value is missing: > >> median(c(1,NA)) > [1] NA >> help(median) >> median(c(1,NA),na.rm=TRUE) > [1] 1 >> median(c(1,0)) > [1] 0.5 > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From aisaac at american.edu Mon Feb 27 14:03:15 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 27 14:03:15 2006 Subject: [Numpy-discussion] missing array type Message-ID: The recent discussion of Matlab's repmat plus some recent use of grids leads me to ask: should numpy contain a true repeated array object, where one single copy of the data supports a full set of array operations? (So that, e.g., repmat(x,(3,2)) would simply point to the x data or, if desired, make a single copy of it.) And actually this seems a special case of a truly space-saving Kronecker product. Cheers, Alan Isaac From ndarray at mac.com Mon Feb 27 14:27:13 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 14:27:13 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: It looks like what you want is a zero-stride array that I proposed some time ago. See "zeros in strides" thread . I've posted a patch to the list, but it was met with a mild opposition from Travis, so I've never committed it to SVN. The final word was: """ I would also like to get more opinions about Sasha's proposal for zero-stride arrays. -Travis """ If you agree that zero-stride array would provide the functionality that you need, it may tip the ballance towards accepting that patch. On 2/27/06, Alan G Isaac wrote: > The recent discussion of Matlab's repmat > plus some recent use of grids leads me to ask: > should numpy contain a true repeated array object, > where one single copy of the data supports > a full set of array operations? (So that, > e.g., repmat(x,(3,2)) would simply point > to the x data or, if desired, make a single > copy of it.) > > And actually this seems a special case of > a truly space-saving Kronecker product. > > Cheers, > Alan Isaac > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ndarray at mac.com Mon Feb 27 14:48:03 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 14:48:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: <200602271658.44855.pgmdevlist@mailcan.com> References: <200602270019.24151.pgmdevlist@mailcan.com> <200602271658.44855.pgmdevlist@mailcan.com> Message-ID: Please reply to the list rather than in private e-mail. Private e-mail is likely to end up in a spam folder. See more below. On 2/27/06, pierregm wrote: > Sasha, > Thanks for your answer. > > Ma development page is at > > http://projects.scipy.org/scipy/numpy/wiki/MaskedArray . > > Feel free to add contents there. I would welcome a section listing > > numpy functions that are still not available in ma. > > OK, I'll work on that > Great! > > Median is one of those examples where Paul's recommendation does not > > work because missing values should be ignored rather than filled. For > > example, in R median has two modes: to ignore missing values and to > > return missing value if any value is missing: > > OK, good idea. I already implemented a mamedian for 1d and 2d array. Could you > tell me where I could upload it to be double-checked/tested ? I see three logical locations for patches: 1. Attach to the wiki page: . 2. Post to the list: . 3. Upload to sourceforge: . I don't have any preference, but if you choose 1 or 3, please announce the URL on the list. Also it is best to post patches as an output tof "svn diff". From aisaac at american.edu Mon Feb 27 15:43:02 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 27 15:43:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: On Mon, 27 Feb 2006, Sasha apparently wrote: > If you agree that zero-stride array would provide the > functionality that you need, it may tip the ballance > towards accepting that patch. I am out of my technical depth here. Based on your examples, zero-stride arrays seem both logical and desirable. They do seem just right for a simpler (and substantially generalized) nd_grid. But I think they only partially address problems two other that have come up. 1. repmat http://www.mathworks.com/access/helpdesk/help/techdoc/ref/repmat.html#998661 Perhaps you will say that the best representation of a repmat will use 4 dimensions, two with zero strides? I wonder if a more general cycling is needed for a natural repeated matrix. 2. Kronecker product: http://www.mathworks.com/access/helpdesk/help/techdoc/ref/kron.html#998881 This seems a different issue altogether. I suspect the right way to produce kron(x,y) is usually as a class whose data is x and y, with the Kronecker product never actually stored in memory. I do not see zero-stride arrays as helping here. Cheers, Alan Isaac From ndarray at mac.com Mon Feb 27 16:13:07 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 16:13:07 2006 Subject: [Numpy-discussion] Faster fill Message-ID: I''ve just posted a patch at http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas that results in a 6-14x speed-up of ndarray.fill for simple datatypes. That's a bigger change than what I am confortable submiting to svn without a review. Also since I am not familiar with record arrays, I am not sure whether or not this change would break anything in that area. Finally, the patch only adresses a single-segment case, "strided" arrays would still use old code. From oliphant at ee.byu.edu Mon Feb 27 16:22:20 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 16:22:20 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: <440397A1.9020607@ee.byu.edu> Sasha wrote: >It looks like what you want is a zero-stride array that I proposed >some time ago. See "zeros in strides" thread >. > >I've posted a patch to the list, but it was met with a mild opposition >from Travis, so I've never committed it to SVN. The final word was: > >""" >I would also like to get more opinions about Sasha's proposal for >zero-stride arrays. > >-Travis >""" > >If you agree that zero-stride array would provide the functionality >that you need, it may tip the ballance towards accepting that patch. > > Actually, I think it does. I think 0-stride arrays are acceptable (I think you can make them now but you have to provide your own memory, right?) From one perspective, all we are proposing to do is allow numpy to create the memory *and* allow differently strided arrays, right? Now, if it creates the memory, the strides must be C-contiguous or fortran-contiguous. We are going to allow user-specified strides, now, even on memory creation. Sasha, your initial patch was pretty good but I was concerned about the speed of array creation being changed for other cases. If you can speed up PyArray_NewFromDescr (probably by only changing the branch that currently raises an error), then I think your proposed changes should be O.K. The check on the provided strides argument needs to be thought through so that we don't accept strides that will allow walking outside the memory that *is* or *will be* allocated. I have not reviewed your code for this, but I'm assuming you've thought that through? -Travis From oliphant at ee.byu.edu Mon Feb 27 16:44:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 16:44:02 2006 Subject: [Numpy-discussion] Faster fill In-Reply-To: References: Message-ID: <44039CA8.4080102@ee.byu.edu> Sasha wrote: >I''ve just posted a patch at >http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas >that results in a 6-14x speed-up of ndarray.fill for simple >datatypes. That's a bigger change than what I am confortable >submiting to svn without a review. Also since I am not familiar with >record arrays, I am not sure whether or not this change would break >anything in that area. Finally, the patch only adresses a >single-segment case, "strided" arrays would still use old code. > > This looks like a good proceedure. For those less familiar with the code. Sasha added an additional "data-type-function" to the structure of data-type-specific functions. Each builtin data-type has a pointer to a list of functions. A while ago I moved these functions out from the data-type object so that they could grow. This is probably a good fundamental operation for speeding up. I would probably also not define the misaligned case, but just use the default code for that case as well. This is consistent with the idea that misaligned data will have slower operations. To make your code work for record arrays you need to handle the VOID case. I would just not define it for the void case at this point. Several other data-type functions need to be improved to handle record arrays (look at setitem and getitem for guidance) better anyway. You also need to add a check so that if the function pointer is NULL, the optimized function is not called. But, generally, this is the right use of the data-type-functions. Good job. -Travis From jh at oobleck.astro.cornell.edu Mon Feb 27 18:24:05 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Mon Feb 27 18:24:05 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <20060221145310.A69A912C2A@sc8-sf-spam2.sourceforge.net> (numpy-discussion-request@lists.sourceforge.net) References: <20060221145310.A69A912C2A@sc8-sf-spam2.sourceforge.net> Message-ID: <200602280223.k1S2NZL3020466@oobleck.astro.cornell.edu> I fixed a small error. I found myself a bit lost. Are cookbook pages supposed to be introductory, or are they aimed at users who already know a fair bit? From the page, I can vaguely get that record arrays allow you access to your data using text strings as partial indexers, but I found myself focusing so much on testing whether this was true and figuring out how they work that I was completely distracted from the example. A paragraph or two at the top explaining what a record array is, why it's useful, and what the basic properties are would be good. Then give the example. Also, it's less than intuitive why you are storing RGB values (usually thought of as integers in the range 0-255 or 0x00 - 0xff) in 32-bit floating-point numbers. --jh-- From ndarray at mac.com Mon Feb 27 18:36:02 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 18:36:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <440397A1.9020607@ee.byu.edu> References: <440397A1.9020607@ee.byu.edu> Message-ID: On 2/27/06, Travis Oliphant wrote: > .... I think 0-stride arrays are acceptable (I > think you can make them now but you have to provide your own memory, > right?) > Not really. Ndarray constructor has never allowed zeros in strides. It was possible to set strides to a tuple containing zeros after construction in some cases. I've changed that in r2054 . Currently zero strides are not allowed. > From one perspective, all we are proposing to do is allow numpy to > create the memory *and* allow differently strided arrays, right? Another way to view this is that we are proposing to change the computation of memory requirements to consider strides instead of item size and number of items. Zero strides require only one item to be allocated for any number of items in the array. > Now, if it creates the memory, the strides must be C-contiguous or > fortran-contiguous. We are going to allow user-specified strides, now, > even on memory creation. > Yes. > Sasha, your initial patch was pretty good but I was concerned about the > speed of array creation being changed for other cases. If you can speed > up PyArray_NewFromDescr (probably by only changing the branch that > currently raises an error), then I think your proposed changes should be > O.K. > I will probably not be able to do it right away. Meanwhile I've created a wiki page for this mini-project . > The check on the provided strides argument needs to be thought through > so that we don't accept strides that will allow walking outside the > memory that *is* or *will be* allocated. > > I have not reviewed your code for this, but I'm assuming you've thought > that through? That was the central issue in the patch: how to compute the size of the buffer in the presence of zero strides, so I hope I got it right. In order to make zero stride arrays really useful, they should survive transformation by ufunc. With my patch if x is a zero-stride array of length N, then exp(x) is a regular array and exp is called N times to compute the result. That would be a much bigger project. As a first step, I would just disallow using zero-stride arrays as output to avoid problems with inplace operations. In any case, everyone interested in this feature is invited to edit the wiki page at . From cjw at sympatico.ca Mon Feb 27 19:23:04 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 27 19:23:04 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <44035227.9010609@ee.byu.edu> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> Message-ID: <4403C20A.5060603@sympatico.ca> Travis Oliphant wrote: > Stefan van der Walt wrote: > >>> The __init__ and __new__ methods are not called because they may >>> have arbitrary signatures. Instead, the __array_finalize__ method >>> is always called. So, you should use that instead of __init__. >>> >> > This is now true in SVN. Previously, __array_finalize__ was not > called if the "parent" was NULL. However, now, it is still called > with None as the value of the first argument. > > Thus __array_finalize__ will be called whenever ndarray.__new__( subclass>,...) is called. Why this change in style from the the common Python idom of __new__, __init__, with the same signature to __new__, __array_finalize__ with possibly different signatures? Incidentally, what are the signatures? The doc string is empty: [Dbg]>>> _n.ndarray.__array_finalize__.__doc__ [Dbg]>>> Colin W. From cjw at sympatico.ca Mon Feb 27 19:30:00 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 27 19:30:00 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: <4403C3AB.1040600@sympatico.ca> Sasha wrote: >It looks like what you want is a zero-stride array that I proposed >some time ago. See "zeros in strides" thread >. > >I've posted a patch to the list, but it was met with a mild opposition >from Travis, so I've never committed it to SVN. The final word was: > >""" >I would also like to get more opinions about Sasha's proposal for >zero-stride arrays. > >-Travis >""" > >If you agree that zero-stride array would provide the functionality >that you need, it may tip the ballance towards accepting that patch. > > (-1) The proposal add complexity. I don't see the compensating benefit. Colin W. > >On 2/27/06, Alan G Isaac wrote: > > >>The recent discussion of Matlab's repmat >>plus some recent use of grids leads me to ask: >>should numpy contain a true repeated array object, >>where one single copy of the data supports >>a full set of array operations? (So that, >>e.g., repmat(x,(3,2)) would simply point >>to the x data or, if desired, make a single >>copy of it.) >> >>And actually this seems a special case of >>a truly space-saving Kronecker product. >> >>Cheers, >>Alan Isaac >> >> >> >> >>------------------------------------------------------- >>This SF.Net email is sponsored by xPML, a groundbreaking scripting language >>that extends applications into web and mobile media. Attend the live webcast >>and join the prime developer group breaking into this new coding territory! >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 >>_______________________________________________ >>Numpy-discussion mailing list >>Numpy-discussion at lists.sourceforge.net >>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> >> > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From ndarray at mac.com Mon Feb 27 19:43:05 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 19:43:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403C3AB.1040600@sympatico.ca> References: <4403C3AB.1040600@sympatico.ca> Message-ID: On 2/27/06, Colin J. Williams wrote: > ... > (-1) The proposal add complexity. I don't see the compensating benefit. Did you read the original thread? I thought I clearly explained what the benefit was. Added complexity is minimal because at the C level zero-stride arrays are already possible. All I was proposing was to safely expose existing C level functionality to Python. From ndarray at mac.com Mon Feb 27 21:11:06 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 21:11:06 2006 Subject: [Numpy-discussion] Faster fill In-Reply-To: <44039CA8.4080102@ee.byu.edu> References: <44039CA8.4080102@ee.byu.edu> Message-ID: On 2/27/06, Travis Oliphant wrote: > Sasha wrote: > > >I''ve just posted a patch at > >http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas > ... > This looks like a good proceedure. On the second thought, it may be better to commit experimental changes to a branch in svn and merge to the trunk after review. What do you think? > I would probably also not define the misaligned case, but just use the > default code for that case as well. This is consistent with the idea > that misaligned data will have slower operations. > It turns out my approach does not speed up misaligned case. I've posted a new patch that incorporates your suggestions at . (Note the change in location.) > > You also need to add a check so that if the function pointer is NULL, > the optimized function is not called. > Done in the new patch. The patch passes numpy.test(10), but I don't think it tests ndarray.fill in any meaningful way. I will probably need to add some tests. From oliphant.travis at ieee.org Mon Feb 27 21:43:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 21:43:02 2006 Subject: [Numpy-discussion] Faster fill In-Reply-To: References: <44039CA8.4080102@ee.byu.edu> Message-ID: <4403E2BE.1090801@ieee.org> Sasha wrote: >On the second thought, it may be better to commit experimental changes >to a branch in svn and merge to the trunk after review. What do you >think? > > This is always possible. It really depends on how significant the changes are. These changes are somewhat isolated to a single bit of functionality (adding a new item to the function structure pointed to by each PyArray_Descr shouldn't change anything else). As long as svn compiles and passes all tests, I think it can be merged directly. I see branches as being needed when a feature requires more testing and it is less clear how invasive the changes will be. In this case, I would say go ahead and apply the feature directly.. >The patch passes numpy.test(10), but I don't think it tests >ndarray.fill in any meaningful way. I will probably need to add some >tests. > > Tests are always good. In fact, it's an easy way for someone to contribue, since there are a lot of features I have only tested using examples in my book (the book examples serve as an additional set of tests that I regularly run). -Travis From zpincus at stanford.edu Mon Feb 27 21:53:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Mon Feb 27 21:53:02 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> Message-ID: <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> Hi again. I would like to put together a wiki page about writing ndarray subclasses because this is obviously a difficult topic, and the available documentation (e.g. looking at defmatrix) doesn't cover all -- or even the most common -- uses. As such, I am trying to put together a "skeleton" ndarray subclass which has all the basic features (a __new__ that allows for direct construction of such objects from other data objects, and porpagation of simple attributes in __array_finalize__). Right now I am trying to figure out what the minimum complement of things that need to go into such a __new__ method is. Below is my first effort, derived from defmatrix. Any help identifying parts of this code that are unnecessary, or parts that need to be added, would directly result in a better wiki page once I figure everything out. Zach > What follows is what I have so far. Have I missed anything, or can > anything else be removed? > > Zach > > class contour(numpy.ndarray): > def __new__(subtype, data, dtype=None, copy=True): > > ##### Do I need this first if block? > ##### Wouldn't the second block would do fine on its own? > if isinstance(data, contour): > dtype2 = data.dtype > if (dtype is None): > dtype = dtype2 > if (dtype2 == dtype) and (not copy): > return data > return data.astype(dtype) > > if isinstance(data, numpy.ndarray): > if dtype is None: > intype = data.dtype > else: > intype = numpy.dtype(dtype) > new = data.view(contour) > if intype != data.dtype: > return new.astype(intype) > if copy: return new.copy() > else: return new > > # now convert data to an array > arr = numpy.array(data, dtype=dtype, copy=copy) > > ##### Do I need this if block? > if not (arr.flags.fortran or arr.flags.contiguous): > arr = arr.copy() > > ##### Do I need the fortran flag? > ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, > buffer=arr, fortran=arr.flags.fortran) > return ret > From oliphant.travis at ieee.org Mon Feb 27 22:02:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 22:02:04 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: <440397A1.9020607@ee.byu.edu> Message-ID: <4403E72E.4040007@ieee.org> Sasha wrote: >On 2/27/06, Travis Oliphant wrote: > > >>.... I think 0-stride arrays are acceptable (I >>think you can make them now but you have to provide your own memory, >>right?) >> >> >> >Not really. Ndarray constructor has never allowed zeros in strides. It >was possible to set strides to a tuple containing zeros after >construction in some cases. I've changed that in r2054 >. Currently >zero strides are not allowed. > > Ah, right. It was only possible to do it in C-code. But, it is possible to do it in C-code. Since Colin has expressed some reservations, it's probably a good idea to continue the discussion before doing anything. One issue I have with zero-stride arrays is that essentially is what broadcasting is all about. Recently there has been a discussion about bringing repmat functionality over. The repmat function is used in some array languages largely because there is no such thing as broadcasting and arrays are not ND. Perhaps what is desired instead is rather than play games with indexing on a two dimensional array you simply define the appropriate 4-dimensional array. Currently you can define the size of the new dimensions to be 1 and they will act like 0-strided arrays when you operate with other arrays of any desired shape. Zero-strided arrays are actually quite fundamental to the notion of broadcasting. [Soap Box] I've been annoyed for several years that the idea of linear operators is constrained in most libraries to 2 dimensions. There are many times I want to find an inverse of an operator that is most naturally expressed with 6 dimensions. I have to myself play games with indexing to give the computer a matrix it can understand. Why is that? I think the computer should be doing the work of raveling and unraveling those indices for me. I think we have the opportunity in NumPy/SciPy to be much more general. A tensor class that handles the "index-raveling" that so many people have become conditioned to think is necessary could and should be handled by the class. If you've ever written finite-element code you should know exactly what I mean. [End Soap Box] On the one hand, we could just tell people to try and use broadcasting so that zero-strided arrays show up in Python in definitive ways. On the other we can just expose the power of zero-strided arrays to Python and let people come up with their own rules. I lean toward giving people the capability and letting them show me what it can do. The only thing controversial, I think is the behavior of outputs on ufuncs for strided arrays. Currently ufunc outputs always have full strides unless an output array is given. Changing this default behavior would require some justification (not to mention some code tweaking). I'm not immediately inclined to change it even if zero-strided arrays are allowed to be created from Python. >In order to make zero stride arrays really useful, they should survive >transformation by ufunc. With my patch if x is a zero-stride array of >length N, then exp(x) is a regular array and exp is called N times to >compute the result. That would be a much bigger project. As a first >step, I would just disallow using zero-stride arrays as output to >avoid problems with inplace operations. > > Hmm.. Could you show us again what you mean by these problems and the better behavior that could happen if ufuncs were changed? -Travis From oliphant.travis at ieee.org Mon Feb 27 22:13:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 22:13:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403C20A.5060603@sympatico.ca> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> <4403C20A.5060603@sympatico.ca> Message-ID: <4403E9CA.8050506@ieee.org> > Travis Oliphant wrote: > >> Stefan van der Walt wrote: >> >>>> The __init__ and __new__ methods are not called because they may >>>> have arbitrary signatures. Instead, the __array_finalize__ method >>>> is always called. So, you should use that instead of __init__. >>>> >>> >>> >> This is now true in SVN. Previously, __array_finalize__ was not >> called if the "parent" was NULL. However, now, it is still called >> with None as the value of the first argument. >> >> Thus __array_finalize__ will be called whenever ndarray.__new__(> subclass>,...) is called. > > > Why this change in style from the the common Python idom of __new__, > __init__, with the same signature to __new__, __array_finalize__ with > possibly different signatures? > I don't see it as a change in style but adding a capability to the ndarray subclass. The problem is that arrays can be created in many ways (slicing, ufuncs, etc). Not all of these ways should go through the __new__/__init__ -- style creation mechanism. Try inheriting from a float builtin and add attributes. Then add your float to an instance of your new class and see what happens. You will get a float-type on the output. This is the essence of Paul's insight that sub-classing is rarely useful because you end up having to re-define all the operators anyway to return the value that you want. He knows whereof he speaks as well, because he wrote MA and UserArray and had experience with Python sub-classing. I wanted a mechanism to make it easier to sub-class arrays and have the operators return your object if possible (including all of it's attributes). Thus, __array_priority__ (a floating point attribute) __array_finalize__ (a method called on internal construction of the array wrapper). were invented (along with __array_wrap__ which any class can define to have their objects survive ufuncs). It was easy enough to see where to call __array_finalize__ in the C-code if somewhat difficult to explain (and get exception handling to work because of my initial over-thinking). The signature is just __array_finalize__(self, parent): return i.e. any return value is ignored (but exceptions are caught). I've used the feature succesfully on at least 3-subclasses (chararray, memmap, and matrix) and so I'm actually pretty happy with it. __new__ and __init__ are still relevant for constructing your brand-new object. The __array_finalize__ function is just what the internal contructor that acutally allocates memory will always call to let you set final attributes *every* time your sub-class gets created. -Travis From zpincus at stanford.edu Mon Feb 27 22:31:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Mon Feb 27 22:31:01 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403E9CA.8050506@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> <4403C20A.5060603@sympatico.ca> <4403E9CA.8050506@ieee.org> Message-ID: <8626BCED-0630-47E0-8DB8-C225579EA9C6@stanford.edu> > __array_priority__ (a floating point attribute) > __array_wrap__ (which any class can define to have their objects > survive ufuncs) What do these do again? That is, how are they used internally? What happens if they're not used? If I (or anyone else) is to put together a wiki page about this (see the other thread I just emailed to, please), getting good concise descriptions of what ndarray subclasses need to do/can do would be very helpful. Zach From oliphant.travis at ieee.org Mon Feb 27 22:33:00 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 22:33:00 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> Message-ID: <4403EE5C.709@ieee.org> Zachary Pincus wrote: > Hi again. > > I would like to put together a wiki page about writing ndarray > subclasses because this is obviously a difficult topic, and the > available documentation (e.g. looking at defmatrix) doesn't cover all > -- or even the most common -- uses. Great. Let me see if I can help out with some concepts. First of all, you should probably mention that UserArray might be what people want. UserArray is a standard "container" class meaning that the array is just one of it's attributes and UserArray doesn't inherit from the array. I finally realized that evey though you can "directly" inherit from the ndarray, sometimes that is not what you want to do, but instead want a capable container class. I think multiple inheritence with other builtins is a big reason to want this feature as was pointed out on the list a few days ago. The biggest reason to inherit from the ndarray, is if you want your object to satisify the isinstance(obj, ndarray) test... Inheriting from the ndarray is as simple as class myarray(numpy.ndarray): pass Now, your array will be constructed. To get your arrays you can either use the standard array constructor array([1,2,3,4]).view(myarray) Or use the new myarray constructor --- which now has the same signature as ndarray.__new__. For several reasons (probably not any good ones, though :-) ), the default ndarray signature is built for "wrapping" around memory (exposed by another object through the buffer protocol in Python) or for creating uninitalized new memory. So, if you over-write the __new__ constructor for your class and want to call the ndarray.__new__ constructor you have to realize that you need to think of what you are doing in terms of "wrapping" some other created piece of memory or "initializing your memory". If you want your array to be-able to "convert" arbitrary objects to arrays, instead, then your constructor could in-fact be as simple as class myarray(numpy.ndarray): def __new__(cls, obj): return numpy.array(obj).view(cls) Then, if you want, you can define an __init__ method to handle setting of attributes --- however, if you set some attributes, then you need to think about what you want to happen when your new array gets "sliced" or added to. Because the internal code will create your new array (without calling new) and then call __array_finalize__(self, parent) where parent could be None (if there is no parent --- i.e. this is a new array). Any attributes you define should also be defined here so they get passed on to all arrays that are created.. I hope this helps some. -Travis From zpincus at stanford.edu Tue Feb 28 01:28:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 01:28:04 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <4403EE5C.709@ieee.org> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> <4403EE5C.709@ieee.org> Message-ID: <135C7D6D-F504-425D-BD5D-13574184AEBA@stanford.edu> Thanks Travis, I think I'm getting a hold on most of what's going on. The __array_priority__ bit remains a bit opaque (can anyone offer guidance?), and I still have some questions about why the __new__ of the matrix subclass ha so much complexity. > So, if you over-write the __new__ constructor for your class and > want to call the ndarray.__new__ constructor you have to realize > that you need to think of what you are doing in terms of "wrapping" > some other created piece of memory or "initializing your memory". This makes good sense. > If you want your array to be-able to "convert" arbitrary objects to > arrays, instead, then your constructor could in-fact be as simple as > > class myarray(numpy.ndarray): > def __new__(cls, obj): > return numpy.array(obj).view(cls) Ok, gotcha. However, the matrix class's __new__ has a lot more complexity. Is *all* of the complexity there in the service of ensuring that matrices are only 2d? It seems like there's more going on there than just that... > Then, if you want, you can define an __init__ method to handle > setting of attributes --- however, if you set some attributes, then > you need to think about what you want to happen when your new array > gets "sliced" or added to. Because the internal code will create > your new array (without calling new) and then call > > __array_finalize__(self, parent) > > where parent could be None (if there is no parent --- i.e. this is > a new array). > > Any attributes you define should also be defined here so they get > passed on to all arrays that are created.. All of this also makes sense. > I hope this helps some. > > > -Travis > From stefan at sun.ac.za Tue Feb 28 01:57:07 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 28 01:57:07 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <200602280223.k1S2NZL3020466@oobleck.astro.cornell.edu> References: <20060221145310.A69A912C2A@sc8-sf-spam2.sourceforge.net> <200602280223.k1S2NZL3020466@oobleck.astro.cornell.edu> Message-ID: <20060228095607.GB7085@sun.ac.za> Hi Joe, On Mon, Feb 27, 2006 at 09:23:35PM -0500, Joe Harrington wrote: > I fixed a small error. I found myself a bit lost. Are cookbook pages > supposed to be introductory, or are they aimed at users who already > know a fair bit? From the page, I can vaguely get that record arrays > allow you access to your data using text strings as partial indexers, > but I found myself focusing so much on testing whether this was true > and figuring out how they work that I was completely distracted from > the example. A paragraph or two at the top explaining what a record > array is, why it's useful, and what the basic properties are would be > good. Then give the example. Also, it's less than intuitive why you > are storing RGB values (usually thought of as integers in the range > 0-255 or 0x00 - 0xff) in 32-bit floating-point numbers. Thanks for the feedback. Of course, any cookbook should aim to be as simple as possible. I wrote this as I figured out record arrays, and proposed it as a starting point -- so feel free to improve it as you see fit. While <= 8-bit images can be stored as integers in [0-255], it is common to use floating point numbers in [0-1] for any depth image. Regards St?fan From ndarray at mac.com Tue Feb 28 04:32:05 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 04:32:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403E72E.4040007@ieee.org> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: On 2/28/06, Travis Oliphant wrote: > Hmm.. Could you show us again what you mean by these problems and the > better behavior that could happen if ufuncs were changed? >From my original post : """ 3. Fix augmented assignment operators. Currently: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([5, 5, 5, 5, 5]) >>> x += arange(5) >>> x array([15, 15, 15, 15, 15]) Desired: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([1, 1, 1, 1, 1]) >>> x += arange(5) >>> x array([1, 2, 3, 4, 5]) """ From zpincus at stanford.edu Tue Feb 28 04:36:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 04:36:01 2006 Subject: [Numpy-discussion] can't resize ndarray subclass Message-ID: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Thus far, it seems like the way to get instances of ndarray subclasses is to use the 'view(subclass)' method on a proper ndarray, either in the subclass __new__, or after constructing the array. However, subclass instances so created *do not own their own data*, as far as I can tell. They are just "views" on an other array's data. This means that these objects can't be resized, or have other operations performed on them which requires 'owning' the data buffer. E.g.: class f(numpy.ndarray): def __new__(cls, *p, **kw): return numpy.array(*p, **kw).view(cls) f([1,2,3]).resize((10,)) ValueError: cannot resize this array: it does not own its data numpy.array([1,2,3]).view(f).resize((10,)) ValueError: cannot resize this array: it does not own its data numpy.array([1,2,3]).resize((10,)) (no problem) Is there another way to create ndarray subclasses which do own their own data? Note that numpy.resize(f([1,2,3]), (10,)) works fine. But this isn't the same as having an object that owns its data. Specifically, there's just no way to have an ndarray subclass resize itself as a result of calling a method if that object doesn't own its data. (Imagine a variable-resolution polygon type that could interpolate or decimate vertices as needed: such a class would need to resize itself.) Zach From Chris.Barker at noaa.gov Tue Feb 28 09:28:07 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 28 09:28:07 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403E72E.4040007@ieee.org> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: <44048816.5020707@noaa.gov> Travis Oliphant wrote: > [Soap Box] > I've been annoyed for several years that the idea of linear operators is > constrained in most libraries to 2 dimensions. There are many times I > want to find an inverse of an operator that is most naturally expressed > with 6 dimensions. Yes, yes, yes! "numpy is not matlab" One of the things I love most about numpy is that it is an n-d array package, NOT a matrix package. I also love broadcasting. Similar to Travis, I was recently helping out a friend using Matlab for a graduate structural mechanics course. The machinations required to shoehorn the natural tensor math into 2-d matrices was pretty ugly indeed. I'd much rather see numpy encourage the use of higher dimension arrays and broadcasting over traditional 2-d matrix solutions. However.... > I have to myself play games with indexing to give > the computer a matrix it can understand. Why is that? One of the reasons is that we want to use other people already optimized code (i.e. LAPACK). They only work with the 2-d data structures. I suppose we could do the translation to the LAPACK data structures under the hood, but that would take some work. However, this makes me wonder.... I'm unclear on the details, but from what I understand of the post that started this thread, one use repmat is used in order to turn some operations into standard linear algebra operations, and that's done for performance purposes. The repmat matrix would therefore need to be in a form usable by LAPACK and friends, and thus would need to be dense anyway ... a zero-stride array would not work, so maybe the potential advantages of the compact storage wouldn't really be realized (until we write out own LAPACK) This also brings me to... Sasha wrote: > Desired: >>>> x = zeros(5) >>>> x.strides=0 >>>> x += 1 >>>> x > array([1, 1, 1, 1, 1]) >>>> x += arange(5) >>>> x > array([1, 2, 3, 4, 5]) So what the heck is a zero-strided array? My understanding was that the whole point was the what looked like multiple values, were really a single, shared, value. In this case, is shouldn't be possible to in-place add more than one value. I wouldn't say that what Sasha presented as "desired" is desired.. an in=place operation shouldn't fundamentally change the nature of the array. That array should ALWAYS remain single-valued. So what should the result of x += arange(5) be? I say it should raise an exception. Maybe zero-stride arrays are only really useful read-only? This is a complicated can of worms..... -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ndarray at mac.com Tue Feb 28 09:38:05 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 09:38:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403E72E.4040007@ieee.org> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: Travis, I've noticed that you changed the code to allow x.strides = 0 , but it does not look like your changes alows creation of memory-saving zero stride arrays: >>> b = array([1]) >>> ndarray((5,), strides=(0,), buffer=b) Traceback (most recent call last): File "", line 1, in ? TypeError: buffer is too small for requested array I would think memory-saving is the only justification for allowing zero strides. What use does your change enable? From tim.hochberg at cox.net Tue Feb 28 09:53:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 09:53:03 2006 Subject: [Numpy-discussion] Re: Method to shift elements in an array? In-Reply-To: References: Message-ID: <44048DCE.80408@cox.net> Alan G Isaac wrote: >Tim wrote: > > >>import numpy >>def roll(A, n): >> "Roll the array A in place. Positive n -> roll right, negative n -> >>roll left" >> if n > 0: >> n = abs(n) >> temp = A[-n:] >> A[n:] = A[:-n] >> A[:n] = temp >> elif n < 0: >> n = abs(n) >> temp = A[:n] >> A[:-n] = A[n:] >> A[-n:] = temp >> else: >> pass >> >> > >This probably counts as a gotcha: > > >>>>a=N.arange(10) >>>>temp=a[-6:] >>>>a[6:]=a[:-6] >>>>a[:6]=temp >>>>a >>>> >>>> >array([4, 5, 0, 1, 2, 3, 0, 1, 2, 3]) > > Ack! Right, those temp variables needed to be copies. That's why I added the caveat about only rolling a few elements, since otherwise it gets expensive. Then I forgot to make the copies in the code, doh! -tim >Cheers, >Alan Isaac > >PS Here's something close to the rotater functionality. > >#rotater: rotate row elements ># Format: y = rotater(x,r,copydata) ># Input: x RxC array ># rotateby size R integer array, or integer (rotation amounts) ># inplace boolean (default is False -> copies data) ># Output: y RxC array: ># rows rotated by rotateby ># or None (if inplace=True) ># Remarks: Intended for use with 2D arrays. ># rotateby values are positive for rightward rotation, ># negative for leftward rotation ># :author: Alan G Isaac (aisaac AT american DOT edu) ># :date: 24 Feb 2006 >def rotater(x,rotateby,inplace=False) : > assert(len(x.shape)==2), "For 2-d arrays only." > xrotate = numpy.array(x,copy=(not inplace)) > xrows = xrotate.shape[0] > #make an iterater of row shifts > if isinstance(rotateby,int): > from itertools import repeat > rowshifts = repeat(rotateby,xrows) > else: > rowshifts = numpy.asarray(rotateby) > assert(rowshifts.size==xrows) > rowshifts = rowshifts.flat > #perform rotation on each row > for row in xrange(xrows): > rs=rowshifts.next() > #do nothing if rs==0 > if rs>0: > xrotate[row] = numpy.concatenate([xrotate[row][-rs:],xrotate[row][:-rs]]) > elif rs<0: > xrotate[row] = numpy.concatenate([xrotate[row][:-rs],xrotate[row][-rs:]]) > if inplace: > return None > else: > return xrotate > > > > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From ndarray at mac.com Tue Feb 28 10:38:03 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 10:38:03 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <44048852.3080701@sympatico.ca> References: <4403C3AB.1040600@sympatico.ca> <440449B2.8060007@sympatico.ca> <4404680F.9020006@sympatico.ca> <44048852.3080701@sympatico.ca> Message-ID: On 2/28/06, Colin J. Williams wrote: > ... > >>Would it not be better to have def zeroes(..., zeroStrides= False): > >>and a= zeros(..., zeroStrides= True)? > >> > >> > > > >This is equivalent to what I proposed: xzeros(shape) and xones(shape) > >functions as a shorthand to ndarray(shape, strides=(0,)*len(shape)) > > > > > Wot, more names to remember? [sorry, I can't give you the graphic to go > aling with this. :-) ] Oh, please - you don't have to be so emphatic. Your solution above (adding zeroStrides parameter) would require a much more arbitrary name to remember. Even worse, someone will write zeros(shape, int, False, True) to mean zeros(shape, dtype=int, fortran=False, zeroStrides=True) and anyone reading that code will have to look up the manual to understand what each boolean means. Boolean parameters are generally considered bad design. For example, it would be much better to have array(..., memory_layout='Fortran') instead of current array(...,fortran=True). (Well, fortran=True is not that bad, but fortran=False is really puzzling - if it is not fortran - what is it?) Arguably even better solution would be array(..., strides = fortran_strides(shape)), but that's a different story. The names xzeros and xones are a natural choice because the functionality they provide is very similar to what xrange provides compared to range: memory saving way to achieve the same result. From ndarray at mac.com Tue Feb 28 10:48:09 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 10:48:09 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <44048816.5020707@noaa.gov> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> <44048816.5020707@noaa.gov> Message-ID: On 2/28/06, Christopher Barker wrote: > ... > So what should the result of x += arange(5) be? I say it should raise an > exception. Agree. That's what I was proposing as a feasible (as opposed to ideal) solution. Ideally, x += [1,1,1,1,1] would be fine, but not x += [1,2,1,2,1]. I quoted too much from an early post. > > Maybe zero-stride arrays are only really useful read-only? > Maybe. It is hard to justify x[1] = 2 changing the result of x[0], but x[:] = 2 may still be ok. > This is a complicated can of worms..... Completely agree. That's why I made x.strides = 0 illegal some time ago. I don't think it is a good idea to bring it back without understanding all the consequences. If we allow it now, it will be harder to change the behavior of the result later. Someone's code will rely on x += ones(5) incrementing x five times for zero-stride x. From zpincus at stanford.edu Tue Feb 28 11:30:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 11:30:04 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: In answer to my previous question, to get an ndarray subclass that owns its own data, copy() must be called on the new "view" of the subclass. This makes sense and is reasonable. However a new problem has me nearly tearing my hair out. Calling the resize method on an instance of such a subclass works fine. However, calling a method that calls 'self.resize' breaks! And worse, it breaks in such a way that then subsequent calls to resize also break. Check it out: class f(numpy.ndarray): def __new__(cls, obj): return numpy.array(obj).view(cls).copy() def expand(self): self.resize([self.shape[0] + 1, self.shape[1]]) g = f([[1,2],[3,4]]) g.resize([3,2]) # this works, thanks to the '.copy()' above g = f([[1,2],[3,4]]) g.expand() # just internally calls self.resize([3,2]) ValueError: cannot resize an array that has been referenced or is referencing another array in this way. Use the resize function g.resize([3,2]) # this NOW DOES NOT WORK! ValueError: cannot resize an array that has been referenced or is referencing another array in this way. Use the resize function Can anyone help? Please? Zach On Feb 28, 2006, at 4:35 AM, Zachary Pincus wrote: > Thus far, it seems like the way to get instances of ndarray > subclasses is to use the 'view(subclass)' method on a proper > ndarray, either in the subclass __new__, or after constructing the > array. > > However, subclass instances so created *do not own their own data*, > as far as I can tell. They are just "views" on an other array's > data. This means that these objects can't be resized, or have other > operations performed on them which requires 'owning' the data buffer. > > E.g.: > class f(numpy.ndarray): > def __new__(cls, *p, **kw): > return numpy.array(*p, **kw).view(cls) > > f([1,2,3]).resize((10,)) > ValueError: cannot resize this array: it does not own its data > numpy.array([1,2,3]).view(f).resize((10,)) > ValueError: cannot resize this array: it does not own its data > numpy.array([1,2,3]).resize((10,)) > (no problem) > > Is there another way to create ndarray subclasses which do own > their own data? > > Note that numpy.resize(f([1,2,3]), (10,)) works fine. But this > isn't the same as having an object that owns its data. > Specifically, there's just no way to have an ndarray subclass > resize itself as a result of calling a method if that object > doesn't own its data. (Imagine a variable-resolution polygon type > that could interpolate or decimate vertices as needed: such a class > would need to resize itself.) > > Zach > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the > live webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at cox.net Tue Feb 28 11:41:15 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 11:41:15 2006 Subject: [Numpy-discussion] Numpy and PEP 343 Message-ID: <4404A71B.10600@cox.net> An idea that has popped up from time to time is delaying evalution of a complicated expressions so that the result can be computed more efficiently. For instance, the matrix expression: a = b*c + d*e results in the creation of two, potentially large, temporary matrices and also does a couple of extra loops at the C level than the equivalent expression implemented in C would. The general idea has been to construct some sort of psuedo-object, when the numerical operations are indicated, then do the actual numerical operations at some later time. This would be very problematic if implemented for all arrays since it would quickly become impossible to figure out what was going on, particularly with view semantics. However, it could result in large performance improvements without becoming incomprehensible if implemented in small enough chunks. A "straightforward" approach would look something like: numpy.begin_defer() # Now all numpy operations (in this thread) are deferred a = b*c + d*e # 'a' is a special object that holds pointers to # 'b', 'c', 'd' and 'e' and knows what ops to perform. numpy.end_defer() # 'a' performs the operations and now looks like an array Since 'a' knows the whole series of operations in advance it can perform them more efficiently than would be possible using the basic numpy machinery. Ideally, the space for 'a' could be allocated up front, and all of the operations could be done in a single loop. In practice the optimization might be somewhat less ambitious, depending on how much energy people put into this. However, this approach has some problems. One is the syntax, which clunky and a bit unsafe (a missing end_defer in a function could cause stuff to break very far away). The other is that I suspect that this sort of deferred evaluation makes multiple views of an array even more likely to bite the unwary. The syntax issue can be cleanly addressed now that PEP 343 (the 'with' statement) is going into Python 2.5. Thus the above would look like: with numpy.deferral(): a = b*c + d*e Just removing the extra allocation of temporary variables can result in 30% speedup for this case[1], so the payoff would likely be large. On the down side, it could be quite a can of worms, and would likely require a lot of work to implement. Food for thought anyway. -tim [1] from timeit import Timer print Timer('a = b*c + d*e', 'from numpy import arange;b=c=d=e=arange(100000.)').timeit(10000) print Timer('a = b*c; multiply(d,e,temp); a+=temp', 'from numpy import arange, zeros, multiply;' 'b=c=d=e=arange(100000.);temp=zeros([100000], dtype=float)').timeit(10000) => 94.8665989672 62.6143562939 From oliphant.travis at ieee.org Tue Feb 28 12:03:09 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 12:03:09 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: <4404AC42.9030405@ieee.org> Zachary Pincus wrote: > Thus far, it seems like the way to get instances of ndarray > subclasses is to use the 'view(subclass)' method on a proper ndarray, > either in the subclass __new__, or after constructing the array. > > However, subclass instances so created *do not own their own data*, > as far as I can tell. They are just "views" on an other array's data. > This means that these objects can't be resized, or have other > operations performed on them which requires 'owning' the data buffer. Yes, that is true. To own your own data your subclass would have to create it's own memory using ndarray.__new__(mysubclass, shape, dtype) Or make a copy as you suggested later. -Travis From oliphant.travis at ieee.org Tue Feb 28 12:11:11 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 12:11:11 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: <4404AE2E.9030405@ieee.org> Zachary Pincus wrote: > In answer to my previous question, to get an ndarray subclass that > owns its own data, copy() must be called on the new "view" of the > subclass. This makes sense and is reasonable. > > However a new problem has me nearly tearing my hair out. Calling the > resize method on an instance of such a subclass works fine. However, > calling a method that calls 'self.resize' breaks! And worse, it > breaks in such a way that then subsequent calls to resize also break. Yeah, this is one difficult aspect of the resize method. Because memory is being re-allocated, the method has to make sure the memory isn't being shared by another object. Right now, it's checking the reference count. Unfortunately, that isn't a fool-proof mechanism as you've discovered, because a reference can be held onto in somewhat unpredictable ways that would not be bothered by a resize on the memory and this messes up the resize method. What is really needed is someway to determine if any other object is actually pointing to the memory of the ndarray (and not just holding on to the object). But, nobody has figured out a way to do that. It would be possible to let the user "force" the issue leaving it up to them to make sure they don't share the memory and then reallocate it. In other words, an extra argument to the resize method could be used to bypass the memory check. I'd be willing to do that because I know the check being performed is not foolproof -Travis From oliphant.travis at ieee.org Tue Feb 28 12:19:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 12:19:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: <4404B002.3010407@ieee.org> Sasha wrote: >>>>b = array([1]) >>>>ndarray((5,), strides=(0,), buffer=b) >>>> >>>> >Traceback (most recent call last): > File "", line 1, in ? >TypeError: buffer is too small for requested array > >I would think memory-saving is the only justification for allowing >zero strides. > >What use does your change enable? > > I was just simplifying the PyArray_CheckStrides code. I didn't try to actually enable creating 0-stride arrays in PyArray_NewFromDescr. I don't mind if it is enabled though. I just haven't done it yet. I agree that 0-stride arrays are a "can-of-worms", and I also do not see changing ufunc behavior. The current behavior is understandable and exactly what one would expect with zero-stride and multiple dimensioned arrays. i.e. b = arange(5); b.strides = 0 add(b,1,b) where b has shape (5,) and stride (0,) would add 1 to the first element of b 5 times. Since all elements of the array b are obtained from the first element (that's what stride=0 means), you end up with an array of all 5's. This may not be useful, I agree, but it is understandable and changing it would be too much of an exception. If somebody is creating 0-stride arrays on their own, then they must know what they are doing. -Travis From oliphant.travis at ieee.org Tue Feb 28 13:25:14 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 13:25:14 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: <4404BF92.5040200@ieee.org> Zachary Pincus wrote: > However a new problem has me nearly tearing my hair out. Calling the > resize method on an instance of such a subclass works fine. However, > calling a method that calls 'self.resize' breaks! And worse, it > breaks in such a way that then subsequent calls to resize also break. In SVN version of numpy, there is a new keyword argument to resize (refcheck). If this keyword argument is 0 (it defaults to 1), the reference-count check is not performed. Thus, if you are sure that your array has not exposed it's memory to another object, then you can set refcheck=0 and the resize will proceed. If you really did expose your memory to another object, this could lead to segfaults in exactly the same way that exposing the memory to a Python array (array module) and then later resizing (which Python currently allows) would cause problems. Be careful... -Travis From cookedm at physics.mcmaster.ca Tue Feb 28 13:48:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 28 13:48:02 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404A71B.10600@cox.net> (Tim Hochberg's message of "Tue, 28 Feb 2006 12:40:11 -0700") References: <4404A71B.10600@cox.net> Message-ID: Tim Hochberg writes: > > > An idea that has popped up from time to time is delaying evalution of > a complicated expressions so that the result can be computed more > efficiently. For instance, the matrix expression: > > a = b*c + d*e > > results in the creation of two, potentially large, temporary matrices > and also does a couple of extra loops at the C level than the > equivalent expression implemented in C would. > > The general idea has been to construct some sort of psuedo-object, > when the numerical operations are indicated, then do the actual > numerical operations at some later time. This would be very > problematic if implemented for all arrays since it would quickly > become impossible to figure out what was going on, particularly with > view semantics. However, it could result in large performance > improvements without becoming incomprehensible if implemented in small > enough chunks. > > A "straightforward" approach would look something like: > > numpy.begin_defer() # Now all numpy operations (in this thread) > are deferred > a = b*c + d*e # 'a' is a special object that holds pointers to > # 'b', 'c', 'd' and 'e' and knows what ops to > perform. > numpy.end_defer() # 'a' performs the operations and now looks like > an array > > Since 'a' knows the whole series of operations in advance it can > perform them more efficiently than would be possible using the basic > numpy machinery. Ideally, the space for 'a' could be allocated up > front, and all of the operations could be done in a single loop. In > practice the optimization might be somewhat less ambitious, depending > on how much energy people put into this. However, this approach has > some problems. One is the syntax, which clunky and a bit unsafe (a > missing end_defer in a function could cause stuff to break very far > away). The other is that I suspect that this sort of deferred > evaluation makes multiple views of an array even more likely to bite > the unwary. This is a good idea; probably a bit difficult. I don't like the global defer context though. That could get messy, especially if you start calling functions. > The syntax issue can be cleanly addressed now that PEP 343 (the 'with' > statement) is going into Python 2.5. Thus the above would look like: > > with numpy.deferral(): > a = b*c + d*e > > Just removing the extra allocation of temporary variables can result > in 30% speedup for this case[1], so the payoff would likely be large. > On the down side, it could be quite a can of worms, and would likely > require a lot of work to implement. Alternatively, make some sort of expression type: ex = VirtualExpression() ex.a = ex.b * ex.c + ex.d * ex.e then, compute = ex.compile(a=(shape_of_a, dtype_of_a), etc.....) This could return a function that would look like def compute(b, c, d, e): a = empty(shape_of_a, dtype=dtype_of_a) multiply(b, c, a) # ok, I'm making this one up :-) fused_multiply_add(d, e, a) return a a = compute(b, c, d, e) Or, it could use some sort of numeric-specific bytecode that can be interpreted quickly in C. With some sort of optimizing compiler for that bytecode it could be really fun (it could use BLAS when appropriate, for instance!). or ... use weave :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From tim.hochberg at cox.net Tue Feb 28 13:57:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 13:57:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: <4403C3AB.1040600@sympatico.ca> <440449B2.8060007@sympatico.ca> <4404680F.9020006@sympatico.ca> <44048852.3080701@sympatico.ca> Message-ID: <4404C6F2.1060402@cox.net> Sasha wrote: >On 2/28/06, Colin J. Williams wrote: > > >>... >> >> >>>>Would it not be better to have def zeroes(..., zeroStrides= False): >>>>and a= zeros(..., zeroStrides= True)? >>>> >>>> >>>> >>>> >>>This is equivalent to what I proposed: xzeros(shape) and xones(shape) >>>functions as a shorthand to ndarray(shape, strides=(0,)*len(shape)) >>> >>> >>> >>> >>Wot, more names to remember? [sorry, I can't give you the graphic to go >>aling with this. :-) ] >> >> > >Oh, please - you don't have to be so emphatic. Your solution above >(adding zeroStrides parameter) would require a much more arbitrary >name to remember. Even worse, someone will write zeros(shape, int, >False, True) to mean zeros(shape, dtype=int, fortran=False, >zeroStrides=True) and anyone reading that code will have to look up >the manual to understand what each boolean means. > >Boolean parameters are generally considered bad design. For example, >it would be much better to have array(..., memory_layout='Fortran') >instead of current array(...,fortran=True). (Well, fortran=True is >not that bad, but fortran=False is really puzzling - if it is not >fortran - what is it?) Arguably even better solution would be >array(..., strides = fortran_strides(shape)), but that's a different >story. > > I agree with this. >The names xzeros and xones are a natural choice because the >functionality they provide is very similar to what xrange provides >compared to range: memory saving way to achieve the same result. > > But not this. I don't think using xrange as a template for naming anything is a good idea. If xrange were being added to Python now, it would almost certainly be called irange and live in itertools. I have a strong suspicion that the *name* xrange will go the way of the dodo eventually, although the functionality will survive in some other form. Also, while I can see how there might well be some good uses for zero-stride arrays, I'm having a hard time getting excited by by xzeros and xones. The only applications I can come up with can already be done in more efficient ways without using xones and xzeros. [Let me apologize in advance if I missed a compelling example earlier in this thread -- I just got back from vacation and I may have missed something in my email reading frenzy] -tim From zpincus at stanford.edu Tue Feb 28 13:58:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 13:58:02 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <4404BF92.5040200@ieee.org> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> Message-ID: <11799E00-E0A1-459D-957E-EEC62910ACDE@stanford.edu> Thanks Travis for the replies and the new functionality in the SVN! I think I have enough (well, maybe once I figure out __array_priority__) to get a decent wiki entry for subclassing ndarray, and maybe some template subclasses that others can use. I presume __array_priority__ determines the resulting type when two different type arrays are ufunc'd together? The exact mechanism of which object gets selected as the parent, etc., is still unclear though. Zach On Feb 28, 2006, at 1:24 PM, Travis Oliphant wrote: > Zachary Pincus wrote: > >> However a new problem has me nearly tearing my hair out. Calling >> the resize method on an instance of such a subclass works fine. >> However, calling a method that calls 'self.resize' breaks! And >> worse, it breaks in such a way that then subsequent calls to >> resize also break. > > > In SVN version of numpy, there is a new keyword argument to resize > (refcheck). If this keyword argument is 0 (it defaults to 1), the > reference-count check is not performed. Thus, if you are sure that > your array has not exposed it's memory to another object, then you > can set refcheck=0 and the resize will proceed. > > If you really did expose your memory to another object, this could > lead to segfaults in exactly the same way that exposing the memory > to a Python array (array module) and then later resizing (which > Python currently allows) would cause problems. > > Be careful... > > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the > live webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at cox.net Tue Feb 28 14:06:09 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 14:06:09 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <4404BF92.5040200@ieee.org> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> Message-ID: <4404C900.4080001@cox.net> Travis Oliphant wrote: > Zachary Pincus wrote: > >> However a new problem has me nearly tearing my hair out. Calling the >> resize method on an instance of such a subclass works fine. However, >> calling a method that calls 'self.resize' breaks! And worse, it >> breaks in such a way that then subsequent calls to resize also break. > > > > In SVN version of numpy, there is a new keyword argument to resize > (refcheck). If this keyword argument is 0 (it defaults to 1), the > reference-count check is not performed. Thus, if you are sure that > your array has not exposed it's memory to another object, then you can > set refcheck=0 and the resize will proceed. I'd suggest that this get exposed as a separate function, for instance A._unchecked_resize(size). It seems much less likely that this will accidentally get called than that somone will mistakenly throw a second boolean argument into resize. -tim > If you really did expose your memory to another object, this could > lead to segfaults in exactly the same way that exposing the memory to > a Python array (array module) and then later resizing (which Python > currently allows) would cause problems. > > Be careful... > > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From oliphant.travis at ieee.org Tue Feb 28 14:15:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 14:15:01 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <4404C900.4080001@cox.net> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> <4404C900.4080001@cox.net> Message-ID: <4404CB22.9070203@ieee.org> Tim Hochberg wrote: > > I'd suggest that this get exposed as a separate function, for instance > A._unchecked_resize(size). It seems much less likely that this will > accidentally get called than that somone will mistakenly throw a > second boolean argument into resize. > The way the resize method is written, you can't mistakenly throw in another argument. You would have to provide a "refcheck" keyword argument. I can't see that being a mistake. The resize method can be called with either a sequence or with several shapes, i.e. a.resize((3,2)) a.resize(3,2) Both of these are equivalent. To get the refcheck functionality you would have to explicitly provide a keyword argument. a.resize((3,2),refcheck=0) a.resize(3,2,refcheck=0) -Travis From oliphant.travis at ieee.org Tue Feb 28 14:20:00 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 14:20:00 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <11799E00-E0A1-459D-957E-EEC62910ACDE@stanford.edu> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> <11799E00-E0A1-459D-957E-EEC62910ACDE@stanford.edu> Message-ID: <4404CC58.8010101@ieee.org> Zachary Pincus wrote: > Thanks Travis for the replies and the new functionality in the SVN! > > I think I have enough (well, maybe once I figure out > __array_priority__) to get a decent wiki entry for subclassing > ndarray, and maybe some template subclasses that others can use. > > I presume __array_priority__ determines the resulting type when two > different type arrays are ufunc'd together? The exact mechanism of > which object gets selected as the parent, etc., is still unclear though. When two different subclasses appear in a ufunc (or in other places in ndarray), the subclass chosen for creation is the one with highest __array_priority__. The "parent" concept is entirely separate is the object passed to __array_finalize__ and should be of the same type as self (or None). -Travis From stefan at sun.ac.za Tue Feb 28 14:21:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 28 14:21:01 2006 Subject: [Numpy-discussion] setting package path Message-ID: <20060228221937.GE10590@alpha> In numpytest.py, set_package_path is provided for handling path changes while doing unit tests. It reads def set_package_path(level=1): """ Prepend package directory to sys.path. set_package_path should be called from a test_file.py that satisfies the following tree structure: //test_file.py Then the first existing path name from the following list /build/lib.- /.. is prepended to sys.path. ... However, the line that supposedly calculates "somepath/.." is d = os.path.dirname(os.path.dirname(os.path.abspath(testfile))) which calculates "somepath". Which is wrong: the docstring, the code or my interpretation? St?fan From tim.hochberg at cox.net Tue Feb 28 15:15:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 15:15:02 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: References: <4404A71B.10600@cox.net> Message-ID: <4404D92D.7000604@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >> >> >>An idea that has popped up from time to time is delaying evalution of >>a complicated expressions so that the result can be computed more >>efficiently. For instance, the matrix expression: >> >>a = b*c + d*e >> >>results in the creation of two, potentially large, temporary matrices >>and also does a couple of extra loops at the C level than the >>equivalent expression implemented in C would. >> >>The general idea has been to construct some sort of psuedo-object, >>when the numerical operations are indicated, then do the actual >>numerical operations at some later time. This would be very >>problematic if implemented for all arrays since it would quickly >>become impossible to figure out what was going on, particularly with >>view semantics. However, it could result in large performance >>improvements without becoming incomprehensible if implemented in small >>enough chunks. >> >>A "straightforward" approach would look something like: >> >> numpy.begin_defer() # Now all numpy operations (in this thread) >> are deferred >> a = b*c + d*e # 'a' is a special object that holds pointers to >> # 'b', 'c', 'd' and 'e' and knows what ops to >> perform. >> numpy.end_defer() # 'a' performs the operations and now looks like >> an array >> >>Since 'a' knows the whole series of operations in advance it can >>perform them more efficiently than would be possible using the basic >>numpy machinery. Ideally, the space for 'a' could be allocated up >>front, and all of the operations could be done in a single loop. In >>practice the optimization might be somewhat less ambitious, depending >>on how much energy people put into this. However, this approach has >>some problems. One is the syntax, which clunky and a bit unsafe (a >>missing end_defer in a function could cause stuff to break very far >>away). The other is that I suspect that this sort of deferred >>evaluation makes multiple views of an array even more likely to bite >>the unwary. >> >> > >This is a good idea; probably a bit difficult. > It's not original. I think this idea comes around periodically, but dies from a combination of it being nontrivial and the resulting syntax being too heavyweight. >I don't like the global >defer context though. That could get messy, especially if you start >calling functions. > > I'm not crazy about it either. You could localize it with an appropriate (ab)use of sys._getframe, but that's another potential can of worms. Something like: class deferral: frames = set() def __enter__(self): self.frame = sys._getframe(1) self.frames.add(self.frame) def __exit__(self, *args): self.frames.discard(self.frame) self.frame = None def should_defer(): return (sys._getframe(1) in deferral.frames) Then: with deferral(): #stuff Should be localized to just 'stuff', even if it calls other functions[1]. The details might be sticky though.... >>The syntax issue can be cleanly addressed now that PEP 343 (the 'with' >>statement) is going into Python 2.5. Thus the above would look like: >> >>with numpy.deferral(): >> a = b*c + d*e >> >>Just removing the extra allocation of temporary variables can result >>in 30% speedup for this case[1], so the payoff would likely be large. >>On the down side, it could be quite a can of worms, and would likely >>require a lot of work to implement. >> >> > >Alternatively, make some sort of expression type: > >ex = VirtualExpression() > >ex.a = ex.b * ex.c + ex.d * ex.e > >then, > >compute = ex.compile(a=(shape_of_a, dtype_of_a), etc.....) > >This could return a function that would look like > >def compute(b, c, d, e): > a = empty(shape_of_a, dtype=dtype_of_a) > multiply(b, c, a) > # ok, I'm making this one up :-) > fused_multiply_add(d, e, a) > return a > >a = compute(b, c, d, e) > > The syntax seems too heavy too me. It would be signifigantly lighter if the explicit compile step is optional, allowing: ex = VirtualExpression() ex.a = ex.b * ex.c + ex.d * ex.e a = ex(b=b, c=c, d=d, e=e) 'ex' could then figure out all of the sizes and types itself, create the function, compute the result. The created function would be cached and whenever the input parameters matched it would just be reused, so there shouldn't be too much more overhead than with compiled version you suggest. The syntax is still heavy relative to the 'with' version though. >Or, it could use some sort of numeric-specific bytecode that can be >interpreted quickly in C. With some sort of optimizing compiler for >that bytecode it could be really fun (it could use BLAS when >appropriate, for instance!). > >or ... use weave :-) > > I'll have to look at weave again. Last time I looked at it (quite a while ago) it didn't work for me. I can't recall if it was a licensing issue or it didn't work with my compiler or what, but I decided I couldn't use it. -tim [1] Here's an example that fakes with and tests deferral: import sys class deferral: frames = set() def __enter__(self): self.frame = sys._getframe(1) self.frames.add(self.frame) def __exit__(self, *args): self.frames.discard(self.frame) self.frame = None def should_defer(): return (sys._getframe(1) in deferral.frames) def f(n): if not n: return if n % 4: print "should_defer() =", should_defer(), "for n =", n f(n-1) else: # This is a rough translation of: # with deferral(): # print "should_defer() =", should_defer(), "in f" # g(n-1) d = deferral() d.__enter__() try: print "should_defer() =", should_defer(), "for n =", n f(n-1) finally: d.__exit__(None, None, None) f(10) From tim.hochberg at cox.net Tue Feb 28 15:20:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 15:20:03 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404D92D.7000604@cox.net> References: <4404A71B.10600@cox.net> <4404D92D.7000604@cox.net> Message-ID: <4404DA6D.70703@cox.net> Tim Hochberg wrote: [SNIP] Ugh. This last part got mangled somehow. > > > [1] Here's an example that fakes with and tests deferral: > > > > import sys > > class deferral: > frames = set() > def __enter__(self): > self.frame = sys._getframe(1) > self.frames.add(self.frame) > def __exit__(self, *args): > self.frames.discard(self.frame) > self.frame = None > def should_defer(): > return (sys._getframe(1) in deferral.frames) > [This got all jammed together, sorry] > def f(n): > if not n: > return > if n % 4: > print "should_defer() =", should_defer(), "for n =", n > f(n-1) > else: > # This is a rough translation of: > # with deferral(): > # print "should_defer() =", should_defer(), "in f" > # g(n-1) > d = deferral() > d.__enter__() > try: > print "should_defer() =", should_defer(), "for n =", n > f(n-1) > finally: > d.__exit__(None, None, None) > > > f(10) > > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From cookedm at physics.mcmaster.ca Tue Feb 28 15:30:05 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 28 15:30:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4404C6F2.1060402@cox.net> (Tim Hochberg's message of "Tue, 28 Feb 2006 14:56:02 -0700") References: <4403C3AB.1040600@sympatico.ca> <440449B2.8060007@sympatico.ca> <4404680F.9020006@sympatico.ca> <44048852.3080701@sympatico.ca> <4404C6F2.1060402@cox.net> Message-ID: Tim Hochberg writes: > > I don't think using xrange as a template for naming anything is a good > idea. If xrange were being added to Python now, it would almost > certainly be called irange and live in itertools. I have a strong > suspicion that the *name* xrange will go the way of the dodo > eventually, although the functionality will survive in some other > form. In Python 3.0 it'll be named "range" :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From pgmdevlist at mailcan.com Tue Feb 28 15:56:05 2006 From: pgmdevlist at mailcan.com (pierregm) Date: Tue Feb 28 15:56:05 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: References: <200602271658.44855.pgmdevlist@mailcan.com> Message-ID: <200602281855.20239.pgmdevlist@mailcan.com> Folks, Following Sasha's recommendation, I added a short list of features yet missing in MA to the wiki page: http://projects.scipy.org/scipy/numpy/wiki/MaskedArray The list corresponds to the features *I* miss (I doubt I'm the only one) and for which I have/had to find a workaround. I tried to organize the features by potential problems/fixes. I'll add more missing features as I run into them. I also attached an example of implementation of `std` and `median` for masked arrays. It's a bit crude, it's not fully tested, it's def'ny not in a svn diff format (sorry about that, I need to figure this one out), but it does the job I wanted it to. http://projects.scipy.org/scipy/numpy/attachment/wiki/MaskedArray/ma_examples.py for Of course, your feedback is more than welcome. Thx again -- Pierre GM From cjw at sympatico.ca Tue Feb 28 17:01:13 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Tue Feb 28 17:01:13 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403E9CA.8050506@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> <4403C20A.5060603@sympatico.ca> <4403E9CA.8050506@ieee.org> Message-ID: <4404F243.5030707@sympatico.ca> Travis Oliphant wrote: > >> Travis Oliphant wrote: >> >>> Stefan van der Walt wrote: >>> >>>>> The __init__ and __new__ methods are not called because they may >>>>> have arbitrary signatures. Instead, the __array_finalize__ >>>>> method is always called. So, you should use that instead of >>>>> __init__. >>>>> >>>> >>>> >>>> >>> This is now true in SVN. Previously, __array_finalize__ was not >>> called if the "parent" was NULL. However, now, it is still called >>> with None as the value of the first argument. >>> >>> Thus __array_finalize__ will be called whenever >>> ndarray.__new__(,...) is called. >> >> >> >> Why this change in style from the the common Python idom of __new__, >> __init__, with the same signature to __new__, __array_finalize__ with >> possibly different signatures? >> > > I don't see it as a change in style but adding a capability to the > ndarray subclass. The problem is that arrays can be created in many > ways (slicing, ufuncs, etc). Not all of these ways should go through > the __new__/__init__ -- style creation mechanism. Try inheriting > from a float builtin and add attributes. Then add your float to an > instance of your new class and see what happens. Yes, I've tried this with ndarray - it didn't work. Later, I realized that it wasn't a good thing to try. Colin W. > > You will get a float-type on the output. This is the essence of > Paul's insight that sub-classing is rarely useful because you end up > having to re-define all the operators anyway to return the value that > you want. He knows whereof he speaks as well, because he wrote MA and > UserArray and had experience with Python sub-classing. > I wanted a mechanism to make it easier to sub-class arrays and have > the operators return your object if possible (including all of it's > attributes). > > Thus, > > __array_priority__ (a floating point attribute) > __array_finalize__ (a method called on internal construction of the > array wrapper). > > were invented (along with __array_wrap__ which any class can define > to have their objects survive ufuncs). > It was easy enough to see where to call __array_finalize__ in the > C-code if somewhat difficult to explain (and get exception handling to > work because of my initial over-thinking). > The signature is just > > __array_finalize__(self, parent): > return > > i.e. any return value is ignored (but exceptions are caught). > > > I've used the feature succesfully on at least 3-subclasses (chararray, > memmap, and matrix) and so I'm actually pretty happy with it. > __new__ and __init__ are still relevant for constructing your > brand-new object. The __array_finalize__ function is just what the > internal contructor that acutally allocates memory will always call to > let you set final attributes *every* time your sub-class gets created. > > > > -Travis > From paul at pfdubois.com Tue Feb 28 17:20:16 2006 From: paul at pfdubois.com (Paul F. Dubois) Date: Tue Feb 28 17:20:16 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: References: <4404A71B.10600@cox.net> Message-ID: <4404F6B3.9080809@pfdubois.com> You're reinventing C++ expression templates, although since Python is dynamically typed you don't need templates. The crucial feature in C++ that lets it all work is that you can override the action for assignment. a = b*c + d*e If we could realize we were at the "equals" sign we could evaluate the RHS, and assign it to a. This is not possible in Python; to make is possible would require slowing down regular assignment, which is perhaps a definition of bad. a[...] = RHS could be overridden but it is ugly and 'naive' users will forget. a := RHS could be added to the language with the semantics that it tries to do a.__assignment__(RHS) but Guido told me no long ago. (:->. Also, you might forget the : in :=. a.assign(RHS) would also work but then the original statement would produce a strange object with surprising results. David M. Cooke wrote: > Tim Hochberg writes: > >> >> >> An idea that has popped up from time to time is delaying evalution of >> a complicated expressions so that the result can be computed more >> efficiently. For instance, the matrix expression: >> >> a = b*c + d*e >> >> results in the creation of two, potentially large, temporary matrices >> and also does a couple of extra loops at the C level than the >> equivalent expression implemented in C would. >> >> The general idea has been to construct some sort of psuedo-object, >> when the numerical operations are indicated, then do the actual From ndarray at mac.com Tue Feb 28 17:51:20 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 17:51:20 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404F6B3.9080809@pfdubois.com> References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> Message-ID: Lazy evaluation has been part of many array languages since early days of APL (which makes this idea almost 50 years old). I was entertaining an idea of bringing lazy evaluation to python myself and concluded that there are two places where it might fit 1. At the level of python optimizer a * x + y, for example, can be translated into a call to a call to axpy if a, x and y are known to be arrays. This aproach quickly brings you to the optional static typing idea. 2. Overload arithmetic operators for ufunc objects. This will allow some form of tacit programming and you would be able to write f = multiply + multiply f(x, y, z, t) and have it evaluated without temporaries. Both of these ideas are from the pie-in-the-sky variety. On 2/28/06, Paul F. Dubois wrote: > You're reinventing C++ expression templates, although since Python is > dynamically typed you don't need templates. The crucial feature in C++ > that lets it all work is that you can override the action for assignment. > > a = b*c + d*e > > If we could realize we were at the "equals" sign we could evaluate the > RHS, and assign it to a. > > This is not possible in Python; to make is possible would require > slowing down regular assignment, which is perhaps a definition of bad. > > a[...] = RHS > > could be overridden but it is ugly and 'naive' users will forget. > > a := RHS > > could be added to the language with the semantics that it tries to do > a.__assignment__(RHS) but Guido told me no long ago. (:->. Also, you > might forget the : in :=. > > a.assign(RHS) > > would also work but then the original statement would produce a strange > object with surprising results. > > David M. Cooke wrote: > > Tim Hochberg writes: > > > >> > >> > >> An idea that has popped up from time to time is delaying evalution of > >> a complicated expressions so that the result can be computed more > >> efficiently. For instance, the matrix expression: > >> > >> a = b*c + d*e > >> > >> results in the creation of two, potentially large, temporary matrices > >> and also does a couple of extra loops at the C level than the > >> equivalent expression implemented in C would. > >> > >> The general idea has been to construct some sort of psuedo-object, > >> when the numerical operations are indicated, then do the actual > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Feb 28 19:42:01 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue Feb 28 19:42:01 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404F6B3.9080809@pfdubois.com> References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> Message-ID: On 2/28/06, Paul F. Dubois wrote: > You're reinventing C++ expression templates, although since Python is Yes indeedy, and although they might work well enough they produce the most godawful looking assembly code I have ever looked at. The boost ublas template library takes this approach and I regard it more as a meta-compiler research project written in a template language than as an array library. I think that there are two main users of arrays: those who want quick and convenient (optimize programmer time) and those who want super-fast execution (optimize cpu time). Because a human can generally do a better job and knows more about the intent than the average compiler, I think that the best bet is to provide the tools needed to write efficient code if the programmer so desires, but otherwise concentrate on convenience. When absolute speed is essential it is worth budgeting programmer time to achieve it, but generally I don't think that is the case. Chuck From oliphant.travis at ieee.org Tue Feb 28 19:55:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 19:55:04 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> Message-ID: <44051AC1.9000908@ieee.org> Charles R Harris wrote: >Yes indeedy, and although they might work well enough they produce the >most godawful looking assembly code I have ever looked at. The boost >ublas template library takes this approach and I regard it more as a >meta-compiler research project written in a template language than as >an array library. I think that there are two main users of arrays: >those who want quick and convenient (optimize programmer time) and >those who want super-fast execution (optimize cpu time). Because a >human can generally do a better job and knows more about the intent >than the average compiler, I think that the best bet is to provide the >tools needed to write efficient code if the programmer so desires, but >otherwise concentrate on convenience. When absolute speed is essential >it is worth budgeting programmer time to achieve it, but generally I >don't think that is the case. > > I think this is ultimately why nothing has been done except to make it easier and easier to write compiled code that gets called from Python. I'm sure most have heard that ctypes will be added to Python 2.5. This will make it very easy to write a C function to do what you want and just call it from Python. Weave can still help with the "auto-compilation" of the specific library for your type. Ultimately such code will be faster than NumPy can every be. -Travis From eric at enthought.com Tue Feb 28 21:28:01 2006 From: eric at enthought.com (eric jones) Date: Tue Feb 28 21:28:01 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <44051AC1.9000908@ieee.org> References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> <44051AC1.9000908@ieee.org> Message-ID: <440530AC.2010000@enthought.com> Travis Oliphant wrote: > > Weave can still help with the "auto-compilation" of the specific > library for your type. Ultimately such code will be faster than NumPy > can every be. Yes. weave.blitz() can be used to do the equivalent of this lazy evaluation for you in many cases without much effort. For example: import weave from scipy import arange a = arange(1e7) b = arange(1e7) c=2.0*a+3.0*b # or with weave weave.blitz("c=2.0*a+3.0*b") As Paul D. mentioned.what Tim outlined is essentially template expressions in C++. blitz++ (http://www.oonumerics.org/blitz/) is a C++ template expressions library for array operations, and weave.blitz translates a Numeric expression into C++ blitz code. For the example above on large arrays, you get about a factor of 4 speed up on large arrays. (Notice, the first time you run the example it will be much slower because of compile time. Use timings from subsequent runs.) C:\temp>weave_time.py Expression: c=2.0*a+3.0*b Numeric: 0.678311899322 Weave: 0.162177084984 Speed-up: 4.18253848494 This isn't as good as you can do with hand coded C, but it isn't so bad for the effort involved... I have wished for time to write a weave.f90("c=2.0*a+3.0*b") function because it is very feasible. My guess from simple experimentation is that it would be about as fast as hand coded C for this sort of expression. This might give us another factor of two or three in execution speed and the compile times would come down from tens of seconds to tenths of seconds. Incidentally, the weave calling overhead is large enough to limit its benefit on small arrays. Pat Miller pointed out some ways to get rid of that overhead, and I even wrote some experimental fixes to weave that helped out a lot. Alas, they never were completed fully. Revisiting these would make weave.blitz useful for small arrays as well. Fixing these is probably more work than wrting blitz.f90() All this to say, I think weave basically accomplishes what Tim wants with a different mechanism (letting C++ compilers do the optimization instead of writing this optimization at the python level). It does require a compiler on client machines in its current form (even that can be fixed...), but I think it might prove faster than re-implementing a numeric expression compiler at the python level (though that sounds fun as well). see ya, eric ############################### #weave_time.py ############################### import timeit array_size = 1e7 iterations = 10 setup = """\ import weave from scipy import arange a=arange(%f) b=arange(%f) c=arange(%f) # needed by weave test """ % (array_size, array_size, array_size) expr = "c=2.0*a+3.0*b" print "Expression:", expr numeric_timer = timeit.Timer(expr, setup) numeric_time = numeric_timer.timeit(number=iterations) print "Numeric:", numeric_time/iterations weave_timer = timeit.Timer('weave.blitz("%s")' % expr, setup) weave_timer = timeit.Timer('weave.blitz("%s")' % expr, setup) weave_time = weave_timer.timeit(number=iterations) print "Weave:", weave_time/iterations print "Speed-up:", numeric_time/weave_time > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From pearu at scipy.org Tue Feb 28 22:35:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 28 22:35:02 2006 Subject: [Numpy-discussion] setting package path In-Reply-To: <20060228221937.GE10590@alpha> References: <20060228221937.GE10590@alpha> Message-ID: On Wed, 1 Mar 2006, Stefan van der Walt wrote: > In numpytest.py, set_package_path is provided for handling path > changes while doing unit tests. It reads > > def set_package_path(level=1): > """ Prepend package directory to sys.path. > > set_package_path should be called from a test_file.py that > satisfies the following tree structure: > > //test_file.py > > Then the first existing path name from the following list > > /build/lib.- > /.. > > is prepended to sys.path. > ... > > However, the line that supposedly calculates "somepath/.." is > > d = os.path.dirname(os.path.dirname(os.path.abspath(testfile))) > > > which calculates "somepath". Which is wrong: the docstring, the code > or my interpretation? You have to read also following code: d1 = os.path.join(d,'build','lib.%s-%s'%(get_platform(),sys.version[:3])) if not os.path.isdir(d1): d1 = os.path.dirname(d) # <- here we get "somepath/.." sys.path.insert(0,d1) Pearu From oliphant.travis at ieee.org Tue Feb 28 23:16:07 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 23:16:07 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Table like array In-Reply-To: <16761e100602282240y5bcf869fme9dd2f42771066c4@mail.gmail.com> References: <16761e100602282240y5bcf869fme9dd2f42771066c4@mail.gmail.com> Message-ID: <44054A19.6040202@ieee.org> Michael Sorich wrote: > Hi, > > I am looking for a table like array. Something like a 'data frame' > object to those familiar with the statistical languages R and Splus. > This is mainly to hold and manipulate 2D spreadsheet like data, which > tends to be of relatively small size (compared to what many people > seem to use numpy for), heterogenous, have column and row names, and > often contains missing data. You could subclass the ndarray to produce one of these fairly easily, I think. The missing data item could be handled by a mask stored along with the array (or even in the array itself). Or you could use a masked array as your core object (though I'm not sure how it handles the arbitrary (i.e. record-like) data-types yet). Alternatively, and probably the easiest way to get started, you could just create your own table-like class and use simple 1-d arrays or 1-d masked arrays for each of the columns --- This has always been a way to store record-like tables. It really depends what you want the data-frames to be able to do and what you want them to "look-like." > A RecArray seems potentially useful, as it allows different fields to > have different data types and holds the name of the field. However it > doesn't seem easy to manipulate the data. Or perhaps I am simply > having difficulty finding documentation on there features. Adding a new column/field means basically creating a new array with a new data-type and copying data over into the already-defined fields. Data-types always have a fixed number of bytes per item. What those bytes represent can be quite arbitrary but it's always fixed. So, it is always "more work" to insert a new column. You could make that seamless in your table class so the user doesn't see it though. You'll want to thoroughly understand the dtype object including it's attributes and methods. Particularly the fields attribute of the dtype object. > eg > adding a new column/field (and to a lesser extent a new row/record) to > the recarray Adding a new row or record is actually similar because once an array is created it is usually resized by creating another array and copying the old array into it in the right places. > Changing the field/column names > make a new table by selecting a subset of fields/columns. (you can > select a single field/column, but not multiple). Right. So far you can't select multiple columns. It would be possible to add this feature with a little-bit of effort if there were a strong demand for it, but it would be much easier to do it in your subclass and/or container class. How many people would like to see x['f1','f2','f5'] return a new array with a new data-type descriptor constructed from the provided fields? > It would also be nice for the table to be able to deal easily with > masked data (I have not tried this with recarray yet) and perhaps also > to be able to give the rows/records unique ids that could be used to > select the rows/records (in addition to the row/record index), in the > same way that the fieldnames can select the fields. Adding fieldnames to the "rows" is definitely something that a subclass would be needed for. I'm not sure how you would even propose to select using row names. Would you also use getitem semantics? > Can anyone comment on this issue? Particularly whether code exists for > this purpose, and if not ideas about how best to go about developing > such a Table like array (this would need to be limited to python > programing as my ability to program in c is very limited). I don't know of code that already exists for this, but I don't think it would be too hard to construct your own data-frame object. I would probably start with an implementation that just used standard arrays of a particular type to represent the internal columns and then handle the indexing using your own over-riding of the __getitem__ and __setitem__ special methods. This would be the easiest to get working, I think. -Travis From hurak at control.felk.cvut.cz Wed Feb 1 06:22:03 2006 From: hurak at control.felk.cvut.cz (=?ISO-8859-2?Q?Zden=ECk_Hur=E1k?=) Date: Wed Feb 1 06:22:03 2006 Subject: [Numpy-discussion] Re: Numerical Mathematics Consortium vs. scipy and numpy References: <43DE59CB.6070601@noaa.gov> Message-ID: Christopher Barker wrote: > In general, > unfortunately, it looks to be quite commercially-focused: how does one > get the open source community to be represented in this kind of thing? I do not know the details, but one of the members of the consortium - Scilab consortium - is kind of open-source. Well, their license is not purely free (http://www.scilab.org/legal/license.html, http://www.scilab.org/legal/index_legal.php?page=faq.html#q6), but it is definitely not a commercial project. Perhaps some Scilab people could answer some open-source related questions. Note that I am not related to the NMC in any way, it is really that I only found this link and as a newcomer to Python computing community, I am simply interested what are the attitudes towards these issues here. Zdenek From chanley at stsci.edu Wed Feb 1 06:47:03 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Wed Feb 1 06:47:03 2006 Subject: [Numpy-discussion] numpy.dtype problem Message-ID: <43E0C9A8.6080200@stsci.edu> The following seems to have stopped working: In [6]: import numpy In [7]: a = numpy.ones((3,3),dtype=numpy.int32) In [8]: a.dtype.name --------------------------------------------------------------------------- exceptions.MemoryError Traceback (most recent call last) /data/sparty1/dev/devCode/ MemoryError: Chris From cjw at sympatico.ca Wed Feb 1 08:17:09 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Wed Feb 1 08:17:09 2006 Subject: [Numpy-discussion] RE: [Numpy-user] possible error with isarrtype In-Reply-To: <43E03349.10208@ieee.org> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> Message-ID: <43E0DE8D.6020907@sympatico.ca> Travis Oliphant wrote: > Colin J. Williams wrote: > >> One of the deprecated names is ArrayType. This seems to be closer to >> the Python style than ndarray. > > > Not really. I agree with what you say below, but doesn't ArrayType have a greater similarity to the Python types than ndarray? [Dbg]>>> import types [Dbg]>>> dir(types) ['BooleanType', 'BufferType', 'BuiltinFunctionType', 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', 'Instance Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan geType', '__builtins__', '__doc__', '__file__', '__name__'] [Dbg]>>> I presume that the aim is still that numpy will become a part of the Python offering. Colin W. > Rather than test: > type(var) == types.IntType > you should be testing > isinstance(var, int) > > just like rather than testing > type(somearray) == ArrayType > > you should be testing > isinstance(somearray, ndarray) > > Python style has changed a bit since 2.2 allowed sub-typing builtings > > -Travis > From faltet at carabos.com Wed Feb 1 08:38:01 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 1 08:38:01 2006 Subject: [Numpy-discussion] numpy.dtype problem In-Reply-To: <43E0C9A8.6080200@stsci.edu> References: <43E0C9A8.6080200@stsci.edu> Message-ID: <200602011736.53745.faltet@carabos.com> A Dimecres 01 Febrer 2006 15:46, Christopher Hanley va escriure: > The following seems to have stopped working: > > > In [6]: import numpy > > In [7]: a = numpy.ones((3,3),dtype=numpy.int32) > > In [8]: a.dtype.name > --------------------------------------------------------------------------- > exceptions.MemoryError Traceback (most > recent call last) > > /data/sparty1/dev/devCode/ > > MemoryError: Below is a patch for this. It seems to me that Travis is introducing new *scalar data types. I'm not sure if they should appear in this case, but perhaps he can throw some light on this. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" Index: numpy/core/src/arrayobject.c =================================================================== --- numpy/core/src/arrayobject.c (revision 2043) +++ numpy/core/src/arrayobject.c (working copy) @@ -8132,18 +8132,21 @@ static PyObject * arraydescr_typename_get(PyArray_Descr *self) { - int len; - PyTypeObject *typeobj = self->typeobj; + int len; + char *w_unders; + PyTypeObject *typeobj = self->typeobj; PyObject *res; - /* Both are equivalents, but second is more resistent to changes */ -/* len = strlen(typeobj->tp_name) - 8; */ - if (PyTypeNum_ISUSERDEF(self->type_num)) { res = PyString_FromString(typeobj->tp_name); } else { - len = strchr(typeobj->tp_name, (int)'_')-(typeobj->tp_name); + w_unders = strchr(typeobj->tp_name, (int)'_'); + if (w_unders != NULL) + len = w_unders-(typeobj->tp_name); + else + /* '_' not found! returning the complete name! */ + len = strlen(typeobj->tp_name); res = PyString_FromStringAndSize(typeobj->tp_name, len); } if (PyTypeNum_ISEXTENDED(self->type_num) && self->elsize != 0) { From gerard.vermeulen at grenoble.cnrs.fr Wed Feb 1 09:16:59 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Wed Feb 1 09:16:59 2006 Subject: [Numpy-discussion] RE: [Numpy-user] possible error with isarrtype In-Reply-To: <43E0DE8D.6020907@sympatico.ca> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> <43E0DE8D.6020907@sympatico.ca> Message-ID: <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> On Wed, 01 Feb 2006 11:15:09 -0500 "Colin J. Williams" wrote: [ depration + style ] > [Dbg]>>> import types > [Dbg]>>> dir(types) > ['BooleanType', 'BufferType', 'BuiltinFunctionType', > 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', > 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', > 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', > 'Instance > Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', > 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', > 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', > 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan > geType', '__builtins__', '__doc__', '__file__', '__name__'] > [Dbg]>>> > Isn't the types module becoming superfluous? [packer at zombie ~]$ python -E Python 2.4 (#2, Feb 12 2005, 00:29:46) [GCC 3.4.3 (Mandrakelinux 10.2 3.4.3-3mdk)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> long >>> int >>> str >>> dict >>> Gerard From oliphant.travis at ieee.org Wed Feb 1 09:31:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 1 09:31:08 2006 Subject: [Numpy-discussion] numpy.dtype problem In-Reply-To: <200602011736.53745.faltet@carabos.com> References: <43E0C9A8.6080200@stsci.edu> <200602011736.53745.faltet@carabos.com> Message-ID: <43E0F037.9030503@ieee.org> Francesc Altet wrote: >A Dimecres 01 Febrer 2006 15:46, Christopher Hanley va escriure: > > >>The following seems to have stopped working: >> >> >>In [6]: import numpy >> >>In [7]: a = numpy.ones((3,3),dtype=numpy.int32) >> >>In [8]: a.dtype.name >>--------------------------------------------------------------------------- >>exceptions.MemoryError Traceback (most >>recent call last) >> >>/data/sparty1/dev/devCode/ >> >>MemoryError: >> >> > >Below is a patch for this. It seems to me that Travis is introducing >new *scalar data types. I'm not sure if they should appear in this >case, but perhaps he can throw some light on this. > > No, I'm not introducing anything new. I just changed the name of the scalar type objects. They used to be conveying type information for the array (but that is now handled by the dtype objects), and so I changed the name of the scalars from _arrtype to scalar to better convey what they are. The code in the name attribute getter was expecting an underscore which isn't there anymore. -Travis From oliphant at ee.byu.edu Wed Feb 1 11:37:09 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 11:37:09 2006 Subject: [Numpy-discussion] Re: NumPy Behavior In-Reply-To: <1138822085.2972.38.camel@d-128-95-235-238.dhcp4.washington.edu> References: <1138759409.5372.14.camel@d-128-95-235-238.dhcp4.washington.edu> <43E0328F.5000400@ieee.org> <1138768894.2965.10.camel@zen> <43E05EC6.9090302@ieee.org> <1138779155.4596.20.camel@zen> <43E0676A.8050401@ieee.org> <1138822085.2972.38.camel@d-128-95-235-238.dhcp4.washington.edu> Message-ID: <43E10DC8.4040002@ee.byu.edu> Jay Painter wrote: >Travis, > >I updated to the latest svn code this morning and ran some mmlib tests. >Your commit has fixed the segmentation fault/illegal instruction error, >but the mmlib test suite fails. I'll look into it tonight, it may just >be intentional differences between Numeric and NumPy. I'll let you know >what I find. > > Be sure to look over the list of differences in the sample chapter of my book (available at numeric.scipy.org) The numpy.lib.convertcode module can be used to make most of the changes (but there may be a few it misses). >As I've been working with Numeric, I have often desired some particular >features which I'd be willing to work on with NumPy. Maybe these have >come up before? > >1) Alternative implementations of matrixmultiply() and >eigenvalues()/eigenvalues() for symmetric matrices. For example, there >is a analytic expression for the eigenvalues of a 3x3 symmetric matrix. > >2) New C implemented vectorM and matrixMN classes which support the >array interface. This could allow for lower memory usage via pool >allocations and the customized implementations in item #1. The ones I >wish were there are: > >class vector3: >class vector4: >class matrix33: >class matrix44: >class symmetric_matrix33: >class symmetric_matrix44: > >Given this, here's a useful function for graphics applications: > >matrixmultiply431(type matrix44, type vector3) > >This function multiplies the 4x4 matrix by the three dimensional vector >by implicitly adding a fourth element with a value of 1.0 to the vector. > > This is actually a benefit of the array interface. It allows many different objects to *be* arrays and allow fast converions when possible. Specialized small-arrays are a good idea, I think, just like specialized (sparse) large arrays. Perhaps it would make sense to define a base-class array object that has only a very few things defined like the number of dimensions, a pointer to the actual memory, the flags, and perhaps a pointer to the typeobject. This would leave things like how the dimensions are stored up for sub-classes to define. -Travis From oliphant at ee.byu.edu Wed Feb 1 14:17:13 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 14:17:13 2006 Subject: [Numpy-discussion] Examples of new (nested) record-types Message-ID: <43E13345.9020701@ee.byu.edu> I've just checked in some tests for the nested-record support in numpy. These tests were written by Francesc Alted and are very useful (they helped track down at-least two reference-counting errors). But, a big utility they have is to show a method for defining and constructing arrays of nested records. Anybody wanting to figure out how to use that facility in NumPy better would benefit by looking at the code in /numpy/core/tests/test_numerictypes.py in the SVN version of NumPy. -Travis P.S. Here is an example of the kind of structure he makes arrays of in this file... # This is the structure of the table used for nested objects (DON'T PANIC!): # # +-+---------------------------------+-----+----------+-+-+ # |x|Info |color|info |y|z| # | +-----+--+----------------+----+--+ +----+-----+ | | # | |value|y2|Info2 |name|z2| |Name|Value| | | # | | | +----+-----+--+--+ | | | | | | | # | | | |name|value|y3|z3| | | | | | | | # +-+-----+--+----+-----+--+--+----+--+-----+----+-----+-+-+ # After defining an array of these guys you could get at an array of y3 fields using a['Info']['Info2']['y3'] Or, reca = a.view(recarray) reca.Info.Info2.y3 -Travis From oliphant at ee.byu.edu Wed Feb 1 14:25:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 14:25:02 2006 Subject: [Numpy-discussion] RE: [Numpy-user] possible error with isarrtype In-Reply-To: <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> <43E0DE8D.6020907@sympatico.ca> <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> Message-ID: <43E13502.6050207@ee.byu.edu> Gerard Vermeulen wrote: >On Wed, 01 Feb 2006 11:15:09 -0500 >"Colin J. Williams" wrote: > >[ depration + style ] > > > >>[Dbg]>>> import types >>[Dbg]>>> dir(types) >>['BooleanType', 'BufferType', 'BuiltinFunctionType', >>'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', >>'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', >>'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', >>'Instance >>Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', >>'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', >>'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', >>'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan >>geType', '__builtins__', '__doc__', '__file__', '__name__'] >>[Dbg]>>> >> >> >> > >Isn't the types module becoming superfluous? > > > That's the point I was trying to make. ArrayType is to ndarray as DictionaryType is to dict. My understanding is that the use of types.DictionaryType is discouraged. -Travis From chris at pseudogreen.org Wed Feb 1 14:40:13 2006 From: chris at pseudogreen.org (Christopher Stawarz) Date: Wed Feb 1 14:40:13 2006 Subject: [Numpy-discussion] Patch for scalartypes.inc.src Message-ID: <6e9a368d2cc315a6b218e210b361afdd@pseudogreen.org> I ran into a couple bugs in scalartypes.inc.src: - The switch statement in PyArray_ScalarAsCtype was missing some breaks. - When SIZEOF_LONGDOUBLE == SIZEOF_DOUBLE, PREC_REPR and PREC_STR should both be 17, not 15. (This matches what's done in floatobject.c for Python's float.) The attached patch (against SVN revision 2045) fixes both problems. Cheers, Chris -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: scalartypes_patch.txt URL: From agn at noc.soton.ac.uk Wed Feb 1 15:04:01 2006 From: agn at noc.soton.ac.uk (George Nurser) Date: Wed Feb 1 15:04:01 2006 Subject: [Numpy-discussion] numpy with ACML In-Reply-To: <200601282102.53281.luszczek@cs.utk.edu> References: <4A52806B-348E-4138-92B8-1D3F50E1D39B@noc.soton.ac.uk> <200601282102.53281.luszczek@cs.utk.edu> Message-ID: <7577860F-0907-4EA0-9A61-3EABE50A90F4@noc.soton.ac.uk> > There is code for that on netlib: > http://www.netlib.org/blas/blast-forum/cblas.tgz > > I used it myself for my C code before and it worked just fine. > > Piotr Piotr, Thanks. I got numpy to work using the cblas & acml. Details at the bottom of the email. I then ran the bench.py tests on numpy [1 processor Opteron ?1.8 GHZ] and got slightly unexpected answers: numpy times given both linked to cblas+acml and not linked. Neither of numarray, Numeric linked to any blas: python bench.py Tests x.T*y x*y.T A*x A*B A.T*x half 2in2 Dimension: 5 Array 0.5700 0.1600 0.1200 0.1600 0.6200 0.4300 0.4800 --acml +cblas Matrix 3.1000 0.9300 0.4000 0.4600 0.6500 1.7000 2.6200--acml +cblas Array 0.6400 0.1700 0.1500 0.1800 0.6100 0.3600 0.4000 Matrix 3.2300 0.6900 0.4100 0.4600 0.6700 1.4900 2.3400 NumArr 1.2100 2.8500 0.2700 2.8600 5.0000 4.1100 6.8300 Numeri 0.7300 0.1800 0.1600 0.2000 0.4100 0.3300 0.4300 Dimension: 50 Array 5.9200 0.8400 0.2900 6.9300 8.0900 2.3600 2.4500--acml +cblas Matrix 30.5500 1.8500 0.6000 7.4500 0.9300 3.7100 4.6400--acml +cblas Array 6.5900 2.7100 0.7500 25.3100 8.5000 0.5600 0.6100 Matrix 32.5200 3.2600 1.0200 25.6100 1.2900 1.7400 2.5900 NumArr 12.6600 3.9700 0.7400 27.7900 6.4900 4.5500 7.1900 Numeri 7.9700 1.5000 0.6500 24.2700 7.4200 0.6000 2.3200 Dimension: 500 Array 0.9800 3.2900 0.6100 65.0000 10.8600 2.3100 2.5500--acml +cblas Matrix 3.5300 3.3500 0.6400 64.9300 0.6500 2.3300 2.6100--acml +cblas Array 1.0900 4.5600 0.8300 589.0000 11.0700 0.1300 0.2600 Matrix 3.7000 4.5800 0.8400 593.7300 1.1700 0.1300 0.3200 NumArr 1.6700 3.3100 0.7700 417.5600 4.3900 0.8500 1.1000 Numeri 1.1900 3.5200 0.7800 559.8100 9.7400 0.8000 2.4100 -- acml+blas indeed speeds up matrix multiplication by factor of 10. but --doesn't really help vector dot products. --slows down searching operations half, 2in2 by factor of 10. Matrices generally much slower than arrays, except for A.T*x, which is ~10x faster for matrices. I also tried with the goto blas library linked in with cblas. Similar results, except slightly faster x.T*y. But trickier to get linked. --George Nurser ------------------------------------------------------------------------ ---------------------------------------------------- making the cblas.a library was straightforward. I just changed the flags in Makefile.LINUX to: CFLAGS = -O3 -DADD_ -pthread -fno-strict-aliasing -m64 -msse2 - mfpmath=sse -march=opteron -fPIC FFLAGS = -Wall -fno-second-underscore -fPIC -O3 -funroll-loops - march=opteron -mmmx -msse2 -msse -m3dnow RANLIB = ranlib BLLIB = where libacml.so lives/libacml.so then link Makefile.LINUX to Makefile.in and make. The resulting cblas.a must then be moved or linked to libcblas.a in the *same* directory as the libacml.so. This directory then needs to be added to the $LD_LIBRARY_PATH if it is not a standard one. I needed a site.cfg in numpy/numpy/distutils/site.cfg as follows: [blas] blas_libs = cblas, acml library_dirs = where libacml.so lives include_dirs = where cblas.h lives [lapack] language = f77 lapack_libs = acml library_dirs = where libacml.so lives include_dirs = where acml *.h live Then numpy and scipy both seem to build fine. numpy passes t=numpy.test(), scipy passes scipy.test(level=10). From oliphant at ee.byu.edu Wed Feb 1 15:46:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 1 15:46:03 2006 Subject: [Numpy-discussion] NumPy SVN not bulding.... Message-ID: <43E1480D.8040503@ee.byu.edu> After changeset 2046, I'm not able to build NumPy. This is what I'm getting... Revsion 2045 works fine. ##### msg: Extension instance has no attribute '__getitem__' Extension instance has no attribute '__getitem__' FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib/atlas'] language = c define_macros = [('NO_ATLAS_INFO', 2)] include_dirs = ['/usr/include/atlas'] Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas Traceback (most recent call last): File "setup.py", line 76, in ? setup_package() File "setup.py", line 63, in setup_package config.add_subpackage('numpy') File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, in add_subpackage config = self.get_subpackage(subpackage_name,subpackage_path) File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, in get_subpackage config = setup_module.configuration(*args) File "/home/oliphant/numpy/numpy/setup.py", line 10, in configuration config.add_subpackage('core') File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, in add_subpackage config = self.get_subpackage(subpackage_name,subpackage_path) File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, in get_subpackage config = setup_module.configuration(*args) File "numpy/core/setup.py", line 207, in configuration config.add_data_dir('tests') File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 594, in add_data_dir self.add_data_files((ds,filenames)) File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 660, in add_data_files dist.data_files.extend(data_dict.items()) AttributeError: 'NoneType' object has no attribute 'extend' From cookedm at physics.mcmaster.ca Wed Feb 1 15:54:09 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 1 15:54:09 2006 Subject: [Numpy-discussion] NumPy SVN not bulding.... In-Reply-To: <43E1480D.8040503@ee.byu.edu> (Travis Oliphant's message of "Wed, 01 Feb 2006 16:45:17 -0700") References: <43E1480D.8040503@ee.byu.edu> Message-ID: Travis Oliphant writes: > After changeset 2046, I'm not able to build NumPy. Obviously my fault then. I'll poke at it. [dave learns yet again to test before committing...] > > This is what I'm getting... > > > Revsion 2045 works fine. > > > ##### msg: Extension instance has no attribute '__getitem__' > Extension instance has no attribute '__getitem__' > FOUND: > libraries = ['ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/lib/atlas'] > language = c > define_macros = [('NO_ATLAS_INFO', 2)] > include_dirs = ['/usr/include/atlas'] > > Warning: distutils distribution has been initialized, it may be too > late to add an extension _dotblas > Traceback (most recent call last): > File "setup.py", line 76, in ? > setup_package() > File "setup.py", line 63, in setup_package > config.add_subpackage('numpy') > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, > in add_subpackage > config = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, > in get_subpackage > config = setup_module.configuration(*args) > File "/home/oliphant/numpy/numpy/setup.py", line 10, in configuration > config.add_subpackage('core') > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 543, > in add_subpackage > config = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 533, > in get_subpackage > config = setup_module.configuration(*args) > File "numpy/core/setup.py", line 207, in configuration > config.add_data_dir('tests') > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 594, > in add_data_dir > self.add_data_files((ds,filenames)) > File "/home/oliphant/numpy/numpy/distutils/misc_util.py", line 660, > in add_data_files > dist.data_files.extend(data_dict.items()) > AttributeError: 'NoneType' object has no attribute 'extend' > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cookedm at physics.mcmaster.ca Wed Feb 1 16:43:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 1 16:43:02 2006 Subject: [Numpy-discussion] NumPy SVN not bulding.... In-Reply-To: (David M. Cooke's message of "Wed, 01 Feb 2006 18:53:25 -0500") References: <43E1480D.8040503@ee.byu.edu> Message-ID: cookedm at physics.mcmaster.ca (David M. Cooke) writes: > Travis Oliphant writes: > >> After changeset 2046, I'm not able to build NumPy. > > Obviously my fault then. I'll poke at it. > > [dave learns yet again to test before committing...] Ok, 2048 fixes it. (Used a wrong variable name when refactoring) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant.travis at ieee.org Thu Feb 2 06:30:12 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 06:30:12 2006 Subject: [Numpy-discussion] Re: [SciPy-user] scipy 0.4.6 release? In-Reply-To: <43E1F355.1050001@ftw.at> References: <43DA1A6F.3050103@hoc.net> <1138380174.43da4d8ed9ae6@webmail.colorado.edu> <43E1A561.6000103@dslextreme.com> <43E1F355.1050001@ftw.at> Message-ID: <43E21738.4060105@ieee.org> Ed Schofield wrote: >Erick Tryzelaar wrote: > > > >>Any chance we could get a minor version bump to fix this and the >>dtypechar/dtype.char bug in Lib/weave/standard_array_spec.py (both >>already fixed in svn)? These two changes get my weave test code and then >>I can release my darwinports package. Thanks, >> >> >> >> >I think this is a good idea. The most recent release (0.4.4) also isn't >compatible with the latest NumPy (0.9.4). I could work on making a new >release this weekend if people agree. > > I'll roll out NumPy 0.9.5 at the same time so we have two versions that work together. There have been some bug-fixes and a few (minor) feature changes. But, I am running out of numbers for 1.0 release :-) -Travis From schofield at ftw.at Thu Feb 2 07:39:15 2006 From: schofield at ftw.at (Ed Schofield) Date: Thu Feb 2 07:39:15 2006 Subject: [Numpy-discussion] Re: [SciPy-user] scipy 0.4.6 release? In-Reply-To: <43E21738.4060105@ieee.org> References: <43DA1A6F.3050103@hoc.net> <1138380174.43da4d8ed9ae6@webmail.colorado.edu> <43E1A561.6000103@dslextreme.com> <43E1F355.1050001@ftw.at> <43E21738.4060105@ieee.org> Message-ID: <43E2275F.3050705@ftw.at> Travis Oliphant wrote: > Ed Schofield wrote: > >> I think this is a good idea. The most recent release (0.4.4) also isn't >> compatible with the latest NumPy (0.9.4). I could work on making a new >> release this weekend if people agree. > > I'll roll out NumPy 0.9.5 at the same time so we have two versions > that work together. There have been some bug-fixes and a few (minor) > feature changes. But, I am running out of numbers for 1.0 release :-) That sounds good :) How about a stream of 1.0 release candidates for Numpy, starting with 1.0-rc1? For what it's worth, I think we should exercise some patience and caution before releasing a 1.0 version of NumPy, because this is likely to signify an API freeze. The recent dtype changes are a case in point -- the API is cleaner now, but the change required many small changes in SciPy. SciPy is lucky to have helpful developers close to NumPy too, but some other projects won't be able to respond as quickly to compatibility-breaking improvements. Some things I have in mind: stronger type-checking for unsafe casts, and ensuring operations on matrices return matrices ... ;) -- Ed From svetosch at gmx.net Thu Feb 2 08:07:03 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Thu Feb 2 08:07:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] scipy 0.4.6 release? In-Reply-To: <43E2275F.3050705@ftw.at> References: <43DA1A6F.3050103@hoc.net> <1138380174.43da4d8ed9ae6@webmail.colorado.edu> <43E1A561.6000103@dslextreme.com> <43E1F355.1050001@ftw.at> <43E21738.4060105@ieee.org> <43E2275F.3050705@ftw.at> Message-ID: <43E22E01.3000604@gmx.net> Ed Schofield schrieb: > ensuring operations on matrices return matrices ... ;) > Yes please! I'm so glad to have you on my side... -Sven From ashamed at issihk.net Thu Feb 2 09:15:07 2006 From: ashamed at issihk.net (=?windows-1251?B?MTQgLSAxOCD05eLw4Ov/IDIwMDYg4+7k4A==?=) Date: Thu Feb 2 09:15:07 2006 Subject: [Numpy-discussion] =?windows-1251?B?0err4OTx6uD/IOvu4+jx8ujq4C4gwvHlIOjt8fLw8+zl7fL7IOIg7uTt7uwg6vPw?= =?windows-1251?B?8eU=?= Message-ID: <07a501c62819$1e33a5ca$6832af6d@issihk.net> ???????????? ???? ?? ????????? ????????? 14 - 18 ??????? 2006 ???? ???? ?? ??????????? ? ????? ?? ???????? ??????? - ??????? ?????? ?????? ????????? ?? ??????? ??????? ? ????? ????????? ??????? ?? ???????????? ???????? ?? ??????????? ??? ??????????????? ???????? - ?? ?????? ???????????? ???? ??????? ??? ?? ?????? ?????? ??? ????????, ?? ? ???? ?????? ?? ?? ????? ?????????? ???????: ??? ??????? ?????????? ????????? ?????? ????? ???????????? ??? ??????? ???, ????? ????? ?? ???????????? ??? ????????? ????????? ???????? ? ????????? ??? ?? ?????? ??????????? ???? ???????? ?? ?????? ????????? ????????? ????? ??????????? ???????????? ?????????? ?????????. ??????? ???????? ??????? ????? ?? ??????? ????? ??????????? ???????????? ?????????? ?????????, ? ?????? ???????????????? ?????????????? ?????????? ?????????. ?? ???????? ?? ??????? ? ??????????? ??????? ? ????????? ?????????? ???????, ???????? ??????????????? ?????? ??????, ???????????? ????????????, ? ????? ????? ????????? ???????????? ???????? ?????????? ?????? ??? ?????? ??????. ?????????? ??????????, ??????????? ? ???????????? ?????????? ? ????????? ??????????????? ??????????????????? ????????-??????????: ???????????? ???? ?? ????????? ????????? 14 - 18 ??????? 2006 ???? ??????? ????????????? ??????????? ? ???????? ??????????? ???? ?????????? ????????. ???????????? ???????, ?????? ?????????? ???????? ? ?????? ??????? ??? ????? ??????????? ? ??????????? ??????? ??????????? ???????. ? ????????? ?????: ??????????????? ?????????????? ?????????? ????????? ? ???????????? ??????? ??????????????? ? ??????????? ????????? ?????? ???????: (095) 98?-65l6 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Thu Feb 2 10:20:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 2 10:20:03 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables Message-ID: <200602021919.31191.faltet@carabos.com> Hi, As some of you know (and are impatiently waiting for ;-) I'm in the process of giving support to numpy in PyTables, so that the users can transparently make use of numpy objects (or Numeric or numarray). Well, I'm glad to say that the process is almost done (bar some issues with character array support and unicode arrays). Fortunately, thanks to the provision of the array interface, saving and reading numpy objects have nearly the same performance than using numarray objects (that, as you know, are in the core of PyTables). So, if you want to have a try to the new PyTables, you can download this preliminary version of it from: http://pytables.carabos.com/download/preliminary/pytables-1.3beta1.tar.gz I'm attaching some examples so that you can see how to use numpy in combination with PyTables, by simply specifying the correct flavor in the data type definition for Table and EArray (the same goes for CArray and VLArray, although no examples are provided here) objects. Also, as I already said in other occasions, when numpy would stabilize enough, we are planning to make numpy the core data container for PyTables. Meanwhile, please, test the couple PyTables/numpy and report any error or glitch that you may notice, so that it can get as stable as possible in order to easy the transition numarray-->numpy. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" -------------- next part -------------- A non-text attachment was scrubbed... Name: array1-numpy.py Type: application/x-python Size: 1200 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: earray1-numpy.py Type: application/x-python Size: 490 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: table1-numpy.py Type: application/x-python Size: 1218 bytes Desc: not available URL: From mfmorss at aep.com Thu Feb 2 12:00:06 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Thu Feb 2 12:00:06 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables In-Reply-To: <200602021919.31191.faltet@carabos.com> Message-ID: This is good news. We're planning, tentatively, to implement a big project in Python that would make heavy use both of Numpy and Pytables. We're also waiting, however, to for Numpy to stabilize. It's a little disconcerting to see how turbulent it is right now. Mark F. Morss Principal Analyst, Market Risk American Electric Power Francesc Altet To Sent by: pytables-users at lists.sourceforge.ne numpy-discussion- t admin at lists.sourc cc eforge.net numpy-discussion at lists.sourceforge. net Subject 02/02/2006 01:19 [Numpy-discussion] Beta support for PM Numpy in PyTables Hi, As some of you know (and are impatiently waiting for ;-) I'm in the process of giving support to numpy in PyTables, so that the users can transparently make use of numpy objects (or Numeric or numarray). Well, I'm glad to say that the process is almost done (bar some issues with character array support and unicode arrays). Fortunately, thanks to the provision of the array interface, saving and reading numpy objects have nearly the same performance than using numarray objects (that, as you know, are in the core of PyTables). So, if you want to have a try to the new PyTables, you can download this preliminary version of it from: http://pytables.carabos.com/download/preliminary/pytables-1.3beta1.tar.gz I'm attaching some examples so that you can see how to use numpy in combination with PyTables, by simply specifying the correct flavor in the data type definition for Table and EArray (the same goes for CArray and VLArray, although no examples are provided here) objects. Also, as I already said in other occasions, when numpy would stabilize enough, we are planning to make numpy the core data container for PyTables. Meanwhile, please, test the couple PyTables/numpy and report any error or glitch that you may notice, so that it can get as stable as possible in order to easy the transition numarray-->numpy. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" [attachment "array1-numpy.py" deleted by Mark F Morss/OR3/AEPIN] [attachment "earray1-numpy.py" deleted by Mark F Morss/OR3/AEPIN] [attachment "table1-numpy.py" deleted by Mark F Morss/OR3/AEPIN] From oliphant.travis at ieee.org Thu Feb 2 12:38:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 12:38:02 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables In-Reply-To: References: Message-ID: <43E26D5F.7030404@ieee.org> mfmorss at aep.com wrote: >This is good news. We're planning, tentatively, to implement a big project >in Python that would make heavy use both of Numpy and Pytables. We're also >waiting, however, to for Numpy to stabilize. It's a little disconcerting >to see how turbulent it is right now. > > The only way for it to stabilize is for people to start using it. So, dive in. The upheaval of the first of the year is behind us. I don't see any major changes in the works. The only possibility is a few new C-API calls to make scalar math work more easily. -Travis From absorptive at tf1.fr Thu Feb 2 13:19:03 2006 From: absorptive at tf1.fr (=?windows-1251?B?OSAtIDEwIPTl4vDg6/8gMjAwNiDj7uTg?=) Date: Thu Feb 2 13:19:03 2006 Subject: [Numpy-discussion] =?windows-1251?B?8+/w4OLr5e3o5SDx7vLw8+Tt6Org7Ogg7ODj4Ofo7eA=?= Message-ID: ?????????? ?????????? ? ??????????????? ?????????, ?????????? ?? ????????? ? ?????????? ?? ???????, ????????????? ?? ?????? ????????? ????????? ????????, ??????? ??????? ? ?????????? ????????-????????: ?????? ?????????: ?????????? ???????? ?????????? ? ????????? ???????????? ??????????? 9 - 10 ??????? 2006 ???? ???? ???????? - ???????????? ?????????? ?????? ???????? ???????? ?????????? ??? ???????????? ??????? ?????????? ?????????? ? ???????? ? ?????? ?? ??????? ???????????? ????????????? ??????? ?????????? ??????. ????????? ????????: I. ???????? ???????? - ???????? ?????? ??? ???????????? ???????????? ????????? ???? ???????????? ?????????? ???????? ??????????. ?????????????? ? ??????????? ?????????? ??????????. ????????? ? ?????? ? ??????????. II. ?????????????? ?????? ?????????? ?????????? ????????. ???????? ???????? ??????????, ????????????? ? ???????: ?? ????? ? ????????? ???????? ????????. ????????? ??????? ? ?????????? ???????? ????? ???????????. ???????????? ????????? ? ???????? ?????????, ???????????????? ??? ?????? (??????????? ??????????, ??????? ?????????). ???? ?????????????? ?????? ????????? ?????????. ????????? ?????? ? ????????? ?????????. ????????? ???????????????? ?????? ??? ??????????????? ? ????????? ?????????. ????????? ?? ?????????? ???????????????? ?????? ? ?????? ??????? "?????????? ?? ?????" (MBO). ???????? MBO, ?????????? ????? ? ????????? ????????? ????????????????. ???????? ????????? ? ???????????????? ????? ??????????. ???????? ? ?????? ?????? ?????????. NB. ????? ???????? ? ??????????? ???????? ?????????? ? ?????? ????????? ??? ???????????? ??????? ?????????? ?????????? ? ????????. III. ??????????? ?????????? ???????? ?????????? ???? ????????? ???????? ? ??????????. ??????????? ???????????? ????????? ????????. ?????? ???????? ?????????????? ????????. ?????? ?? ??????? ? ????-???????????????? ??????????. ??????????? ????????: ? ???? ????????????? ???????? ???????????? ??????????? ?????? ????????: ????????????? ?????? ?????????????? ?????????, ????????? ????????? ? ????? ??????, ????????? ???????????? ???????, ????-???????????????? ??????????. ????????????????? ???????? 2 ??? ?? 8 ?????. ???.: (495) 98?-65-39, 98?-65-36 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Thu Feb 2 15:54:16 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 15:54:16 2006 Subject: [Numpy-discussion] Learning "strides" Message-ID: I don't know if this came from numarray or not, but for me as someone who transitions from Numeric, the "strides" attribute to an ndarray is a a new feature. I've spend some time playing with it and there are some properties that I dislike. Some of these undesired properties are probably bugs and easy to fix, but others require some discussion. 1. Negative strides: >>> x = zeros(5) >>> x.strides= (-4,) >>> x array([ 0, 25, 0, -136009696, -136009536]) Looks like a bug. PyArray_CheckStrides only checks for one end of the buffer. It is easy to fix that by disallowing negative strides, but I think that would be wrong. In my view. the right sollution is to pass offset to PyArray_CheckStrides and check for both ends of the buffer. The later will change C-API. 2. Zero strides: >>> x = arange(5) >>> x.strides = 0 >>> x array([0, 0, 0, 0, 0]) >>> x += 1 >>> x array([5, 5, 5, 5, 5]) These are somewhat puzzling properties unless you know the internals. I believe ndarray with 0s in strides are quite useful and will follow up with the description of the properties I would expect from them. 3. "Fractional" strides: I call "fractional" strides that are not a multiple of "itemsize". >>> x = arange(5) >>> x.strides = 3 >>> x array([ 0, 256, 131072, 50331648, 3]) I think these should be disallowed. It is just too easy to forget that strides are given in bytes, not in elements. Ideally rather than checking for strides[i] % itemsize, I would just make strides[i] to be expressed in number of elements, not in bytes. This can be done without changing the way strides are stored internally - just multiply by itemsize in set_strides and divide in get_strides. If strides attribute was not introduced before numpy, this change should not cause any compatibility problems. If it has some history of use, it may be possible to depricate "strides" (with a deprecation warning) and introduce a different attribute, say "steps", that will be expressed in number of elements. From oliphant.travis at ieee.org Thu Feb 2 16:45:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 16:45:01 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: References: Message-ID: <43E2A751.1060807@ieee.org> Sasha wrote: >I don't know if this came from numarray or not, but for me as someone >who transitions from Numeric, the "strides" attribute to an ndarray is >a a new feature. I've spend some time playing with it and there are >some properties that I dislike. Some of these undesired properties >are probably bugs and easy to fix, but others require some discussion. > > Of course strides have always been there, they've just never been visible from Python. Allowing the user to set the strides may not be a good idea. It was done largely so that the code that deals with misaligned data could be tested. However, it also allows you a lot of flexibility for interacting with arbitrary data-buffers that might be useful, so I'm inclined to allow it if the possible problems can be fixed. Users that set strides will have to know what they are doing, of course. The average user wouldn't bother with it. >1. Negative strides: > > > >>>>x = zeros(5) >>>>x.strides= (-4,) >>>>x >>>> >>>> >array([ 0, 25, 0, -136009696, -136009536]) > >Looks like a bug. PyArray_CheckStrides only checks for one end of the >buffer. > Right. PyArray_CheckStrides needs to be better or we can't allow negative strides. > > >3. "Fractional" strides: >I call "fractional" strides that are not a multiple of "itemsize". > > In dealing with an arbitrary data-buffer, I could see this as being useful, so I'm not sure if disallowing it is a good idea. Again, setting strides is not something that should be done by the average user so I'm not as concerned about "forgetting" the units strides are in. If a user is going to be setting strides you have to assume they are being careful. A separate attribute called steps that uses element-sizes instead of byte-sizes is a possible idea. -Travis From ndarray at mac.com Thu Feb 2 17:03:07 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 17:03:07 2006 Subject: [Numpy-discussion] Zeros in strides Message-ID: As I explained in my previous post, numpy allows zeros in the "strides" tuple, but the arrays with such strides have unexpected properties. In this post I will try to explain why arrays with zeros in strides are desireable and what properties they should have. A rank-1 array with strides=0 behaves almost like a scalar, in fact scalar arithmetics is currently implemented by setting stride to 0 is generic umath loops. Like scalar, rank-1 array with stride=0 only needs a buffer of size 1*itemsize, but currently numpy does not allow creation of rank-1 arrays with buffer smaller than size*itemsize: >>> ndarray([5], strides=[0], buffer=array([1])) Traceback (most recent call last): File "", line 1, in ? TypeError: buffer is too small for requested array An array with 0 stride is a better alternative to x + zeros(n) than a scalar or rank-0 x because an array with zero stride knows its size. (With the current umath implementation, adding two arrays with stride=0, would still require n operations, but this would probably not be the case if BLAS is used instead of a generic loop). I propose to make a few changes to the way zeros in strides are handled. This looks like undocumented territory, so I don't think there are any compatibility issues. 1. Change the buffer size requirements so that dimentions with zero stride count as size=1. 2. Use strides provided to the ndarray even when buffer is not provided. Currently they are silently ignored: >>> ndarray([5], strides=[0]).strides (4,) 3. Fix augmented assignment operators. Currently: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([5, 5, 5, 5, 5]) >>> x += arange(5) >>> x array([15, 15, 15, 15, 15]) Desired: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([1, 1, 1, 1, 1]) >>> x += arange(5) >>> x array([1, 2, 3, 4, 5]) This will probably require proper handling of stride=0 case in the output arguments of ufuncs in general, so this may be harder to get right than the first two proposals. 4. Introduce xzeros and xones functions that will create stride=0 arrays as a super-fast alternative to zeros and ones. From oliphant.travis at ieee.org Thu Feb 2 17:39:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 17:39:02 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: References: Message-ID: <43E2B420.1060801@ieee.org> Sasha wrote: >A rank-1 array with strides=0 behaves almost like a scalar, in fact >scalar arithmetics is currently implemented by setting stride to 0 is >generic umath loops. Like scalar, rank-1 array with stride=0 only >needs a buffer of size 1*itemsize, but currently numpy does not allow >creation of rank-1 arrays with buffer smaller than size*itemsize: > > As you noted, broadcasting is actually done by setting strides equal to 0 in the affected dimensions. The changes you describe, however, require serious thought with C-level explanations because you will be changing some fundamental assumptions that are made throughout the code. For example, currently there is no way you can construct new memory for an array and have different strides assigned (that's why strides is ignored if no buffer is given). You would have to change the behavior of the C-level function PyArray_NewFromDescr. You need to propose how exactly you would change that. Checking for strides that won't cause later segfaults can be tricky especially if you start allowing buffer-sizes to be different than array dimensions. How do you propose to ensure that you won't walk outside of allocated memory when somebody changes the strides later? I'm concerned that your proposal has too many potential pitfalls. At least you haven't addressed them sufficiently. My current inclination is to simply disallow setting the strides attribute now that the misaligned segments of code have been tested. -Travis From ndarray at mac.com Thu Feb 2 18:00:01 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 18:00:01 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: <43E2A751.1060807@ieee.org> References: <43E2A751.1060807@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > Sasha wrote: > > > Of course strides have always been there, they've just never been > visible from Python. > I know that strides were always part of C-API, but I don't know if they were exposed to python in numarray. If they were, there is probably some history of use. Can someone confirm or deny that? > Allowing the user to set the strides may not be a good idea. It was > done largely so that the code that deals with misaligned data could be > tested. Presently settable strides attribute does not feel like an "experts only" feature. (You've documented it in your book!) > However, it also allows you a lot of flexibility for > interacting with arbitrary data-buffers that might be useful, so I'm > inclined to allow it if the possible problems can be fixed. > This is a great feature and I can see it being used to explain ndarrays to novices. I don't think it should be regarded as "for experts only." > > > >Looks like a bug. PyArray_CheckStrides only checks for one end of the > >buffer. > > > Right. PyArray_CheckStrides needs to be better or we can't allow > negative strides. > Please let me know if you plan to change PyArray_CheckStrides so that we don't duplicate effort. > >3. "Fractional" strides: > >I call "fractional" strides that are not a multiple of "itemsize". > In dealing with an arbitrary data-buffer, I could see this as being > useful, so I'm not sure if disallowing it is a good idea. Can you suggest a use-case? I cannot think of anything that cannot be handled using a record-array view of the buffer. > Again, > setting strides is not something that should be done by the average user > so I'm not as concerned about "forgetting" the units strides are in. > If a user is going to be setting strides you have to assume they are > being careful. > The problem is that many people (including myself) think that they know what strides are when they come to numpy because they used strides in other libraries (e.g. BLAS). Most people expect element-based strides. A footnote in your book "Our definition of stride here is an element-based stride, while the strides attribute returns a byte-based stride." also suggests that element-based strides are more natural. > A separate attribute called steps that uses element-sizes instead of > byte-sizes is a possible idea. Assuming strides attribute is not used except for testing, would you object to renaming current byte-based strides to "byte_strides" and implementing element-based "strides"? I would even suggest "_byte_strides" as a clearly "don't use it unless you know what you are doing" name. From ndarray at mac.com Thu Feb 2 18:17:05 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 2 18:17:05 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: <43E2B420.1060801@ieee.org> References: <43E2B420.1060801@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > The changes you describe, however, require serious thought with C-level > explanations because you will be changing some fundamental assumptions > that are made throughout the code. > I agree, but I would like to discuss this at the conceptual level first and maybe hear from people not intimately familiar with the C code about what they would expect from a zero stride. > For example, currently there is no way you can construct new memory for > an array and have different strides assigned (that's why strides is > ignored if no buffer is given). You would have to change the behavior > of the C-level function PyArray_NewFromDescr. You need to propose how > exactly you would change that. > Sure. I've started working on a "proof of concept" patch and will post it soon. > Checking for strides that won't cause later segfaults can be tricky > especially if you start allowing buffer-sizes to be different than array > dimensions. How do you propose to ensure that you won't walk outside > of allocated memory when somebody changes the strides later? > I think PyArray_CheckStrides would catch that, but I will have to test that once I have some code ready. > I'm concerned that your proposal has too many potential pitfalls. At > least you haven't addressed them sufficiently. My current inclination > is to simply disallow setting the strides attribute now that the > misaligned segments of code have been tested. That would be an unfortunate result of my post :-( I would suggest just to disallow zero strides in PyArray_CheckStrides until I can convince you that they are not that dangerous. From oliphant.travis at ieee.org Thu Feb 2 18:51:12 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 18:51:12 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: References: <43E2B420.1060801@ieee.org> Message-ID: <43E2C50D.3090103@ieee.org> Sasha wrote: >Sure. I've started working on a "proof of concept" patch and will post it soon. > > > Great. >>I'm concerned that your proposal has too many potential pitfalls. At >>least you haven't addressed them sufficiently. My current inclination >>is to simply disallow setting the strides attribute now that the >>misaligned segments of code have been tested. >> >> > >That would be an unfortunate result of my post :-( I would suggest >just to disallow zero strides in PyArray_CheckStrides until I can >convince you that they are not that dangerous. > > Inclinations to... and actual plans to... are quite different things :-) So, I'm waiting and seeing. You may be on to something. Let's see what others think and what you really have in mind. -Travis From oliphant.travis at ieee.org Thu Feb 2 19:03:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 2 19:03:08 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: References: <43E2A751.1060807@ieee.org> Message-ID: <43E2C7BB.8010808@ieee.org> Sasha wrote: >On 2/2/06, Travis Oliphant wrote: > > >Please let me know if you plan to change PyArray_CheckStrides so that >we don't duplicate effort. > > I won't do anything with it in the near future. >Can you suggest a use-case? I cannot think of anything that cannot be >handled using a record-array view of the buffer. > > Here's the issue. With records it is quite easy to generate strides that are not integer multiples of the data. For example, a record [('field1', 'f8),('field2', 'i2')] data-type would have floating point data separated by 10 bytes. When you get a view of field1 (but getting that attribute) you would get such a "misaligned" data. Look at the following: temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') temp['f1'].strides (10,) How would you represent that in the element-based strides report? So, fractional strides are actually fundamental to the ability to have record arrays. >The problem is that many people (including myself) think that they >know what strides are when they come to numpy because they used >strides in other libraries (e.g. BLAS). > > >Most people expect element-based strides. A footnote in your book >"Our definition of stride here is an element-based stride, while the >strides attribute returns a byte-based stride." also suggests that >element-based strides are more natural. > > It's easier to explain striding when you have contiguous chunks of memory of the same data-type, but record-arrays change that and require byte-based striding. >Assuming strides attribute is not used except for testing, would you >object to renaming current byte-based strides to "byte_strides" and >implementing element-based "strides"? > > I wouldn't have a problem with that, necessarily (though there is already an __array_strides__ attribute that is byte-based for the array interface --- except it returns None for C-style contiguous so we really don't need another attribute). The remaining issue is how will fractional strides be represented? -Travis From faltet at carabos.com Fri Feb 3 08:43:43 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 3 08:43:43 2006 Subject: [Numpy-discussion] Beta support for Numpy in PyTables In-Reply-To: <43E3639C.7090708@web.de> References: <200602021919.31191.faltet@carabos.com> <43E3639C.7090708@web.de> Message-ID: <200602031742.35235.faltet@carabos.com> A Divendres 03 Febrer 2006 15:07, N. Volbers va escriure: > I tried to install the beta and discovered that it is not possible to > build w/o numarray. So is numpy just optional and numarray a requirement > or will it be possible to build pytables only with numpy support ? No, numarray is still a *requeriment* for compiling PyTables; NumPy and Numeric are *not needed* at all for compilation. However, if they are present (I mean, at run-time, not at compile-time), they can be used both to provide input data to be written to disk and to get output data read from disk. You can even have different objects with different flavors (currently "numarray", "numpy", "numeric" or "python") in the same PyTables file, so that you can retrieve different objects (numarray, Numpy, Numeric or pure Python) in the same session depending on its flavor (but of course, this is not for the faint-hearted ;-). It is the magic of array interface: http://numeric.scipy.org/array_interface.html that allows doing this in a very efficient manner. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From ndarray at mac.com Fri Feb 3 10:10:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 10:10:02 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: <6C7F52FD-C2A6-45D1-AB68-AE5D04C61BE5@local> References: <43E2A751.1060807@ieee.org> <43E2C7BB.8010808@ieee.org> <6C7F52FD-C2A6-45D1-AB68-AE5D04C61BE5@local> Message-ID: On Feb 2, 2006, at 10:02 PM, Travis Oliphant wrote: >> >> Please let me know if you plan to change PyArray_CheckStrides so that >> we don't duplicate effort. >> >> > I won't do anything with it in the near future. > Attached patch deals with negative strides and prohibits zero strides. I think we can agree that this is the right behavior while zero-stride semantics are being discussed. Since I am touching C- API, I would like you to take a look before I commit. Also I am not sure "self->data - new->data" is always the right was to compute offset in array_strides_set . -- sasha -------------- next part -------------- Index: numpy/core/include/numpy/arrayobject.h =================================================================== --- numpy/core/include/numpy/arrayobject.h (revision 2053) +++ numpy/core/include/numpy/arrayobject.h (working copy) @@ -74,7 +74,7 @@ #define PY_SUCCEED 1 /* Helpful to distinguish what is installed */ -#define NDARRAY_VERSION 0x00090403 +#define NDARRAY_VERSION 0x00090404 /* Some platforms don't define bool, long long, or long double. Handle that here. Index: numpy/core/src/arrayobject.c =================================================================== --- numpy/core/src/arrayobject.c (revision 2053) +++ numpy/core/src/arrayobject.c (working copy) @@ -3517,7 +3517,7 @@ */ /*OBJECT_API*/ static Bool -PyArray_CheckStrides(int elsize, int nd, intp numbytes, +PyArray_CheckStrides(int elsize, int nd, intp numbytes, intp offset, intp *dims, intp *newstrides) { int i; @@ -3526,7 +3526,17 @@ numbytes = PyArray_MultiplyList(dims, nd) * elsize; for (i=0; i numbytes) { + intp stride = newstrides[i]; + if (stride > 0) { + if (offset + stride*(dims[i]-1)+elsize > numbytes) { + return FALSE; + } + } + else if (stride < 0) { + if (offset + stride*dims[i] < 0) { + return FALSE; + } + } else { return FALSE; } } @@ -4064,10 +4074,8 @@ } } else { /* buffer given -- use it */ - buffer.len -= offset; - buffer.ptr += offset; if (dims.len == 1 && dims.ptr[0] == -1) { - dims.ptr[0] = buffer.len / itemsize; + dims.ptr[offset] = buffer.len / itemsize; } else if (buffer.len < itemsize* \ PyArray_MultiplyList(dims.ptr, dims.len)) { @@ -4084,7 +4092,7 @@ goto fail; } if (!PyArray_CheckStrides(itemsize, strides.len, - buffer.len, + buffer.len, offset, dims.ptr, strides.ptr)) { PyErr_SetString(PyExc_ValueError, "strides is incompatible "\ @@ -4104,7 +4112,7 @@ PyArray_NewFromDescr(subtype, descr, dims.len, dims.ptr, strides.ptr, - (char *)buffer.ptr, + offset + (char *)buffer.ptr, buffer.flags, NULL); if (ret == NULL) {descr=NULL; goto fail;} PyArray_UpdateFlags(ret, UPDATE_ALL_FLAGS); @@ -4222,7 +4230,8 @@ numbytes = PyArray_MultiplyList(new->dimensions, new->nd)*new->descr->elsize; - if (!PyArray_CheckStrides(self->descr->elsize, self->nd, numbytes, + if (!PyArray_CheckStrides(self->descr->elsize, self->nd, numbytes, + self->data - new->data, self->dimensions, newstrides.ptr)) { PyErr_SetString(PyExc_ValueError, "strides is not "\ "compatible with available memory"); Index: numpy/core/tests/test_multiarray.py =================================================================== --- numpy/core/tests/test_multiarray.py (revision 2053) +++ numpy/core/tests/test_multiarray.py (working copy) @@ -62,6 +62,34 @@ assert_equal(self.one.dtype.str[1], 'i') assert_equal(self.three.dtype.str[1], 'f') + def check_stridesattr(self): + x = self.one + def make_array(size, offset, strides): + return ndarray([size], buffer=x, + offset=offset*x.itemsize, + strides=strides*x.itemsize) + assert_equal(make_array(4, 4, -1), array([4, 3, 2, 1])) + self.failUnlessRaises(ValueError, make_array, 4, 4, -2) + self.failUnlessRaises(ValueError, make_array, 4, 3, -1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 0) + + def check_set_stridesattr(self): + x = self.one + def make_array(size, offset, strides): + try: + r = ndarray([size], buffer=x, offset=offset*x.itemsize) + except: + pass + r.strides = strides=strides*x.itemsize + return r + assert_equal(make_array(4, 4, -1), array([4, 3, 2, 1])) + self.failUnlessRaises(ValueError, make_array, 4, 4, -2) + self.failUnlessRaises(ValueError, make_array, 4, 3, -1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 1) + self.failUnlessRaises(ValueError, make_array, 8, 3, 0) + + class test_dtypedescr(ScipyTestCase): def check_construction(self): d1 = dtype('i4') From matthew.brett at gmail.com Fri Feb 3 10:36:16 2006 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri Feb 3 10:36:16 2006 Subject: [Numpy-discussion] Huge performance hit for NaNs with Intel P3, P4 Message-ID: <1e2af89e0602031009s1c36f178ne2941ca0678c8f9f@mail.gmail.com> Hi, This is just to flag up a problem I ran into for matlab, which is that Pentium 3s and 4s have very very slow standard math performance with NaN values - for example adding to an NaN value on my machine is about 22 times slower than adding to a non-NaN value. This can become a very big problem with matrix multiplication if there are a significant number of NaNs. I explained the problem here, for matlab and the software I have been working with: http://www.mrc-cbu.cam.ac.uk/Imaging/Common/spm_intel_tune.shtml To illustrate, I've attached a timing script, running on current svn numpy linked with a standard P4 optimized ATLAS library. It (dot) multiples a 200x200 array of ones by a) another 200x200 array of ones and b) a 200x200 array of NaNs: ones * ones: 0.017460 ones * NaNs: 2.323742 proportion: 133.090452 Happily, for the Pentium 4, you can solve the problem by forcing the chip to do floating point math with the SSE instructions, which do not have this NaN penalty. So, the solution was only to recompile the ATLAS libraries with extra gcc flags forcing the use of SSE math (see the page above) - or use the Intel Math Kernel libraries, which appear to have already used this trick. Here's output from numpy linked to the recompiled ATLAS libraries: ones * ones: 0.026638 ones * NaNs: 0.023987 proportion: 0.900473 I wonder if it would be worth considering distributing the recompiled libraries by default in any binary releases? Or include a test like this one in the benchmarks to warn users about this problem? Best, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: nan_timer.py Type: text/x-python Size: 360 bytes Desc: not available URL: From mithrandir42 at web.de Fri Feb 3 13:06:08 2006 From: mithrandir42 at web.de (N. Volbers) Date: Fri Feb 3 13:06:08 2006 Subject: [Numpy-discussion] retrieving type objects for void array-scalar objects Message-ID: <43E3C596.2010904@web.de> Hello everyone! I think I have finally understood the 'void array-scalar object', but now I need some help me with the following. Assume I have an array, e.g. >>> dtype = numpy.dtype({'names': ['name', 'weight'],'formats': ['U30', 'f4']}) >>> a = numpy.array([(u'Bill', 71.2), (u'Fred', 94.3)], dtype=dtype) and this array is displayed in a graphical list. When the user modifies a value in the GUI, the value, which is a string, needs to be converted to the appropriate type, which in this example might either be a unicode string for the 'name' _or_ a float for the 'weight'. If the row already exists, I can get the type object easily: >>> my_type = type(ds.array['weight'][0]) and using this type object, I can convert the string >>> value = my_type(user_value) Is there some way to retrieve the type object directly from the array (not using any existing row) using only the name of the item? I have checked the dtype attribute, but I could only get the character representation for the item types (e.g. 'f4'). Any help would be appreciated, Niklas Volbers. From ndarray at mac.com Fri Feb 3 13:43:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 13:43:01 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: <43E2C50D.3090103@ieee.org> References: <43E2B420.1060801@ieee.org> <43E2C50D.3090103@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > Sasha wrote: > > >Sure. I've started working on a "proof of concept" patch and will post it soon. > > > Great. Attached patch allows numpy create memory-saving zero-stride arrays. Here is a sample session: >>> from numpy import * >>> x = ndarray([5], strides=0) >>> x array([12998768, 12998768, 12998768, 12998768, 12998768]) >>> x[0] = 0 >>> x array([0, 0, 0, 0, 0]) >>> x.strides = 4 Traceback (most recent call last): File "", line 1, in ? ValueError: strides is not compatible with available memory >>> x.strides (0,) >>> x.data Traceback (most recent call last): File "", line 1, in ? AttributeError: cannot get single-segment buffer for discontiguous array >>> exp(x) array([ 1., 1., 1., 1., 1.]) # Only single-element buffer is required for zero-stride array: >>> y = ones(1) >>> z = ndarray([10], strides=0, buffer=y) >>> z array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) I probably missed some places where buffer size is computed as a product of dimensions, but it should not be hard to review the code for those if we agree that having zero-stride arrays is a good idea. Note that I did not attempt to change any behaviors, the only change is that zero-stride arrays do not use more memory than they need. -------------- next part -------------- Index: numpy/core/src/arrayobject.c =================================================================== --- numpy/core/src/arrayobject.c (revision 2055) +++ numpy/core/src/arrayobject.c (working copy) @@ -3517,8 +3517,7 @@ For axes with a positive stride this function checks for a walk beyond the right end of the buffer, for axes with a negative stride, - it checks for a walk beyond the left end of the buffer. Zero strides - are disallowed. + it checks for a walk beyond the left end of the buffer. */ /*OBJECT_API*/ static Bool @@ -3532,27 +3531,17 @@ for (i=0; i 0) { + if (stride >= 0) { /* The last stride does not need to be fully inside the buffer, only its first elsize bytes */ if (offset + stride*(dims[i]-1)+elsize > numbytes) { return FALSE; } } - else if (stride < 0) { + else { if (offset + stride*dims[i] < 0) { return FALSE; } - } else { - /* XXX: Zero strides may be useful, but currently - XXX: allowing them would lead to strange results, - XXX: for example : - XXX: >>> x = arange(5) - XXX: >>> x.strides = 0 - XXX: >>> x += 1 - XXX: >>> x - XXX: array([5, 5, 5, 5, 5]) */ - return FALSE; } } return TRUE; @@ -3602,6 +3591,33 @@ } return itemsize; } +/* computes the buffer size needed to accomodate dims and strides */ +static intp +_array_buffer_size(int nd, intp *dims, intp *strides, intp itemsize) +{ + intp bufsize = 0, size; + int i; + for (i = 0; i < nd; ++i) { + if (dims[i] < 0) { + PyErr_Format(PyExc_ValueError, + "negative dimension (%d) for axis %d", + dims[i], i); + return -1; + } + if (strides[i] < 0) { + PyErr_Format(PyExc_ValueError, + "negative stride (%d) for axis %d", + strides[i], i); + return -1; + } + if (dims[i] == 0) + continue; + size = (dims[i] - 1)*strides[i] + itemsize; + if (size > bufsize) + bufsize = size; + } + return bufsize; +} /*OBJECT_API Generic new array creation routine. @@ -3768,13 +3784,8 @@ flags, &(self->flags)); } else { - if (data == NULL) { - PyErr_SetString(PyExc_ValueError, - "if 'strides' is given in " \ - "array creation, data must " \ - "be given too"); - goto fail; - } + sd = _array_buffer_size(nd, dims, strides, sd); + if (sd < 0) goto fail; memcpy(self->strides, strides, sizeof(intp)*nd); } } @@ -4092,7 +4103,7 @@ if (dims.len == 1 && dims.ptr[0] == -1) { dims.ptr[offset] = buffer.len / itemsize; } - else if (buffer.len < itemsize* \ + else if (strides.ptr == NULL && buffer.len < itemsize* \ PyArray_MultiplyList(dims.ptr, dims.len)) { PyErr_SetString(PyExc_TypeError, "buffer is too small for " \ @@ -4242,9 +4253,9 @@ if (PyArray_Check(new->base)) new = (PyArrayObject *)new->base; } - numbytes = PyArray_MultiplyList(new->dimensions, - new->nd)*new->descr->elsize; - + numbytes = _array_buffer_size(new->nd, new->dimensions, new->strides, + new->descr->elsize); + if (numbytes < 0) goto fail; if (!PyArray_CheckStrides(self->descr->elsize, self->nd, numbytes, self->data - new->data, self->dimensions, newstrides.ptr)) { From oliphant at ee.byu.edu Fri Feb 3 13:54:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 3 13:54:03 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: References: <43E2B420.1060801@ieee.org> <43E2C50D.3090103@ieee.org> Message-ID: <43E3D0D8.2060609@ee.byu.edu> Sasha wrote: >Attached patch allows numpy create memory-saving zero-stride arrays. > > > A good first cut. I'm very concerned about the speed of PyArray_NewFromDescr. So, I don't really want to make changes that will cause it to be slower for all cases unless absolutely essential. Could you give more examples of how you will be using these zero-stride arrays? What problem are they actually solving? I would also like to get more opinions about Sasha's proposal for zero-stride arrays. -Travis From alexander.belopolsky at gmail.com Fri Feb 3 14:04:08 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri Feb 3 14:04:08 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: <43E2C7BB.8010808@ieee.org> References: <43E2A751.1060807@ieee.org> <43E2C7BB.8010808@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > ... > Here's the issue. With records it is quite easy to generate strides > that are not integer multiples of the data. For example, a record > [('field1', 'f8),('field2', 'i2')] data-type would have floating point > data separated by 10 bytes. When you get a view of field1 (but getting > that attribute) you would get such a "misaligned" data. > > Look at the following: > > temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') > temp['f1'].strides > (10,) > > How would you represent that in the element-based strides report? You are right. I cannot think of anything better than just byte-based strides in this case. Maybe we could add a restriction abs(strides[i]) >= itemsize? This will probably catch some of the more common mistakes that are due to using number of elements instead of number of bytes. From ndarray at mac.com Fri Feb 3 14:05:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 14:05:01 2006 Subject: [Numpy-discussion] Learning "strides" In-Reply-To: References: <43E2A751.1060807@ieee.org> <43E2C7BB.8010808@ieee.org> Message-ID: On 2/2/06, Travis Oliphant wrote: > ... > Here's the issue. With records it is quite easy to generate strides > that are not integer multiples of the data. For example, a record > [('field1', 'f8),('field2', 'i2')] data-type would have floating point > data separated by 10 bytes. When you get a view of field1 (but getting > that attribute) you would get such a "misaligned" data. > > Look at the following: > > temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') > temp['f1'].strides > (10,) > > How would you represent that in the element-based strides report? You are right. I cannot think of anything better than just byte-based strides in this case. Maybe we could add a restriction abs(strides[i]) >= itemsize? This will probably catch some of the more common mistakes that are due to using number of elements instead of number of bytes. From jswhit at fastmail.fm Fri Feb 3 14:35:09 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Fri Feb 3 14:35:09 2006 Subject: [Numpy-discussion] treating numpy arrays like lists is slow Message-ID: <43E3DA64.5080506@fastmail.fm> Hi: I've noticed that code like this is really slow in numpy (0.9.4): import numpy as NP a = NP.ones(10000,'d') a = [2.*a1 for a1 in a] the last line takes 0.17 seconds on my G5, while for Numeric and numarray it takes only 0.01. Anyone know the reason for this? -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From ndarray at mac.com Fri Feb 3 15:03:16 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 15:03:16 2006 Subject: [Numpy-discussion] Zeros in strides In-Reply-To: <43E3D0D8.2060609@ee.byu.edu> References: <43E2B420.1060801@ieee.org> <43E2C50D.3090103@ieee.org> <43E3D0D8.2060609@ee.byu.edu> Message-ID: On 2/3/06, Travis Oliphant wrote: > I'm very concerned about the speed of PyArray_NewFromDescr. So, I > don't really want to make changes that will cause it to be slower for > all cases unless absolutely essential. > It is easy to change the code so that it only affects the branch in PyArray_NewFromDescr that currently raises an exception -- providing both strides but no buffer. There is no need to call _array_buffer_size if data is provided. > Could you give more examples of how you will be using these zero-stride > arrays? What problem are they actually solving? > Currently when I need to represent a statistic that is constant across population, I use scalars. In many cases this works because thanks to broadcasting rules a scalar behaves almost like a vector with equal elements. With the changes introduced in numpy, generic code that works on both scalars and vectors is becoming increasingly easier to write, but there are some cases where scalars cannot replace a vector with equal elements. For example, if you want to combine data for two populations and the data comes as two scalars, you need to somehow know the size of each population to add to the size of the result. A zero-stride array would solve this problem: it takes little memory, but unlike scalar knows its size. Another use that I was contemplating was to represent per-row or per-column mask in ma. It is often the case that in a rectangular matrix data may be missing only for an entire row. It is tempting to use rank-1 mask with an element for each row to represent this case. That will work fine, but if you would not be able to use vectors to specify either per-row or per-column mask. With zero-stride array, you can use strides=(1,0) or strides=(0,1) and have the same memory use as with a vector. From ndarray at mac.com Fri Feb 3 15:11:10 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 3 15:11:10 2006 Subject: [Numpy-discussion] treating numpy arrays like lists is slow In-Reply-To: <43E3DA64.5080506@fastmail.fm> References: <43E3DA64.5080506@fastmail.fm> Message-ID: This is so because scalar math is very slow in numpy. This will improve with the introduction of the scalarmath module. > python -m timeit -s "from numpy import float_; x = float_(2)" "2.*x" 100000 loops, best of 3: 15.8 usec per loop > python -m timeit -s "x = 2." "2.*x" 1000000 loops, best of 3: 0.261 usec per loop On 2/3/06, Jeff Whitaker wrote: > > Hi: > > I've noticed that code like this is really slow in numpy (0.9.4): > > import numpy as NP > a = NP.ones(10000,'d') > a = [2.*a1 for a1 in a] > > > the last line takes 0.17 seconds on my G5, while for Numeric and > numarray it takes only 0.01. Anyone know the reason for this? > > -Jeff > > -- > Jeffrey S. Whitaker Phone : (303)497-6313 > Meteorologist FAX : (303)497-6449 > NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov > 325 Broadway Office : Skaggs Research Cntr 1D-124 > Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From jswhit at fastmail.fm Fri Feb 3 19:08:02 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Fri Feb 3 19:08:02 2006 Subject: [Numpy-discussion] treating numpy arrays like lists is slow] Message-ID: <43E419F1.8070608@fastmail.fm> Travis Oliphant wrote: > Jeff Whitaker wrote: > >> >> Hi: >> >> I've noticed that code like this is really slow in numpy (0.9.4): >> >> import numpy as NP >> a = NP.ones(10000,'d') >> a = [2.*a1 for a1 in a] >> >> >> the last line takes 0.17 seconds on my G5, while for Numeric and >> numarray it takes only 0.01. Anyone know the reason for this? >> > We could actually change this right now, before the introduction of > scalar math by using the standard float table for the corresponding > array scalars. The only reason I didn't do this initially was that I > wanted consistency in behavior for "division-by-zero" between arrays > and scalars. > Using the Python float math you will get divide-by-zero errors whereas > you don't (unless you ask for them), with numpy arrays. > > Thus, current scalars are treated as 0-d arrays in the internals and > go through the entire ufunc machinery for every operation. > Now, the real question is why are you doing this? Using arrays in > this way defeats their purpose :-) > > What is wrong with 2*a? Now, of course there will be situations that > require this. > > -Travis > Travis: Of course I know this is a dumb thing to do - but sometimes it does happen that a function that expects a list actually gets a rank-1 array. The workaround in that case is to just pass it a.tolist() instead of a. -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From tim.hochberg at cox.net Fri Feb 3 19:29:05 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 3 19:29:05 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Message-ID: <43E41E52.6060805@cox.net> Hi I recently installed the Visual Studio .NET 2003 (AKA VC7) compiler and I took a stab at compiling numpy. I've tried previously with the free, toolkit version of VC7 with little success, but I was hoping this would be a piece of cake. No joy! It's quite possible that my compiler setup is gummed up by the previous existence of the toolkit compiler. A bunch of paths were set to this and that and there may be some residue that is messing things up. However I successfully compiled numarray 1.5 and a couple of my own extensions so things *seem* OK. So before I go hunting, I thought I'd ask and see if there were some known issues with compiling numpy 0.9.4 with VC7. The symptoms I'm seeing are, first, that it can't run configure. It can't find python24.lib. An abbreviated traceback is shown at the bottom. I kludged my way past this by replacing line 33 of numpy/core/setup.py with the two lines: python_lib = sysconfig.EXEC_PREFIX + '/libs' result = config_cmd.try_run(tc,include_dirs=[python_include],library_dirs=[python_lib]) That got me a little farther, but I quickly ran into trouble compiling multiarray module: C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /nologo /Ox /MD /W3 /GX /DNDEBU G -Ibuild\src\numpy\core\src -Inumpy\core\include -Ibuild\src\numpy\core -Inumpy\core\src -Inumpy\li b\..\core\include -IC:\Python24\include -IC:\Python24\PC /Tcnumpy\core\src\multiarraymodule.c /Fobui ld\temp.win32-2.4\Release\numpy\core\src\multiarraymodule.obj multiarraymodule.c build\src\numpy\core\src\arraytypes.inc(5305) : error C2036: 'void *' : unknown size build\src\numpy\core\src\arraytypes.inc(5885) : error C2036: 'void *' : unknown size build\src\numpy\core\src\arraytypes.inc(6465) : error C2036: 'void *' : unknown size ...a bunch of warnings... c:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\core\src\arrayobject.c(4049) : er ror C2036: 'void *' : unknown size ...some more warnings... error: Command ""C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe" /c /nologo /Ox / MD /W3 /GX /DNDEBUG -Ibuild\src\numpy\core\src -Inumpy\core\include -Ibuild\src\numpy\core -Inumpy\c ore\src -Inumpy\lib\..\core\include -IC:\Python24\include -IC:\Python24\PC /Tcnumpy\core\src\multiar raymodule.c /Fobuild\temp.win32-2.4\Release\numpy\core\src\multiarraymodule.obj" failed with exit st atus 2 Anyway, like I said, my compiler could be broken, but if there is a known issue with VC7 or this rings a bell with anyone please let me know. I certainly wouldn't mind a hint. -tim Traceback from configure failure: ----------------------------------------------------------------------------------------------------------------------- C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /nologo /Ox /MD /W3 /GX /DNDEBU G -IC:\Python24\include -Inumpy\core\src -Inumpy\lib\..\core\include -IC:\Python24\include -IC:\Pyth on24\PC /Tc_configtest.c /Fo_configtest.obj C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /nologo /INCREMENTAL:NO _configt est.obj /OUT:_configtest.exe LINK : fatal error LNK1104: cannot open file 'python24.lib' failure. removing: _configtest.c _configtest.obj Traceback (most recent call last): File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\setup.py", line 73, in ? setup_package() File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\setup.py", line 66, in setup_pa ckage setup( **config.todict() ) File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\core.py", line 93, in setup return old_setup(**new_attr) File "C:\Python24\lib\distutils\core.py", line 149, in setup dist.run_commands() File "C:\Python24\lib\distutils\dist.py", line 946, in run_commands self.run_command(cmd) File "C:\Python24\lib\distutils\dist.py", line 966, in run_command cmd_obj.run() File "C:\Python24\lib\distutils\command\build.py", line 112, in run self.run_command(cmd_name) File "C:\Python24\lib\distutils\cmd.py", line 333, in run_command self.distribution.run_command(command) File "C:\Python24\lib\distutils\dist.py", line 966, in run_command cmd_obj.run() File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 86, in run self.build_sources() File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 99, in build_sources self.build_extension_sources(ext) File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 143, in build_extension_sources sources = self.generate_sources(sources, ext) File "C:\Documents and Settings\End-user\Desktop\numpy\numpy-0.9.4\numpy\distutils\command\build_s rc.py", line 199, in generate_sources source = func(extension, build_dir) File "numpy\core\setup.py", line 35, in generate_config_h raise "ERROR: Failed to test configuration" ERROR: Failed to test configuration ----------------------------------------------------------------------------------------------------------------------- From faltet at carabos.com Sat Feb 4 01:34:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Sat Feb 4 01:34:02 2006 Subject: [Numpy-discussion] retrieving type objects for void array-scalar objects In-Reply-To: <43E3C596.2010904@web.de> References: <43E3C596.2010904@web.de> Message-ID: <1139045585.7529.22.camel@localhost.localdomain> El dv 03 de 02 del 2006 a les 22:05 +0100, en/na N. Volbers va escriure: > >>> dtype = numpy.dtype({'names': ['name', 'weight'],'formats': ['U30', 'f4']}) > >>> a = numpy.array([(u'Bill', 71.2), (u'Fred', 94.3)], dtype=dtype) > Is there some way to retrieve the type object directly from the array (not using any existing row) using only the name of the item? I have checked the dtype attribute, but I could only get the character representation for the item types (e.g. 'f4'). To retrieve the type directly from the array, you can use a function like this: def get_field_type_flat(descr, fname): """Get the type associated with a field named `fname`. If the field name is not found, None is returned. """ for item in descr: if fname == item[0]: return numpy.typeDict[item[1][1]] return None That one is very simple and fast. However, it can't deal with nested types. The next one is more general: def get_field_type_nested(descr, fname): """Get the type associated with a field named `fname`. This funcion looks recursively in possible nested descriptions. If the field is not found anywhere in the hierarchy, None is returned. If there are two names that are equal in the hierarchy, the first one (from top to bottom and from left to the right) found is returned. """ for item in descr: descr = item[1] if fname == item[0]: return numpy.dtype(descr).type else: if isinstance(descr, list): return get_field_type(descr, fname) return None The drawback here is that you can not select a field that is named the same way and that lives in different levels of the hierarchy. For example, selecting 'name' in a type structure like this: +-----------+ |name |x | | +-----+ | |name | +-----+-----+ is ambiguous (in the algorithm implemented above, the top level 'name' would be selected). Addressing this problem would imply to define a way to univocally specify nested fields. Anyway, I'm attaching a file with several examples on these functions. HTH, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" -------------- next part -------------- A non-text attachment was scrubbed... Name: prova.py Type: text/x-python Size: 1778 bytes Desc: not available URL: From lmbrmxdr at webquill.com Sat Feb 4 13:26:03 2006 From: lmbrmxdr at webquill.com (Bragg Megan) Date: Sat Feb 4 13:26:03 2006 Subject: [Numpy-discussion] Hey numpy-discussion Message-ID: <219570723.20060204222530@lists.sourceforge.net> Hi, numpy-discussion. globe vice pleas grumbled freshness? slithering returning hardly false rent serves delighted empire leading scrutinizing sleeper ripens bent works mountain purpose? birthday maltre forecasters mania wrong categorically hysterics neighbour business singing linden decide saved she hidden redbearded curlingirons? hysterically nonetoofresh father discovered burdock insistence chain emperors lucid peeked exclaims ratify sparkles clatter listlessly ladder diddled naive habit sector seats expects -- Best Regards, Bragg Megan mailto:numpy-discussion at lists.sourceforge.net -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wwdzjpx.gif Type: image/gif Size: 9269 bytes Desc: not available URL: From faltet at carabos.com Sun Feb 5 04:33:04 2006 From: faltet at carabos.com (Francesc Altet) Date: Sun Feb 5 04:33:04 2006 Subject: [Numpy-discussion] retrieving type objects for void array-scalar objects In-Reply-To: <43E3C596.2010904@web.de> References: <43E3C596.2010904@web.de> Message-ID: <1139142764.7534.20.camel@localhost.localdomain> El dv 03 de 02 del 2006 a les 22:05 +0100, en/na N. Volbers va escriure: > Is there some way to retrieve the type object directly from the array (not using any existing row) using only the name of the item? I have checked the dtype attribute, but I could only get the character representation for the item types (e.g. 'f4'). Ops, I've just discovered a new way to get the type in a simpler way: In [17]:dtype = numpy.dtype({'names': ['name', 'weight'],'formats': ['U30', 'f4']}) In [18]:a = numpy.array([(u'Bill', 71.2), (u'Fred', 94.3)], dtype=dtype) In [20]:a.dtype.fields['name'][0].type Out[20]: In [21]:a.dtype.fields['weight'][0].type Out[21]: For nested types, something like this should work: ntype = a.dtype.fields['name'][0].fields['nested_field'][0].type By the way, you will need numpy 0.9.5 (at least) for this to work. Incidentally, Travis, what do you think about allowing: In [30]:a.dtype.fields['weight'] Out[30]:dtype('0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From jswhit at fastmail.fm Sun Feb 5 07:25:04 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Sun Feb 5 07:25:04 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? Message-ID: <43E6188F.20703@fastmail.fm> Hi: I've successfully used the examples at http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data in a 'normal' numpy array, but have had no success adapting these examples to work with object arrays. I understand that the .data attribute holds pointers to the objects which actually contain the data in an object array, but how to you use those pointers to get the data in C/pyrex? -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From jswhit at fastmail.fm Sun Feb 5 07:57:03 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Sun Feb 5 07:57:03 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? Message-ID: <43E6202D.90906@fastmail.fm> Hi: I've successfully used the examples at http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data in a 'normal' numpy array, but have had no success adapting these examples to work with object arrays. I understand that the .data attribute holds pointers to the objects which actually contain the data in an object array, but how do you use those pointers to get the data in C/pyrex? -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From oliphant.travis at ieee.org Sun Feb 5 20:22:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sun Feb 5 20:22:10 2006 Subject: [Numpy-discussion] Re: Numpy 0.9.4 install In-Reply-To: <200602051655.01719.j.simons@planet.nl> References: <200602051655.01719.j.simons@planet.nl> Message-ID: <43E6CEBC.7070405@ieee.org> Jan Simons @planet.nl wrote: >Dear Travis, > >Thank you for all the work that you put into numerical Python. I believe that >it makes Python applicable to serious numerical work. > >I just attempted to install the package on my Suse 10.0 system (which does >have the (recent) python 2.4.1. > > I think the problem with the rpm binary is that I built the binary rpm versions against a debug-version of Python. Most people install from source on Linux, because this is the first time somebody has complained and I'm sure others have stumbled on this. I've been using a debug version of Python for a few months. I will probably switch back soon, which should make these issues less of a problem. Try building from source directly. Best, -travis From oliphant.travis at ieee.org Sun Feb 5 20:41:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sun Feb 5 20:41:01 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E6188F.20703@fastmail.fm> References: <43E6188F.20703@fastmail.fm> Message-ID: <43E6D329.4060001@ieee.org> Jeff Whitaker wrote: > > Hi: I've successfully used the examples at > http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data > in a 'normal' numpy array, but have had no success adapting these > examples to work with object arrays. I understand that the .data > attribute holds pointers to the objects which actually contain the > data in an object array, but how to you use those pointers to get the > data in C/pyrex? You have a pointer to a PyObject *object in the data. Thus, data should be recast to PyObject **. I don't know how to do that in PyRex. But, it's easy in C. In C, you will need to be concerned about reference counts. I don't know how pyrex handles this. From jswhit at fastmail.fm Mon Feb 6 05:01:09 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Mon Feb 6 05:01:09 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E6D329.4060001@ieee.org> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> Message-ID: <43E7487C.2060600@fastmail.fm> Travis Oliphant wrote: > Jeff Whitaker wrote: > >> >> Hi: I've successfully used the examples at >> http://www.scipy.org/Wiki/Cookbook/Pyrex_and_NumPy to access the data >> in a 'normal' numpy array, but have had no success adapting these >> examples to work with object arrays. I understand that the .data >> attribute holds pointers to the objects which actually contain the >> data in an object array, but how to you use those pointers to get the >> data in C/pyrex? > > You have a pointer to a PyObject *object in the data. Thus, data > should be recast to PyObject **. I don't know how to do that in PyRex. Travis: Apparently not. If I try to do this pyrex says 115:25: Pointer base type cannot be a Python object > But, it's easy in C. > In C, you will need to be concerned about reference counts. OK, I was hoping to avoid hand-coding an extension in C (which I'm woefully unqualified to do). -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From nicolist at limare.net Mon Feb 6 07:14:14 2006 From: nicolist at limare.net (Nico) Date: Mon Feb 6 07:14:14 2006 Subject: [Numpy-discussion] new on the list In-Reply-To: <20060206144906.627F28821B@sc8-sf-spam1.sourceforge.net> References: <20060206144906.627F28821B@sc8-sf-spam1.sourceforge.net> Message-ID: <43E76798.4050602@limare.net> Hi. I'm a new user of the numpy-discussion and scipy-user mailing-lists. So, as I usually do, here are a few words about me and my use of numpy/scipy. I am a doctorate student, in Paris; I will work on numerical analysis, mesh generation and image processing, and I intend to do the prototyping (and maybe everything) of my works with python. I recently choosed python because... - flexible and rich language for array manipulation - seems a good language to help me write clean, clear, bug-free and reusable code - seems possible to make a GUI frontend without too much pain - seems OK to glue with various other C/fortran applications without to much pain - free, as in free beer (I had to work on Matlab previously, and I don't like to force people pay for an expensive licence if they are interested in my work) - free, as in free speech (... I also had serious problems, needing compatibility of Matlab with a linux kernel not officially supported) I use numpy/scipy on Debian/Ubuntu, building from the release tarballs. And I am currently reading the available documentation... Last thing: What about a #scipy irc channel? I feel there are too many people on irc.freenode.org/#python for an efficient use. Happy coding! -- Nico -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature URL: From faltet at carabos.com Mon Feb 6 10:25:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 6 10:25:07 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy Message-ID: <1139250278.7538.52.camel@localhost.localdomain> Hi, I'm a bit surprised by the fact that unicode types are the only ones breaking the rule that must be specified with a different number of bytes than it really takes. For example: In [120]:numpy.dtype([('x','c16')]) Out[120]:dtype([('x', ' 64-bit issues?). OTOH, I thought that Python would represent internally unicode strings with 16-bit chars. Oh well, I'm bit lost on this. Anybody can bring some light? Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From faltet at carabos.com Mon Feb 6 10:53:14 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 6 10:53:14 2006 Subject: [Numpy-discussion] Mapping protocol to nested types in descriptor Message-ID: <1139251961.7538.73.camel@localhost.localdomain> Hi, I've implemented a simple mapping protocol in the descriptor type so that the user would be able to do: In [138]:dtype = numpy.dtype([ .....: ('x', ' instead of the current: In [141]:dtype.fields['Info'][0].name Out[141]:'void3872' In [142]:dtype.fields['Info'][0].fields['name'][0].type Out[142]: which I find cumbersome to type. Find the patch for this in the attachments. OTOH, I've completed the tests for heterogeneous objects in test_numerictypes.py. Now, there is a better check for both flat and nested fields, as well as explicit checking of type descriptors (including tests for the new mapping interface in descriptors). So far, no more problems have been detected by the new tests :-). Please, note that you will need the patch above applied in order to run the tests. Travis, if you think that it would be better to do not apply the patch, the tests can be easily adapted by changing lines like: self.assert_(h.dtype['x'][0].name[:4] == 'void') by self.assert_(h.dtype.fields['x'][0].name[:4] == 'void') Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" -------------- next part -------------- A non-text attachment was scrubbed... Name: arrayobject.c.patch Type: text/x-patch Size: 2654 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_numerictypes.py Type: text/x-python Size: 12824 bytes Desc: not available URL: From faltet at carabos.com Mon Feb 6 11:06:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 6 11:06:02 2006 Subject: [Numpy-discussion] Properties of fields in numpy Message-ID: <1139252739.7538.84.camel@localhost.localdomain> Hi, I don't specially like the 'void*' typecasting that are receiving the types in fields in situations like: In [143]:dtype = numpy.dtype([ .....: ('x', '0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From oliphant at ee.byu.edu Mon Feb 6 11:17:00 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 11:17:00 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139250278.7538.52.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> Message-ID: <43E7A062.9030508@ee.byu.edu> Francesc Altet wrote: >Hi, > >I'm a bit surprised by the fact that unicode types are the only ones >breaking the rule that must be specified with a different number of >bytes than it really takes. For example: > > Yeah, it's a bit annoying. There are special checks throughout the code for this. The problem, though is that sizeof(Py_UNICODE) can be 4 or 2 depending on how Python was compiled. Also, Python treats unicode and string characters as having the same length (even though internally, there is a different number of bytes required). So, I'm not sure exactly what to do, short of introducing a new code for "Unicode with specific number of bytes." I think the inconsistency should be removed, though. I'm just not sure how to do it. -Travis From oliphant at ee.byu.edu Mon Feb 6 11:21:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 11:21:01 2006 Subject: [Numpy-discussion] Properties of fields in numpy In-Reply-To: <1139252739.7538.84.camel@localhost.localdomain> References: <1139252739.7538.84.camel@localhost.localdomain> Message-ID: <43E7A16D.3050801@ee.byu.edu> Francesc Altet wrote: >Hi, > >I don't specially like the 'void*' typecasting that are receiving the >types in fields in situations like: > >In [143]:dtype = numpy.dtype([ > .....: ('x', ' .....: ('Info',[ > .....: ('name', ' .....: ('weight', ' >In [147]:dtype.fields['x'][0].name >Out[147]:'void64' > >were you can see that we have lost the information about the native type >of the 'x' field. Rather, I'd expect something like: > > Well, it's actually there. Look at dtype.fields['x'][0].subdtype[0] dtype.fields['x'][0].subdtype[1] The issue is that the base data-type of the 'x' field is void-64 (that's the dtype object the array "sees"). -Travis From strawman at astraw.com Mon Feb 6 12:33:05 2006 From: strawman at astraw.com (Andrew Straw) Date: Mon Feb 6 12:33:05 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7487C.2060600@fastmail.fm> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> Message-ID: <43E7B254.3040200@astraw.com> Hi Jeff, I've significantly updated the page at http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy Pyrex should be able to do everything you need. I hope you find the revised page more useful. Please let me know (or fix the page) if you have any issues or questions. Cheers! Andrew From jswhit at fastmail.fm Mon Feb 6 13:32:03 2006 From: jswhit at fastmail.fm (Jeff Whitaker) Date: Mon Feb 6 13:32:03 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7B254.3040200@astraw.com> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> <43E7B254.3040200@astraw.com> Message-ID: <43E7C03C.4060806@fastmail.fm> Andrew Straw wrote: > Hi Jeff, > > I've significantly updated the page at > http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy > > Pyrex should be able to do everything you need. > > I hope you find the revised page more useful. Please let me know (or > fix the page) if you have any issues or questions. > > Cheers! > Andrew Andrew: Thanks! That looks like exactly what I need. -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker at noaa.gov 325 Broadway Office : Skaggs Research Cntr 1D-124 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg From oliphant at ee.byu.edu Mon Feb 6 14:16:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 14:16:02 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139250278.7538.52.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> Message-ID: <43E7CA57.2040907@ee.byu.edu> Francesc Altet wrote: >Hi, > >I'm a bit surprised by the fact that unicode types are the only ones >breaking the rule that must be specified with a different number of >bytes than it really takes. For example: > > Right now, the array protocol typestring is a little ambiguous on unicode characters. Ideally, the array interface would describe what kind of Unicode characters are being dealt with so that 2-byte and 4-byte unicode characters have a different description in the typestring. Python can be compiled with Unicode as either 2-byte or 4-byte. The 'U#' descriptor is supposed to be the Python unicode data-type with # representing the number of characters. If this data-type is handed off to a Python that is compiled with a different representation for Unicode, then we have a problem. Right now, the typestring value gives the number of bytes in the type. Thus, "U4" gives dtype(" References: <1139252739.7538.84.camel@localhost.localdomain> Message-ID: <43E7E4FD.9050303@ee.byu.edu> Francesc Altet wrote: >Hi, > >I don't specially like the 'void*' typecasting that are receiving the >types in fields in situations like: > >In [143]:dtype = numpy.dtype([ > .....: ('x', ' .....: ('Info',[ > .....: ('name', ' .....: ('weight', ' >In [147]:dtype.fields['x'][0].name >Out[147]:'void64' > >were you can see that we have lost the information about the native type >of the 'x' field. Rather, I'd expect something like: > > In SVN of numpy, the dtype objects now have a .base attribute and a .shape attribute. The .shape attribute returns (1,) or the shape of the sub-array. The .base attribute returns the data-type object of the base-type, or a new reference to self, if the object has no base.type. Thus, in current SVN dtype['x'].base.name would always give you what you want. -Travis From tim.hochberg at cox.net Mon Feb 6 17:14:11 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 6 17:14:11 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E7CA57.2040907@ee.byu.edu> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7CA57.2040907@ee.byu.edu> Message-ID: <43E7F40F.7030303@cox.net> Travis Oliphant wrote: > Francesc Altet wrote: > >> Hi, >> >> I'm a bit surprised by the fact that unicode types are the only ones >> breaking the rule that must be specified with a different number of >> bytes than it really takes. For example: >> >> > > Right now, the array protocol typestring is a little ambiguous on > unicode characters. Ideally, the array interface would describe what > kind of Unicode characters are being dealt with so that 2-byte and > 4-byte unicode characters have a different description in the typestring. > > Python can be compiled with Unicode as either 2-byte or 4-byte. The > 'U#' descriptor is supposed to be the Python unicode data-type with # > representing the number of characters. If this data-type is handed > off to a Python that is compiled with a different representation for > Unicode, then we have a problem. > > Right now, the typestring value gives the number of bytes in the > type. Thus, "U4" gives dtype(" sizeof(Py_UNICODE)==2, but on another system it could give dtype(" I know only a little-bit about unicode. The full Unicode character is > a 4-byte entity, but there are standard 2-byte (UTF-16) and even > 1-byte (UTF-8) encoders. > > I changed the source so that (" (i.e. if you specify an endianness then you are being byte-conscious > anyway and so the number is interpreted as a byte, otherwise the > number is interpreted as a length). This fixes issues on the same > platform, but does not fix issues where data is saved out with one > Python interpreter and read in by another with a different value of > sizeof(Py_UNICODE). This sounds like a mess. I'm not sure what the level of Unicode expertise is one this list (I certainly don't add to it), but I'd be tempted to raise this issue on PythonDev and see if anyone there has any good suggestions. I'm way out of my depth here, but it really sounds like there needs to be one descriptor for each type. Just for example "U" could be 2-byte unicode and "V" (assuming it's not taken already) could be 4-byte unicode. Then the size for a given descriptor would be constant and things would be much less confusing. -tim From oliphant at ee.byu.edu Mon Feb 6 17:28:19 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 6 17:28:19 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E7F40F.7030303@cox.net> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7CA57.2040907@ee.byu.edu> <43E7F40F.7030303@cox.net> Message-ID: <43E7F78F.7080304@ee.byu.edu> Tim Hochberg wrote: >> Right now, the typestring value gives the number of bytes in the >> type. Thus, "U4" gives dtype("> sizeof(Py_UNICODE)==2, but on another system it could give >> dtype("> I know only a little-bit about unicode. The full Unicode character >> is a 4-byte entity, but there are standard 2-byte (UTF-16) and even >> 1-byte (UTF-8) encoders. >> >> I changed the source so that ("> "U4" (i.e. if you specify an endianness then you are being >> byte-conscious anyway and so the number is interpreted as a byte, >> otherwise the number is interpreted as a length). This fixes issues >> on the same platform, but does not fix issues where data is saved out >> with one Python interpreter and read in by another with a different >> value of sizeof(Py_UNICODE). > > > This sounds like a mess. I'm not sure what the level of Unicode > expertise is one this list (I certainly don't add to it), but I'd be > tempted to raise this issue on PythonDev and see if anyone there has > any good suggestions. > I'm not a unicode expert, but I have read-up on it so I think I at least understand the issues involved. > I'm way out of my depth here, but it really sounds like there needs to > be one descriptor for each type. Just for example "U" could be 2-byte > unicode and "V" (assuming it's not taken already) could be 4-byte > unicode. Then the size for a given descriptor would be constant and > things would be much less confusing. > This is what I'm currently thinking. The question is would we have to define a new basic data-type for 4-byte unicode or would we just handle this on the input. Would we also define a 1-byte unicode data-type or just let the user deal with that using standard strings and encoding as is currently done in Python. -Travis From oliphant.travis at ieee.org Mon Feb 6 20:04:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 20:04:08 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? In-Reply-To: <43E81650.2040204@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> Message-ID: <43E81BFA.7060600@ieee.org> Tim Hochberg wrote: > > Just a little update on this: > > It appears that all (or almost all) of the checks in generate_config_h > must be failing. I would guess from a missing library or some such. I > will investigate some more and see what I find. > That shouldn't be a big problem. It just means that NumPy will provide the missing features instead of using the system functions. More problematic is the strange errors you are getting about void * not having a size. The line numbers you show are where we have variable declarations like register intp i Is it possible that integers the size of void * cannot be placed in a register?? -Travis From oliphant.travis at ieee.org Mon Feb 6 22:14:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 22:14:01 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? In-Reply-To: <43E82930.7070103@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> Message-ID: <43E83A88.9080607@ieee.org> Tim Hochberg wrote: > Travis Oliphant wrote: > >> Tim Hochberg wrote: >> >>> >>> Just a little update on this: >>> >>> It appears that all (or almost all) of the checks in >>> generate_config_h must be failing. I would guess from a missing >>> library or some such. I will investigate some more and see what I find. >>> >> That shouldn't be a big problem. It just means that NumPy will >> provide the missing features instead of using the system functions. >> More problematic is the strange errors you are getting about void * >> not having a size. The line numbers you show are where we have >> variable declarations like >> >> register intp i >> >> Is it possible that integers the size of void * cannot be placed in a >> register?? > > > OK, I think I found what causes the problem. What we have is lines like: > > for(i=0; i > where op is declared (void*). There shouldn't be anything like that. These should all be char *. Where did you see these? > > Of course, unfuncmodule then failed to compile. A quick peak shows > that it's throwing a lot of syntax errors. It appears to happen > whenever there's a longdouble function defined. For example: > > longdouble sinl(longdouble x) { > return (longdouble) sin((double)x); > } On your platform longdouble should be equivalent to double, so I'm not sure why this would fail. -Travis From oliphant.travis at ieee.org Mon Feb 6 22:40:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 22:40:06 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc Message-ID: <43E840BE.5060204@ieee.org> We need to test numpy on other compilers besides gcc, so that we can ferret out any gnu-isms that we may be relying on. Anybody out there with compilers they are willing to try out and/or report on? Thanks, -Travis From oliphant.travis at ieee.org Mon Feb 6 23:17:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 6 23:17:04 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E7F40F.7030303@cox.net> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7CA57.2040907@ee.byu.edu> <43E7F40F.7030303@cox.net> Message-ID: <43E8495C.9020008@ieee.org> > I'm way out of my depth here, but it really sounds like there needs to > be one descriptor for each type. Just for example "U" could be 2-byte > unicode and "V" (assuming it's not taken already) could be 4-byte > unicode. Then the size for a given descriptor would be constant and > things would be much less confusing. In current SVN, numpy assumes 'w' is 2-byte unicode and 'W' is 4-byte unicode in the array interface typestring. Right now these codes require that the number of bytes be specified explicitly (to satisfy the array interface requirement). There is still only 1 Unicode data-type on the platform and it has the size of Python's Py_UNICODE type. The character 'U' continues to be useful on data-type construction to stand for a unicode string of a specific character length. It's internal dtype representation will use 'w' or 'W' depending on how Python was compiled. This may not solve all issues, but at least it's a bit more consistent and solves the problem of dtype(dtype('U8').str) not producing the same datatype. It also solves the problem of unicode written out with one compilation of Python and attempted to be written in with another (it won't let you because only one of 'w#' or 'W#' is supported on a platform. -Travis From a.h.jaffe at gmail.com Tue Feb 7 01:10:03 2006 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Tue Feb 7 01:10:03 2006 Subject: [Numpy-discussion] Re: Need compilations with compilers other than gcc In-Reply-To: <43E840BE.5060204@ieee.org> References: <43E840BE.5060204@ieee.org> Message-ID: Also, what is the status of gcc 4.0 support (on Mac OS X at least)? It's a bit of a pain to have to switch between the two (are there any other disadvantages?). Andrew Travis Oliphant wrote: > > We need to test numpy on other compilers besides gcc, so that we can > ferret out any gnu-isms that we may be relying on. > > Anybody out there with compilers they are willing to try out and/or > report on? > > Thanks, > > -Travis > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 From faltet at carabos.com Tue Feb 7 02:55:10 2006 From: faltet at carabos.com (Francesc Altet) Date: Tue Feb 7 02:55:10 2006 Subject: [Numpy-discussion] Properties of fields in numpy In-Reply-To: <43E7E4FD.9050303@ee.byu.edu> References: <1139252739.7538.84.camel@localhost.localdomain> <43E7E4FD.9050303@ee.byu.edu> Message-ID: <200602071154.39285.faltet@carabos.com> A Dimarts 07 Febrer 2006 01:08, Travis Oliphant va escriure: > In SVN of numpy, the dtype objects now have a .base attribute and a > .shape attribute. > > The .shape attribute returns (1,) or the shape of the sub-array. Uh, it wouldn't be better to put .shape = 1 in case of a scalar field and (...) for a non-scalar field? Remember that this is the current convention for the numpy protocol. > The .base attribute returns the data-type object of the base-type, or a > new reference to self, if the object has no base.type. > > Thus, in current SVN > > dtype['x'].base.name would always give you what you want. Great. I like it. Thanks! -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From arnd.baecker at web.de Tue Feb 7 03:13:01 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 03:13:01 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7C03C.4060806@fastmail.fm> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> <43E7B254.3040200@astraw.com> <43E7C03C.4060806@fastmail.fm> Message-ID: On Mon, 6 Feb 2006, Jeff Whitaker wrote: > Andrew Straw wrote: > > > Hi Jeff, > > > > I've significantly updated the page at > > http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy > > > > Pyrex should be able to do everything you need. > > > > I hope you find the revised page more useful. Please let me know (or > > fix the page) if you have any issues or questions. > > > > Cheers! > > Andrew > > Andrew: Thanks! That looks like exactly what I need. -Jeff Very nice! Would it be better the policy that any runnable .py file is an attachment (see tst.py in http://scipy.org/Wiki/WikiSandBox) to the page, so that it can be easily downloaded? Presently one has to disable line numbers, copy the text, paste into an editor and save with the right file name... Best, Arnd From pearu at scipy.org Tue Feb 7 03:55:05 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 7 03:55:05 2006 Subject: [Numpy-discussion] how to get data out of an object array in pyrex? In-Reply-To: <43E7B254.3040200@astraw.com> References: <43E6188F.20703@fastmail.fm> <43E6D329.4060001@ieee.org> <43E7487C.2060600@fastmail.fm> <43E7B254.3040200@astraw.com> Message-ID: On Mon, 6 Feb 2006, Andrew Straw wrote: > I've significantly updated the page at > http://scipy.org/Wiki/Cookbook/Pyrex_and_NumPy FYI, numpy.distutils now supports building pyrex extension modules. See numpy/distutils/tests/pyrex_ext/ for a working example. In case of Cookbook/Pyrex_and_NumPy, the corresponding setup.py file is: #!/usr/bin/env python def configuration(parent_package='',top_path=None): from numpy.distutils.misc_util import Configuration config = Configuration('mypackage',parent_package,top_path) config.add_extension('pyrex_and_numpy', sources = ['test.pyx'], depends = ['c_python.pxd','c_numpy.pxd']) return config if __name__ == "__main__": from numpy.distutils.core import setup setup(**configuration(top_path='').todict()) And to build the package inplace, use python setup.py build_src build_ext --inplace Pearu From arnd.baecker at web.de Tue Feb 7 06:02:05 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 06:02:05 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: <43E840BE.5060204@ieee.org> References: <43E840BE.5060204@ieee.org> Message-ID: Hi Travis, On Mon, 6 Feb 2006, Travis Oliphant wrote: > We need to test numpy on other compilers besides gcc, so that we can > ferret out any gnu-isms that we may be relying on. > > Anybody out there with compilers they are willing to try out and/or > report on? Alright, we might need the asbestos suite thing: Something ahead: I normally used python numpy/distutils/system_info.py lapack_opt to figure out which library numpy is going to use. With current svn I get the folloowing error: Traceback (most recent call last): File "numpy/distutils/system_info.py", line 111, in ? from exec_command import find_executable, exec_command, get_pythonexe File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", line 56, in ? from numpy.distutils.misc_util import is_sequence ImportError: No module named numpy.distutils.misc_util Concerning icc compilation I used: export FC_VENDOR=Intel export F77=ifort export CC=icc export CXX=icc python setup.py config --compiler=intel install --prefix=$DESTnumpyDIR | tee ../build_log_numpy_${nr}.txt The build log shows 1393 warnings 3362 remarks Should I post them off-list or on scipy-dev? Trying to test the resulting numpy gives: In [1]: import numpy import core -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove import random -> failed: 'module' object has no attribute 'dtype' import lib -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove import linalg -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so: undefined symbol: ?1__serial_memmove import dft -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove --------------------------------------------------------------------------- exceptions.ImportError Traceback (most recent call last) /work/home/baecker/ /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/__init__.py 43 44 test = ScipyTest('numpy').test ---> 45 import add_newdocs 46 47 __doc__ += """ /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/add_newdocs.py ----> 2 from lib import add_newdoc 3 4 add_newdoc('numpy.core','dtypedescr', 5 [('fields', "Fields of the data-typedescr if any."), 6 ('alignment', "Needed alignment for this data-type"), /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/lib/__init__.py 3 from numpy.version import version as __version__ 4 ----> 5 from type_check import * 6 from index_tricks import * 7 from function_base import * /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/lib/type_check.py 6 'common_type'] 7 ----> 8 import numpy.core.numeric as _nx 9 from numpy.core.numeric import ndarray, asarray, array, isinf, isnan, \ 10 isfinite, signbit, ufunc, ScalarType, obj2sctype /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/__init__.py 3 from numpy.version import version as __version__ 4 ----> 5 import multiarray 6 import umath 7 import numerictypes as nt ImportError: /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: ?1__serial_memmove I already reported this a month ago with a bit more information on a possible solution http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 Best, Arnd From faltet at carabos.com Tue Feb 7 06:44:01 2006 From: faltet at carabos.com (Francesc Altet) Date: Tue Feb 7 06:44:01 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E8495C.9020008@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> Message-ID: <200602071542.35593.faltet@carabos.com> A Dimarts 07 Febrer 2006 08:16, Travis Oliphant va escriure: > In current SVN, numpy assumes 'w' is 2-byte unicode and 'W' is 4-byte > unicode in the array interface typestring. Right now these codes > require that the number of bytes be specified explicitly (to satisfy the > array interface requirement). There is still only 1 Unicode data-type > on the platform and it has the size of Python's Py_UNICODE type. The > character 'U' continues to be useful on data-type construction to stand > for a unicode string of a specific character length. It's internal dtype > representation will use 'w' or 'W' depending on how Python was compiled. > > This may not solve all issues, but at least it's a bit more consistent > and solves the problem of > > dtype(dtype('U8').str) not producing the same datatype. > > It also solves the problem of unicode written out with one compilation > of Python and attempted to be written in with another (it won't let you > because only one of 'w#' or 'W#' is supported on a platform. While I agree that this solution is more consistent, I must say that I'm not very confortable with having to deal with two different widths for unicode characters. What bothers me is the lack portability of unicode strings when saving them to disk in python interpreters UCS4-enabled and retrieving with UCS2-enabled ones in the context of PyTables (or any other database). Let's suppose that a user have a numpy object of type unicode that has been created in a python with UCS4. This would look like: # UCS4-aware interpreter here >>> numpy.array(u"\U000110fc", "U1") array(u'\U000110fc', dtype=(unicode,4)) Now, suppose that you save this in a PyTables file (for example) and you want to regenerate it on a python interpreter compiled with UCS2. As the buffer on-disk has a fixed length, we are forced to use unicode types twice as larger as containers for this data. So the net effect is that we will end in the UCS2 interpreter with an object like: # UCS2-aware interpreter here >>> numpy.array(u"\U000110fc", "U2") array(u'\U000110fc', dtype=(unicode,4)) which, apparently is the same than the one above, but not quite. To begin with, the former is an array that is an unicode scalar with only *one* character, while the later has *two* characters. But worse than that, the interpretation of the original content changes drastically in the UCS2 platform. For example, if we select the first and second characters of the string in the UCS2-aware platform, we have: >>> numpy.array(u"\U000110fc", "U2")[()][0] u'\ud804' >>> numpy.array(u"\U000110fc", "U2")[()][1] u'\udcfc' that have nothing to do with the original \U000110fc character (I'd expect to get at least the truncated values \u0001 and \u10fc). I think this is because of the conventions that are used to represent 32-bit unicode characters in UTF-16 using a technique called "surrogate pairs" (see: http://www.unicode.org/glossary/). All in all, my opinion is that allowing the coexistence of different sizes of unicode types in numpy would be a receipt for disaster when one wants to transport unicode characters between platforms with python interpreters compiled with different unicode sizes. Consequently I'd propose to suport just one size of unicode sizes in numpy, namely, the 4-byte one, and if this size doesn't match the underlying python platform, then refuse to deliver native unicode objects if the user is asking for them. Something like would work: # UCS2-aware interpreter here >>> h=numpy.array(u"\U000110fc", "U1") >>> h # This is a 'true' 32-bit unicode array in numpy array(u'\U000110fc', dtype=(unicode,4)) >>> h[()] # Try to get a native unicode object in python Traceback (most recent call last): File "", line 1, in ? ValueError: unicode sizes in numpy and your python interpreter doesn't match. Sorry, but you should get an UCS4-enable python interpreter if you want to successfully complete this operation. As a premium, we can get rid of the 'w' and 'W' typecodes that has been introduced a bit forcedly, IMO. I don't know, however, how difficult would be implementing this in numpy. Another option can be to refuse to compile numpy with UCS2-aware interpreters, but this sounds a bit extreme, but see below. OTOH, I'm not an expert in Unicode, but after googling a bit, I've found interesting recommendations about its use in Python. The first is from Uge Ubuchi in http://www.xml.com/pub/a/2005/06/15/py-xml.html. Here is the relevant excerpt: """ I also want to mention another general principle to keep in mind: if possible, use a Python install compiled to use UCS4 character storage [...] UCS4 uses more space to store characters, but there are some problems for XML processing in UCS2, which the Python core team is reluctant to address because the only known fixes would be too much of a burden on performance. Luckily, most distributors have heeded this advice and ship UCS4 builds of Python. """ So, it seems that the Python crew is not interested in solving problems with with UCS2. Now, towards the end of the PEP 261 ('Support for "wide" Unicode characters') one can read this as a final conclusion: """ This PEP represents the least-effort solution. Over the next several years, 32-bit Unicode characters will become more common and that may either convince us that we need a more sophisticated solution or (on the other hand) convince us that simply mandating wide Unicode characters is an appropriate solution. """ This PEP dates from 27-Jun-2001, so the "next several years" the author is referring to is nowadays. In fact, the interpreters in my Debian based Linux, are both compiled with UCS4. Despite of this, it seems that the default for compiling python is using UCS2 provided that you still need to pass the flag "--enable-unicode=ucs4" if you want to end with a UCS4-enabled interpreter. I wonder why they are doing this if that can positively lead to problems with XML as Uge Ubuchi said (?). Anyway, I don't know if the recommendation of compiling Python with UCS4 is spread enough or not in the different distributions, but people can easily check this with: >>> len(buffer(u"u")) 4 if the output of this is 4 (as in my example), then the interpreter is using UCS4; if it is 2, it is using UCS2. Finally, I agree that asking for help about these issues in the python list would be a good idea. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From pearu at scipy.org Tue Feb 7 07:00:05 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 7 07:00:05 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Arnd Baecker wrote: > Alright, we might need the asbestos suite thing: > > Something ahead: I normally used > python numpy/distutils/system_info.py lapack_opt > to figure out which library numpy is going to use. > With current svn I get the folloowing error: > > Traceback (most recent call last): > File "numpy/distutils/system_info.py", line 111, in ? > from exec_command import find_executable, exec_command, get_pythonexe > File > "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", > line 56, in ? > from numpy.distutils.misc_util import is_sequence > ImportError: No module named numpy.distutils.misc_util This occurs probably because numpy is not installed. > Concerning icc compilation I used: > > export FC_VENDOR=Intel This has no effect anymore. Use --fcompiler=intel instead. > export F77=ifort > export CC=icc > export CXX=icc > python setup.py config --compiler=intel install --prefix=$DESTnumpyDIR > | tee ../build_log_numpy_${nr}.txt There is no intel compiler. Allowed C compilers are unix,msvc,cygwin,mingw32,bcpp,mwerks,emx. Distutils should have given an exception when using --compiler=intel. If you are using IFC compiled blas/lapack libraries then --fcompiler=intel might produce importable extension modules (because then ifc is used for linking that knows about which intel libraries need be linked to a shared library). > Trying to test the resulting numpy gives: > > In [1]: import numpy > import core -> failed: > /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: > undefined symbol: ?1__serial_memmove > I already reported this a month ago with a bit more information > on a possible solution > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 When Python is compiled with a different compiler than numpy (or any extension module) is going to be installed then proper libraries must be specified manually. Which libraries and flags are needed exactly, this is described in compilers manual. So, a recommended fix would be to build Python with icc and as a result correct libraries will be used for building 3rd party extension modules. Otherwise one has to read compilers manual, sections like about gcc-compatibility and linking might be useful. See also http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb Pearu From arnd.baecker at web.de Tue Feb 7 07:27:17 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 07:27:17 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Pearu Peterson wrote: > On Tue, 7 Feb 2006, Arnd Baecker wrote: > > > Alright, we might need the asbestos suite thing: > > > > Something ahead: I normally used > > python numpy/distutils/system_info.py lapack_opt > > to figure out which library numpy is going to use. > > With current svn I get the folloowing error: > > > > Traceback (most recent call last): > > File "numpy/distutils/system_info.py", line 111, in ? > > from exec_command import find_executable, exec_command, get_pythonexe > > File > > "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", > > line 56, in ? > > from numpy.distutils.misc_util import is_sequence > > ImportError: No module named numpy.distutils.misc_util > > This occurs probably because numpy is not installed. Maybe I am wrong, but I thought that I could run the above command before any installation to see which libraries will be used. My installation notes on this give me the feeling that this used to work... > > Concerning icc compilation I used: > > > > export FC_VENDOR=Intel > > This has no effect anymore. Use --fcompiler=intel instead. OK - I have to confess that I am really confused about which options might work and which not. Is there a document which describes this? > > export F77=ifort > > export CC=icc > > export CXX=icc But these are still needed? > > python setup.py config --compiler=intel install --prefix=$DESTnumpyDIR > > | tee ../build_log_numpy_${nr}.txt > > There is no intel compiler. Allowed C compilers are > unix,msvc,cygwin,mingw32,bcpp,mwerks,emx. Distutils should have given an > exception when using --compiler=intel. > > If you are using IFC compiled blas/lapack libraries then --fcompiler=intel > might produce importable extension modules (because then ifc is used for > linking that knows about which intel libraries need be linked to a shared > library). For this test I haven't used any blas/lapack. But it is good to know. > > Trying to test the resulting numpy gives: > > > > In [1]: import numpy > > import core -> failed: > > /home/baecker/python2/scipy_icc5_lintst_n_N0/lib/python2.4/site-packages/numpy/core/multiarray.so: > > undefined symbol: ?1__serial_memmove > > > > > I already reported this a month ago with a bit more information > > on a possible solution > > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 > > When Python is compiled with a different compiler than numpy (or any > extension module) is going to be installed then proper libraries must be > specified manually. Which libraries and flags are needed exactly, this is > described in compilers manual. > > So, a recommended fix would be to build Python with icc and as a > result correct libraries will be used for building 3rd party extension > modules. This would also mean that all dependent packages will have to be installed again, right? I am sorry but then I won't be able to help with icc at the moment as I am completely swamped with other stuff... > Otherwise one has to read compilers manual, sections like > about gcc-compatibility and linking might be useful. See also > http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb I thought that supplying ``--libraries="irc"`` might cure the problem, but (quoting from http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 ) """ However, in the build log I only found -lirc for the config_tests but nowhere else. What should I do instead of the above? """ Best, Arnd From pearu at scipy.org Tue Feb 7 08:07:06 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 7 08:07:06 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Arnd Baecker wrote: > On Tue, 7 Feb 2006, Pearu Peterson wrote: > >> On Tue, 7 Feb 2006, Arnd Baecker wrote: >> >>> Alright, we might need the asbestos suite thing: >>> >>> Something ahead: I normally used >>> python numpy/distutils/system_info.py lapack_opt >>> to figure out which library numpy is going to use. >>> With current svn I get the folloowing error: >>> >>> Traceback (most recent call last): >>> File "numpy/distutils/system_info.py", line 111, in ? >>> from exec_command import find_executable, exec_command, get_pythonexe >>> File >>> "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/exec_command.py", >>> line 56, in ? >>> from numpy.distutils.misc_util import is_sequence >>> ImportError: No module named numpy.distutils.misc_util >> >> This occurs probably because numpy is not installed. > > Maybe I am wrong, but I thought that I could run the above > command before any installation to see which > libraries will be used. > My installation notes on this give me the feeling that > this used to work... from numpy.distutils.misc_util import is_sequence, is_string should be changed to from misc_util import is_sequence, is_string to fix this. >>> Concerning icc compilation I used: >>> >>> export FC_VENDOR=Intel >> >> This has no effect anymore. Use --fcompiler=intel instead. > > OK - I have to confess that I am really confused about > which options might work and which not. > Is there a document which describes this? FC_VENDOR env. variable was used in old f2py long time ago. When Fortran compiler support was moved to scipy_distutils, --fcompiler option was introduced to config, config_fc, build_ext,.. setup.py commands. One should use any of these commands to specify a Fortran compiler and config_fc to change various Fortran compiler flags. See python setup.py config_fc --help for more information. How to enhance C compiler options, see standard Distutils documentation. >>> export F77=ifort >>> export CC=icc >>> export CXX=icc > > But these are still needed? No for F77, using --fcompiler=.. should be enough. I am not sure about CC, CXX, must try it out.. >> When Python is compiled with a different compiler than numpy (or any >> extension module) is going to be installed then proper libraries must be >> specified manually. Which libraries and flags are needed exactly, this is >> described in compilers manual. >> >> So, a recommended fix would be to build Python with icc and as a >> result correct libraries will be used for building 3rd party extension >> modules. > > This would also mean that all dependent packages will have > to be installed again, right? > I am sorry but then I won't be able to help with icc at the moment > as I am completely swamped with other stuff... > >> Otherwise one has to read compilers manual, sections like >> about gcc-compatibility and linking might be useful. See also >> http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb > > I thought that supplying ``--libraries="irc"`` > might cure the problem, but > (quoting from > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 > ) > """ > However, in the build log I only found -lirc for > the config_tests but nowhere else. > What should I do instead of the above? > """ Try: export CC=icc python setup.py build build_ext -lirc This will probably use gcc for linking but might fix undefined symbol problems. Pearu From cjw at sympatico.ca Tue Feb 7 10:02:15 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Tue Feb 7 10:02:15 2006 Subject: [Numpy-discussion] Is the Python types module superfluous? In-Reply-To: <43E13502.6050207@ee.byu.edu> References: <43DFD598.5000503@colorado.edu> <43E0309D.5050700@sympatico.ca> <43E03349.10208@ieee.org> <43E0DE8D.6020907@sympatico.ca> <20060201181434.0cbb368a.gerard.vermeulen@grenoble.cnrs.fr> <43E13502.6050207@ee.byu.edu> Message-ID: <43E8CDCF.10303@sympatico.ca> Travis Oliphant wrote: > Gerard Vermeulen wrote: > >> On Wed, 01 Feb 2006 11:15:09 -0500 >> "Colin J. Williams" wrote: >> >> [ currently numpy uses ndarray, with synonym ArrayType, for a >> multidimensional array ] >> >> >> >>> [Dbg]>>> import types >>> [Dbg]>>> dir(types) >>> ['BooleanType', 'BufferType', 'BuiltinFunctionType', >>> 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', >>> 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', >>> 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', >>> 'Instance >>> Type', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MethodType', >>> 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', >>> 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', >>> 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRan >>> geType', '__builtins__', '__doc__', '__file__', '__name__'] >>> [Dbg]>>> >>> >>> >> >> >> Isn't the types module becoming superfluous? >> >> >> > That's the point I was trying to make. ArrayType is to ndarray as > DictionaryType is to dict. My understanding is that the use of > types.DictionaryType is discouraged. > > -Travis > I was simply trying to suggest that the name ArrayType is more appropriate name that ndbigarray or ndarray for the multidimensional array. Since the intent is, in the long run, to integrate numpy with the Python distribution, the use of a name in the style of the existing Python types would appear to be better. Is the types module becoming superfluous? I've cross posted to c.l.p to seek information on this. Colin W. From arnd.baecker at web.de Tue Feb 7 10:13:26 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 7 10:13:26 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Pearu Peterson wrote: [... /numpy/distutils/exec_command.py ...] > from numpy.distutils.misc_util import is_sequence, is_string > > should be changed to > > from misc_util import is_sequence, is_string > > to fix this. Making the same type of change in numpy/distutils/system_info.py worked if ATLAS is not used (`export ATLAS=None`). Otherwise I get: python numpy/distutils/system_info.py lapack_opt lapack_opt_info: lapack_mkl_info: mkl_info: NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS system_info.atlas_threads_info Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/baecker/python2/lib/atlas'] language = f77 include_dirs = ['/usr/include'] Traceback (most recent call last): File "numpy/distutils/system_info.py", line 1693, in ? show_all() File "numpy/distutils/system_info.py", line 1689, in show_all r = c.get_info() File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/system_info.py", line 338, in get_info self.calc_info() File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/system_info.py", line 1123, in calc_info atlas_version = get_atlas_version(**version_info) File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/system_info.py", line 1028, in get_atlas_version from core import Extension, setup File "/work/home/baecker/INSTALL_PYTHON5_icc/CompileDir/numpy/numpy/distutils/core.py", line 12, in ? from numpy.distutils.extension import Extension ImportError: No module named numpy.distutils.extension numpy/distutils/core.py is full of `from numpy.distutils.command import ...`. > >>> Concerning icc compilation I used: > >>> > >>> export FC_VENDOR=Intel > >> > >> This has no effect anymore. Use --fcompiler=intel instead. > > > > OK - I have to confess that I am really confused about > > which options might work and which not. > > Is there a document which describes this? > > FC_VENDOR env. variable was used in old f2py long time ago. When Fortran > compiler support was moved to scipy_distutils, --fcompiler option was > introduced to config, config_fc, build_ext,.. setup.py commands. > One should use any of these commands to specify a Fortran compiler and > config_fc to change various Fortran compiler flags. See > python setup.py config_fc --help > for more information. > > How to enhance C compiler options, see standard Distutils documentation. > > >>> export F77=ifort > >>> export CC=icc > >>> export CXX=icc > > > > But these are still needed? > > No for F77, using --fcompiler=.. should be enough. I am not sure about CC, > CXX, must try it out.. > > >> When Python is compiled with a different compiler than numpy (or any > >> extension module) is going to be installed then proper libraries must be > >> specified manually. Which libraries and flags are needed exactly, this is > >> described in compilers manual. > >> > >> So, a recommended fix would be to build Python with icc and as a > >> result correct libraries will be used for building 3rd party extension > >> modules. > > > > This would also mean that all dependent packages will have > > to be installed again, right? > > I am sorry but then I won't be able to help with icc at the moment > > as I am completely swamped with other stuff... > > > >> Otherwise one has to read compilers manual, sections like > >> about gcc-compatibility and linking might be useful. See also > >> http://www.scipy.org/Wiki/FAQ#head-8371c35ef08b877875217aaac5489fc747b4aceb > > > > I thought that supplying ``--libraries="irc"`` > > might cure the problem, but > > (quoting from > > http://aspn.activestate.com/ASPN/Mail/Message/scipy-dev/2983903 > > ) > > """ > > However, in the build log I only found -lirc for > > the config_tests but nowhere else. > > What should I do instead of the above? > > """ > > Try: > > export CC=icc > python setup.py build build_ext -lirc > > This will probably use gcc for linking Yes, it does use gcc for linking. I also had to specify the location of `libirc`, export CC=icc python setup.py build build_ext -L/opt/intel/cc_90/lib/ -lirc followed by python setup.py config --fcompiler=intel install worked. On import I get another error import core -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N3/lib/python2.4/site-packages/numpy/core/umath.so: undefined symbol: __libm_sincos import random -> failed: 'module' object has no attribute 'dtype' import lib -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N3/lib/python2.4/site-packages/numpy/core/umath.so: undefined symbol: __libm_sincos import linalg -> failed: /opt/intel/fc_90/lib/libunwind.so.6: undefined symbol: ?1__serial_memmove import dft -> failed: /home/baecker/python2/scipy_icc5_lintst_n_N3/lib/python2.4/site-packages/numpy/core/umath.so: undefined symbol: __libm_sincos So it seems I will have to specify more libraries, Would this be the correct syntax: python setup.py build build_ext -L/opt/intel/cc_90/lib/:SomeOtherPath -lirc:someotherlibrary ? >From ``python setup.py build build_ext --help`` --libraries (-l) external C libraries to link with --library-dirs (-L) directories to search for external C libraries (separated by ':') it is not clear how to specify several libraries with "-l"? But that did not work (neither did -lirc -lm) > but might fix undefined symbol problems. Many thanks, Arnd From oliphant.travis at ieee.org Tue Feb 7 10:16:13 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 10:16:13 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: References: Message-ID: <43E8E39D.5020605@ieee.org> Rich Shepard wrote: > Last evening I downloaded numpy-0.9.4 and scipy-0.4.4. I have an earlier >version of Numeric in /usr/lib/python2.4/site-packages/Numeric/. Should I >remove all references to Numeric before installing NumPy? > >Rich > > > No need to do that. Numeric and NumPy (import numpy) can live happily together. With versions of Numeric about 24.0, then can even share the same data. -Travis From rshepard at appl-ecosys.com Tue Feb 7 10:18:40 2006 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Tue Feb 7 10:18:40 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: <43E8E39D.5020605@ieee.org> References: <43E8E39D.5020605@ieee.org> Message-ID: On Tue, 7 Feb 2006, Travis Oliphant wrote: > No need to do that. Numeric and NumPy (import numpy) can live happily > together. With versions of Numeric about 24.0, then can even share the same > data. Travis, Are there advantages to having both on the system? I read the Numeric manual a couple of times, but haven't looked deeply at the division between the two. Many thanks, Rich -- Richard B. Shepard, Ph.D. | Author of "Quantifying Environmental Applied Ecosystem Services, Inc. (TM) | Impact Assessments Using Fuzzy Logic" Voice: 503-667-4517 Fax: 503-667-8863 From efiring at hawaii.edu Tue Feb 7 10:22:25 2006 From: efiring at hawaii.edu (Eric Firing) Date: Tue Feb 7 10:22:25 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <200602071542.35593.faltet@carabos.com> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> Message-ID: <43E8E529.6030308@hawaii.edu> Francesc, Travis, Francesc Altet wrote: [...] > All in all, my opinion is that allowing the coexistence of different > sizes of unicode types in numpy would be a receipt for disaster when > one wants to transport unicode characters between platforms with > python interpreters compiled with different unicode sizes. I agree--it would be a nightmare. > Anyway, I don't know if the recommendation of compiling Python with > UCS4 is spread enough or not in the different distributions, but > people can easily check this with: > > >>>>len(buffer(u"u")) > > 4 > > if the output of this is 4 (as in my example), then the interpreter is > using UCS4; if it is 2, it is using UCS2. No, it is not sufficiently widespread; Mandriva 2006 python is compiled for UCS2. Eric From tim.hochberg at cox.net Tue Feb 7 10:34:09 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 7 10:34:09 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E8E529.6030308@hawaii.edu> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <43E8E529.6030308@hawaii.edu> Message-ID: <43E8E7C8.3030206@cox.net> Eric Firing wrote: > Francesc, Travis, > > Francesc Altet wrote: > [...] > >> All in all, my opinion is that allowing the coexistence of different >> sizes of unicode types in numpy would be a receipt for disaster when >> one wants to transport unicode characters between platforms with >> python interpreters compiled with different unicode sizes. > > > I agree--it would be a nightmare. > > >> Anyway, I don't know if the recommendation of compiling Python with >> UCS4 is spread enough or not in the different distributions, but >> people can easily check this with: >> >> >>>>> len(buffer(u"u")) >>>> >> >> 4 >> >> if the output of this is 4 (as in my example), then the interpreter is >> using UCS4; if it is 2, it is using UCS2. > > > No, it is not sufficiently widespread; Mandriva 2006 python is > compiled for UCS2. Also the default build for MS Windows is compiled for UCS2. How about always storing data as UCS4 and converting it on the fly to UCS2 when extracting a python string from the array, if on a UCS2 python build. Isn't converting to UCS2 simply a matter of lopping off the top two bytes? If so, converting it should be simply a check that the value is not out of range, followed by the aforementioned lopping. -tim From gerard.vermeulen at grenoble.cnrs.fr Tue Feb 7 10:50:04 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Tue Feb 7 10:50:04 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <200602071542.35593.faltet@carabos.com> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> Message-ID: <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> On Tue, 7 Feb 2006 15:42:34 +0100 Francesc Altet wrote: > A Dimarts 07 Febrer 2006 08:16, Travis Oliphant va escriure: > > In current SVN, numpy assumes 'w' is 2-byte unicode and 'W' is 4-byte > > unicode in the array interface typestring. Right now these codes > > require that the number of bytes be specified explicitly (to satisfy the > > array interface requirement). There is still only 1 Unicode data-type > > on the platform and it has the size of Python's Py_UNICODE type. The > > character 'U' continues to be useful on data-type construction to stand > > for a unicode string of a specific character length. It's internal dtype > > representation will use 'w' or 'W' depending on how Python was compiled. > > > > This may not solve all issues, but at least it's a bit more consistent > > and solves the problem of > > > > dtype(dtype('U8').str) not producing the same datatype. > > > > It also solves the problem of unicode written out with one compilation > > of Python and attempted to be written in with another (it won't let you > > because only one of 'w#' or 'W#' is supported on a platform. > > While I agree that this solution is more consistent, I must say that > I'm not very confortable with having to deal with two different widths > for unicode characters. What bothers me is the lack portability of > unicode strings when saving them to disk in python interpreters > UCS4-enabled and retrieving with UCS2-enabled ones in the context of > PyTables (or any other database). Let's suppose that a user have a > numpy object of type unicode that has been created in a python with > UCS4. This would look like: > > # UCS4-aware interpreter here > >>> numpy.array(u"\U000110fc", "U1") > array(u'\U000110fc', dtype=(unicode,4)) > > Now, suppose that you save this in a PyTables file (for example) and > you want to regenerate it on a python interpreter compiled with UCS2. > As the buffer on-disk has a fixed length, we are forced to use unicode > types twice as larger as containers for this data. So the net effect > is that we will end in the UCS2 interpreter with an object like: > > # UCS2-aware interpreter here > >>> numpy.array(u"\U000110fc", "U2") > array(u'\U000110fc', dtype=(unicode,4)) > > which, apparently is the same than the one above, but not quite. To > begin with, the former is an array that is an unicode scalar with only > *one* character, while the later has *two* characters. But worse than > that, the interpretation of the original content changes drastically > in the UCS2 platform. For example, if we select the first and second > characters of the string in the UCS2-aware platform, we have: > > >>> numpy.array(u"\U000110fc", "U2")[()][0] > u'\ud804' > >>> numpy.array(u"\U000110fc", "U2")[()][1] > u'\udcfc' > > that have nothing to do with the original \U000110fc character (I'd > expect to get at least the truncated values \u0001 and \u10fc). I > think this is because of the conventions that are used to represent > 32-bit unicode characters in UTF-16 using a technique called > "surrogate pairs" (see: http://www.unicode.org/glossary/). > > All in all, my opinion is that allowing the coexistence of different > sizes of unicode types in numpy would be a receipt for disaster when > one wants to transport unicode characters between platforms with > python interpreters compiled with different unicode sizes. > Consequently I'd propose to suport just one size of unicode sizes in > numpy, namely, the 4-byte one, and if this size doesn't match the > underlying python platform, then refuse to deliver native unicode > objects if the user is asking for them. Something like would work: > > # UCS2-aware interpreter here > >>> h=numpy.array(u"\U000110fc", "U1") > >>> h # This is a 'true' 32-bit unicode array in numpy > array(u'\U000110fc', dtype=(unicode,4)) > >>> h[()] # Try to get a native unicode object in python > Traceback (most recent call last): > File "", line 1, in ? > ValueError: unicode sizes in numpy and your python interpreter doesn't > match. Sorry, but you should get an UCS4-enable python interpreter if > you want to successfully complete this operation. > > As a premium, we can get rid of the 'w' and 'W' typecodes that has > been introduced a bit forcedly, IMO. I don't know, however, how > difficult would be implementing this in numpy. Another option can be > to refuse to compile numpy with UCS2-aware interpreters, but this > sounds a bit extreme, but see below. > > OTOH, I'm not an expert in Unicode, but after googling a bit, I've > found interesting recommendations about its use in Python. The first > is from Uge Ubuchi in http://www.xml.com/pub/a/2005/06/15/py-xml.html. > Here is the relevant excerpt: > > """ > I also want to mention another general principle to keep in mind: if > possible, use a Python install compiled to use UCS4 character storage > [...] UCS4 uses more space to store characters, but there are some > problems for XML processing in UCS2, which the Python core team is > reluctant to address because the only known fixes would be too much of > a burden on performance. Luckily, most distributors have heeded this > advice and ship UCS4 builds of Python. > """ > > So, it seems that the Python crew is not interested in solving > problems with with UCS2. Now, towards the end of the PEP 261 ('Support > for "wide" Unicode characters') one can read this as a final > conclusion: > > """ > This PEP represents the least-effort solution. Over the next several > years, 32-bit Unicode characters will become more common and that may > either convince us that we need a more sophisticated solution or (on > the other hand) convince us that simply mandating wide Unicode > characters is an appropriate solution. > """ > > This PEP dates from 27-Jun-2001, so the "next several years" the > author is referring to is nowadays. In fact, the interpreters in my > Debian based Linux, are both compiled with UCS4. Despite of this, it > seems that the default for compiling python is using UCS2 provided > that you still need to pass the flag "--enable-unicode=ucs4" if you > want to end with a UCS4-enabled interpreter. I wonder why they are > doing this if that can positively lead to problems with XML as Uge > Ubuchi said (?). > > Anyway, I don't know if the recommendation of compiling Python with > UCS4 is spread enough or not in the different distributions, but > people can easily check this with: > > >>> len(buffer(u"u")) > 4 > > if the output of this is 4 (as in my example), then the interpreter is > using UCS4; if it is 2, it is using UCS2. > > Finally, I agree that asking for help about these issues in the python > list would be a good idea. > I have no good solution for this problem, but the standard Python on my 1-year old Mandrake is still UCS2 and I quote from PEP-261: Windows builds will be narrow for a while based on the fact that there have been few requests for wide characters, those requests are mostly from hard-core programmers with the ability to buy their own Python and Windows itself is strongly biased towards 16-bit characters. Suppose that is still true. Maybe Vista will change that. Wouldn't it be possible that numpy takes care of the "surrogate pairs" when transferring unicode strings from UCS2-interpreters to UCS4-ndarrays and vice-versa? It would be nice to be able to cast explicitly between UCS2- and UCS4- arrays, too. Requesting users to recompile their Python is a rather brutal solution :-) Gerard From oliphant.travis at ieee.org Tue Feb 7 11:09:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 11:09:06 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? In-Reply-To: <43E8EE72.6070101@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> Message-ID: <43E8EFE9.5040207@ieee.org> Tim Hochberg wrote: > > A couple of more minor issues. > > 1. numpy/random/mtrand/distributions.c needs M_PI defined if is not > already. I used the def from umathmodule.c: > > #ifndef M_PI > #define M_PI 3.14159265358979323846264338328 > #endif > > 2. The math library m.lib was hardcoded into numpy/random/setup.py. > I simply replaced ['m'] with [], which is probably not right in > general. It should probably be grabbed from config.h. > > 3. This made it through all the compiling, but blew up on linking > randomkit because sever CryptXXX functions were not defined. I added > 'Advapi32' to the libraries list. (In total libraries went from ['m'] > to ['Advapi32']. > > With this I got a full compile. I successfully imported numpy and > added a couple of matrices. Hooray! > > Is there a way to run it through some regression tests? That seems > like it should be the next step. > > Let's see if we can't fix up the setup.py file to handle this common platform correctly.... import numpy numpy.test(1,1) -Travis From oliphant.travis at ieee.org Tue Feb 7 11:12:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 11:12:08 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: References: <43E8E39D.5020605@ieee.org> Message-ID: <43E8F0B4.6090502@ieee.org> Rich Shepard wrote: > On Tue, 7 Feb 2006, Travis Oliphant wrote: > >> No need to do that. Numeric and NumPy (import numpy) can live happily >> together. With versions of Numeric about 24.0, then can even share >> the same >> data. > > > Travis, > > Are there advantages to having both on the system? I read the Numeric > manual a couple of times, but haven't looked deeply at the division > between > the two. The only real advantage is to ease the transition burden. Several third-party libraries have not converted yet, so to use those you still need Numeric. -Travis From oliphant.travis at ieee.org Tue Feb 7 11:27:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 11:27:04 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> Message-ID: <43E8F449.2060600@ieee.org> Gerard Vermeulen wrote: >>While I agree that this solution is more consistent, I must say that >>I'm not very confortable with having to deal with two different widths >>for unicode characters. >> Python itself hands us this difference. Is it really so different then the fact that python integers are either 32-bit or 64-bit depending on the platform. Perhaps what this is telling us, is that we do indeed need another data-type for 4-byte unicode. It's how we solve the problem of 32-bit or 64-bit integers (we have a 64-bit integer on all platforms). Then in NumPy we can support going back and forth between UCS-2 (which we can then say is UTF-16) and UCS-4. The issue with saving to disk is really one of encoding anyway. So, if PyTables want's do do this correctly, then it should be using a particular encoding anyway. The internal representation of Unicode should not technically matter as it's only input and output that is important. I won't support requiring a UCS-4 build of Python, though. That's too stringent. Most characters are contained within the 0th plane of UCS-2. For the additional characters (only up to 0x0010FFFF are defined), the surrogate pairs can be used. I think the best solution is to define separate UCS4 and UCS2 data-types and handle conversion between them using the casting functions. This is a bit of work to implement, but not too bad... >Wouldn't it be possible that numpy takes care of the "surrogate pairs" >when transferring unicode strings from UCS2-interpreters to UCS4-ndarrays >and vice-versa? > >It would be nice to be able to cast explicitly between UCS2- and UCS4- arrays, >too. > >Requesting users to recompile their Python is a rather brutal solution :-) > > I agree. I much prefer an additional data-type since that is after-all what UCS2 and UCS4 are... different data-types. -Travis From rshepard at appl-ecosys.com Tue Feb 7 11:32:03 2006 From: rshepard at appl-ecosys.com (Rich Shepard) Date: Tue Feb 7 11:32:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Replacing Numeric With NumPy In-Reply-To: <43E8F0B4.6090502@ieee.org> References: <43E8E39D.5020605@ieee.org> <43E8F0B4.6090502@ieee.org> Message-ID: On Tue, 7 Feb 2006, Travis Oliphant wrote: > The only real advantage is to ease the transition burden. Several > third-party libraries have not converted yet, so to use those you still > need Numeric. Thank you. Rich -- Richard B. Shepard, Ph.D. | Author of "Quantifying Environmental Applied Ecosystem Services, Inc. (TM) | Impact Assessments Using Fuzzy Logic" Voice: 503-667-4517 Fax: 503-667-8863 From faltet at carabos.com Tue Feb 7 12:09:05 2006 From: faltet at carabos.com (Francesc Altet) Date: Tue Feb 7 12:09:05 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E8F449.2060600@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> Message-ID: <1139342874.7544.37.camel@localhost.localdomain> El dt 07 de 02 del 2006 a les 12:26 -0700, en/na Travis Oliphant va escriure: > Python itself hands us this difference. Is it really so different then > the fact that python integers are either 32-bit or 64-bit depending on > the platform. > > Perhaps what this is telling us, is that we do indeed need another > data-type for 4-byte unicode. It's how we solve the problem of 32-bit > or 64-bit integers (we have a 64-bit integer on all platforms). Agreed. > Then in NumPy we can support going back and forth between UCS-2 (which > we can then say is UTF-16) and UCS-4. If this could be implemented, then excellent! > The issue with saving to disk is really one of encoding anyway. So, if > PyTables want's do do this correctly, then it should be using a > particular encoding anyway. The problem with unicode encodings is that most (I'm thinking in UTF-8 and UTF-16) choose (correct me if I'm wrong here) a technique of surrogating pairs when trying to encode values that doesn't fit in a single word (7 bits for UTF-8 and 15 bits for UTF-16), which brings to a *variable* length of the coded output. And this is precisely the point: PyTables (as NumPy itself, or any other piece of software with efficiency in mind) would require a *fixed* space for keeping data, not a space that can be bigger or smaller depending on the number of surrogate pairs that should be used to encode a certain unicode string. But, if what you are saying is that NumPy would adopt a 32-bit unicode type internally and then do the appropriate conversion to/from the python interpreter, then this is perfect, because it is the buffer of NumPy that will be used to be written/read to/from disk, not the Python object, and the buffer of such a NumPy object meets the requisites to become an efficient buffer: fixed length *and* large enough to keep *every* Unicode character without a need to use encodings. > I think the best solution is to define separate UCS4 and UCS2 data-types > and handle conversion between them using the casting functions. This > is a bit of work to implement, but not too bad... Well, I don't understand well here. I thought that you were proposing a 32-bit unicode type for NumPy and then converting it appropriately to UCS2 (conversion to UCS4 wouldn't be necessary as it would be the same as the native NumPy unicode type) just in case that the user requires an scalar out of the NumPy object. But you are talking here about defining separate UCS4 and UCS2 data-types. I admit that I'm loosed here... Regards, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From oliphant.travis at ieee.org Tue Feb 7 12:37:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 12:37:03 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139342874.7544.37.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> Message-ID: <43E904AC.5060002@ieee.org> Francesc Altet wrote: >El dt 07 de 02 del 2006 a les 12:26 -0700, en/na Travis Oliphant va >escriure: > > >>Python itself hands us this difference. Is it really so different then >>the fact that python integers are either 32-bit or 64-bit depending on >>the platform. >> >>Perhaps what this is telling us, is that we do indeed need another >>data-type for 4-byte unicode. It's how we solve the problem of 32-bit >>or 64-bit integers (we have a 64-bit integer on all platforms). >> >> > >Agreed. > > > >>Then in NumPy we can support going back and forth between UCS-2 (which >>we can then say is UTF-16) and UCS-4. >> >> > >If this could be implemented, then excellent! > > Sure it could be implemented. It's just a matter of effort. Python itself always defines a Py_UCS4 type even on UCS2 builds. We would just have to make sure Py_UCS2 is always defined as well. The biggest hassle is implementing the corresponding scalar type. The one corresponding to the build for Python comes free. The other would have to be implemented directly. >The problem with unicode encodings is that most (I'm thinking in UTF-8 >and UTF-16) choose (correct me if I'm wrong here) a technique of >surrogating pairs when trying to encode values that doesn't fit in a >single word (7 bits for UTF-8 and 15 bits for UTF-16), which brings to a >*variable* length of the coded output. And this is precisely the point: >PyTables (as NumPy itself, or any other piece of software with >efficiency in mind) would require a *fixed* space for keeping data, not >a space that can be bigger or smaller depending on the number of >surrogate pairs that should be used to encode a certain unicode string. > > You are correct that encoding introduces a variable byte-length per character (up to 6 for UTF-8 and up to 2 for UTF-16 I think). I've seen data-bases handle this by warning the user to make sure the size of their data area is large enough to handle their longest use case. You can still used fixed-sizes you just have to make sure they are large enough (or risk truncation). >But, if what you are saying is that NumPy would adopt a 32-bit unicode >type internally and then do the appropriate conversion to/from the >python interpreter, then this is perfect, because it is the buffer of >NumPy that will be used to be written/read to/from disk, not the Python >object, and the buffer of such a NumPy object meets the requisites to >become an efficient buffer: fixed length *and* large enough to keep >*every* Unicode character without a need to use encodings. > > I see the value in such a buffer, I really do. I'm just concerned about forcing everyone to use Python UCS4 builds. That is way too stringent. I'm afraid the only real solution is to implement a UCS2 and a UCS4 data-type. >Well, I don't understand well here. I thought that you were proposing a >32-bit unicode type for NumPy and then converting it appropriately to >UCS2 (conversion to UCS4 wouldn't be necessary as it would be the same >as the native NumPy unicode type) just in case that the user requires an >scalar out of the NumPy object. But you are talking here about defining >separate UCS4 and UCS2 data-types. I admit that I'm loosed here... > > > I suppose that is another approach: we could internally have all UNICODE data-types use 4-bytes and do the conversions necessary. But, it would still require us to do most of work of supporting two data-types. Currently, the unicode scalar object is a simple inheritance from Python's UNICODE data-type. That would have to change and the work to do that is most of the work to support two different data-types. So, if we are going to go through that effort. I would rather see the result be two different Unicode data-types supported. -Travis From oliphant.travis at ieee.org Tue Feb 7 16:52:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 7 16:52:04 2006 Subject: ***[Possible UCE]*** Re: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43E9235E.70004@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> Message-ID: <43E94090.5080609@ieee.org> Tim Hochberg wrote: > > OK, I finally got it to pass all of the tests. The final two pieces of > the puzzle were using _isnan and _finite and then realizing the > _finite was not in fact the opposite of isinf. Thanks for finding this. I've updated the ufuncobject.h file with definitions for isinf, isfinite, and isnan. Presumably this should allow the SVN version of numpy to build. Let me know what happens. -Travis From faltet at carabos.com Wed Feb 8 00:09:10 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 8 00:09:10 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E904AC.5060002@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> <43E904AC.5060002@ieee.org> Message-ID: <1139386100.7534.35.camel@localhost.localdomain> El dt 07 de 02 del 2006 a les 13:35 -0700, en/na Travis Oliphant va escriure: > Sure it could be implemented. It's just a matter of effort. Python > itself always defines a Py_UCS4 type even on UCS2 builds. We would just > have to make sure Py_UCS2 is always defined as well. Be careful with this because you can run into problems. For example, trying to import numpy compiled with a UCS4 python from a UCS2 one, gives me the following: $ python Python 2.4.2 (#1, Feb 8 2006, 08:16:44) [GCC 4.0.3 20060115 (prerelease) (Debian 4.0.2-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy import core -> failed: /usr/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: _PyUnicodeUCS4_IsWhitespace import random -> failed: 'module' object has no attribute 'dtype' import lib -> failed: /usr/lib/python2.4/site-packages/numpy/core/multiarray.so: undefined symbol: _PyUnicodeUCS4_IsWhitespace Although I guess that this would be not a problem when using a numpy compiled with a proper interpreter. Just wanted to point out this. > The biggest hassle is implementing the corresponding scalar type. The > one corresponding to the build for Python comes free. The other would > have to be implemented directly. Yeah, it seems like we should end implementing a new Unicode type entirely in NumPy in a way or other. > I've seen data-bases handle this by warning the user to make sure the > size of their data area is large enough to handle their longest use > case. You can still used fixed-sizes you just have to make sure they > are large enough (or risk truncation). Ok. I can admit that data can be truncated (you may end with a corrupted Unicode string, but this is the responsability of the user :-(). However, another thing that I feel unconfortable with is the additional encoding/decoding steps that potentially introduces UCS2 for doing I/O. Well, perhaps this is faster than I suppose and that I/O speed will not be too affected, but still... > >Well, I don't understand well here. I thought that you were proposing a > >32-bit unicode type for NumPy and then converting it appropriately to > >UCS2 (conversion to UCS4 wouldn't be necessary as it would be the same > >as the native NumPy unicode type) just in case that the user requires an > >scalar out of the NumPy object. But you are talking here about defining > >separate UCS4 and UCS2 data-types. I admit that I'm loosed here... > > > > > > > I suppose that is another approach: we could internally have all > UNICODE data-types use 4-bytes and do the conversions necessary. But, > it would still require us to do most of work of supporting two > data-types. Currently, the unicode scalar object is a simple > inheritance from Python's UNICODE data-type. That would have to change > and the work to do that is most of the work to support two different > data-types. So, if we are going to go through that effort. I would > rather see the result be two different Unicode data-types supported. Ok. I see that you got my point. Well, maybe I'm wrong here, but my proposal would result in implementing just one new data-type for 32-bit unicode when the python platform is UCS2 aware. If, as you said above, Py_UCS4 type is always defined, even on UCS2 interpreters, that should be relatively easy to do. So, you we can make all the NumPy unicode *arrays* based on this new type. The NumPy unicode *scalars* will inherit directly from the native Py_UCS2 type for this interpreter. Then, we just have to implement the necessary conversions between UCS4<-->UCS2 to comunicate data from NumPy array into/from scalar type. The only drawback that I see in this approach is that you will end having UCS4 types in numpy ndarrays and UCS2 types when getting scalars from them (however, the user will hardly notice this, IMO). The advantage would be that NumPy arrays will always be UCS4 irregardingly of the platform they are, making the access to their data from C much easier and portable (and yes, efficient!). Of course, if you are using a UCS4 platform, then you can choose the same native Py_UCS4 type for NumPy arrays and scalars and you are done. Well, probably I've overlooked something, but I really think that this would be a nice thing to do. Regards, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From oliphant.travis at ieee.org Wed Feb 8 00:42:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 8 00:42:03 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <1139386100.7534.35.camel@localhost.localdomain> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> <43E904AC.5060002@ieee.org> <1139386100.7534.35.camel@localhost.localdomain> Message-ID: <43E9AEAE.5020306@ieee.org> Francesc Altet wrote: >Ok. I see that you got my point. Well, maybe I'm wrong here, but my >proposal would result in implementing just one new data-type for 32-bit >unicode when the python platform is UCS2 aware. If, as you said above, >Py_UCS4 type is always defined, even on UCS2 interpreters, that should >be relatively easy to do. > Hmm. I think I'm beginning to like your idea. We could in fact make the NumPy Unicode type always UCS4 and then keep the Python Unicode scalar. On Python UCS2 builds the conversion would use UTF-16 to go to the Python scalar (which would always inherit from the native unicode type). It would be one data-type where there was not an identical match in the memory layout of the scalar and the array data-type, but because in this case there are conversions to go back and forth, it may not matter. This would not be too difficult to implement, actually --- it would require new functions to handle conversions in arraytypes.inc.src and some modifications to PyArray_Scalar. The only draw-back is that now all unicode arrays are twice as large and the aforementioned asymmetry between the data-type and the array-scalar on Python UCS2 builds. But, all in all, it sounds like a good plan. If the time comes that somebody wants to add a reduced-size USC2 array of unicode characters then we can cross that bridge if and when it comes up. I still like using explicit typecode characters in the array interface to denote UCS2 or the UCS4 data-type. We could still change from 'W', 'w' to other characters... >Well, probably I've overlooked something, but I really think that this >would be a nice thing to do. > > There are details in the scalar-array conversions (getitem and setitem that would have to be implemented but it is possible. The UCS4 --> UTF-16 encoding is one of the easiest. It's done in unicodeobject.h in Python, but I'm not sure it's exposed other than going through the interpreter. Does this seem like a solution that everyone can live with? -Travis From gerard.vermeulen at grenoble.cnrs.fr Wed Feb 8 01:30:02 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Wed Feb 8 01:30:02 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E9AEAE.5020306@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <43E7F40F.7030303@cox.net> <43E8495C.9020008@ieee.org> <200602071542.35593.faltet@carabos.com> <20060207194858.67cd0698.gerard.vermeulen@grenoble.cnrs.fr> <43E8F449.2060600@ieee.org> <1139342874.7544.37.camel@localhost.localdomain> <43E904AC.5060002@ieee.org> <1139386100.7534.35.camel@localhost.localdomain> <43E9AEAE.5020306@ieee.org> Message-ID: <20060208102906.6191d180.gerard.vermeulen@grenoble.cnrs.fr> On Wed, 08 Feb 2006 01:41:18 -0700 Travis Oliphant wrote: > >Well, probably I've overlooked something, but I really think that this > >would be a nice thing to do. > > > > > There are details in the scalar-array conversions (getitem and setitem > that would have to be implemented but it is possible. The UCS4 --> > UTF-16 encoding is one of the easiest. It's done in unicodeobject.h in > Python, but I'm not sure it's exposed other than going through the > interpreter. > > Does this seem like a solution that everyone can live with? > Yes. The only point that worries me a little bit that some problems are limited by memory or memory bandwidth and for those cases UCS2 arrays are better than UCS4 arrays. I have run into memory problems before and I don't know if it will happen for unicode strings. Time will tell. Gerard From faltet at carabos.com Wed Feb 8 02:10:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 8 02:10:07 2006 Subject: [Numpy-discussion] Extent of unicode types in numpy In-Reply-To: <43E9AEAE.5020306@ieee.org> References: <1139250278.7538.52.camel@localhost.localdomain> <1139386100.7534.35.camel@localhost.localdomain> <43E9AEAE.5020306@ieee.org> Message-ID: <200602081109.14883.faltet@carabos.com> A Dimecres 08 Febrer 2006 09:41, Travis Oliphant va escriure: > Hmm. I think I'm beginning to like your idea. We could in fact make Good :-) > the NumPy Unicode type always UCS4 and then keep the Python Unicode > scalar. On Python UCS2 builds the conversion would use UTF-16 to go to > the Python scalar (which would always inherit from the native unicode > type). Yes, exactly. > But, all in all, it sounds like a good plan. If the time comes that > somebody wants to add a reduced-size USC2 array of unicode characters > then we can cross that bridge if and when it comes up. Well, provided the recommendations about migrating to 32-bit unicode objects, I'd say that this would be a strange desire. If the problem is memory consumption, the users can always choose regular 8-bit strings (of course, without supporting completely general unicode characters). > I still like using explicit typecode characters in the array interface > to denote UCS2 or the UCS4 data-type. We could still change from 'W', > 'w' to other characters... But, why do you want to do this? If data type for unicode in arrays is always UCS4 and in scalars is always determined by the python build, then why do we want to try to distinguish them with specific type codes? At C level there should be straightforward ways to determine whether a scalar is UCS2 or UCS4 (just looking at the native python type), and at python level there is not an evident way to distinguish (correct me if I'm wrong here) between an UCS2 and UCS4 unicode string, and in fact, the user will not notice the difference in general (but see later). Besides, having an 'U' as indicator for unicode is compatible in the way Python has to express 32-bit unicode chars (i.e. \Uxxxxxxxx). So I find that keeping 'U' for specifying unicode types would be more than enough and that introducing 'w' and 'W' (or whathever) will only introduce unnecessary burden, IMO. Moreover, if a user tries to know the type using the .dtype descriptor, he will find that the type continues to be 'U' irregardingly of the build he is using. Something like: # We are in a UCS2 interpreter In [30]: numpy.array([1],dtype="U2")[0].dtype Out[30]: dtype('UCS2. I'm still wondering why this is not the default... :-/ Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From a.h.jaffe at gmail.com Wed Feb 8 04:36:04 2006 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Wed Feb 8 04:36:04 2006 Subject: [Numpy-discussion] GCC 4 (OS X?) support for numpy? Message-ID: Hi All, [originally posted in a slightly off-topic thread, so I thought I'd try here -- sorry for the duplication!] What is the status of gcc 4.0 support (on Mac OS X at least)? It's a bit of a pain to have to switch between the two (are there any other disadvantages?). As of my last attempt, numpy.test() fails due to some machar issues if I recall correctly. Andrew From stefan at sun.ac.za Wed Feb 8 06:09:12 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed Feb 8 06:09:12 2006 Subject: [Numpy-discussion] creating column vectors Message-ID: <20060208141052.GB5734@alpha> This is probably a silly question, but what is the best way of creating column vectors? 'arange' always returns a row vector, on which you cannot perform 'transpose' since it has only one dimension. mat(arange(1,10)).transpose() works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). I'd appreciate pointers in the right direction. Regards St?fan From svetosch at gmx.net Wed Feb 8 06:35:19 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 8 06:35:19 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <20060208141052.GB5734@alpha> References: <20060208141052.GB5734@alpha> Message-ID: <43EA0157.9000908@gmx.net> Stefan van der Walt schrieb: > This is probably a silly question, but what is the best way of > creating column vectors? 'arange' always returns a row vector, on > which you cannot perform 'transpose' since it has only one dimension. > > mat(arange(1,10)).transpose() > mat(range(1,10)).T is a bit shorter, but I would agree that doing matrix algebra in numpy is not as natural as with explicitly matrix-oriented languages; my understanding is that this is due to numpy's broader (n-dimensional) scope. Numpy-masters: Is there a way to set a user- or project-specific config switch or something like that to always get matrix results when dealing with 1d and 2d arrays? I think that would make numpy much more attractive for people like Stefan and me coming from the 2d world. cheers, Sven From luszczek at cs.utk.edu Wed Feb 8 07:03:03 2006 From: luszczek at cs.utk.edu (Piotr Luszczek) Date: Wed Feb 8 07:03:03 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA0157.9000908@gmx.net> References: <20060208141052.GB5734@alpha> <43EA0157.9000908@gmx.net> Message-ID: <200602081001.54312.luszczek@cs.utk.edu> On Wednesday 08 February 2006 09:33, Sven Schreiber wrote: > Stefan van der Walt schrieb: > > This is probably a silly question, but what is the best way of > > creating column vectors? 'arange' always returns a row vector, on > > which you cannot perform 'transpose' since it has only one > > dimension. > > > > mat(arange(1,10)).transpose() > > mat(range(1,10)).T is a bit shorter, but I would agree that doing > matrix algebra in numpy is not as natural as with explicitly > matrix-oriented languages; my understanding is that this is due to > numpy's broader (n-dimensional) scope. > > Numpy-masters: Is there a way to set a user- or project-specific > config switch or something like that to always get matrix results > when dealing with 1d and 2d arrays? I think that would make numpy > much more attractive for people like Stefan and me coming from the 2d > world. I'm not a master by far but I heard that question before. Isn't the mlab module just for that purpose? I was explained that the problem with a "switch" is that the same code will behave differently depending on which installation you run. If you run on my n-D installation it will do one thing and if you run it on your 2-D installation (with the 2D world "switch" enabled) you get subtly different result. It might become a bug hunting nighmare. I think this is when Python's explicit vs. implicit rule kicks in: python -c 'import this' Piotr From gerard.vermeulen at grenoble.cnrs.fr Wed Feb 8 07:22:04 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Wed Feb 8 07:22:04 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <20060208141052.GB5734@alpha> References: <20060208141052.GB5734@alpha> Message-ID: <20060208162115.205360dd.gerard.vermeulen@grenoble.cnrs.fr> On Wed, 8 Feb 2006 16:10:52 +0200 Stefan van der Walt wrote: > This is probably a silly question, but what is the best way of > creating column vectors? 'arange' always returns a row vector, on > which you cannot perform 'transpose' since it has only one dimension. > > mat(arange(1,10)).transpose() > > works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). > > I'd appreciate pointers in the right direction. > What about this? arange(1, 10)[:, NewAxis] Gerard From arnd.baecker at web.de Wed Feb 8 09:24:02 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Wed Feb 8 09:24:02 2006 Subject: [Numpy-discussion] Need compilations with compilers other than gcc In-Reply-To: References: <43E840BE.5060204@ieee.org> Message-ID: On Tue, 7 Feb 2006, Arnd Baecker wrote: > On Tue, 7 Feb 2006, Pearu Peterson wrote: [...] > > >> So, a recommended fix would be to build Python with icc and as a > > >> result correct libraries will be used for building 3rd party extension > > >> modules. OK, I went for this. With numpy.__version__ '0.9.5.2069' I get for numpy.test(10) ====================================================================== FAIL: check_basic (numpy.lib.function_base.test_function_base.test_cumprod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/lib/tests/test_function_base.py", line 169, in check_basic 1320, 6600, 26400],ctype)) File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/testing/utils.py", line 156, in assert_array_equal assert cond,\ AssertionError: Arrays are not equal (mismatch 57.1428571429%): Array 1: [ 1. 2. 20. 0. 0. 0. 0.] Array 2: [ 1.0000000000000000e+00 2.0000000000000000e+00 2.0000000000000000e+01 2.2000000000000000e+02 1.32000000000000... ====================================================================== FAIL: check_basic (numpy.lib.function_base.test_function_base.test_cumsum) ---------------------------------------------------------------------- Traceback (most recent call last): File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/lib/tests/test_function_base.py", line 128, in check_basic assert_array_equal(cumsum(a), array([1,3,13,24,30,35,39],ctype)) File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/testing/utils.py", line 156, in assert_array_equal assert cond,\ AssertionError: Arrays are not equal (mismatch 57.1428571429%): Array 1: [ 1. 3. 13. 11. 17. 5. 9.] Array 2: [ 1. 3. 13. 24. 30. 35. 39.] ====================================================================== FAIL: check_simple (numpy.lib.function_base.test_function_base.test_unwrap) ---------------------------------------------------------------------- Traceback (most recent call last): File "/work/home/baecker/INSTALL_PYTHON_again_with_icc/Inst/lib/python2.4/site-packages/numpy/lib/tests/test_function_base.py", line 273, in check_simple assert(all(diff(unwrap(rand(10)*100)) References: <20060208141052.GB5734@alpha> <20060208162115.205360dd.gerard.vermeulen@grenoble.cnrs.fr> Message-ID: <43EA2E16.906@gmx.net> Gerard Vermeulen schrieb: >> mat(arange(1,10)).transpose() >> >> works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). > > What about this? > > arange(1, 10)[:, NewAxis] > The numpy-book beats both of us (see my previous post) in terms of minimal typing overhead by suggesting r_[1:10,'c'] which produces a matrix type, very nice. Compared to [1:10]', that's quite good already... -sven From stefan at sun.ac.za Wed Feb 8 12:25:02 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed Feb 8 12:25:02 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA2E16.906@gmx.net> References: <20060208141052.GB5734@alpha> <20060208162115.205360dd.gerard.vermeulen@grenoble.cnrs.fr> <43EA2E16.906@gmx.net> Message-ID: <20060208202658.GC5734@alpha> On Wed, Feb 08, 2006 at 06:44:54PM +0100, Sven Schreiber wrote: > Gerard Vermeulen schrieb: > > >> mat(arange(1,10)).transpose() > >> > >> works, but seems a bit long-winded (in comparison to MATLAB's [1:10]'). > > > > > What about this? > > > > arange(1, 10)[:, NewAxis] > > > > The numpy-book beats both of us (see my previous post) in terms of > minimal typing overhead by suggesting r_[1:10,'c'] which produces a > matrix type, very nice. Thanks for your effort, that's exactly what I was looking for! Time to get hold of that book... Cheers St?fan From Chris.Barker at noaa.gov Wed Feb 8 14:01:02 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 8 14:01:02 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <20060208141052.GB5734@alpha> References: <20060208141052.GB5734@alpha> Message-ID: <43EA69CC.3030504@noaa.gov> Stefan van der Walt wrote: > This is probably a silly question, but what is the best way of > creating column vectors? I do this: >>> import numpy as N >>> v = N.arange(10) >>> v array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> v.shape = (-1,1) >>> v array([[0], [1], [2], [3], [4], [5], [6], [7], [8], [9]]) > 'arange' always returns a row vector no, it doesn't, it returns a 1-dimensional vector. > Numpy-masters: Is there a way to set a user- or project-specific config > switch or something like that to always get matrix results when dealing > with 1d and 2d arrays? I think that would make numpy much more > attractive for people like Stefan and me coming from the 2d world. numpy is not a Matlab clone, nor should it be. That's exactly why I use it! Take a little time to get used to it, and you'll become very glad that numpy works the way it does, rather than like Matlab. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Feb 8 14:04:16 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 8 14:04:16 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) Message-ID: <43EA6A34.2000202@gmail.com> Sasha (from the ticket comments): """I would prefer if arange, would do what range does and round step, possibly with a warning for fractional steps. In other words, arange(start, stop, step, dtype) should be an optimized version of array(range(start, stop, step), dtype). If this is not acceptable, I think arange(start,stop,step)[-1] < stop should be an invariant and floating point issues should be properly addressed. """ arange() does allow for fractional steps unlinke range(). You may fix the docstring if you like. However, I don't think it is possible to ensure that invariant in the face of floating point. That's why we have linspace(). -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From oliphant at ee.byu.edu Wed Feb 8 14:40:19 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 14:40:19 2006 Subject: [Numpy-discussion] newunicode branch started to fix unicode to always be UCS4 Message-ID: <43EA72D5.8090508@ee.byu.edu> I've started a branch on SVN to fix the unicode implementation in NumPy so that internally all unicode arrays use UCS4. When a scalar is obtained it will be the Python unicode scalar and the required conversions (and data-copying) will be done. If anybody would like to help the branch is http://svn.scipy.org/svn/numpy/branches/newunicode -Travis From andorxor at gmx.de Wed Feb 8 14:47:20 2006 From: andorxor at gmx.de (Stephan Tolksdorf) Date: Wed Feb 8 14:47:20 2006 Subject: [Numpy-discussion] Constructing array from generator expression/iterator Message-ID: <43EA7441.4020500@gmx.de> Hi I'm new to Numpy and just stumbled over the following problem in Numpy 0.9.4: array(x**2 for x in range(10)) does not return what one (me) would suspect, i.e. array([x**2 for x in range(10)]) Is this expected behavior? Stephan From ndarray at mac.com Wed Feb 8 14:51:10 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 14:51:10 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA6A34.2000202@gmail.com> References: <43EA6A34.2000202@gmail.com> Message-ID: On 2/8/06, Robert Kern wrote: > ... > arange() does allow for fractional steps unlinke range(). You may fix the > docstring if you like. However, I don't think it is possible to ensure that > invariant in the face of floating point. That's why we have linspace(). There is certainly a way to ensure that arange(..., stop, ...)[-1] < stop in the face of floating point -- just repeat start += step with start in a volatile double variable until it exceeds stop to get the length of the result. There might be an O(1) solution as well, but it may require some assumptions about the floating point unit. In any case, I can do one of the following depending on a vote: 1 (default). Document length=ceil((stop - start)/step) in the arange docstring 2. Change arange to be a fast equivalent of array(range(start, stop, step), dtype). 3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. Please vote on 1-3. -- sasha From oliphant at ee.byu.edu Wed Feb 8 14:59:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 14:59:03 2006 Subject: [Numpy-discussion] Constructing array from generator expression/iterator In-Reply-To: <43EA7441.4020500@gmx.de> References: <43EA7441.4020500@gmx.de> Message-ID: <43EA773A.5010200@ee.byu.edu> Stephan Tolksdorf wrote: > Hi > > I'm new to Numpy and just stumbled over the following problem in Numpy > 0.9.4: > > array(x**2 for x in range(10)) > > does not return what one (me) would suspect, i.e. > > array([x**2 for x in range(10)]) > > Is this expected behavior? The array constructor does not current "understand" generators objects. It only understands sequence objects. It could be made to work but is based on code written long before there were generators. So, instead you get a 0-d Object-array containing the generator. Just use list comprehensions instead. -Travis From oliphant at ee.byu.edu Wed Feb 8 15:00:33 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 15:00:33 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> Message-ID: <43EA779F.2080302@ee.byu.edu> Sasha wrote: >On 2/8/06, Robert Kern wrote: > > >> ... >>arange() does allow for fractional steps unlinke range(). You may fix the >>docstring if you like. However, I don't think it is possible to ensure that >>invariant in the face of floating point. That's why we have linspace(). >> >> > >There is certainly a way to ensure that arange(..., stop, ...)[-1] < >stop in the face of floating point -- just repeat start += step with >start in a volatile double variable until it exceeds stop to get the >length of the result. There might be an O(1) solution as well, but >it may require some assumptions about the floating point unit. > >In any case, I can do one of the following depending on a vote: > >1 (default). Document length=ceil((stop - start)/step) in the arange docstring > > +5 We can't really do anything else at this point since this behavior has been what is with us for a long time. -Travis From ndarray at mac.com Wed Feb 8 15:05:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 15:05:03 2006 Subject: [Numpy-discussion] Constructing array from generator expression/iterator In-Reply-To: <43EA7441.4020500@gmx.de> References: <43EA7441.4020500@gmx.de> Message-ID: Array constructor does not support arbitrary iterables. For example: >>> array(iter([1,2,3])) array(, dtype=object) In Numeric, it was not possible to try to iterate throught the object in array constructor because rank-0 arrays were iterable and would lead to infinite recursion. Since this problem was fixed in numpy, I don't see much of a problem in implementing such feature. On 2/8/06, Stephan Tolksdorf wrote: > Hi > > I'm new to Numpy and just stumbled over the following problem in Numpy > 0.9.4: > > array(x**2 for x in range(10)) > > does not return what one (me) would suspect, i.e. > > array([x**2 for x in range(10)]) > > Is this expected behavior? > > Stephan > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From tim.hochberg at cox.net Wed Feb 8 15:41:31 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 8 15:41:31 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> Message-ID: <43EA7FE7.2040902@cox.net> Sasha wrote: >On 2/8/06, Robert Kern wrote: > > >> ... >>arange() does allow for fractional steps unlinke range(). You may fix the >>docstring if you like. However, I don't think it is possible to ensure that >>invariant in the face of floating point. That's why we have linspace(). >> >> > >There is certainly a way to ensure that arange(..., stop, ...)[-1] < >stop in the face of floating point -- just repeat start += step with >start in a volatile double variable until it exceeds stop to get the >length of the result. There might be an O(1) solution as well, but >it may require some assumptions about the floating point unit. > > Isn't that bad numerically? That is, isn't (n*step) much more accurate than (step + step + ....)? It also seems needlessly inefficient; you should be able to to it in at most a few steps: length = (stop - start)/step while length * step < stop: length += 1 while length * step >= stop: length -= 1 Fix errors, convert to C and enjoy. It should normally take only a few tries to get the right N. I see that the internals of range use repeated adding to make the range. I imagine that is why you proposed the repeated adding. I think that results in error that's on the order of length ULP, while multiplying would result in error on the order of 1 ULP. So perhaps we should fix XXX_fill to be more accurate if nothing else. >In any case, I can do one of the following depending on a vote: > >1 (default). Document length=ceil((stop - start)/step) in the arange docstring > > That has the virtue of being easy to explain. >2. Change arange to be a fast equivalent of array(range(start, stop, >step), dtype). > > No thank you. >3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. > > I see that Travis has vetoed this in any event, but perhaps we should fix up the fill functions to be more accurate and maybe most of the problem would just magically go away. -tim From ndarray at mac.com Wed Feb 8 16:05:26 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 16:05:26 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA779F.2080302@ee.byu.edu> References: <43EA6A34.2000202@gmail.com> <43EA779F.2080302@ee.byu.edu> Message-ID: On 2/8/06, Travis Oliphant wrote: > +5 > > We can't really do anything else at this point since this behavior has > been what is with us for a long time. I guess this closes the dispute. I've commited a new docstring to SVN. From wbaxter at gmail.com Wed Feb 8 16:05:29 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Wed Feb 8 16:05:29 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA69CC.3030504@noaa.gov> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> Message-ID: On 2/9/06, Christopher Barker wrote: > > > numpy is not a Matlab clone, nor should it be. That's exactly why I use > it! Take a little time to get used to it, and you'll become very glad > that numpy works the way it does, rather than like Matlab. If you can spare the time, I'd love to hear you elaborate on that. What are some specifics that make you say 'thank goodness for numpy!'? If you have some good ones, I'd like to put them up on http://www.scipy.org/NumPy_for_Matlab_Addicts (of course you're more than welcome to cut out the middle man and just post them directly on the wiki there yourself...) --Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From svetosch at gmx.net Wed Feb 8 16:12:53 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 8 16:12:53 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA69CC.3030504@noaa.gov> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> Message-ID: <43EA87EE.1030808@gmx.net> Christopher Barker schrieb: > > numpy is not a Matlab clone, nor should it be. That's exactly why I use > it! Take a little time to get used to it, and you'll become very glad > that numpy works the way it does, rather than like Matlab. > well, I have taken that time because I was already into python (glue everyting together, you know), but I bet you won't be very successful in the Gauss et al. camp with that marketing slogan... also, your statement does not sound very pythonic; the "you'll get used to it, and trust me, even if you don't understand it now, it's great afterwards"-approach sounds more like the pre-python era (you may insert a language of your choice here ;-) I don't see why numpy cannot preserve the features that are important to you (and which I know nothing about) and at the same time make life more intuitive and easier for 2d-dummies like myself -- in a lot of ways, it's already accomplished, I'd say it just needs the finishing touch. cheers, sven From oliphant at ee.byu.edu Wed Feb 8 16:17:31 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 8 16:17:31 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA7FE7.2040902@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> Message-ID: <43EA8867.5080109@ee.byu.edu> Tim Hochberg wrote: >> > > > I see that Travis has vetoed this in any event, but perhaps we should > fix up the fill functions to be more accurate and maybe most of the > problem would just magically go away. To do something different than arange has always done we need a new function, not change what arange does and thus potentially break lots of code. How do you propose to make the fill funtions more accurate? I'm certainly willing to see improvements there. -Travis From svetosch at gmx.net Wed Feb 8 16:29:06 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 8 16:29:06 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> Message-ID: <43EA8B46.60202@gmx.net> Bill Baxter schrieb: > > If you can spare the time, I'd love to hear you elaborate on that. What > are some specifics that make you say 'thank goodness for numpy!'? If > you have some good ones, I'd like to put them up on > http://www.scipy.org/NumPy_for_Matlab_Addicts (of course you're more > than welcome to cut out the middle man and just post them directly on > the wiki there yourself...) > > --Bill Baxter Just one addition/correction for your page (sorry won't do it myself, all those different wiki engines/syntaxes...): a * b is only element-wise if a and b are not numpy-matrices, afaik that's the main reason why it's so important to know whether you're working with numpy-arrays or with its subclass numpy-matrix. -sven From wbaxter at gmail.com Wed Feb 8 16:45:22 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Wed Feb 8 16:45:22 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA8B46.60202@gmx.net> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> <43EA8B46.60202@gmx.net> Message-ID: Thanks. To be honest I wrote that page in the middle of composing my email to Chris just so something would be there when people clicked on the link. :-) I think my 1-minute draft of that chart needs a different organization, because, as you point out, things are different in NumPy depending on whether you have a Matrix or an Array. Maybe it should be a 3-way comparison of Matlab / NumPy Array / NumPy Matrix instead. --bb On 2/9/06, Sven Schreiber wrote: > > Bill Baxter schrieb: > > > > > If you can spare the time, I'd love to hear you elaborate on that. What > > are some specifics that make you say 'thank goodness for numpy!'? If > > you have some good ones, I'd like to put them up on > > http://www.scipy.org/NumPy_for_Matlab_Addicts (of course you're more > > than welcome to cut out the middle man and just post them directly on > > the wiki there yourself...) > > > > --Bill Baxter > > Just one addition/correction for your page (sorry won't do it myself, > all those different wiki engines/syntaxes...): a * b is only > element-wise if a and b are not numpy-matrices, afaik that's the main > reason why it's so important to know whether you're working with > numpy-arrays or with its subclass numpy-matrix. > -sven > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Feb 8 17:02:22 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 8 17:02:22 2006 Subject: [Numpy-discussion] creating column vectors In-Reply-To: <43EA87EE.1030808@gmx.net> References: <20060208141052.GB5734@alpha> <43EA69CC.3030504@noaa.gov> <43EA87EE.1030808@gmx.net> Message-ID: <43EA93C4.7000904@noaa.gov> Sven Schreiber wrote: >> Take a little time to get used to it, and you'll become very glad >> that numpy works the way it does, rather than like Matlab. > I bet you won't be very successful in > the Gauss et al. camp with that marketing slogan... It's not a marketing slogan. It's a suggestion for someone that has already decided to learn Python+Numpy. Whenever you use something new, you shouldn't try to use it the same way that you use a different tool. We say the same thing to people that try to write Python like it's C. > does not sound very pythonic; the "you'll get used to it, and trust me, > even if you don't understand it now, it's great afterwards"-approach > sounds more like the pre-python era (you may insert a language of your > choice here ;-) The difference is that you really will like it better, not just get used to it. > I don't see why numpy cannot preserve the features that are important to > you (and which I know nothing about) and at the same time make life more > intuitive and easier for 2d-dummies like myself -- Because a matrix is not the same as an array. A matrix can be represented by a 2-d matrix, but a matrix can not represent an arbitrary n-d array (at least not easily!). If you're really doing a lot of linear algebra, then you want to use the matrix package. I haven't used it, but it should have a way to easily create a column vector for you. Python (and NumPy) is a much more powerful and flexible language than Matlab (Or gauss, or IDL, or...) Once you learn to use it, you will be happy you did. I was a major Matlab fan a while back. I spend 5 years in grad school using it, and did my entire dissertation with it. I've recently been helping a friend with some Matlab code, and I find it painful to use. You'll see. Or was that too smug? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ndarray at mac.com Wed Feb 8 20:10:02 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 8 20:10:02 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA7FE7.2040902@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> Message-ID: On 2/8/06, Tim Hochberg wrote: > Sasha wrote: > Isn't that bad numerically? That is, isn't (n*step) much more accurate > than (step + step + ....)? It does not matter whether n*step is more accurate than step+...+step. As long as arange uses stop+=step loop to fill in the values, the last element may exceed stop even if start + length*step does not. One may argue that filling with start + i*step is more accurate, but that will probably be much slower (even than my O(N) algorithm). > It also seems needlessly inefficient; I proposed O(N) algorithm just to counter Robert's argument that it is not possible to ensure the invariant. On the other hand I don't think it is that bad - I would expect the length computing loop to be much faster than the main loop that involves main memory. > you > should be able to to it in at most a few steps: > > length = (stop - start)/step > while length * step < stop: > length += 1 > while length * step >= stop: > length -= 1 > > Fix errors, convert to C and enjoy. It should normally take only a few > tries to get the right N. This will not work (even if you fix the error of missing start+ in the conditions :-): start + length*step < stop does not guarantee than start + step + ... + step < stop. > I see that the internals of range use repeated adding to make the range. > I imagine that is why you proposed the repeated adding. I think that > results in error that's on the order of length ULP, while multiplying > would result in error on the order of 1 ULP. So perhaps we should fix > XXX_fill to be more accurate if nothing else. > I don't think accuracy of XXX_fill for fractional steps is worth improving. In the cases where accuracy matters, one can always use integral step and multiply the result by a float. However, if anything is done to that end, I would suggest to generalize XXX_fill functions to allow accumulation be performed using a different type similarly to the way op.reduce and op.accumulate functions us their (new in numpy) dtype argument. > >3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. > > > I see that Travis has vetoed this in any event, but perhaps we should > fix up the fill functions to be more accurate and maybe most of the > problem would just magically go away. The more I think about this, the more I am convinced that using arange with a non-integer step is a bad idea. Since making it illegal is not an option, I don't see much of a point in changing exactly how bad it is. Users who want fractional steps should just be educated about linspace. From tim.hochberg at cox.net Wed Feb 8 21:01:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 8 21:01:01 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> Message-ID: <43EACC42.6030102@cox.net> Sasha wrote: >On 2/8/06, Tim Hochberg wrote: > > >>Sasha wrote: >>Isn't that bad numerically? That is, isn't (n*step) much more accurate >>than (step + step + ....)? >> >> > >It does not matter whether n*step is more accurate than step+...+step. > As long as arange uses stop+=step loop to fill in the values, the >last element may exceed stop even if start + length*step does not. >One may argue that filling with start + i*step is more accurate, but >that will probably be much slower (even than my O(N) algorithm). > > > >>It also seems needlessly inefficient; >> >> >I proposed O(N) algorithm just to counter Robert's argument that it is >not possible to ensure the invariant. On the other hand I don't think >it is that bad - I would expect the length computing loop to be much >faster than the main loop that involves main memory. > > > >>you >>should be able to to it in at most a few steps: >> >>length = (stop - start)/step >>while length * step < stop: >> length += 1 >>while length * step >= stop: >> length -= 1 >> >>Fix errors, convert to C and enjoy. It should normally take only a few >>tries to get the right N. >> >> > >This will not work (even if you fix the error of missing start+ in the >conditions :-): start + length*step < stop does not guarantee than >start + step + ... + step < stop. > > Indeed. When I first was thinking about this, I assumed that arange was computed as essentially start + range(0, n)*step. Not, as an accumulation. After I actually looked at what arange did, I failed to update my thinking -- my mistake, sorry. > > >>I see that the internals of range use repeated adding to make the range. >>I imagine that is why you proposed the repeated adding. I think that >>results in error that's on the order of length ULP, while multiplying >>would result in error on the order of 1 ULP. So perhaps we should fix >>XXX_fill to be more accurate if nothing else. >> >> >> > >I don't think accuracy of XXX_fill for fractional steps is worth improving. > > I would think that would depend on (a) how hard it is to do (I think the answer to that is not hard at all), (b) how much of a performance impact would it have (some, probably, since one's adding a multiply, and (c) how much one values the minor increase in accuracy versus whatever performance impact this might have. The change I was referring to would look more or less like: static void FLOAT_fill(float *buffer, intp length, void *ignored) { intp i; float start = buffer[0]; float delta = buffer[1]; delta -= start; /*start += (delta + delta); */ buffer += 2; for (i=2; iIn the cases where accuracy matters, one can always use integral step >and multiply the result by a float. However, if anything is done to >that end, I would suggest to generalize XXX_fill functions to allow >accumulation be performed using a different type similarly to the way >op.reduce and op.accumulate functions us their (new in numpy) dtype >argument. > > Really? That seems unnecessarily baroque. >>>3. Change arange to ensure that arange(..., stop, ...)[-1] < stop. >>> >>> >>> >>I see that Travis has vetoed this in any event, but perhaps we should >>fix up the fill functions to be more accurate and maybe most of the >>problem would just magically go away. >> >> > >The more I think about this, the more I am convinced that using arange >with a non-integer step is a bad idea. Since making it illegal is not >an option, I don't see much of a point in changing exactly how bad it >is. Users who want fractional steps should just be educated about >linspace. > > Are integer steps with noninteger start and stop safe? For that matter are integer steps safe for sufficiently large, floating point, but integral, values of start and stop. It seems like they might well not be, but I haven't thought it through very well. I suppose that even if this was technically unsafe, in practice it would probably be pretty hard to get into trouble in that way. Regards, -tim From oliphant.travis at ieee.org Wed Feb 8 21:37:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 8 21:37:03 2006 Subject: [Numpy-discussion] newunicode branch started to fix unicode to always be UCS4 In-Reply-To: <43EA72D5.8090508@ee.byu.edu> References: <43EA72D5.8090508@ee.byu.edu> Message-ID: <43EAD4D1.800@ieee.org> Travis Oliphant wrote: > > I've started a branch on SVN to fix the unicode implementation in > NumPy so that internally all unicode arrays use UCS4. When a scalar > is obtained it will be the Python unicode scalar and the required > conversions (and data-copying) will be done. > If anybody would like to help the branch is > Well, it turned out not to be too difficult. It is done. All Unicode arrays are now always 4-bytes-per character in NumPy. The length is specified in terms of characters (not bytes). This is different than other types, but it's consistent with the use of Unicode as characters. The array-scalar that a unicode array produces inherits directly from Python unicode type which has either 2 or 4 bytes depending on the build. On narrow builds where Python unicode is only 2-bytes, the 4-byte unicode is converted to 2-byte using surrogate pairs. There may be lingering bugs of course, so please try it out and report problems. -Travis From wbaxter at gmail.com Thu Feb 9 00:22:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 00:22:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki Message-ID: I added some content to the "NumPy/SciPy for Matlab users" page on the scipy wiki. But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart of equivalents that I laid out. If folks who know both could browse by and maybe fill in a blank or two, that would be great. I think this will be a helpful "getting started" page for newbies to NumPy coming from matlab, like me. One of the most frustrating things is when you sit down and can't figure out how to do the most basic things that do in your sleep in another environment (like making a column vector). So hopefully this page will help. The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts Thanks, Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Feb 9 01:15:00 2006 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu Feb 9 01:15:00 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> Hi Bill, On 2/9/06, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the scipy > wiki. Thanks a lot for doing this. Did you see this excellent reference? Maybe it would be useful to combine effort in some way? http://www.37mm.no/matlab-python-xref.html Best, Matthew From dd55 at cornell.edu Thu Feb 9 04:05:16 2006 From: dd55 at cornell.edu (Darren Dale) Date: Thu Feb 9 04:05:16 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <200602090703.45993.dd55@cornell.edu> On Thursday 09 February 2006 3:21 am, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the > scipy wiki. > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart > of equivalents that I laid out. > If folks who know both could browse by and maybe fill in a blank or two, > that would be great. I think this will be a helpful "getting started" page > for newbies to NumPy coming from matlab, like me. One of the most > frustrating things is when you sit down and can't figure out how to do the > most basic things that do in your sleep in another environment (like making > a column vector). So hopefully this page will help. > > The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts I filled in a couple of places, where I could. I have a question about upcasting related to this example: a with elements less than 0.5 zeroed out: Matlab: a .* (a>0.5) NumPy: where(a<0.5, 0, a) I think numpy should be able to do a*a>0.5 as well, but instead one must do: a*(a>0.5).astype('i') Is it desireable to upcast bools in this case? I think so, one could always recover the current behavior by doing: (a*(a>0.5)).astype('?') Darren From martin.wiechert at gmx.de Thu Feb 9 04:14:28 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Thu Feb 9 04:14:28 2006 Subject: [Numpy-discussion] segfault when calling PyArray_DescrFromType Message-ID: <200602091304.59062.martin.wiechert@gmx.de> Hi list, I'm trying to build an C extension, which uses arrays. It builds, and I can import it from python, but the very first call to a numpy function ea = (PyObject *) PyArray_DescrFromType (PyArray_INT); gives me a segfault. I have absolutely no clue, but nm -l mymodule.so | grep rray gives 000026a0 b PyArray_API /usr/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api.h:316 and this line reads static void **PyArray_API=NULL; which looks suspicious to me. Something wrong with my setup.py? Any suggestions? Regards, Martin. From dd55 at cornell.edu Thu Feb 9 04:24:01 2006 From: dd55 at cornell.edu (Darren Dale) Date: Thu Feb 9 04:24:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <200602090703.45993.dd55@cornell.edu> References: <200602090703.45993.dd55@cornell.edu> Message-ID: <200602090723.03522.dd55@cornell.edu> On Thursday 09 February 2006 7:03 am, Darren Dale wrote: > I have a question about upcasting related to this example: > > a with elements less than 0.5 zeroed out: > Matlab: a .* (a>0.5) > NumPy: where(a<0.5, 0, a) > > I think numpy should be able to do a*a>0.5 as well, but instead one must > do: a*(a>0.5).astype('i') > > Is it desireable to upcast bools in this case? I think so, one could always > recover the current behavior by doing: > (a*(a>0.5)).astype('?') oops: I should have been doing a*(a>0.5), the order of operations is important. My mistake. From gruben at bigpond.net.au Thu Feb 9 04:27:06 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 9 04:27:06 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> References: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> Message-ID: <43EB34F8.6030006@bigpond.net.au> Vidar's documentation is under a GNU Free Documentation License. This is probably a problem with incorporating it directly into the scipy site, although Vidar was at one point happy to incorporate the MATLAB parts into Perry Greenfield and Robert Jedrzejewski's interactive data analysis tutorial. This tutorial used to be on the numarray page but has now disappeared and hasn't quite found it's way onto the scipy site, although it may just be due to a broken link. I've added a link to Vidar's site to the wiki. Gary R. Matthew Brett wrote: > Hi Bill, > > On 2/9/06, Bill Baxter wrote: >> I added some content to the "NumPy/SciPy for Matlab users" page on the scipy >> wiki. > > Thanks a lot for doing this. Did you see this excellent reference? > Maybe it would be useful to combine effort in some way? > > http://www.37mm.no/matlab-python-xref.html > > Best, > > Matthew From faltet at carabos.com Thu Feb 9 04:50:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 9 04:50:03 2006 Subject: [Numpy-discussion] newunicode branch started to fix unicode to always be UCS4 In-Reply-To: <43EAD4D1.800@ieee.org> References: <43EA72D5.8090508@ee.byu.edu> <43EAD4D1.800@ieee.org> Message-ID: <200602091349.47672.faltet@carabos.com> A Dijous 09 Febrer 2006 06:36, Travis Oliphant va escriure: > Travis Oliphant wrote: > > I've started a branch on SVN to fix the unicode implementation in > > NumPy so that internally all unicode arrays use UCS4. When a scalar > > is obtained it will be the Python unicode scalar and the required > > conversions (and data-copying) will be done. > > If anybody would like to help the branch is > > Well, it turned out not to be too difficult. It is done. Oh my! If I wouldn't have met you in person I would tend to think that you are not human ;-) > All Unicode > arrays are now always 4-bytes-per character in NumPy. The length is > specified in terms of characters (not bytes). This is different than > other types, but it's consistent with the use of Unicode as characters. Yes, I think this is a good idea. > The array-scalar that a unicode array produces inherits directly from > Python unicode type which has either 2 or 4 bytes depending on the build. > > On narrow builds where Python unicode is only 2-bytes, the 4-byte > unicode is converted to 2-byte using surrogate pairs. Very good! > There may be lingering bugs of course, so please try it out and report > problems. Well, I've tried it for a while and it seems to me that you made a very good job! Just a little thing: # Using an UCS4 interpreter here >>> len(buffer(numpy.array("qsds", 'U4')[()])) 16 >>> numpy.array("qsds", 'U4')[()].dtype dtype('>> len(buffer(numpy.array("qsds", 'U3')[()])) 12 >>> numpy.array("qsds", 'U3')[()].dtype dtype('>> len(buffer(numpy.array("qsds", 'U4')[()])) 8 # Fine >>> numpy.array("qsds", 'U4')[()].dtype dtype('>> len(buffer(numpy.array("qsds", 'U3')[()])) 6 # Fine >>> numpy.array("qsds", 'U3')[()].dtype dtype('0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From bsouthey at gmail.com Thu Feb 9 05:57:04 2006 From: bsouthey at gmail.com (Bruce Southey) Date: Thu Feb 9 05:57:04 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: Hi, The example of ndim to give the rank is not the same as the Matlab function rank(a). See http://en.wikipedia.org/wiki/Rank_of_a_matrix for definition of rank that I would think that most people would use if they use Matlab and is provided by rank(a). I have not used the latest numpy but the equivalent function is not present in numarray/Numeric (to my knowledge) so you have to find some other way like using svd. Regards Bruce On 2/9/06, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the scipy wiki. > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart of equivalents that I laid out. > If folks who know both could browse by and maybe fill in a blank or two, that would be great. I think this will be a helpful "getting started" page for newbies to NumPy coming from matlab, like me. One of the most frustrating things is when you sit down and can't figure out how to do the most basic things that do in your sleep in another environment (like making a column vector). So hopefully this page will help. > > The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts > > Thanks, > Bill Baxter > From aisaac at american.edu Thu Feb 9 06:39:11 2006 From: aisaac at american.edu (Alan G Isaac) Date: Thu Feb 9 06:39:11 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: On Thu, 9 Feb 2006, Bruce Southey apparently wrote: > The example of ndim to give the rank is not the same as the Matlab > function rank(a). See > http://en.wikipedia.org/wiki/Rank_of_a_matrix for definition of rank > that I would think that most people would use if they use Matlab Coming from GAUSS and having never studies tensors, I was also surprised by the 'rank' terminology. I believe this is why Travis changed to ndim, which is less likely to confuse users coming from a linear algebra perspective. Unfortunately the SciPy book currently uses the term 'rank' in the two conflicting ways. (It uses 'rank' in the linear algebra sense only in the discussion of lstsq on p.145.) It might be helpful for the tensor sense to always be qualified as 'tensor rank'? Cheers, Alan Isaac From pau.gargallo at gmail.com Thu Feb 9 06:44:01 2006 From: pau.gargallo at gmail.com (Pau Gargallo) Date: Thu Feb 9 06:44:01 2006 Subject: [Numpy-discussion] ufuncs question Message-ID: <6ef8f3380602090643m382c0b46ndf025e39f67894c0@mail.gmail.com> hi all, i have a code the following code: def foo(x): '''takes a nd-array x and return another md-array''' do something return a md-array A = an array of nd-arrays #A has 1+n dimensions B = an array of md-arrays #B has 1+m dimensions for i in len(A): B[i] = foo(A[i]) and was wandering if there is an easy way to speed it up. I guess that something using ufuncs could be used (?). Something like B = ufunced_foo( A ). thanks in advance, pau From martin.wiechert at gmx.de Thu Feb 9 07:01:31 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Thu Feb 9 07:01:31 2006 Subject: [Numpy-discussion] Re: [SciPy-user] segfault when calling PyArray_DescrFromType In-Reply-To: <200602091141.51520.martin.wiechert@gmx.de> References: <200602091141.51520.martin.wiechert@gmx.de> Message-ID: <200602091552.11896.martin.wiechert@gmx.de> Found it (in the "old" docs). Must #define PY_ARRAY_UNIQUE_SYMBOL and call import_array (). Sorry to bother. Martin. On Thursday 09 February 2006 11:41, Martin Wiechert wrote: > Hi list, > > I'm trying to build an C extension, which uses arrays. It builds, and I can > import it from python, but the very first call to a numpy function > > ea = (PyObject *) PyArray_DescrFromType (PyArray_INT); > > gives me a segfault. > > I have absolutely no clue, but > > nm -l mymodule.so | grep rray > > gives > > 000026a0 b > PyArray_API > /usr/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api. >h:316 > > and this line reads > > static void **PyArray_API=NULL; > > which looks suspicious to me. Something wrong with my setup.py? > > Any suggestions? > > Regards, Martin. > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.net > http://www.scipy.net/mailman/listinfo/scipy-user From oliphant.travis at ieee.org Thu Feb 9 08:11:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 08:11:04 2006 Subject: [Numpy-discussion] Re: ***[Possible UCE]*** Re: ***[Possible UCE]*** Re: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EA3DE0.1070608@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> Message-ID: <43EB695D.9050501@ieee.org> Tim Hochberg wrote: > I'm attaching the two modified setup files. The first is > numpy/core/setup.py and the second is numpy/random/setup.py. I tried > to keep the modifications as minimal as possible. With these two setup > files, and adding M_PI to numpy\random\mtrand\distributions.c, numpy > compiles fine and passes all tests except for the test_minrelpath path > I mentioned in my last message. I'm trying to incorporate your changes. 1) M_PI was easy to fix. 2) In the core/setup.py file you sent you add the line: python_libs = join(distutils.sysconfig.EXEC_PREFIX, 'libs') I'm not sure what this is supposed to do. What problem does it fix on your system? It makes no sence on mine as this becomes python_libs = '/usr/libs' which is not a directory. 3) For the setup.py file in random you are using Advapi for all win32 platforms. But, this seems to be a windows NT file or at least only needed when compiling with certain compilers. Mingw32 built just fine without it. So, I'm not sure how to handle this. Suggestions? -Travis > > -tim > >------------------------------------------------------------------------ > > >import imp >import os >import sys >import distutils.sysconfig >from os.path import join >from glob import glob >from distutils.dep_util import newer,newer_group > >def configuration(parent_package='',top_path=None): > from numpy.distutils.misc_util import Configuration,dot_join > from numpy.distutils.system_info import get_info > > config = Configuration('core',parent_package,top_path) > local_dir = config.local_path > codegen_dir = join(local_dir,'code_generators') > > generate_umath_py = join(codegen_dir,'generate_umath.py') > n = dot_join(config.name,'generate_umath') > generate_umath = imp.load_module('_'.join(n.split('.')), > open(generate_umath_py,'U'),generate_umath_py, > ('.py','U',1)) > > header_dir = join(*(config.name.split('.')+['include','numpy'])) > > def generate_config_h(ext, build_dir): > target = join(build_dir,'config.h') > if newer(__file__,target): > config_cmd = config.get_config_cmd() > print 'Generating',target > # > tc = generate_testcode(target) > from distutils import sysconfig > python_include = sysconfig.get_python_inc() > python_libs = join(distutils.sysconfig.EXEC_PREFIX, 'libs') > result = config_cmd.try_run(tc,include_dirs=[python_include], library_dirs=[python_libs]) > if not result: > raise "ERROR: Failed to test configuration" > moredefs = [] > > # > mathlibs = [] > tc = testcode_mathlib() > mathlibs_choices = [[],['m'],['cpml']] > mathlib = os.environ.get('MATHLIB') > if mathlib: > mathlibs_choices.insert(0,mathlib.split(',')) > for libs in mathlibs_choices: > if config_cmd.try_run(tc,libraries=libs): > mathlibs = libs > break > else: > raise "math library missing; rerun setup.py after setting the MATHLIB env variable" > ext.libraries.extend(mathlibs) > moredefs.append(('MATHLIB',','.join(mathlibs))) > > libs = mathlibs > kws_args = {'libraries':libs,'decl':0,'headers':['math.h']} > if config_cmd.check_func('expl', **kws_args): > moredefs.append('HAVE_LONGDOUBLE_FUNCS') > if config_cmd.check_func('expf', **kws_args): > moredefs.append('HAVE_FLOAT_FUNCS') > if config_cmd.check_func('log1p', **kws_args): > moredefs.append('HAVE_LOG1P') > if config_cmd.check_func('expm1', **kws_args): > moredefs.append('HAVE_EXPM1') > if config_cmd.check_func('asinh', **kws_args): > moredefs.append('HAVE_INVERSE_HYPERBOLIC') > if config_cmd.check_func('atanhf', **kws_args): > moredefs.append('HAVE_INVERSE_HYPERBOLIC_FLOAT') > if config_cmd.check_func('atanhl', **kws_args): > moredefs.append('HAVE_INVERSE_HYPERBOLIC_LONGDOUBLE') > if config_cmd.check_func('isnan', **kws_args): > moredefs.append('HAVE_ISNAN') > if config_cmd.check_func('isinf', **kws_args): > moredefs.append('HAVE_ISINF') > > if sys.version[:3] < '2.4': > kws_args['headers'].append('stdlib.h') > if config_cmd.check_func('strtod', **kws_args): > moredefs.append(('PyOS_ascii_strtod', 'strtod')) > > if moredefs: > target_f = open(target,'a') > for d in moredefs: > if isinstance(d,str): > target_f.write('#define %s\n' % (d)) > else: > target_f.write('#define %s %s\n' % (d[0],d[1])) > target_f.close() > else: > mathlibs = [] > target_f = open(target) > for line in target_f.readlines(): > s = '#define MATHLIB' > if line.startswith(s): > value = line[len(s):].strip() > if value: > mathlibs.extend(value.split(',')) > target_f.close() > > ext.libraries.extend(mathlibs) > > incl_dir = os.path.dirname(target) > if incl_dir not in config.numpy_include_dirs: > config.numpy_include_dirs.append(incl_dir) > > config.add_data_files((header_dir,target)) > return target > > def generate_api_func(header_file, module_name): > def generate_api(ext,build_dir): > target = join(build_dir, header_file) > script = join(codegen_dir, module_name + '.py') > if newer(script, target): > sys.path.insert(0, codegen_dir) > try: > m = __import__(module_name) > print 'executing',script > m.generate_api(build_dir) > finally: > del sys.path[0] > config.add_data_files((header_dir,target)) > return target > return generate_api > > generate_array_api = generate_api_func('__multiarray_api.h', > 'generate_array_api') > generate_ufunc_api = generate_api_func('__ufunc_api.h', > 'generate_ufunc_api') > > def generate_umath_c(ext,build_dir): > target = join(build_dir,'__umath_generated.c') > script = generate_umath_py > if newer(script,target): > f = open(target,'w') > f.write(generate_umath.make_code(generate_umath.defdict, > generate_umath.__file__)) > f.close() > return [] > > config.add_data_files(join('include','numpy','*.h')) > config.add_include_dirs('src') > > config.numpy_include_dirs.extend(config.paths('include')) > > deps = [join('src','arrayobject.c'), > join('src','arraymethods.c'), > join('src','scalartypes.inc.src'), > join('src','arraytypes.inc.src'), > join('src','_signbit.c'), > join('src','_isnan.c'), > join('include','numpy','*object.h'), > join(codegen_dir,'genapi.py'), > join(codegen_dir,'*.txt') > ] > > config.add_extension('multiarray', > sources = [join('src','multiarraymodule.c'), > generate_config_h, > generate_array_api, > join('src','scalartypes.inc.src'), > join('src','arraytypes.inc.src'), > join(codegen_dir,'generate_array_api.py'), > join('*.py') > ], > depends = deps, > ) > > config.add_extension('umath', > sources = [generate_config_h, > join('src','umathmodule.c.src'), > generate_umath_c, > generate_ufunc_api, > join('src','scalartypes.inc.src'), > join('src','arraytypes.inc.src'), > ], > depends = [join('src','ufuncobject.c'), > generate_umath_py, > join(codegen_dir,'generate_ufunc_api.py'), > ]+deps, > ) > > config.add_extension('_sort', > sources=[join('src','_sortmodule.c.src'), > generate_config_h, > generate_array_api, > ], > ) > > # Configure blasdot > blas_info = get_info('blas_opt',0) > #blas_info = {} > def get_dotblas_sources(ext, build_dir): > if blas_info: > return ext.depends[:1] > return None # no extension module will be built > > config.add_extension('_dotblas', > sources = [get_dotblas_sources], > depends=[join('blasdot','_dotblas.c'), > join('blasdot','cblas.h'), > ], > include_dirs = ['blasdot'], > extra_info = blas_info > ) > > > config.add_data_dir('tests') > config.make_svn_version_py() > > return config > >def testcode_mathlib(): > return """\ >/* check whether libm is broken */ >#include >int main(int argc, char *argv[]) >{ > return exp(-720.) > 1.0; /* typically an IEEE denormal */ >} >""" > >import sys >def generate_testcode(target): > if sys.platform == 'win32': > target = target.replace('\\','\\\\') > testcode = [r''' >#include >#include >#include > >int main(int argc, char **argv) >{ > > FILE *fp; > > fp = fopen("'''+target+'''","w"); > '''] > > c_size_test = r''' >#ifndef %(sz)s > fprintf(fp,"#define %(sz)s %%d\n", sizeof(%(type)s)); >#else > fprintf(fp,"/* #define %(sz)s %%d */\n", %(sz)s); >#endif >''' > for sz, t in [('SIZEOF_SHORT', 'short'), > ('SIZEOF_INT', 'int'), > ('SIZEOF_LONG', 'long'), > ('SIZEOF_FLOAT', 'float'), > ('SIZEOF_DOUBLE', 'double'), > ('SIZEOF_LONG_DOUBLE', 'long double'), > ('SIZEOF_PY_INTPTR_T', 'Py_intptr_t'), > ]: > testcode.append(c_size_test % {'sz' : sz, 'type' : t}) > > testcode.append('#ifdef PY_LONG_LONG') > testcode.append(c_size_test % {'sz' : 'SIZEOF_LONG_LONG', > 'type' : 'PY_LONG_LONG'}) > testcode.append(c_size_test % {'sz' : 'SIZEOF_PY_LONG_LONG', > 'type' : 'PY_LONG_LONG'}) > > > testcode.append(r''' >#else > fprintf(fp, "/* PY_LONG_LONG not defined */\n"); >#endif >#ifndef CHAR_BIT > { > unsigned char var = 2; > int i=0; > while (var >= 2) { > var = var << 1; > i++; > } > fprintf(fp,"#define CHAR_BIT %d\n", i+1); > } >#else > fprintf(fp, "/* #define CHAR_BIT %d */\n", CHAR_BIT); >#endif > fclose(fp); > return 0; >} >''') > testcode = '\n'.join(testcode) > return testcode > >if __name__=='__main__': > from numpy.distutils.core import setup > setup(**configuration(top_path='').todict()) > > >------------------------------------------------------------------------ > >import sys >from os.path import join > >def configuration(parent_package='',top_path=None): > from numpy.distutils.misc_util import Configuration > config = Configuration('random',parent_package,top_path) > > # Configure mtrand > # Note that I'm mimicking the original behaviour of always using 'm' for > # the math library. This should probably use the logic from numpy/core/setup.py > # to chose the math libraries, but I'm going for minimal changes -- TAH > if sys.platform == "win32": > libraries = ['Advapi32'] > else: > libraries = ['m'] > config.add_extension('mtrand', > sources=[join('mtrand', x) for x in > ['mtrand.c', 'randomkit.c', 'initarray.c', > 'distributions.c']], > libraries=libraries, > depends = [join('mtrand','*.h'), > join('mtrand','*.pyx'), > join('mtrand','*.pxi'), > ] > ) > > return config > >if __name__ == '__main__': > from numpy.distutils.core import setup > setup(**configuration(top_path='').todict()) > > From tim.hochberg at cox.net Thu Feb 9 08:30:10 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 08:30:10 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB695D.9050501@ieee.org> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> Message-ID: <43EB6DF5.6010705@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >> I'm attaching the two modified setup files. The first is >> numpy/core/setup.py and the second is numpy/random/setup.py. I tried >> to keep the modifications as minimal as possible. With these two >> setup files, and adding M_PI to numpy\random\mtrand\distributions.c, >> numpy compiles fine and passes all tests except for the >> test_minrelpath path I mentioned in my last message. > > > > I'm trying to incorporate your changes. Great. > 1) M_PI was easy to fix. > > 2) In the core/setup.py file you sent you add the line: > > python_libs = join(distutils.sysconfig.EXEC_PREFIX, 'libs') > > I'm not sure what this is supposed to do. What problem does it fix on > your system? It makes no sence on mine as this becomes > > python_libs = '/usr/libs' > > which is not a directory. OK, we'll have to work out something that works for both. The issue here on windows is that compiling the testcode requires python.lib, and it doesn't get found unless that directory is specified. The problem is perhaps related to the following comment in system_info.py if sys.platform == 'win32': # line 116 default_lib_dirs = ['C:\\'] # probably not very helpful... In any event, it does seem like there should be a better way to find where python.lib lives, but I couldn't find it in my perusal of the distutils docs. > > 3) For the setup.py file in random you are using Advapi for all win32 > platforms. But, this seems to be a windows NT file I'm compiling on XP FWIW. > or at least only needed when compiling with certain compilers. > Mingw32 built just fine without it. So, I'm not sure how to handle > this. My guess, and it's only a gues because I use neither mingw or the windows crypto stuff, is that defines are set differently by mingw so that the parts that need that library are not being compiled when you use mingw. The code in question is all gaurded by: #ifdef _WIN32 #ifndef RK_NO_WINCRYPT As far as I can tell, RK_NO_WINCRYPT never gets defined anywhere, so the important test is for _WIN32. So, does mingw define _WIN32? If it does not, then that's what's going on. In that case, the proper test is probably to check if _WIN32 is defined by the compiler in question and include Advapi only then. If it does define _WIN32, then I dunno! -tim From strawman at astraw.com Thu Feb 9 09:00:03 2006 From: strawman at astraw.com (Andrew Straw) Date: Thu Feb 9 09:00:03 2006 Subject: [Numpy-discussion] segfault when calling PyArray_DescrFromType In-Reply-To: <200602091304.59062.martin.wiechert@gmx.de> References: <200602091304.59062.martin.wiechert@gmx.de> Message-ID: <43EB74DD.3050608@astraw.com> Martin Wiechert wrote: >Hi list, > >I'm trying to build an C extension, which uses arrays. It builds, and I can >import it from python, but the very first call to a numpy function > > ea = (PyObject *) PyArray_DescrFromType (PyArray_INT); > >gives me a segfault. > >I have absolutely no clue, but > >nm -l mymodule.so | grep rray > >gives > >000026a0 b >PyArray_API /usr/lib/python2.4/site-packages/numpy/core/include/numpy/__multiarray_api.h:316 > >and this line reads > >static void **PyArray_API=NULL; > >which looks suspicious to me. Something wrong with my setup.py? > >Any suggestions? > > Did you do import_array() beforehand? From oliphant.travis at ieee.org Thu Feb 9 09:30:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 09:30:03 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB6DF5.6010705@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> Message-ID: <43EB7BE3.70706@ieee.org> Tim Hochberg wrote: >> >> >> I'm trying to incorporate your changes. > > > > Great. > > > OK, we'll have to work out something that works for both. The issue > here on windows is that compiling the testcode requires python.lib, > and it doesn't get found unless that directory is specified. The > problem is perhaps related to the following comment in system_info.py > > if sys.platform == 'win32': # line 116 > default_lib_dirs = ['C:\\'] # probably not very helpful... I added the change you made in setup.py to default_lib_dirs, here. See if this fixes it. >> >> 3) For the setup.py file in random you are using Advapi for all win32 >> platforms. But, this seems to be a windows NT file > > > I'm compiling on XP FWIW. > >> or at least only needed when compiling with certain compilers. >> Mingw32 built just fine without it. So, I'm not sure how to handle >> this. > I see now. On _WIN32 platforms it's using the registry instead of the file system to store things. I modified the random/setup.py script to test for _WIN32 in the compiler and add the dll to the list of libraries if it is found. I'm also reading the configuration file to determine MATHLIB. Can you try out the new SVN and see if it builds for you without modification. -Travis From tim.hochberg at cox.net Thu Feb 9 09:47:14 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 09:47:14 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EA8867.5080109@ee.byu.edu> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> Message-ID: <43EB7FE5.1050000@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >>> >> >> >> I see that Travis has vetoed this in any event, but perhaps we should >> fix up the fill functions to be more accurate and maybe most of the >> problem would just magically go away. > > > > To do something different than arange has always done we need a new > function, not change what arange does and thus potentially break lots > of code. > > How do you propose to make the fill funtions more accurate? I'm > certainly willing to see improvements there. OK, I experimented with. I replaced the code for @NAME at _fill in arraytypes.inc.src with: /**begin repeat #NAME=BYTE,UBYTE,SHORT,USHORT,INT,UINT,LONG,ULONG,LONGLONG,ULONGLONG,FLOAT,DOUBLE,LONGDOUBLE# #typ=byte,ubyte,short,ushort,int,uint,long,ulong,longlong,ulonglong,float,double,longdouble# */ static void @NAME at _fill(@typ@ *buffer, intp length, void *ignored) { intp i; @typ@ start = buffer[0]; @typ@ delta = buffer[1]; delta -= start; buffer += 2; for (i=2; i= 100.65 arange(0, 100.68000000000001, 0.12) failed 100.68 >= 100.68 arange(0, 100.65000000000001, 0.074999999999999997) failed 100.65 >= 100.65 arange(0, 100.26000000000001, 0.059999999999999998) failed 100.26 >= 100.26 arange(0, 100.62, 0.059999999999999998) failed 100.62 >= 100.62 arange(0, 100.68000000000001, 0.059999999999999998) failed 100.68 >= 100.68 arange(0, 100.98, 0.059999999999999998) failed 100.98 >= 100.98 arange(0, 100.5, 0.031914893617021274) failed 100.5 >= 100.5 arange(10000) took 2.25123220968 seconds for 100000 reps arange(10000.0) took 4.31864636427 seconds for 100000 reps After the change: arange(10000) took 1.82795662577 seconds for 100000 reps arange(10000.0) took 3.93278363591 seconds for 100000 reps That is, not only did all of the incorrect end cases go away, it actually got marginally faster. Why it got faster I can't say, there's not much to be gained in second guessing an optimizing compiler. It's quite possible that this may be compiler dependant, so I'd be interested in the results with other compilers. Also, I only sampled a small chunk of the input space of arange, so if you have some other failing input values, please send them to me and I can test them and see if this change fixes them also. I didn't mess with the complex version of fill yet. Is that just there to support arange(0, 100, 2, dtype=complex), or is there some other use for _fill besides arange? Regards, -tim -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: testarange.py URL: From tim.hochberg at cox.net Thu Feb 9 10:34:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 10:34:03 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB7BE3.70706@ieee.org> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> Message-ID: <43EB8AFE.2060704@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >>> >>> >>> I'm trying to incorporate your changes. >> >> >> >> >> Great. >> >> >> OK, we'll have to work out something that works for both. The issue >> here on windows is that compiling the testcode requires python.lib, >> and it doesn't get found unless that directory is specified. The >> problem is perhaps related to the following comment in system_info.py >> >> if sys.platform == 'win32': # line 116 >> default_lib_dirs = ['C:\\'] # probably not very helpful... > > > > I added the change you made in setup.py to default_lib_dirs, here. > See if this fixes it. > >>> >>> 3) For the setup.py file in random you are using Advapi for all >>> win32 platforms. But, this seems to be a windows NT file >> >> >> >> I'm compiling on XP FWIW. >> >>> or at least only needed when compiling with certain compilers. >>> Mingw32 built just fine without it. So, I'm not sure how to handle >>> this. >> >> > I see now. On _WIN32 platforms it's using the registry instead of the > file system to store things. I modified the random/setup.py script to > test for _WIN32 in the compiler and add the dll to the list of > libraries if it is found. I'm also reading the configuration file to > determine MATHLIB. > > Can you try out the new SVN and see if it builds for you without > modification. There's a shallow error in system_info.py File "C:\Documents and Settings\End-user\Desktop\numpy\svn\numpy\numpy\distutils\system_info.py", line 118, in ? default_lib_dirs = ['C:\\', NameError: name 'join' is not defined Just replacing join with os.path.join fixed that. However, it didn't help. I had this fantasy that default_lib_dirs would get picked up automagically; however that does not happen. I still ended up putting: from numpy.distutils import system_info library_dirs = system_info.default_lib_dirs result = config_cmd.try_run(tc,include_dirs=[python_include], library_dirs=library_dirs) into setup.py. Is that acceptable? It's not very elegant. The changes to setup.py in random and the M_PI, seem to have worked since with the changes above it compiles and passes all of the tests except for the previously mentioned test_minrelpath. -tim From oliphant.travis at ieee.org Thu Feb 9 12:30:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 12:30:03 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EB7FE5.1050000@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> Message-ID: <43EBA615.8020101@ieee.org> Tim Hochberg wrote: > That is, not only did all of the incorrect end cases go away, it > actually got marginally faster. Why it got faster I can't say, there's > not much to be gained in second guessing an optimizing compiler. It's > quite possible that this may be compiler dependant, so I'd be > interested in the results with other compilers. Also, I only sampled a > small chunk of the input space of arange, so if you have some other > failing input values, please send them to me and I can test them and > see if this change fixes them also. > > I didn't mess with the complex version of fill yet. Is that just there > to support arange(0, 100, 2, dtype=complex), or is there some other > use for _fill besides arange? This is a simple change and one we could easily do. The complex versions are there to support complex arange? There are no other uses "currently" for fill. Although you could use it with two equal values to fill an array with the same thing quickly. I have yet to test a version of ones using fill against the current implementation which adds 1 to a zeros array. Thanks for the changes. -Travis From oliphant.travis at ieee.org Thu Feb 9 12:33:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 12:33:06 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB8AFE.2060704@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> Message-ID: <43EBA6D5.9020906@ieee.org> Tim Hochberg wrote: > > > There's a shallow error in system_info.py > > File "C:\Documents and > Settings\End-user\Desktop\numpy\svn\numpy\numpy\distutils\system_info.py", > > line 118, in ? > default_lib_dirs = ['C:\\', > NameError: name 'join' is not defined > > Just replacing join with os.path.join fixed that. However, it didn't > help. I had this fantasy that default_lib_dirs would get picked up > automagically; however that does not happen. I still ended up putting: > > from numpy.distutils import system_info > library_dirs = system_info.default_lib_dirs > result = > config_cmd.try_run(tc,include_dirs=[python_include], > library_dirs=library_dirs) > > into setup.py. Is that acceptable? It's not very elegant. I think it's probably O.K. as long as it doesn't produce errors on other systems (and it doesn't on mine). > > The changes to setup.py in random and the M_PI, seem to have worked > since with the changes above it compiles and passes all of the tests > except for the previously mentioned test_minrelpath. > I thought I fixed minrelpath too by doing a search and replace. Perhaps this did not help. -Travis From ndarray at mac.com Thu Feb 9 12:51:02 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 12:51:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: On 2/9/06, Alan G Isaac wrote: > Unfortunately the SciPy book currently uses the term 'rank' > in the two conflicting ways. (It uses 'rank' in the linear > algebra sense only in the discussion of lstsq on p.145.) > It might be helpful for the tensor sense to always be > qualified as 'tensor rank'? Another alternative would be "number of axes." I also find a glossary used by the J language (an APL descendant) useful in array discussions. See . Here is how J documentation explains the difference in their terminology and that of the C language: "What C calls an n-dimensional array of rank i?j???k is in J an array of rank n with axes of length i,j,?,k." From tim.hochberg at cox.net Thu Feb 9 13:44:06 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 13:44:06 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EBA615.8020101@ieee.org> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> Message-ID: <43EBB782.1010509@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >> That is, not only did all of the incorrect end cases go away, it >> actually got marginally faster. Why it got faster I can't say, >> there's not much to be gained in second guessing an optimizing >> compiler. It's quite possible that this may be compiler dependant, so >> I'd be interested in the results with other compilers. Also, I only >> sampled a small chunk of the input space of arange, so if you have >> some other failing input values, please send them to me and I can >> test them and see if this change fixes them also. >> >> I didn't mess with the complex version of fill yet. Is that just >> there to support arange(0, 100, 2, dtype=complex), or is there some >> other use for _fill besides arange? > > > > This is a simple change and one we could easily do. The complex > versions are there to support complex arange? There are no other uses > "currently" for fill. Just for truth in advertising, after the last svn update I did, the speed advantage mostly went away: # baseline arange(10000) took 2.27355292363 seconds for 100000 reps arange(10000.0) took 4.39404812623 seconds for 100000 reps arange(10000,dtype=complex) took 4.01601209092 seconds for 100000 reps # multiply instead of repeated add. arange(10000) took 2.20859410903 seconds for 100000 reps arange(10000.0) took 4.34652784083 seconds for 100000 reps arange(10000,dtype=complex) took 6.02266433304 seconds for 100000 reps I'm not sure if this is a result of the changes that you made stripping out the unneeded 'i' or if my machine was in some sort of different state or what. Note that I modified the complex fills as well now and they are much slower. Is it possible for delta to be complex? If not, we could speed up the complex case a little by exploiting the fact that delta.real is always zero. If in, in addition, we can assume both that start.imag is zero and that the array is zeroed out to start with, we could speed things up some more. This seems like a no-brainer for floats (and a noop for ints) since it alleviates the problem of arange(start,stop,step)[-1] sometimes being >= stop without costing anything performance wise. (I don't know that it cures the problem, but it seems to make it a lot less likely). For complex the situation is more, uh, complex. I'd like to make it since in general I'd rather be right than fast. Still, it's a signifigant performance hit in this case. Thoughts? -tim > Although you could use it with two equal values to fill an array with > the same thing quickly. I have yet to test a version of ones using > fill against the current implementation which adds 1 to a zeros array. > > Thanks for the changes. > > -Travis > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From pearu at scipy.org Thu Feb 9 14:35:07 2006 From: pearu at scipy.org (Pearu Peterson) Date: Thu Feb 9 14:35:07 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EB8AFE.2060704@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> Message-ID: On Thu, 9 Feb 2006, Tim Hochberg wrote: > I had this fantasy that default_lib_dirs would get picked up > automagically; however that does not happen. I still ended up putting: > > from numpy.distutils import system_info > library_dirs = system_info.default_lib_dirs > result = config_cmd.try_run(tc,include_dirs=[python_include], > library_dirs=library_dirs) > > into setup.py. Is that acceptable? It's not very elegant. No, don't use system_info.default_lib_dirs. Use distutils.sysconfig.get_python_lib() to get the directory that contains Python library. Pearu From tim.hochberg at cox.net Thu Feb 9 14:46:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 14:46:04 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> Message-ID: <43EBC5DF.9090709@cox.net> Pearu Peterson wrote: > > > On Thu, 9 Feb 2006, Tim Hochberg wrote: > >> I had this fantasy that default_lib_dirs would get picked up >> automagically; however that does not happen. I still ended up putting: >> >> from numpy.distutils import system_info >> library_dirs = system_info.default_lib_dirs >> result = >> config_cmd.try_run(tc,include_dirs=[python_include], >> library_dirs=library_dirs) >> >> into setup.py. Is that acceptable? It's not very elegant. > > > No, don't use system_info.default_lib_dirs. > > Use distutils.sysconfig.get_python_lib() to get the directory that > contains Python library. That's the wrong library. Get_python_lib gives you the location of the python standard library, not the location of python24.lib . The former being python24/Lib (or python24/Lib/site-packages depending what options you feed get_python_lib) 'and the latter being python24/libs on my box. -tim From vidar+list at 37mm.no Thu Feb 9 16:00:15 2006 From: vidar+list at 37mm.no (Vidar Gundersen) Date: Thu Feb 9 16:00:15 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EB34F8.6030006@bigpond.net.au> (Gary Ruben's message of "Thu, 09 Feb 2006 23:26:32 +1100") References: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com> <43EB34F8.6030006@bigpond.net.au> Message-ID: ===== Original message from Gary Ruben | 9 Feb 2006: > Vidar's documentation is under a GNU Free Documentation License. This is > probably a problem with incorporating it directly into the scipy site, > although Vidar was at one point happy to incorporate the MATLAB parts this was not the intention when i picked an open license for it. i'm not that familiar with the legal stuff, and i guess when first used a GPL/GFDL it always has to be? i also considered CC, but didn't want to spend a lot of time digging into legal stuff: i wanted to make the reference available and reusable to anyone. i don't mind, i wanted to achieve openness and encourage contributions and derivations, and be able to use these to improve and update the original reference. i need to update it with the new NumPy package, but i haven't taken the time to buy the manual and start looking into it yet. will including NumPy commands be a problem related to licensing on the NumPy documentation? also, i'd prefer to publish it on a more appropriate site (like scipy.org, sourceforge.net, or wherever useful) when i feel the documents are more complete. but note that this is not really a Numerical Python and Matlab thing, but a framework to get from math environment a to b. it could also (when i include NumPy) help transition between Numeric/numarray/NumPy: this can easily be generated as a separate reference (i use XSL and LaTeX). (although i did this to support my own transition from Matlab to non-commercial alternatives, e.g. Python and R/RPy, Gnuplot, etc for plotting.) thanks for cross-posting this to me, Gary. i'm jumping right into this list, so please be indulgent if i seem uninformed on late talks here. A brief observation on "NumPy for Matlab Addicts": The section "Some Key Differences" says nothing about the amount of routines found in Matlab toolboxes for Optimization, Control engineering, Wavelets, etc. for these there are no real alternatives. kind regards, Vidar Bronken Gundersen From wbaxter at gmail.com Thu Feb 9 16:02:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:02:15 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: Oh, yeh. I can see the problem with the phrase "rank of a matrix". It does sound like it means the linear algebra rank of rank/nullity fame. I changed the description on the page a bit. Thanks for catching that. --bb On 2/9/06, Bruce Southey wrote: > > Hi, > The example of ndim to give the rank is not the same as the Matlab > function rank(a). See > http://en.wikipedia.org/wiki/Rank_of_a_matrix for definition of rank > that I would think that most people would use if they use Matlab and > is provided by rank(a). > > I have not used the latest numpy but the equivalent function is not > present in numarray/Numeric (to my knowledge) so you have to find some > other way like using svd. > > Regards > Bruce > > On 2/9/06, Bill Baxter wrote: > > I added some content to the "NumPy/SciPy for Matlab users" page on the > scipy wiki. > > > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole > chart of equivalents that I laid out. > > If folks who know both could browse by and maybe fill in a blank or two, > that would be great. I think this will be a helpful "getting started" page > for newbies to NumPy coming from matlab, like me. One of the most > frustrating things is when you sit down and can't figure out how to do the > most basic things that do in your sleep in another environment (like making > a column vector). So hopefully this page will help. > > > > The URL is : http://www.scipy.org/Wiki/NumPy_for_Matlab_Addicts > > > > Thanks, > > Bill Baxter > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 9 16:07:33 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:07:33 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those don't seem to be defined, at least for the versions of NumPy/SciPy that I have. Are they new? Or are they perhaps defined by a 3rd package in your environment? By the way, is there any python way to tell which package a symbol is coming from? --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From cookedm at physics.mcmaster.ca Thu Feb 9 16:17:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 9 16:17:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: (Bill Baxter's message of "Fri, 10 Feb 2006 09:06:49 +0900") References: Message-ID: Bill Baxter writes: > Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those > don't seem to be defined, at least for the versions of NumPy/SciPy that I > have. Are they new? Or are they perhaps defined by a 3rd package in your > environment? They're in numpy.linalg. > By the way, is there any python way to tell which package a symbol is coming > from? Check it's __module__ attribute. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From wbaxter at gmail.com Thu Feb 9 16:19:24 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:19:24 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: I find 'number of axes' to be even more confusing than 'dimension'. Both sound to me like they're talking about the number of components in a vector (e.g. 3-dimensional space vs 2-dimensional space), but axes moreso. The word dimension has a lot of uses, and in most programming languages arrays are described as being one, two or three dimensional etc. So that makes sense. But I can't think of any common usages of axis that aren't related to vectors in a vector space. But that's just me. Seems like this debate probably came and went a long time ago. What is right probably depends mostly on what sort of math you spend your time doing. --bb On 2/10/06, Sasha wrote: > > On 2/9/06, Alan G Isaac wrote: > > Unfortunately the SciPy book currently uses the term 'rank' > > in the two conflicting ways. (It uses 'rank' in the linear > > algebra sense only in the discussion of lstsq on p.145.) > > It might be helpful for the tensor sense to always be > > qualified as 'tensor rank'? > > Another alternative would be "number of axes." I also find a > glossary used by the J language (an APL descendant) useful in array > discussions. See > . > > Here is how J documentation explains the difference in their > terminology and that of the C language: "What C calls an n-dimensional > array of rank i?j???k is in J an array of rank n with axes of length > i,j,?,k." > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 9 16:24:14 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 16:24:14 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: > Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those > > don't seem to be defined, at least for the versions of NumPy/SciPy that > I > > have. Are they new? Or are they perhaps defined by a 3rd package in > your > > environment? > > They're in numpy.linalg. Ooooh! Lots of goodies there! > By the way, is there any python way to tell which package a symbol is > coming > > from? > > Check it's __module__ attribute. Ah, perfect. I see it's also mentioned in help(thing) for thing. Thanks. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fernando.Perez at colorado.edu Thu Feb 9 16:35:03 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 16:35:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43EBDF88.6040504@colorado.edu> [ regarding the way of describing arrays vs. matlab's matrices, and the use of 'dimension', 'rank', 'number of axes', etc.] Let's not introduce new terms where none are needed. These concepts have had well-established names (tensor rank and matrix rank) for a long time. It may be a good idea to add a local glossary page reminding anyone of what the definitions are, but for as long as I remember reading literature on these topics, the two terms have been fully unambiguous. A numpy array with length(array.shape)==d is closest to a rank d tensor (minus the geometric co/contravariance information). A d=2 array can be used to represent a matrix, and linear algebra operations can be performed on it; if a Matrix object is built out of it, a number of things (notably the * operator) are then performed in the linear algebra sense (and not element-wise). The rank of a matrix has nothing to do with the shape attribute of the underlying array, but with the number of non-zero singular values (and for floating-point matrices, is best defined up to a given tolerance). Since numpy is a n-dimensional array package, it may be convenient to introduce a matrix_rank() routine which does what matlab's rank() for 2-d arrays and matrices, while raising an error for any other shape. This would also make it explicit that this operation is only well-defined for 2-d objects. My 1e-2, f From tim.hochberg at cox.net Thu Feb 9 17:07:21 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 9 17:07:21 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EBC5DF.9090709@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> <43EBC5DF.9090709@cox.net> Message-ID: <43EBE721.6000602@cox.net> Tim Hochberg wrote: > Pearu Peterson wrote: > >> >> >> On Thu, 9 Feb 2006, Tim Hochberg wrote: >> >>> I had this fantasy that default_lib_dirs would get picked up >>> automagically; however that does not happen. I still ended up putting: >>> >>> from numpy.distutils import system_info >>> library_dirs = system_info.default_lib_dirs >>> result = >>> config_cmd.try_run(tc,include_dirs=[python_include], >>> library_dirs=library_dirs) >>> >>> into setup.py. Is that acceptable? It's not very elegant. >> >> >> >> No, don't use system_info.default_lib_dirs. >> >> Use distutils.sysconfig.get_python_lib() to get the directory that >> contains Python library. > > > That's the wrong library. Get_python_lib gives you the location of the > python standard library, not the location of python24.lib. The former > being python24/Lib (or python24/Lib/site-packages depending what > options you feed get_python_lib) 'and the latter being python24/libs > on my box. To follow up on this a little bit, I investigated how distutils itself finds python24.lib. It turns out that it is in build_ext.py, near line 168. The relevant code is: # also Python's library directory must be appended to library_dirs if os.name == 'nt': self.library_dirs.append(os.path.join(sys.exec_prefix, 'libs')) Unfortunately, there's no obvious, clean way to extract the library information from there. You can grab it using the following magic formula: from distutils.core import Distribution from distutils.command import build_ext be = build_ext.build_ext(Distribution()) be.finalize_options() librarys_dirs = be.library_dirs However, that seems worse than what we're doing now. I haven't actually tried this in the code either -- for all I know instantiating an extra Distribution may have some horrible side effect that I don't know about. If someone can come up with a cleaner way to get to this info, that'd be great, otherwise I'd say we might as well just keep things as they are for the time being. Regards, -tim From wbaxter at gmail.com Thu Feb 9 17:10:37 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 17:10:37 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EBDF88.6040504@colorado.edu> References: <43EBDF88.6040504@colorado.edu> Message-ID: For what it's worth, matlab's rank function just calls svd, and returns the number singular values greater than a tolerance. The implementation is a whopping 5 lines long. On 2/10/06, Fernando Perez wrote: > > Since numpy is a n-dimensional array package, it may be convenient to > introduce a matrix_rank() routine which does what matlab's rank() for 2-d > arrays and matrices, while raising an error for any other shape. This > would > also make it explicit that this operation is only well-defined for 2-d > objects. Or put it in numpy.linalg, which also makes it pretty clear what the scope is. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 9 17:14:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 17:14:15 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <43EBDF88.6040504@colorado.edu> Message-ID: Anyone know what the terms are for redistribution of applications built with Matlab? I searched around their site a bit but couldn't find anything conclusive. One page seemed to be saying there was a per-application fee for distributing a matlab-based application, but other pages made it sound more like it was no extra charge. If the former, then that's another point that should go in the 'key differences' section. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fernando.Perez at colorado.edu Thu Feb 9 17:27:39 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 17:27:39 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <43EBDF88.6040504@colorado.edu> Message-ID: <43EBEBD8.4050705@colorado.edu> Bill Baxter wrote: > For what it's worth, matlab's rank function just calls svd, and returns the > number singular values greater than a tolerance. The implementation is a > whopping 5 lines long. Yup, and it would be pretty much the same 5 lines in numpy, with the same semantics. Here's a quick and dirty implementation for old-scipy (I don't have new-scipy on this box): def matrix_rank(arr,tol=1e-8): """Return the matrix rank of an input array.""" arr = scipy.asarray(arr) if len(arr.shape) != 2: raise ValueError('Input must be a 2-d array or Matrix object') svdvals = scipy.linalg.svdvals(arr) return sum(scipy.where(svdvals>tol,1,0)) If you really hate readability and error-checking, it's a one-liner :) matrix_rank = lambda arr,tol=1e-8: sum(scipy.where(scipy.linalg.svdvals(arr)>tol,1,0)) Looks OK (RA is RandomArray from Numeric): In [21]: matrix_rank([[1,0],[0,0]]) Out[21]: 1 In [22]: matrix_rank(RA.random((3,3))) Out[22]: 3 In [23]: matrix_rank([[1,0],[0,0]]) Out[23]: 1 In [24]: matrix_rank([[1,0],[1,0]]) Out[24]: 1 In [25]: matrix_rank([[1,0],[0,1]]) Out[25]: 2 In [26]: matrix_rank(RA.random((3,3)),1e-1) Out[26]: 2 In [48]: matrix_rank([[1,0],[1,1e-8]]) Out[48]: 1 In [49]: matrix_rank([[1,0],[1,1e-4]]) Out[49]: 2 In [50]: matrix_rank([[1,0],[1,1e-8]],1e-9) Out[50]: 2 Cheers, f From aisaac at american.edu Thu Feb 9 17:33:00 2006 From: aisaac at american.edu (Alan G Isaac) Date: Thu Feb 9 17:33:00 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: On Fri, 10 Feb 2006, Bill Baxter apparently wrote: > Some kind soul added 'svd' and 'inv' to the NumPy/SciPi > columns, but those don't seem to be defined, at least for > the versions of NumPy/SciPy that I have. Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from numpy.linalg import svd, inv >>> hth, Alan Isaac From ndarray at mac.com Thu Feb 9 17:41:39 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 17:41:39 2006 Subject: [Numpy-discussion] NumPy Glossary Was:Matlab page on scipy wiki Message-ID: I've created a rough draft of NumPy Glossary on the developer's wiki . Please comment/edit. When it is ready, we can move it to scipy.org. From Fernando.Perez at colorado.edu Thu Feb 9 17:49:35 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 17:49:35 2006 Subject: [Numpy-discussion] NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43EBF101.3090401@colorado.edu> Sasha wrote: > I've created a rough draft of NumPy Glossary on the developer's wiki > . Please > comment/edit. When it is ready, we can move it to scipy.org. A humble suggestion: move it NOW. It will never 'be ready', and that's just the wiki way: put it in early, mark it at the top as a stub (so we don't falsely claim it to be in great shape when it isn't), and let it be improved in-place. The trac wiki should be for developers to work on pure development things, and it requires an SVN login (well, it doesn't right now, but this should be changed ASAP: spammers WILL show up sooner or later, and they will destroy the wiki. They did it to Enthought's and to IPython's in the past, they will also do it here; it's just a matter of time, and the cleanup later will be more work than running trac-admin now and closing wiki edit permissions for anonymous users). The public wiki is where this content should be: a non-developer can do a perfectly good job of contributing content here, so there's no reason to keep this material in the dev wiki. Cheers, f From ndarray at mac.com Thu Feb 9 18:03:47 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 18:03:47 2006 Subject: [Numpy-discussion] NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <43EBF101.3090401@colorado.edu> References: <43EBF101.3090401@colorado.edu> Message-ID: On 2/9/06, Fernando Perez wrote: > A humble suggestion: move it NOW. Done. See . From ndarray at mac.com Thu Feb 9 20:36:09 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 9 20:36:09 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EBB782.1010509@cox.net> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> <43EBB782.1010509@cox.net> Message-ID: Well, my results are different. SVN r2087: > python -m timeit -s "from numpy import arange" "arange(10000.0)" 10000 loops, best of 3: 21.1 usec per loop SVN r2088: > python -m timeit -s "from numpy import arange" "arange(10000.0)" 10000 loops, best of 3: 25.6 usec per loop I am using gcc version 3.3.4 with the following flags: -msse2 -mfpmath=sse -fno-strict-aliasing -DNDEBUG -g -O3. The timing is consistent with the change in the DOUBLE_fill loop: r2087: 1b8f0: f2 0f 11 08 movsd %xmm1,(%eax) 1b8f4: f2 0f 58 ca addsd %xmm2,%xmm1 1b8f8: 83 c0 08 add $0x8,%eax 1b8fb: 39 c8 cmp %ecx,%eax 1b8fd: 72 f1 jb 1b8f0 r2088: 1b9d0: f2 0f 2a c2 cvtsi2sd %edx,%xmm0 1b9d4: 42 inc %edx 1b9d5: f2 0f 59 c1 mulsd %xmm1,%xmm0 1b9d9: f2 0f 58 c2 addsd %xmm2,%xmm0 1b9dd: f2 0f 11 00 movsd %xmm0,(%eax) 1b9e1: 83 c0 08 add $0x8,%eax 1b9e4: 39 ca cmp %ecx,%edx 1b9e6: 7c e8 jl 1b9d0 The loop was 5 instructions before the change and 8 instructions after. It is possible that 387 FPU may do addition and multiplication in parallel and this is why you don't see the difference. Nevetheless, I would like to withdraw my prior objections. I think the code is now more numerically correct and that is worth the slow-down on some platforms. By the way, as I was playing with the code. I've also tried the recommendation of using a[i] instead of *p: --- numpy/core/src/arraytypes.inc.src (revision 2088) +++ numpy/core/src/arraytypes.inc.src (working copy) @@ -1652,9 +1652,8 @@ @typ@ start = buffer[0]; @typ@ delta = buffer[1]; delta -= start; - buffer += 2; - for (i=2; i This is one instruction less because "add $0x8,%eax" was eliminated and all pointer arithmetics and store (buffer[i] = ...) is now done in a single instruction "movsd %xmm0,(%edx,%eax,8)". The timing, however did not change: > python -m timeit -s "from numpy import arange" "arange(10000.0)" 10000 loops, best of 3: 25.6 usec per loop My change may be worth commiting because C code is shorter and arguably more understandable (at least by Fortran addicts :-). Travis? On 2/9/06, Tim Hochberg wrote: > # baseline > arange(10000.0) took 4.39404812623 seconds for 100000 reps > # multiply instead of repeated add. > arange(10000.0) took 4.34652784083 seconds for 100000 reps From jh at oobleck.astro.cornell.edu Thu Feb 9 21:08:01 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Thu Feb 9 21:08:01 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> (numpy-discussion-request@lists.sourceforge.net) References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> Message-ID: <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> > http://scipy.org/NumPyGlossary In general, such things should be birthed on the Developer_Zone page. This is where the front page directs people to go if they are interested in contributing. We're getting a lot of new interest now, so posting hidden pages on the mailing list will miss new talent. Items can move to Documentation (or wherever) when they are somewhat stable, and continue to grow there. There's now a link for this page under the heading DOCUMENTATION: Projects on the Developer_Zone page. --jh-- From Fernando.Perez at colorado.edu Thu Feb 9 21:12:02 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 21:12:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> Message-ID: <43EC208E.5000705@colorado.edu> Joe Harrington wrote: >>http://scipy.org/NumPyGlossary > > > In general, such things should be birthed on the Developer_Zone page. > This is where the front page directs people to go if they are > interested in contributing. We're getting a lot of new interest now, > so posting hidden pages on the mailing list will miss new talent. > Items can move to Documentation (or wherever) when they are somewhat > stable, and continue to grow there. Why? Did you read the argument I made for putting it on the main wiki? How are you going to get contributions on the dev wiki once anonymous edits are locked out (which they will hopefully be very soon, before the wiki is spammed out of recognition)? The less friction and committee-ness we impose on this whole thing, the better of we'll all be. Let's be _less_ bureaucratic, not more. f From aisaac at american.edu Thu Feb 9 21:18:10 2006 From: aisaac at american.edu (Alan G Isaac) Date: Thu Feb 9 21:18:10 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <1e2af89e0602090046g55f80ecdi26b24afc5dbe8a1d@mail.gmail.com><43EB34F8.6030006@bigpond.net.au> Message-ID: On Fri, 10 Feb 2006, Vidar Gundersen apparently wrote: > i guess when first used a GPL/GFDL it always has to be? If you own the copyright, you can license it anyway you want at any time. You have already licensed it under the GFDL, but you can license it other ways as well. > i also considered CC, but didn't want to spend a lot of > time digging into legal stuff: i wanted to make the > reference available and reusable to anyone. If that is really the goal, then just include a statement placing it in the public domain. E.g., Copyright: This document has been placed in the public domain. If you want attribution, use an attribution license: http://creativecommons.org/licenses/by/2.5/ (Be sure to say what you want as attribution.) Cheers, Alan Isaac PS IANAL! From jh at oobleck.astro.cornell.edu Thu Feb 9 21:32:01 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Thu Feb 9 21:32:01 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <43EC208E.5000705@colorado.edu> (message from Fernando Perez on Thu, 09 Feb 2006 22:11:42 -0700) References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> <43EC208E.5000705@colorado.edu> Message-ID: <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> >>>http://scipy.org/NumPyGlossary >> >> >> In general, such things should be birthed on the Developer_Zone page. >> This is where the front page directs people to go if they are >> interested in contributing. We're getting a lot of new interest now, >> so posting hidden pages on the mailing list will miss new talent. >> Items can move to Documentation (or wherever) when they are somewhat >> stable, and continue to grow there. >Why? Did you read the argument I made for putting it on the main wiki? How are >you going to get contributions on the dev wiki once anonymous edits are locked >out (which they will hopefully be very soon, before the wiki is spammed out of >recognition)? Fernando, that page *is* on the main wiki (I don't deal with the developers' wiki at all). Go to scipy.org, click on Developer Zone in the navigation tabs, scroll down to DOCUMENTATION: Projects. There are two reasons to put it there. First, there are now many people who are looking for projects to do. This is where we can list stuff we want to call attention to as needing work. Once someone is happy with it, they can link it from the Documentation page as well, but it should also stay in Developer Zone until it's mature enough that we'd rather people spent their time on other projects. This is the "work on me first" page. Second, it might not belong on the Documentation page until it gets at least a little review for scope, correctness, and readability. Remember that too many stubs and apologies for being under construction will turn people away. > The less friction and committee-ness we impose on this whole thing, the better > of we'll all be. Let's be _less_ bureaucratic, not more. It just takes one happy person (you?) to link it under Documentation (and one unhappy person to take it off, but that won't be me, in this case). --jh-- From Fernando.Perez at colorado.edu Thu Feb 9 21:40:00 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Thu Feb 9 21:40:00 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> <43EC208E.5000705@colorado.edu> <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> Message-ID: <43EC26F1.2050101@colorado.edu> Joe Harrington wrote: >>Why? Did you read the argument I made for putting it on the main wiki? How are >>you going to get contributions on the dev wiki once anonymous edits are locked >>out (which they will hopefully be very soon, before the wiki is spammed out of >>recognition)? > > > Fernando, that page *is* on the main wiki (I don't deal with the > developers' wiki at all). Go to scipy.org, click on Developer Zone in > the navigation tabs, scroll down to DOCUMENTATION: Projects. I misunderstood something: I thought you wanted it moved over to the dev wiki, which is the first link on the DeveloperZone page. I read the DeveloperZone page (on the main wiki) and thought you wanted the glossary moved over to the pages linked there, and since the first ones are for the Trac wiki, I (mis)understood you wanted the glossary pushed over there. Sorry for the confusion. Cheers, f From mithrandir42 at web.de Thu Feb 9 22:02:02 2006 From: mithrandir42 at web.de (N. Volbers) Date: Thu Feb 9 22:02:02 2006 Subject: [Numpy-discussion] Using ndarray for 2-dimensional, heterogeneous data Message-ID: <43EC2C21.9020509@web.de> Hello everyone, I am re-thinking the design of my evaluation software, but I am not quite sure if I am doing the right decision, so let me state my problem: I am writing a simple evaluation program to read scientific (ASCII) data and plot it both via gnuplot and matplotlib. The data is typically very simple: numbers arranged in columns. Before numpy I was using Numeric arrays to store this data in a list of 1-dimensional arrays, e.g.: a = [ array([1,2,3,4]), array([2.3,17.2,19.1,22.2]) ] This layout made it very easy to add, remove or rearrange columns, because these were simple list operations. It also had the nice effect to allow different data types for different columns. However, row access was hard and I had to use my own iterator object to do so. When I read about heterogeneous arrays in numpy I started a new implementation which would store the same data as above like this: b = numpy.array( [(1,2,3,4), (2.3,17.2,19.1,22.2)], dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) Row operations are much easier now, because I can use numpy's intrinsic capabilities. However column operations require to create a new array based on the old one. Now I am wondering if the use of such an array has more drawbacks that I am not aware of. E.g. is it possible to mask values in such an array? And is it slower to get a certain column by using b['col1'] than it would using a homogeneous array c and the notation c[:,0]? Does anyone else use such a data layout and can report on problems with it? Best regards, Niklas Volbers. From mithrandir42 at web.de Thu Feb 9 22:12:02 2006 From: mithrandir42 at web.de (N. Volbers) Date: Thu Feb 9 22:12:02 2006 Subject: [Numpy-discussion] Using ndarray for 2-dimensional, heterogeneous data In-Reply-To: <43EC2C21.9020509@web.de> References: <43EC2C21.9020509@web.de> Message-ID: <43EC2E8A.2040200@web.de> N. Volbers wrote: > Hello everyone, > > I am re-thinking the design of my evaluation software, but I am not > quite sure if I am doing the right decision, so let me state my problem: > > I am writing a simple evaluation program to read scientific (ASCII) > data and plot it both via gnuplot and matplotlib. The data is > typically very simple: numbers arranged in columns. Before numpy I was > using Numeric arrays to store this data in a list of 1-dimensional > arrays, e.g.: > > a = [ array([1,2,3,4]), array([2.3,17.2,19.1,22.2]) ] > > This layout made it very easy to add, remove or rearrange columns, > because these were simple list operations. It also had the nice effect > to allow different data types for different columns. However, row > access was hard and I had to use my own iterator object to do so. > > When I read about heterogeneous arrays in numpy I started a new > implementation which would store the same data as above like this: > > b = numpy.array( [(1,2,3,4), (2.3,17.2,19.1,22.2)], > dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) > Sorry, I meant of course b = numpy.array( [(1,2.3), (2, 17.2), (3, 19.1), (4, 22.2)], dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) > Row operations are much easier now, because I can use numpy's > intrinsic capabilities. However column operations require to create a > new array based on the old one. > > Now I am wondering if the use of such an array has more drawbacks that > I am not aware of. E.g. is it possible to mask values in such an array? > > And is it slower to get a certain column by using b['col1'] than it > would using a homogeneous array c and the notation c[:,0]? > > Does anyone else use such a data layout and can report on problems > with it? The mathematical operations I want to use will be limited to operations acting on the column e.g. creating a new column = b['col1'] + b['col2'] and such. So of course I am aware of the basic difference that slicing works different if I have an heterogeneous array due to the fact that each row is considered a single item. Niklas. From oliphant.travis at ieee.org Thu Feb 9 22:28:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 9 22:28:02 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> <43EBB782.1010509@cox.net> Message-ID: <43EC3261.2060601@ieee.org> Sasha wrote: >Well, my results are different. > >SVN r2087: > > >>python -m timeit -s "from numpy import arange" "arange(10000.0)" >> >> >10000 loops, best of 3: 21.1 usec per loop > >SVN r2088: > > >>python -m timeit -s "from numpy import arange" "arange(10000.0)" >> >> >10000 loops, best of 3: 25.6 usec per loop > >I am using gcc version 3.3.4 with the following flags: -msse2 >-mfpmath=sse -fno-strict-aliasing -DNDEBUG -g -O3. > >The timing is consistent with the change in the DOUBLE_fill loop: > >r2087: > 1b8f0: f2 0f 11 08 movsd %xmm1,(%eax) > 1b8f4: f2 0f 58 ca addsd %xmm2,%xmm1 > 1b8f8: 83 c0 08 add $0x8,%eax > 1b8fb: 39 c8 cmp %ecx,%eax > 1b8fd: 72 f1 jb 1b8f0 > >r2088: > 1b9d0: f2 0f 2a c2 cvtsi2sd %edx,%xmm0 > 1b9d4: 42 inc %edx > 1b9d5: f2 0f 59 c1 mulsd %xmm1,%xmm0 > 1b9d9: f2 0f 58 c2 addsd %xmm2,%xmm0 > 1b9dd: f2 0f 11 00 movsd %xmm0,(%eax) > 1b9e1: 83 c0 08 add $0x8,%eax > 1b9e4: 39 ca cmp %ecx,%edx > 1b9e6: 7c e8 jl 1b9d0 > > > Nice to see some real hacking on this list :-) >My change may be worth commiting because C code is shorter and >arguably more understandable (at least by Fortran addicts :-). >Travis? > > Yes, I think it's worth submitting. Most of the suggestions for pointer-arithmetic for fast C-code were developed when processors spent more time computing than fetching memory. Now it seem it's all about fetching memory intelligently. The buffer[i]= style is even recommended according to the AMD-optimization book Sasha pointed out. So, I say go ahead unless somebody can point out something we are missing... -Travis From pearu at scipy.org Thu Feb 9 23:15:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Thu Feb 9 23:15:02 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: <43EBE721.6000602@cox.net> References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> <43EBC5DF.9090709@cox.net> <43EBE721.6000602@cox.net> Message-ID: On Thu, 9 Feb 2006, Tim Hochberg wrote: > Tim Hochberg wrote: > >> Pearu Peterson wrote: >> >>> >>> >>> On Thu, 9 Feb 2006, Tim Hochberg wrote: >>> >>>> I had this fantasy that default_lib_dirs would get picked up >>>> automagically; however that does not happen. I still ended up putting: >>>> >>>> from numpy.distutils import system_info >>>> library_dirs = system_info.default_lib_dirs >>>> result = config_cmd.try_run(tc,include_dirs=[python_include], >>>> library_dirs=library_dirs) >>>> >>>> into setup.py. Is that acceptable? It's not very elegant. >>> >>> >>> >>> No, don't use system_info.default_lib_dirs. >>> >>> Use distutils.sysconfig.get_python_lib() to get the directory that >>> contains Python library. >> >> >> That's the wrong library. Get_python_lib gives you the location of the >> python standard library, not the location of python24.lib. The former being >> python24/Lib (or python24/Lib/site-packages depending what options you feed >> get_python_lib) 'and the latter being python24/libs on my box. Ok, but using system_info.default_lib_dirs is still wrong, this list is not designed for this purpose.. > To follow up on this a little bit, I investigated how distutils itself finds > python24.lib. It turns out that it is in build_ext.py, near line 168. The > relevant code is: > > # also Python's library directory must be appended to library_dirs > if os.name == 'nt': > self.library_dirs.append(os.path.join(sys.exec_prefix, 'libs')) Hmm, this should be effective also for numpy.distutils. self.library_dirs and other such attributes are used in distutils.command.build_ext.run() method while our numpy.distutils.command.build_ext.run() doesn't. So, I we have to do is to update numpy.distutils.command.build_ext.run() method to resolve this issue. This should fix also rpath issues that was reported on this list for certain platforms. I'll look fixing it today.. Pearu From wbaxter at gmail.com Thu Feb 9 23:22:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 9 23:22:03 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <43EC26F1.2050101@colorado.edu> References: <20060210044507.32E368884A@sc8-sf-spam1.sourceforge.net> <200602100507.k1A5753M023391@oobleck.astro.cornell.edu> <43EC208E.5000705@colorado.edu> <200602100531.k1A5VFjs023566@oobleck.astro.cornell.edu> <43EC26F1.2050101@colorado.edu> Message-ID: Well, I'm with Fernando. Wikis are meant for having people muck with them. I am much more annoyed for instance by the "NumPy Tutorial" teaser link on the main documentation page that goes to nowhere than I would be by a half finished page that acknowledges it's half finished. If I'd known I was supposed to go to some Dev Zone and put in a provisional page or whatnot I probably wouldn't have made the NumPy for Matlab users page. Which is still half finished, thank-you-very-much, but nonetheless still contains useful information. More useful information than the "NumPy Tutorial" link at least. :-) --bb On 2/10/06, Fernando Perez wrote: > > Joe Harrington wrote: > > >>Why? Did you read the argument I made for putting it on the main wiki? > How are > >>you going to get contributions on the dev wiki once anonymous edits are > locked > >>out (which they will hopefully be very soon, before the wiki is spammed > out of > >>recognition)? > > > > > > Fernando, that page *is* on the main wiki (I don't deal with the > > developers' wiki at all). Go to scipy.org, click on Developer Zone in > > the navigation tabs, scroll down to DOCUMENTATION: Projects. > > I misunderstood something: I thought you wanted it moved over to the dev > wiki, > which is the first link on the DeveloperZone page. I read the > DeveloperZone > page (on the main wiki) and thought you wanted the glossary moved over to > the > pages linked there, and since the first ones are for the Trac wiki, I > (mis)understood you wanted the glossary pushed over there. Sorry for the > confusion. > > Cheers, > > f > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Thu Feb 9 23:40:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 9 23:40:03 2006 Subject: [Numpy-discussion] Using ndarray for 2-dimensional, heterogeneous data In-Reply-To: <43EC2E8A.2040200@web.de> References: <43EC2C21.9020509@web.de> <43EC2E8A.2040200@web.de> Message-ID: <1139557152.7537.9.camel@localhost.localdomain> El dv 10 de 02 del 2006 a les 06:11 +0000, en/na N. Volbers va escriure: > N. Volbers wrote: > Sorry, I meant of course > > b = numpy.array( [(1,2.3), (2, 17.2), (3, 19.1), (4, 22.2)], > dtype={'names':['col1','col2'], 'formats': ['i2','f4']}) > > > Row operations are much easier now, because I can use numpy's > > intrinsic capabilities. However column operations require to create a > > new array based on the old one. Yes, but this should be a pretty fast operation, as not data copy is implied in doing b['col1'], for example. > > Now I am wondering if the use of such an array has more drawbacks that > > I am not aware of. E.g. is it possible to mask values in such an array? I'm not familiar with masked arrays, but my understanding is that such column arrays are the same than regular arrays, so I'd say yes. > > And is it slower to get a certain column by using b['col1'] than it > > would using a homogeneous array c and the notation c[:,0]? Well, you should do some benchmarks, but I'd be surprised if there is a big speed difference. > > Does anyone else use such a data layout and can report on problems > > with it? I use column data *a lot* in numarray and had not problems with this. With NumPy things should be similar in terms of stability. > The mathematical operations I want to use will be limited to operations > acting on the column e.g. creating a new column = b['col1'] + b['col2'] > and such. So of course I am aware of the basic difference that slicing > works different if I have an heterogeneous array due to the fact that > each row is considered a single item. Exactly, these array columns are the same than regular homogeneous arrays. The only difference is that there is a 'hole' between elements. However, this is handled internally by NumPy through the use of the stride. My 2 cents, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From atrocious at netvigator.com Fri Feb 10 01:29:05 2006 From: atrocious at netvigator.com (BALANCED SCORECARD) Date: Fri Feb 10 01:29:05 2006 Subject: [Numpy-discussion] =?windows-1251?B?ysDKIM/F0MXJ0sggztIgwd7ExsXSzc7DziDTz9DAwsvFzcjfIMog0dLQwNLFw8jX?= =?windows-1251?B?xdHKzszT?= Message-ID: ?????????? ?????. ??????????? ??? ????. ? ????? - ??? ?? ??????? ?? ???????? - ?????????. ? ? ?????? ?????????? ???? ??????? ????. ??????. ?????? ????. ?????. ??????. ?????????. ?????????. ?????????, ???, ???? ?????????. ????. ????? ?????????. ???????, ??????, ?????, ?????, ?????? ????????: ?????????. ???. ??????? ????, ??????? ????, ??????? ??????. ????? ??????????, ????????? ??????????? ??????????, ?????? ??????. ?????? ?????????? - ???????? ??????????. ???????????? ??????????. ?????? - ??????????? ???????? ??????-????. ????? - ????? ? ?????? ?????????. ?????? ? ???????? - ????????! ?????????. ?????????. ???, ????? ????????? ??? ????? ???? (??? ?????? ???? - ?????????? ????????). ????? ???! ????? ???????????? ????????. ?????????????? ??????????. ????? ??????. ???????????????? ??????????. ?, ???... BALANCED SCORECARD: ??? ??????? ?? ?????????? ?????????? ? ??????????????? ?????? 21 - 22 ??????? ... ???, (495) 980-65-36 ???????????????? ?????????? - ???????? ?????? ? ??????????????? ???????????. ?????????? ?? ???????????????? ??????? ???????????, ?? ???? ???????? ???????? ??? ?????????? ???????????? ?????? ????????, ????????? ????????????? ? ????????????? ????????????????? ???????????? ???????????. I. ???????????????? ??????? ??????????? ??????? ????????? ??? ? ?? ????????. ??????? ?? ?????????? ????????????? ???????? ? ?????????? ???????????. ????????? ???? ?????????????? ??????? ? ????? ???????? ???????? ?????????. ?????????????? ?????. ?????????????? ??????????? ? ?????????? ? ?????????? ????????????, ???????????? ?????????? ????????? ? ????????. ????????-???????????? ????????? ?????, ???????????? ????????? ???????????. ???????? ?????????? ????????????????. ??????????? ????? ?????? ????? ?????????? ???? ???????? ?????, ??????? ????????, ???????????? ???????? ? ???????. ?????????? ?????????????, ???????????? ????????? ? ????????? ?????????? ????????. ???????? ?????? ??????????????? ??????????. ???????????? ?????/??????????? ???????? ??? ????? ???????? ????????? ?? ???????????? ???????. ????????? ??????????? ? ??????????? ?? ?????????? ???????? ???????????. II. ????????? ??? ???????? ?????????. ?????????? ?????? ? ?????????. ?????????????? ??????? ?????????? ?????????????. ??????? ????????? ??? ? ??????. ?????????? ?????? ??? ?????????????? ?????????? ?????????? ???. ???????? ???????? ??????? ??????????? ??????? ? ??????? ???. ???????? ???. ?????????-??????????????? ?????????????? ??? ???????? ?????????? ??????????????? ??????????????. ??????? ??????? ??????. ?????????? ????????? ????????? ???. ??????? "????????? - ??????? ???????". ?????????? ??????? ??? ?????? ???????? ???????? ?????? ? ????????? ????????? ???????????. ???????????? ???????. ?????????? ?????? ??? ??? ???????????? ???????????. ????? ???????? - ???????? ????????????? ????, ?????????? ?? ??????????????? ??????????. ????? ???? ???????????? ?????? ? ?????????? ????????? ????????????? ?????? ??????????? ??????? ? ??????-????????????, ????????????? ???????-?????????????? ??????, ????????????? ?????? ?????????????? ????????, ? ?????????????? ???????? ????????????? ??????? "?????????? ???????????????? ??????????????". ????? ???? "???????????????? ?????????? ????????????", "??????-???? ???????????? ????????" ? ???? ?????? ? ??????? ????????. (495) 98O-65-?9 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu at scipy.org Fri Feb 10 03:10:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 10 03:10:03 2006 Subject: [Numpy-discussion] BUG: numpy.put raises TypeError. Message-ID: Hi, There seems to be a bug in numpy.put function (i.e. array.put method). Consider the following example: a = array([0, 0, 0, 0, 0]) a.put([1.1],[2]) # works as documented a.put(array([1.1]),[2]) # raises the following exception: TypeError: array cannot be safely cast to required type The bug seems to boil down to calling PyArray_FromArray function but I got a bit lost on debugging this issue.. Pearu From pearu at scipy.org Fri Feb 10 03:30:06 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 10 03:30:06 2006 Subject: [Numpy-discussion] BUG(?): array([None])==None test Message-ID: Hi, While converting some Numeric based code to numpy, I noticed that in Numeric array([None])==None returns array([1]) while in numpy it returns False. Is this expected behaviour or a bug? Pearu From svetosch at gmx.net Fri Feb 10 03:36:02 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Fri Feb 10 03:36:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EBEBD8.4050705@colorado.edu> References: <43EBDF88.6040504@colorado.edu> <43EBEBD8.4050705@colorado.edu> Message-ID: <43EC7A6E.2020702@gmx.net> Fernando Perez schrieb: > Bill Baxter wrote: >> For what it's worth, matlab's rank function just calls svd, and >> returns the >> number singular values greater than a tolerance. The implementation is a >> whopping 5 lines long. > > Yup, and it would be pretty much the same 5 lines in numpy, with the > same semantics. > > Here's a quick and dirty implementation for old-scipy (I don't have > new-scipy on this box): > Is there any reason not to use the algorithm implicit in lstsq, as in: rk = linalg.lstsq(M, ones(p))[2] (where M is the matrix to check, and p==M.shape[0]) thanks, sven From stefan at sun.ac.za Fri Feb 10 04:24:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 10 04:24:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <20060210122242.GA21950@sun.ac.za> On Thu, Feb 09, 2006 at 05:21:10PM +0900, Bill Baxter wrote: > I added some content to the "NumPy/SciPy for Matlab users" page on the scipy > wiki. > > But my knowledge of NumPy/SciPy isn't sufficient to fill in the whole chart of > equivalents that I laid out. One of my colleagues also asked about the shortest way to do array concatenation. In Octave that would be [1, 0, 1:4, 0, 1] Using numpy we currently do concatenate([[1, 0], arange(1,5), [0, 1]]) or vstack(...) The "+" operator now means something else, so you can't do [1,0] + arange(1,5) + [0,1], while [1, 0, arange(1,5), 0, 1] produces [1, 0, array([1, 2, 3, 4]), 0, 1] which can't be converted to an array by simply doing array([[1, 0, array([1, 2, 3, 4]), 0, 1]]) I'll add it to the wiki once I know what the best method is. Regards St?fan From aisaac at american.edu Fri Feb 10 06:20:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 10 06:20:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <20060210122242.GA21950@sun.ac.za> References: <20060210122242.GA21950@sun.ac.za> Message-ID: On Fri, 10 Feb 2006, Stefan van der Walt apparently wrote: > In Octave that would be > [1, 0, 1:4, 0, 1] > Using numpy we currently do > concatenate([[1, 0], arange(1,5), [0, 1]]) or > vstack(...) numpy.r_[1,0,range(1,5),0,1] fwiw, Alan Isaac From cjw at sympatico.ca Fri Feb 10 07:14:07 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 10 07:14:07 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> Message-ID: <43ECAD98.60802@sympatico.ca> Alan G Isaac wrote: >On Fri, 10 Feb 2006, Stefan van der Walt apparently wrote: > > >>In Octave that would be >>[1, 0, 1:4, 0, 1] >>Using numpy we currently do >>concatenate([[1, 0], arange(1,5), [0, 1]]) or >>vstack(...) >> >> > >numpy.r_[1,0,range(1,5),0,1] > >fwiw, >Alan Isaac > > This seems to be a neat idea but not in the usual Python style. >>> help(numpy.r_) Help on concatenator in module numpy.lib.index_tricks object: class concatenator(__builtin__.object) | Translates slice objects to concatenation along an axis. | | Methods defined here: | | __getitem__(self, key) | | __getslice__(self, i, j) | | __init__(self, axis=0, matrix=False) | | __len__(self) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __dict__ = | dictionary for instance variables (if defined) | | __weakref__ = | list of weak references to the object (if defined) The help refers to concatenator, presumably r_ is a synonym, but that name is not available to the user: >>> numpy.concatenator Traceback (most recent call last): File "", line 1, in ? AttributeError: 'module' object has no attribute 'concatenator' >>> If r_ is a class, couldn't it have have a more mnemonic name and, in the usual Python style, start with an upper case letter? help(numpy.r_.__init__) Help on method __init__ in module numpy.lib.index_tricks: __init__(self, axis=0, matrix=False) unbound numpy.lib.index_tricks.concatenator method >>> Colin W. From jh at oobleck.astro.cornell.edu Fri Feb 10 08:17:02 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Fri Feb 10 08:17:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> (numpy-discussion-request@lists.sourceforge.net) References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> Message-ID: <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Look, people are making this into something it isn't, and hating that. I'm the last person in the world to want rules and bureaucrazy! However, to make the wiki accomplish our goals, we need to agree on some basic standards of judgement that we apply to ourselves. We did agree on some goals after SciPy '04: we want Python to be the environment of choice for EVERYONE to do numerical manipulation and data display, and to get there we need to present ourselves well (see the ASP roadmap, linked at the bottom of DevZone). I'm going to lay out my reasoning why a particular workflow will reach that goal. The idea is to take full advantage of the Wiki Way without having the liabilities of the Wiki Way. I'm sorry this is long. I have three proposals due Thursday and I don't have time to edit much. Bill, your argument is that you want to see a work in progress, something you and anyone else can just go and contribute to whenever you see a need, and that benefits from others making contributions continually. That's great from the point of view of an experienced user of array math (in Python or otherwise) who is used to and interested in contributing in the Wiki Way. I'm sure most list members, even of the scipy-user list, are still in this category. It is therefore crucial for all of us to remember that such people are *not* the main viewers of the site, at least not in the future we said we hoped for, and probably not even now. Most viewers are dyed-in-the-wool users and potential users. They want to see a clean, professional, straightforward site with the simple life laid out before them: bulletproof, current, binary installs for all their preferred platform(s); readable, grammatical, complete, current, well-assembled documentation for beginners and experts, both tutorial and reference; examples, screenshots, demos; lots of good, indexed, well-documented, shared software; and an active community. "Under construction" is an annoyance at best, and a deal-killer to many. They may contribute someday, but that's not in their minds yet. Recall that we have in our vision not just practicing scientists, but also secondary-school students and their teachers, students taking their only math or science course in college (under direction from their professors and TAs), and even photographers and others who would have need of the manipulations NumPy allows without necessarily understanding them. Our failure to gather a large following to date is largely due to our not (yet) delivering on the site/project vision above. The reunification that is NumPy allows us to change that. There are a LOT of people lurking, waiting for things to clean up and professionalize. Once they jump in, they'll tell their peers, who have never heard of us, and so on. The pure Wiki Way will never produce the site that will start this hoped-for avalanche of ordinary users. Wikis are always under construction, always bleeding-edge, always overdone in areas where their developers have interest, and always weak in crucial places where they don't. That said, wikis are *great* incubators. Wikis get individual subprojects completed fast and well by breaking down the barriers to contribution and review. DevZone tries to take advantage of the Wiki Way while still producing a polished site. The point of DevZone is twofold: first, focus contributors looking for a project on the things that most need help: our weak spots, like documentation. I'm getting about an email a week from new people looking to help. This wasn't the case a month ago or before; it was more like 1-2 a year. Second, isolate the worst of "under constructon" from the general view, so we don't look like a pile of stubs, go-nowhere links, and abandonned or outdated projects. The second item probably makes more sense for larger projects (a user's guide, etc.) than for pages like the glossary. The model for workflow is that projects are announced to the lists and immediately started in DevZone. When they reach a basic level of completeness and cleanliness, something like a 1.0 release, they get a link from the main site. When they are no longer in need of new helpers or when their development curve levels off, the DevZone link goes away. Right now, it's pretty loose when to put the link in from the main site to a project's page. Anyone can do it. To avoid a Wiki War, we need to have some common vision for how much of a construction zone we want the main site to be. The development model I'm discussing is a little like the Plone model, except that there is no elite administrator who decides when things move up to the main site. We dumped Plone for performance reasons, and because the administrators were too busy with their Enthought work to have time to do much on the site. But, the basic Plone idea of internal incubation is a good one. In my ideal world, a group (potentially including everyone, at least to a small degree) would write a (large) doc and periodically ask the list to take a look and comment. At some point, they'd ask for objections to putting it on the main site. They'd try to satisfy any objections, and then put it up. I'd trust the authors of a smaller doc (that therefore was both easier to write and had more people willing to give a quick review) to make the decision to promote to the main site themselves. This is exactly what code developers do when cutting releases: ensure that when a version goes public, it is reasonably clean, consistent, and complete. At the moment, some pages, most notably Documentation, are a mess. Clear out all the "incompletes", and those pages will look like Cookbook: cool but with gaping holes. I think that would be an improvement, particularly if we state on those pages that documents in early stages of construction are in DevZone, and provide a link. We could link those docs at the bottom of Documentation, but there is a point in a project's life cycle when it would be linked both from the main site and in DevZone. Do we want those projects to have two links on the same page? And do we really want all that "under construction" in people's faces? In the future, we will have good docs and a more spanning set of recipes. At that point, if we have embraced the pure Wiki Way, we will have a hard time agreeing no longer to do early construction in the main site. The loads of construction between the gems will turn away many of the huge class of non-expert users. SciPy will thus gain fewer users, and therefore attract fewer contributors and grow more slowly. My point now is to get our community culture to include a sense of professionalism and pride about what we present to the world. Unless you're a fool or you have no competition, you dress well for a job interview. We're not fools, and we have very healthy competition. The main site is the first impression we make on new users. My goal is to prevent it from being the last. --jh-- From cjw at sympatico.ca Fri Feb 10 08:26:06 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 10 08:26:06 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43EC96F5.6020500@sympatico.ca> David M. Cooke wrote: >Bill Baxter writes: > > > >>Some kind soul added 'svd' and 'inv' to the NumPy/SciPi columns, but those >>don't seem to be defined, at least for the versions of NumPy/SciPy that I >>have. Are they new? Or are they perhaps defined by a 3rd package in your >>environment? >> >> > >They're in numpy.linalg. > > > >>By the way, is there any python way to tell which package a symbol is coming >>from? >> >> > >Check it's __module__ attribute. > > > Yes, but not all objects have this attribute and some do not yet have a docstring. Colin W. From bblais at bryant.edu Fri Feb 10 08:40:04 2006 From: bblais at bryant.edu (Brian Blais) Date: Fri Feb 10 08:40:04 2006 Subject: [Numpy-discussion] gnuplot problem with numpy Message-ID: <43ECC154.1000004@bryant.edu> Hello, I have been trying to use the Gnuplot1.7.py module, but it doesn't seem to work with numpy (although it works with Numeric). The following code plots two "identical" sets of data, but the numpy data gets rounded to the nearest integer when passed to Gnuplot. What is odd is that the offending code in utils.py, is the function float_array(m), which does the conversion that I do in this script, but it doesn't seem to work. Any ideas? #---------------------------- import numpy import Numeric import Gnuplot g = Gnuplot.Gnuplot(debug=1) dh=.1; x=numpy.arange(dh,2+dh,dh,'d') y1 = x**2 y2=y1 d1 = Gnuplot.Data(x, y1, title='numpy', with='points') # doesn't work d2 = Gnuplot.Data(Numeric.asarray(x,'f'), Numeric.asarray(y2,'f'), title='Numeric', with='points') # works g.plot(d1,d2) #---------------------------- thanks, bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From Fernando.Perez at colorado.edu Fri Feb 10 09:15:02 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Fri Feb 10 09:15:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43EC7A6E.2020702@gmx.net> References: <43EBDF88.6040504@colorado.edu> <43EBEBD8.4050705@colorado.edu> <43EC7A6E.2020702@gmx.net> Message-ID: <43ECC9D0.8070809@colorado.edu> Sven Schreiber wrote: > Fernando Perez schrieb: > >>Bill Baxter wrote: >> >>>For what it's worth, matlab's rank function just calls svd, and >>>returns the >>>number singular values greater than a tolerance. The implementation is a >>>whopping 5 lines long. >> >>Yup, and it would be pretty much the same 5 lines in numpy, with the >>same semantics. >> >>Here's a quick and dirty implementation for old-scipy (I don't have >>new-scipy on this box): >> > > > Is there any reason not to use the algorithm implicit in lstsq, as in: > rk = linalg.lstsq(M, ones(p))[2] Simplicity? lstsq goes through a lot of contortions (needed for other reasons), and uses lapack's *gelss. If you read its man page: PURPOSE DGELSS computes the minimum norm solution to a real linear least squares problem: Minimize 2-norm(| b - A*x |). using the singular value decomposition (SVD) of A. A is an M-by-N matrix which may be rank-deficient. Several right hand side vectors b and solution vectors x can be handled in a single call; they are stored as the columns of the M-by-NRHS right hand side matrix B and the N-by-NRHS solution matrix X. The effective rank of A is determined by treating as zero those singu- lar values which are less than RCOND times the largest singular value. So you've gone through all that extra complexity, to get back what a direct call to svd would give you (caveat: the quick version I posted used absolute tolerance, while this one is relative; that can be trivially fixed). Given that a direct SVD call fits the definition of what we are computing (a numerical estimation of a matrix rank), I completely fail to see the point of going through several additional layers of unnecessary complexity, which both add cost and obscure the intent of the calculation. But perhaps I'm missing something... Cheers, f From tim.hochberg at cox.net Fri Feb 10 09:41:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 10 09:41:12 2006 Subject: [Numpy-discussion] Is there a known problem compiling numpy with VC7? Success In-Reply-To: References: <43E41E52.6060805@cox.net> <43E42B26.2090304@ee.byu.edu> <43E42C2C.6000208@cox.net> <43E45A95.6040601@ieee.org> <43E81650.2040204@cox.net> <43E81BFA.7060600@ieee.org> <43E82930.7070103@cox.net> <43E83FD3.90206@ieee.org> <43E8EE72.6070101@cox.net> <43E8EFE9.5040207@ieee.org> <43E9235E.70004@cox.net> <43E94090.5080609@ieee.org> <43EA3DE0.1070608@cox.net> <43EB695D.9050501@ieee.org> <43EB6DF5.6010705@cox.net> <43EB7BE3.70706@ieee.org> <43EB8AFE.2060704@cox.net> <43EBC5DF.9090709@cox.net> <43EBE721.6000602@cox.net> Message-ID: <43ECCFFF.2000209@cox.net> Pearu Peterson wrote: > > > On Thu, 9 Feb 2006, Tim Hochberg wrote: > >> Tim Hochberg wrote: >> >>> Pearu Peterson wrote: >>> >>>> >>>> >>>> On Thu, 9 Feb 2006, Tim Hochberg wrote: >>>> >>>>> I had this fantasy that default_lib_dirs would get picked up >>>>> automagically; however that does not happen. I still ended up >>>>> putting: >>>>> >>>>> from numpy.distutils import system_info >>>>> library_dirs = system_info.default_lib_dirs >>>>> result = >>>>> config_cmd.try_run(tc,include_dirs=[python_include], >>>>> library_dirs=library_dirs) >>>>> >>>>> into setup.py. Is that acceptable? It's not very elegant. >>>> >>>> >>>> >>>> >>>> No, don't use system_info.default_lib_dirs. >>>> >>>> Use distutils.sysconfig.get_python_lib() to get the directory that >>>> contains Python library. >>> >>> >>> >>> That's the wrong library. Get_python_lib gives you the location of >>> the python standard library, not the location of python24.lib. The >>> former being python24/Lib (or python24/Lib/site-packages depending >>> what options you feed get_python_lib) 'and the latter being >>> python24/libs on my box. >> > > Ok, but using system_info.default_lib_dirs is still wrong, this list > is not designed for this purpose.. OK. > >> To follow up on this a little bit, I investigated how distutils >> itself finds python24.lib. It turns out that it is in build_ext.py, >> near line 168. The relevant code is: >> >> # also Python's library directory must be appended to library_dirs >> if os.name == 'nt': >> self.library_dirs.append(os.path.join(sys.exec_prefix, >> 'libs')) > > > Hmm, this should be effective also for numpy.distutils. > self.library_dirs and other such attributes are used in > distutils.command.build_ext.run() method while our > numpy.distutils.command.build_ext.run() doesn't. So, I we have to do > is to update numpy.distutils.command.build_ext.run() method to resolve > this issue. This should fix also rpath issues that was reported on > this list for certain platforms. I'll look fixing it today.. While your looking at it, keep in mind that the original failure that I was trying to fix occurs when numpy/core/setup.py calls config_cmd.try_run. I'm not certain, but I suspect that this isn't going to got through numpy.distutils.command.build_ext. One strategy would be to put a functions somewhere that returns these extra libray directories somewhere appropriate and call it from both numpy.distutils.command.build_ext and numpy/core/setup.py. It could look like: def get_extra_library_dirs(): if os.name == 'nt': return [os.path.join(sys.exec_prefix, 'libs')] else: return [] I'm not sure what would be an appropriate place for it though. -tim > > Pearu > > From tim.hochberg at cox.net Fri Feb 10 11:17:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 10 11:17:01 2006 Subject: [Numpy-discussion] Test test_minrelpath.py on unix for me Message-ID: <43ECE677.2040104@cox.net> Could someone try the attached diff on a unixy system? It works under windows, but it's easy to mess up those \/'s. Thanks, -tim -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fix_minrelpathtests.diff URL: From nicolist at limare.net Fri Feb 10 12:25:04 2006 From: nicolist at limare.net (Nico) Date: Fri Feb 10 12:25:04 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations Message-ID: <43ECF661.1010003@limare.net> Hi! a = array([0,1,0,0]) a[1:] |= a[:-1] gives the unexpected result [0 1 1 1] instead of [0 1 1 0] because python performs the |= on the forst cell, then on the second, and so on. I found two ways to get it right, with a copy: b = a.copy() a[1:] |= b[:-1] or working backward: a[-1:1:-1] |= a[-2:0:-1] which is better, in terms of speed (and memory management), for large 3D arrays? -- Nico From nicolist at limare.net Fri Feb 10 12:47:02 2006 From: nicolist at limare.net (Nico) Date: Fri Feb 10 12:47:02 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations In-Reply-To: <43ECF661.1010003@limare.net> References: <43ECF661.1010003@limare.net> Message-ID: <43ECFB86.6030600@limare.net> > a = array([0,1,0,0]) > a[1:] |= a[:-1] > > gives the unexpected result [0 1 1 1] instead of [0 1 1 0] because > python performs the |= on the first cell, then on the second, and so on. > > I found two ways to get it right, with a copy: > b = a.copy() > a[1:] |= b[:-1] > > or working backward: > a[-1:1:-1] |= a[-2:0:-1] I finally noticed that a = array([0,1,0,0]) a[1:] |= a[:-1] | False also works, but I can't figure out why... -- Nico From tim.hochberg at cox.net Fri Feb 10 12:51:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 10 12:51:02 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations In-Reply-To: <43ECF661.1010003@limare.net> References: <43ECF661.1010003@limare.net> Message-ID: <43ECFC86.8020603@cox.net> Nico wrote: >Hi! > >a = array([0,1,0,0]) >a[1:] |= a[:-1] > >gives the unexpected result [0 1 1 1] instead of [0 1 1 0] because >python performs the |= on the forst cell, then on the second, and so on. > >I found two ways to get it right, with a copy: >b = a.copy() >a[1:] |= b[:-1] > > You could also do: a[1:] = a[1:] | a[:-1] This is nearly the same as the copy version, but uses very slightly less space and is clearer IMO. >or working backward: >a[-1:1:-1] |= a[-2:0:-1] > >which is better, in terms of speed (and memory management), for large 3D >arrays? > > The backwards version will be better in terms of memory usage and almost certainly in terms of speed as well since it avoids and extra copy and should have better locality of reference (also because of no extra copy). It's a little obscure though. I'd be tempted to do something like. a_rev = a[::-1] a_rev[:-1] |= a_rev[1:] # a holds result. Another question / suggestion is: are you using a[0]. This looks like an operation where you may well be throwing away a[0] when you are done anyway. If that is the case, would it work to use: a[:-1] |= a[1:] a = a[:-1] This last will give you the same result as the others except that the first value will be missing. -tim From perry at stsci.edu Fri Feb 10 12:55:01 2006 From: perry at stsci.edu (Perry Greenfield) Date: Fri Feb 10 12:55:01 2006 Subject: [Numpy-discussion] array shift and |=, copy, and backward operations In-Reply-To: <43ECFB86.6030600@limare.net> References: <43ECF661.1010003@limare.net> <43ECFB86.6030600@limare.net> Message-ID: <5f2eeb776ec6e857cc6c92c73f0249bc@stsci.edu> On Feb 10, 2006, at 3:45 PM, Nico wrote: > > I finally noticed that > > a = array([0,1,0,0]) > a[1:] |= a[:-1] | False > > also works, but I can't figure out why... > Because the expression on the right generates a new copy, thus eliminating the problem of overwriting itself. From Chris.Barker at noaa.gov Fri Feb 10 13:34:04 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 10 13:34:04 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: Message-ID: <43ED06A6.7040600@noaa.gov> Bill Baxter wrote: > By the way, is there any python way to tell which package a symbol is coming > from? Yes. Don't use "import *". There is a long tradition of using NumPy this way: from numpy import * But now I always use it this way: import numpy as N (or nx, or whatever short name you want). I like it, because it's always clear where stuff is coming from. numpy's addition of a number of methods for what used to be functions helps make this more convenient too. "Namespaces are one honking great idea -- let's do more of those!" from: http://www.python.org/doc/Humor.html#zen -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Fri Feb 10 13:58:03 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 10 13:58:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ECAD98.60802@sympatico.ca> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> Message-ID: <43ED0C5D.6030300@noaa.gov> Colin J. Williams wrote: >> numpy.r_[1,0,range(1,5),0,1] > This seems to be a neat idea but not in the usual Python style. Exactly. couldn't it at least get a meaningful, but short, name? And is there a way to use it to concatenate along other axis? I couldn't see a way. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at ee.byu.edu Fri Feb 10 14:18:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 10 14:18:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> Message-ID: <43ED10D0.1000405@ee.byu.edu> Alan G Isaac wrote: >On Fri, 10 Feb 2006, Stefan van der Walt apparently wrote: > > >>In Octave that would be >>[1, 0, 1:4, 0, 1] >>Using numpy we currently do >>concatenate([[1, 0], arange(1,5), [0, 1]]) or >>vstack(...) >> >> > >numpy.r_[1,0,range(1,5),0,1] > > or even faster numpy.r_[1,0,1:5,0,1] The whole point of r_ is to allow you to use slice notation to build ranges easily. I wrote it precisely to make it easier to construct arrays in a simliar style that Matlab allows. -Travis From oliphant at ee.byu.edu Fri Feb 10 14:29:05 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 10 14:29:05 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED0C5D.6030300@noaa.gov> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> <43ED0C5D.6030300@noaa.gov> Message-ID: <43ED138C.7040301@ee.byu.edu> Christopher Barker wrote: > Colin J. Williams wrote: > >>> numpy.r_[1,0,range(1,5),0,1] >> > >> This seems to be a neat idea but not in the usual Python style. > > > Exactly. couldn't it at least get a meaningful, but short, name? It is meaningful :-) r_ means row concatenation... (but, it has taken on more functionality than that). What name do you suggest? > > And is there a way to use it to concatenate along other axis? I > couldn't see a way. Yes, add a string at the end with the number of the axis you want to concatenate along. But, you have to have that axis to start with or the result is no different. The default is to concatenate along the last axis. Thus (the ndmin keyword forces the array to have a minimum number of dimensions --- prepended). a = array([1,2,3],ndmin=2) b = array([1,2,3],ndmin=2) c = r_[a,b,'0'] print c [[1 2 3] [1 2 3]] print r_[a,b,'1'] [[1 2 3 1 2 3]] -Travis From ndarray at mac.com Fri Feb 10 14:31:05 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 14:31:05 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED10D0.1000405@ee.byu.edu> References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: On 2/10/06, Travis Oliphant wrote: > The whole point of r_ is to allow you to use slice notation to build > ranges easily. I wrote it precisely to make it easier to construct > arrays in a simliar style that Matlab allows. Maybe it is just me, but r_ is rather unintuitive. I would expect something like this to be called "c" for "combine" or "concatenate." This is the name used by S+ and R. >From R manual: """ c package:base R Documentation Combine Values into a Vector or List ... Examples: c(1,7:9) ... """ From ndarray at mac.com Fri Feb 10 14:50:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 14:50:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: To tell you the truth I dislike trailing underscore much more than the choice of letter. In my code I will probably be renaming all these foo_ to delete the underscore foo_(...) or foo_[...] is way too ugly for my taste. However I fully admit that it is just a matter of taste and it is trivial to rename things on import in Python. PS: Trailing underscore reminds me of C++ - the language that I happily live without :-) On 2/10/06, Ryan Krauss wrote: > The problem is that c_ at least used to mean "column concatenate" and > concatenate is too long to type. > > On 2/10/06, Sasha wrote: > > On 2/10/06, Travis Oliphant wrote: > > > The whole point of r_ is to allow you to use slice notation to build > > > ranges easily. I wrote it precisely to make it easier to construct > > > arrays in a simliar style that Matlab allows. > > > > Maybe it is just me, but r_ is rather unintuitive. I would expect > > something like this to be called "c" for "combine" or "concatenate." > > This is the name used by S+ and R. > > > > From R manual: > > """ > > c package:base R Documentation > > Combine Values into a Vector or List > > ... > > Examples: > > c(1,7:9) > > ... > > """ > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > > for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From ndarray at mac.com Fri Feb 10 14:58:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 14:58:02 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: Actually, what would be wrong with a single letter "c" or "r" for the concatenator? NumPy already has one single-letter global identifier - "e", so it will not be against any naming standard. I don't think either "c" or "r" will conflict with anything in the standard library. I would still prefer "c" because "r" is taken by RPy. On 2/10/06, Sasha wrote: > To tell you the truth I dislike trailing underscore much more than the > choice of letter. In my code I will probably be renaming all these > foo_ to delete the underscore foo_(...) or foo_[...] is way too ugly > for my taste. However I fully admit that it is just a matter of taste > and it is trivial to rename things on import in Python. > > PS: Trailing underscore reminds me of C++ - the language that I > happily live without :-) > > On 2/10/06, Ryan Krauss wrote: > > The problem is that c_ at least used to mean "column concatenate" and > > concatenate is too long to type. > > > > On 2/10/06, Sasha wrote: > > > On 2/10/06, Travis Oliphant wrote: > > > > The whole point of r_ is to allow you to use slice notation to build > > > > ranges easily. I wrote it precisely to make it easier to construct > > > > arrays in a simliar style that Matlab allows. > > > > > > Maybe it is just me, but r_ is rather unintuitive. I would expect > > > something like this to be called "c" for "combine" or "concatenate." > > > This is the name used by S+ and R. > > > > > > From R manual: > > > """ > > > c package:base R Documentation > > > Combine Values into a Vector or List > > > ... > > > Examples: > > > c(1,7:9) > > > ... > > > """ > > > > > > > > > ------------------------------------------------------- > > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > > > for problems? Stop! Download the new AJAX search engine that makes > > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > > http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > > From efiring at hawaii.edu Fri Feb 10 14:59:03 2006 From: efiring at hawaii.edu (Eric Firing) Date: Fri Feb 10 14:59:03 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED138C.7040301@ee.byu.edu> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> <43ED0C5D.6030300@noaa.gov> <43ED138C.7040301@ee.byu.edu> Message-ID: <43ED1A71.1080605@hawaii.edu> Travis Oliphant wrote: > Christopher Barker wrote: > >> Colin J. Williams wrote: >> >>>> numpy.r_[1,0,range(1,5),0,1] >>> >>> >> >>> This seems to be a neat idea but not in the usual Python style. >> >> >> >> Exactly. couldn't it at least get a meaningful, but short, name? > > > It is meaningful :-) r_ means row concatenation... (but, it has taken > on more functionality than that). What name do you suggest? "cat"? "rcat"? "catr"? "catter"? Eric From ndarray at mac.com Fri Feb 10 15:16:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 15:16:01 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED1A71.1080605@hawaii.edu> References: <20060210122242.GA21950@sun.ac.za> <43ECAD98.60802@sympatico.ca> <43ED0C5D.6030300@noaa.gov> <43ED138C.7040301@ee.byu.edu> <43ED1A71.1080605@hawaii.edu> Message-ID: I would be against any meaningful name because it will look too much like a function and people will be trying to use (...) instead of [...] with it. A single-letter identifier will look more like syntax and the concatenator is really just a clever way to take advantage of Python syntax that recognizes slices inside []. Novices may just think that something like c[1:3,9:20] is an array literal like r"xyz" for raw strings (another argument against "r"!). On 2/10/06, Eric Firing wrote: > Travis Oliphant wrote: > > Christopher Barker wrote: > > > >> Colin J. Williams wrote: > >> > >>>> numpy.r_[1,0,range(1,5),0,1] > >>> > >>> > >> > >>> This seems to be a neat idea but not in the usual Python style. > >> > >> > >> > >> Exactly. couldn't it at least get a meaningful, but short, name? > > > > > > It is meaningful :-) r_ means row concatenation... (but, it has taken > > on more functionality than that). What name do you suggest? > > "cat"? "rcat"? "catr"? "catter"? > > Eric > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ndarray at mac.com Fri Feb 10 16:09:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 16:09:01 2006 Subject: [Numpy-discussion] Array literal Message-ID: Recent discussion of the numpy catenator (r_) made me realize that Python syntax allows us to effectively implement an array literal. >>> from numpy import r_ as a >>> a[1:3,5:9] array([1, 2, 5, 6, 7, 8]) >>> a[1, 2, 5, 6, 7, 8] array([1, 2, 5, 6, 7, 8]) One can think of a[1, 2, 5, 6, 7, 8] as an array literal. To me it looks very "pythonic": [...] already has a meaning of list literal and python uses single-letter modifier in string literals to denote raw strings. In other words a[...] is to [...] what r"..." is to "...". The catenator can probably be generalized to cover all use cases of the "array" constructor. For example: a(shape=(2,3))[1:3,5:9] may return array([[1,2,5],[6,7,8]]) a(shape=(2,3))[1] may return ones((2,3)) a(shape=(2,3))[...] may return empty((2,3)) a(shape=(2,3))[1, 2, ...] may return array([[1,2,1],[2,1,2]]) dtype and other array(...) arguments can be passed similarly to shape above. If this syntax proves successful, ndarray repr may be changed to return "a[...]" instead of "array([...])" and thus make new users immediately aware of this way to represent arrays. From cookedm at physics.mcmaster.ca Fri Feb 10 16:12:03 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 10 16:12:03 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> (Joe Harrington's message of "Fri, 10 Feb 2006 11:16:00 -0500") References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: Joe Harrington writes: > [...] Put this it on the wiki (seriously). Another thing to look at is the "Producing Open Source Software" book that's been mentioned before (http://producingoss.com/). There's a section on wiki's that useful to keep in mind at http://producingoss.com/html-chunk/index.html -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant at ee.byu.edu Fri Feb 10 16:52:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 10 16:52:02 2006 Subject: [Numpy-discussion] BUG(?): array([None])==None test In-Reply-To: References: Message-ID: <43ED3525.3020502@ee.byu.edu> Pearu Peterson wrote: > > Hi, > > While converting some Numeric based code to numpy, I noticed that > in Numeric > > array([None])==None > > returns array([1]) while in numpy it returns False. > > Is this expected behaviour or a bug? It's expected behavior. If you do an equality test on None then False is returned while True is returned on an inequality test to None. -Travis From gruben at bigpond.net.au Fri Feb 10 17:30:05 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 17:30:05 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: <43ED3DDB.4030806@bigpond.net.au> Sasha wrote: > On 2/10/06, Travis Oliphant wrote: >> The whole point of r_ is to allow you to use slice notation to build >> ranges easily. I wrote it precisely to make it easier to construct >> arrays in a simliar style that Matlab allows. > > Maybe it is just me, but r_ is rather unintuitive. I would expect > something like this to be called "c" for "combine" or "concatenate." > This is the name used by S+ and R. I agree that c or c_ (don't care which) is more intuitive but I can understand why it's ended up as it has. Even v or v_ for 'vector' or a or a_ for 'array' would also make sense to me. I must say that Travis's example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that the upper limit on an integer range is non-inclusive. I'm sure the BDFL has some excuse for this silliness. Gary R From ndarray at mac.com Fri Feb 10 18:37:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 18:37:04 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: <43ED3DDB.4030806@bigpond.net.au> References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> <43ED3DDB.4030806@bigpond.net.au> Message-ID: On 2/10/06, Gary Ruben wrote: >... I must say that Travis's > example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that > the upper limit on an integer range is non-inclusive. In this case you must hate that an integer range starts at 0 (I don't think you would want len(range(10)) to be 11). If this is the case, I don't blame you: it is silly to start counting at 0, but algorithmically it is quite natural. Semi-closed integer ranges have many algorithmic advantages as well such as length = (stop - start)/step, empty range can be recognized by start=stop test regardless of step, adjacent ranges - start2=stop1 (again no need to know step) etc. > I'm sure the BDFL has some excuse for this silliness. Maybe he does not like Fortran :-) PS: What's your second favorite language (I assume that python is the first :-)? From gruben at bigpond.net.au Fri Feb 10 19:42:00 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 19:42:00 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> <43ED3DDB.4030806@bigpond.net.au> Message-ID: <43ED5CCA.8000805@bigpond.net.au> Sasha wrote: > On 2/10/06, Gary Ruben wrote: >> ... I must say that Travis's >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that >> the upper limit on an integer range is non-inclusive. > > In this case you must hate that an integer range starts at 0 (I don't > think you would want len(range(10)) to be 11). Actually, that wouldn't bother me and I'm not really fussed by whether a language chooses 0 or 1 based integer ranges, as long as you can override the default, but 0 seems more natural for any programming language. > If this is the case, > I don't blame you: it is silly to start counting at 0, but > algorithmically it is quite natural. Semi-closed integer ranges have > many algorithmic advantages as well such as length = (stop - > start)/step, empty range can be recognized by start=stop test > regardless of step, adjacent ranges - start2=stop1 (again no need to > know step) etc. Thanks for the explanation Sasha. It does make some sense in terms of your examples, but I'll remain unconvinced. >> I'm sure the BDFL has some excuse for this silliness. > > Maybe he does not like Fortran :-) > > PS: What's your second favorite language (I assume that python is the first :-)? It's not Fortran-77! If I say it's Object Pascal (i.e. Delphi) you may begin to see where my range specifier preference comes from. Pascal lets you define things like enumeration type ranges like Monday..Friday. It would seem nonsensical to define the range of working weekdays as Monday..Saturday. I'm pretty competent with C, less-so with C++ and I've totally missed out on Java. One day I might have a play with Haskell and Ruby. Actually I see that Ruby sidesteps my pet hate by providing both types of range specifiers. I can't see myself defecting to the enemy just because of this though, Gary R. From ndarray at mac.com Fri Feb 10 20:39:07 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 20:39:07 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review Message-ID: Sorry for cross posting. This request clearly relevant to the NumPy list, but Wiki instructs that such requests should be posted on scipy-dev. Please review http://scipy.org/NumPyGlossary . From gruben at bigpond.net.au Fri Feb 10 21:04:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 21:04:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: Message-ID: <43ED700B.10903@bigpond.net.au> Hi Sasha, A couple of points: Stride The distance (in bytes) between the two concecutive elements along an axis. Stride isn't the distance in bytes is it? Isn't it just the index increment or alternatively the distance in terms of the multiplier of the word length of the contained type. Also, a slight typo: concecutive -> consecutive. Record A composite element of an array similar to C struct. This implies that you can contain different types in a record, which I think is only true if you have an object array. Everything else looks OK. Gary R. Sasha wrote: > Sorry for cross posting. This request clearly relevant to the NumPy > list, but Wiki instructs that such requests should be posted on > scipy-dev. Please review http://scipy.org/NumPyGlossary . From cookedm at physics.mcmaster.ca Fri Feb 10 21:20:03 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 10 21:20:03 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43ED700B.10903@bigpond.net.au> (Gary Ruben's message of "Sat, 11 Feb 2006 16:03:07 +1100") References: <43ED700B.10903@bigpond.net.au> Message-ID: Gary Ruben writes: > Hi Sasha, > A couple of points: > > Stride > The distance (in bytes) between the two concecutive elements > along an axis. > > Stride isn't the distance in bytes is it? Isn't it just the index > increment or alternatively the distance in terms of the multiplier of > the word length of the contained type. Also, a slight typo: > concecutive -> consecutive. In numpy usage, it's bytes. It's particularly important when you've got a record array of mixed types. Travis's example is temp = array([(1.8,2),(1.7,3)],dtype='f8,i2') temp['f1'].strides (10,) > Record > A composite element of an array similar to C struct. > > This implies that you can contain different types in a record, which I > think is only true if you have an object array. Nope; see above. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From ndarray at mac.com Fri Feb 10 21:31:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 21:31:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43ED700B.10903@bigpond.net.au> References: <43ED700B.10903@bigpond.net.au> Message-ID: On 2/11/06, Gary Ruben wrote: > Stride isn't the distance in bytes is it? Isn't it just the index > increment or alternatively the distance in terms of the multiplier of > the word length of the contained type. Unfortunately it is in bytes and Travis convinced me that there is no way to change it. > Also, a slight typo: concecutive -> consecutive. I've changed that. In the future, please just edit the wiki for obvious misspellings. Spell check does not work for me on that wiki and English is not my first language, so any spelling/grammar corrections are more than welcome. > > Record > A composite element of an array similar to C struct. > > This implies that you can contain different types in a record, which I > think is only true if you have an object array. Record arrays is a new feature in numpy. I think what I wrote is correct, but this entry will definitely benefit from a review by someone familiar with record arrays since I am not. From gruben at bigpond.net.au Fri Feb 10 21:56:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Fri Feb 10 21:56:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> Message-ID: <43ED7C3D.6070705@bigpond.net.au> David Cooke corrected my misconceptions, so the glossary all looks good to me. Gary R. Sasha wrote: > On 2/11/06, Gary Ruben wrote: >> Stride isn't the distance in bytes is it? Isn't it just the index >> increment or alternatively the distance in terms of the multiplier of >> the word length of the contained type. > > Unfortunately it is in bytes and Travis convinced me that there is no > way to change it. > >> Also, a slight typo: concecutive -> consecutive. > > I've changed that. In the future, please just edit the wiki for > obvious misspellings. Spell check does not work for me on that wiki > and English is not my first language, so any spelling/grammar > corrections are more than welcome. > >> Record >> A composite element of an array similar to C struct. >> >> This implies that you can contain different types in a record, which I >> think is only true if you have an object array. > > Record arrays is a new feature in numpy. I think what I wrote is > correct, but this entry will definitely benefit from a review by > someone familiar with record arrays since I am not. From zpincus at stanford.edu Fri Feb 10 22:41:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Fri Feb 10 22:41:01 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43ED7C3D.6070705@bigpond.net.au> References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: The "broadcasting" entry is somewhat unclear in terms of what "conforming" array shapes are. Perhaps "compatible shapes" would be better, coupled with an example or two of "compatible" shapes, and/or a precise definition of how compatibility is determined. Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine On Feb 10, 2006, at 9:55 PM, Gary Ruben wrote: > David Cooke corrected my misconceptions, so the glossary all looks > good to me. > > Gary R. > > Sasha wrote: >> On 2/11/06, Gary Ruben wrote: >>> Stride isn't the distance in bytes is it? Isn't it just the index >>> increment or alternatively the distance in terms of the >>> multiplier of >>> the word length of the contained type. >> Unfortunately it is in bytes and Travis convinced me that there is no >> way to change it. >>> Also, a slight typo: concecutive -> consecutive. >> I've changed that. In the future, please just edit the wiki for >> obvious misspellings. Spell check does not work for me on that wiki >> and English is not my first language, so any spelling/grammar >> corrections are more than welcome. >>> Record >>> A composite element of an array similar to C struct. >>> >>> This implies that you can contain different types in a record, >>> which I >>> think is only true if you have an object array. >> Record arrays is a new feature in numpy. I think what I wrote is >> correct, but this entry will definitely benefit from a review by >> someone familiar with record arrays since I am not. > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From ndarray at mac.com Fri Feb 10 23:10:02 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 10 23:10:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: I updated the "broadcasting" entry. I don't think examples belong to a glossary. I think a glossary should be more like a quick reference rather than a tutorial. Unfortunately the broadcasting is one of those concepts that will never be clear without examples. On 2/11/06, Zachary Pincus wrote: > The "broadcasting" entry is somewhat unclear in terms of what > "conforming" array shapes are. Perhaps "compatible shapes" would be > better, coupled with an example or two of "compatible" shapes, and/or > a precise definition of how compatibility is determined. From wbaxter at gmail.com Sat Feb 11 03:59:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sat Feb 11 03:59:01 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: Definitely a very clear and convincing explanation, Joe. I guess my problem is mostly that I pretty much just walked in here, and I'm wondering how was I supposed to know that new things go in this DevZone? I saw a wiki, it didn't have the page I wished were there, so I registered and added it, figuring that's how Wikis are supposed to work. Seems like the information in your email needs to be communicated to new registrants to the wiki. And maybe permissions for creating new pages limited to the DevZone? --bb On 2/11/06, David M. Cooke wrote: > > Joe Harrington writes: > > > [...] > > Put this it on the wiki (seriously). > > Another thing to look at is the "Producing Open Source Software" book > that's been mentioned before (http://producingoss.com/). There's a > section on wiki's that useful to keep in mind at > http://producingoss.com/html-chunk/index.html > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sat Feb 11 04:06:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sat Feb 11 04:06:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: On the point of professionalism, I'd like to change the matlab page's title from "NumPy for Matlab Addicts" to simply "NumPy for Matlab Users". It's been bugging me since I put it up there initially... but I'm not really sure how to chage the name of a page in the wiki. On 2/11/06, Joe Harrington wrote: > > My point now is to get our community culture to include a > sense of professionalism and pride about what we present to the world. > Unless you're a fool or you have no competition, you dress well for a > job interview. We're not fools, and we have very healthy competition. > The main site is the first impression we make on new users. My goal > is to prevent it from being the last. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mithrandir42 at web.de Sat Feb 11 04:46:02 2006 From: mithrandir42 at web.de (N. Volbers) Date: Sat Feb 11 04:46:02 2006 Subject: [Numpy-discussion] dtype names and titles Message-ID: <43EDDC6C.6040005@web.de> I continue to learn all about the heterogeneous arrays... When I was reading through the records.py code I discovered that besides the 'names' and 'formats' for the fields of a numpy array you can also specify 'titles'. Playing around with this feature I discovered a bug: >>> import numpy >>> mydata = [(1,1), (2,4), (3,9)] >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'], 'titles': ['col2', 'col1']} >>> b = numpy.array( mydata, dtype=mytype) >>> print b [(1.0, 1.0) (4.0, 4.0) (9.0, 9.0)] This seems to be caused by the fact that you can access a field by both the name and the field title. Why would you want to have two names anyway? By the way, is there an easy way to access a field vector by its index? Right now I retrieve the field name from dtype.fields[-1][index] and then return the 'column' by using myarray[name]. Best regards, Niklas. From pearu at scipy.org Sat Feb 11 07:21:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Sat Feb 11 07:21:02 2006 Subject: [Numpy-discussion] Test test_minrelpath.py on unix for me In-Reply-To: <43ECE677.2040104@cox.net> References: <43ECE677.2040104@cox.net> Message-ID: On Fri, 10 Feb 2006, Tim Hochberg wrote: > > Could someone try the attached diff on a unixy system? It works under > windows, but it's easy to mess up those \/'s. Hmm, minrelpath does not need if os.sep != '/': path = path.replace('/',os.sep) In functions (see njoin, for instance) that call minrelpath already have applied this codelet. I have commited the tests fix with some modifications to svn, tested on Linux. Pearu From tim.hochberg at cox.net Sat Feb 11 07:37:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sat Feb 11 07:37:02 2006 Subject: [Numpy-discussion] Test test_minrelpath.py on unix for me In-Reply-To: References: <43ECE677.2040104@cox.net> Message-ID: <43EE0466.8070608@cox.net> Pearu Peterson wrote: > > > On Fri, 10 Feb 2006, Tim Hochberg wrote: > >> >> Could someone try the attached diff on a unixy system? It works under >> windows, but it's easy to mess up those \/'s. > > > Hmm, minrelpath does not need > > if os.sep != '/': > path = path.replace('/',os.sep) > > In functions (see njoin, for instance) that call minrelpath already > have applied this codelet. > > I have commited the tests fix with some modifications to svn, tested > on Linux. That seems to do the trick under VC7 as well. Thanks, -tim > > Pearu > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From ndarray at mac.com Sat Feb 11 08:03:04 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 11 08:03:04 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: At this point the only change I would like to make to the glossary page is to rename it to NumPy_Glossary. I don't have a permission to change file names on the wiki, so I have to defer this task to someone else. I don't have any view on where this page should be linked, so I will not make any more changes relating to this page. From david.trem at gmail.com Sat Feb 11 09:05:03 2006 From: david.trem at gmail.com (David TREMOUILLES) Date: Sat Feb 11 09:05:03 2006 Subject: [Numpy-discussion] gnuplot problem with numpy In-Reply-To: <43ECC154.1000004@bryant.edu> References: <43ECC154.1000004@bryant.edu> Message-ID: <129e1cd10602110904i9177e55t@mail.gmail.com> Hello, Maybe you have to upgrade your numeric to 24.2 Refer to the recent thread in gnuplot-py-user list: http://sourceforge.net/mailarchive/forum.php?forum_id=11272&max_rows=25&style=nested&viewmonth=200602 David 2006/2/10, Brian Blais : > > Hello, > > I have been trying to use the Gnuplot1.7.py module, but it doesn't seem to > work with > numpy (although it works with Numeric). The following code plots two > "identical" > sets of data, but the numpy data gets rounded to the nearest integer when > passed to > Gnuplot. What is odd is that the offending code in utils.py, is the > function > float_array(m), which does the conversion that I do in this script, but it > doesn't > seem to work. Any ideas? > > #---------------------------- > import numpy > import Numeric > import Gnuplot > > g = Gnuplot.Gnuplot(debug=1) > dh=.1; > x=numpy.arange(dh,2+dh,dh,'d') > y1 = x**2 > > > y2=y1 > > d1 = Gnuplot.Data(x, y1, > title='numpy', > with='points') # doesn't work > d2 = Gnuplot.Data(Numeric.asarray(x,'f'), Numeric.asarray(y2,'f'), > title='Numeric', > with='points') # works > > g.plot(d1,d2) > > #---------------------------- > > > > > thanks, > > bb > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu at scipy.org Sat Feb 11 09:08:01 2006 From: pearu at scipy.org (Pearu Peterson) Date: Sat Feb 11 09:08:01 2006 Subject: [Numpy-discussion] More numpy and Numeric differences Message-ID: I have created a wiki page http://scipy.org/PearuPeterson/NumpyVersusNumeric that reports my findings on how numpy and Numeric behave on various corner cases. Travis O., could you take a look at it? Here is the most recent addition: """ Clipping integer array with Inf In Numeric (v24.2) clip returns float array: >>> from Numeric import * >>> Inf = array(1.0)/0 >>> clip([1,2],0,Inf) array([ 1., 2.]) >>> In numpy (v0.9.5.2092) Overflow error is raised: >>> from numpy import * >>> clip([1,2],0,Inf) Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.3/site-packages/numpy/core/oldnumeric.py", line 336, in clip return asarray(m).clip(m_min, m_max) OverflowError: cannot convert float infinity to long Comment: is it a numpy bug? Should clip take optional dtype argument and return asarray(m, dtype=dtype).clip(m_min, m_max)? Then also array.clip should have optional dtype.. """ Pearu From affix at zahav.net.il Sat Feb 11 09:33:06 2006 From: affix at zahav.net.il (=?windows-1251?B?MTYgLSAxNyD05eLw4Ov/IDIwMDYg4+7k4A==?=) Date: Sat Feb 11 09:33:06 2006 Subject: [Numpy-discussion] =?windows-1251?B?08/QwMLLxc3IxSDOz9LOws7JINLO0MPOwsvFyQ==?= Message-ID: <743c01c62f30$28724834$e1d07284@zahav.net.il> ?????????? ????????????? ???????????, ???????????? ??????? ?????????, ???????????? ?????????? ? ?????????? ?? ????? ???????????? ????????, ???????????????? ? ????????? ????????????? ??????? ?????????? ??????? ?????????, ??????? ??????? ? ????? ?????????? ????????: ?????????? ??????? ????????? 16 - 17 ??????? 2006 ???? ???? ????????: ???????????? ?????????? ????????????? ???? ? ??????????? ??? ???????????? ???????, ???????????: ?????????????? ????????? ?????????? ??????? ?????????; ??????? ? ?????????????? ??????? ??????????, ????????, ???????????? ? ????????? ??????????; ??????????? ??????? ??????; ??????? ?????? ????????? ????????? ? ?????????????? ?? ??????????. ?? ????????? ???????? ????????? ??????: ??????????? ??????? ???????? ????????????; ????????? ??????????? ??????? ????????? ?????; ?????????????? ?????????????? ????????? ??? ???????? ?????; ???????????? ??????????? ?????; ???????????? ??? ?????????????? ???????? ? ????????, ??????????? ????? ??????? ????? ??????. ????????? ????????: ???????? ?????????????????????? ??????????? ????? ? ?????? ?????????? ????????? ????????????????????????. ????????? ??????????: ??? ??????? ?????????? ?????????. ????????? ???? ?????? ????. ?????? ?????????????????????? ???????????. ??????? ? ?????????? ??????? ????????????????????????. ?????? ?????? ? ?????????? ??????? ????????? ? ???? ?? ???????????. ???????????? ???????? ?????????: ??? ??????? ??????. ??????????? ??????????????????????? ???????? ??????? ???????? ? ???? ??????: ?????????, ????, ?????????. ???? ? ????????? - ?????????????? ????????????. ????????? ??????????? ????? - ?????? ??????????? ???????????. ???????????? ???????????? ???????? ???????????. ??????????????? ??? ??????? ???????????? ??????????. ??????????? ???????????? ?????????? ???????? ???????????. ?????? ??????????? ????????? ??????? ????????. ???????????? ??????? ???????? ????????? ?????????? ?????????????????????? ????????????? ???????? ? ??????. ??????? ?? ????????: ????? ? ??? ?????. ?????, ????? ? ????? ????????? ?????????. ???????? ?????????? ???????? ??????????. ??????? ????????? ????????? ?????????. ??????????? ?????????????? ???????? ??????????? ??????? ???????? ?????????. ?????? ? ???????????: ???????-????????? ?????????? ? ????????????? ??????, ??? ??????????? ?????????????? ???????? ? ??????????. ??????? ????????? ??? ? ??????? ?????????? ??? ? ??????, ??? ? ? ????????, ?????? ??????? ????????????? ??????? ??????????. ????? ???????? - ?????????? ? 20 ?????? ??????. ?????????? ????? 50 ???????? ?? ?????????? ? ??????????? ?????, ?????????? ?????????, ??????????? ????????????? ????????????, ???????????? ????????? ???????????? ? PR, ?????????????? ? ?.?. ????? ???? ?????? ?? ???????? ??????????, ?????????? ????????, ?????? ???????? ????????????????? (???), ????????????????? ???-??????, ????????????? ?????????? ? ??????????? ?? ?????????? ??? ? ?????? ??????-?????? ??????. (?95) 98?-65-39, 98?-65-36 -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Feb 11 09:35:01 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat Feb 11 09:35:01 2006 Subject: [Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8) In-Reply-To: <43EC3261.2060601@ieee.org> References: <43EA6A34.2000202@gmail.com> <43EA7FE7.2040902@cox.net> <43EA8867.5080109@ee.byu.edu> <43EB7FE5.1050000@cox.net> <43EBA615.8020101@ieee.org> <43EBB782.1010509@cox.net> <43EC3261.2060601@ieee.org> Message-ID: On 2/9/06, Travis Oliphant wrote: > Sasha wrote: > > >Well, my results are different. > > > > [snip] > Yes, I think it's worth submitting. Most of the suggestions for > pointer-arithmetic for fast C-code were developed when processors spent > more time computing than fetching memory. Now it seem it's all about > fetching memory intelligently. > > The buffer[i]= > > style is even recommended according to the AMD-optimization book Sasha > pointed out. Pointers vs indexing is architecture and compiler dependent. My own experience is that recent gcc compilers produce better indexing code than they used to and the indexing instructions on newer cpus are faster. When I wrote the sorting routines pointers were faster, so I used them for quicksort and mergesort. Now I think indexing is faster and I am tempted to change the code. Indexing also looks cleaner to me. Chuck From martin.wiechert at gmx.de Sat Feb 11 10:52:02 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Sat Feb 11 10:52:02 2006 Subject: [Numpy-discussion] bug with NO_IMPORT_ARRAY / PY_ARRAY_UNIQUE_SYMBOL? was Re: [SciPy-user] segfault when calling PyArray_DescrFromType In-Reply-To: <43EB6436.1050307@ieee.org> References: <200602091141.51520.martin.wiechert@gmx.de> <200602091552.11896.martin.wiechert@gmx.de> <43EB6436.1050307@ieee.org> Message-ID: <200602111942.50704.martin.wiechert@gmx.de> Hi Travis, thanks for your help! I think there is a small bug with NO_IMPORT_ARRAY / PY_ARRAY_UNIQUE_SYMBOL in numpy-0.9.4. For ease of reference I've pasted part of __multiarray_api.h below. The problem I ran into is, that my "non-importing" source files, the ones defining NO_IMPORT_ARRAY, cannot see PyArray_API, because they obviously cannot know which name I chose in the importing file. E.g. I do #define PY_ARRAY_UNIQUE_SYMBOL my_name in the file which calls import_array (). Then the object generated will not have the symbol PyArray_API, because PyArray_API is replaced with my_name. But the sources with NO_IMPORT_ARRAY look for PyArray_API, because for them it is not replaced. Indeed inserting #define PyArray_API my_name into these files seems to fix the problem for me. Regards, Martin. #if defined(PY_ARRAY_UNIQUE_SYMBOL) #define PyArray_API PY_ARRAY_UNIQUE_SYMBOL #endif #if defined(NO_IMPORT) || defined(NO_IMPORT_ARRAY) extern void **PyArray_API; #else #if defined(PY_ARRAY_UNIQUE_SYMBOL) void **PyArray_API; #else static void **PyArray_API=NULL; #endif #endif On Thursday 09 February 2006 16:48, Travis Oliphant wrote: > Martin Wiechert wrote: > >Found it (in the "old" docs). > >Must #define PY_ARRAY_UNIQUE_SYMBOL and call import_array (). > > To be clear, you must call import_array() in the modules init function. > This is the only requirement. > > You only have to define PY_ARRAY_UNIQUE_SYMBOL if your extension module > uses more than one file. In the files without the module initialization > code you also have to define NO_IMPORT_ARRAY. > > -Travis From oliphant.travis at ieee.org Sat Feb 11 14:24:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 11 14:24:04 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EDDC6C.6040005@web.de> References: <43EDDC6C.6040005@web.de> Message-ID: <43EE63E1.10601@ieee.org> N. Volbers wrote: > I continue to learn all about the heterogeneous arrays... > > When I was reading through the records.py code I discovered that > besides the 'names' and 'formats' for the fields of a numpy array you > can also specify 'titles'. Playing around with this feature I > discovered a bug: > > >>> import numpy > >>> mydata = [(1,1), (2,4), (3,9)] > >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'], > 'titles': ['col2', 'col1']} > >>> b = numpy.array( mydata, dtype=mytype) > >>> print b > [(1.0, 1.0) (4.0, 4.0) (9.0, 9.0)] > > This seems to be caused by the fact that you can access a field by > both the name and the field title. Why would you want to have two > names anyway? This lets you use attribute look up on the names but have the titles be the "true name" of the field. I've fixed this in SVN, so that it raises an error when the titles have the same names as the columns. Thanks for the test. -Travis From oliphant.travis at ieee.org Sat Feb 11 14:53:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 11 14:53:01 2006 Subject: [Numpy-discussion] More numpy and Numeric differences In-Reply-To: References: Message-ID: <43EE6AAB.9010408@ieee.org> Pearu Peterson wrote: > > I have created a wiki page > > http://scipy.org/PearuPeterson/NumpyVersusNumeric > > that reports my findings on how numpy and Numeric behave on various > corner cases. Travis O., could you take a look at it? > Here is the most recent addition: > I fixed the put issue. The problem with clip is actually in choose (clip is just a specific application of choose). The problem is in PyArray_ConvertToCommonType. You have an integer array, an integer scalar, and a floating-point scalar. I think the rules implemented in PyArray_ConvertToCommonType are not allowing the scalar to dictate anything. But, this should clearly be changed to allow scalars of different "kinds" to up-cast the array. This would be consistent with the umath module. So, PyArray_ConvertToCommonType needs to be improved. This will have an impact on several other functions that use this C-API. -Travis From mithrandir42 at web.de Sun Feb 12 01:03:04 2006 From: mithrandir42 at web.de (N. Volbers) Date: Sun Feb 12 01:03:04 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EE63E1.10601@ieee.org> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> Message-ID: <43EEF995.10706@web.de> (sorry Travis, I accidentally replied first to you directly, and not to the list) Travis Oliphant wrote: > N. Volbers wrote: > >> [...]This seems to be caused by the fact that you can access a field >> by both the name and the field title. Why would you want to have two >> names anyway? > > > This lets you use attribute look up on the names but have the titles > be the "true name" of the field. > I still don't understand the reason for keeping two different names. IMHO it adds some extra complexity and might be a potential source for errors. If I keep an extra title in the array, then I think I should be allowed to name it whatever I want. If this is not the case, then I would be better off to just have unique field names and keep my extra information about the fields in a separate dictionary with the field names as keys and the extra information as value. This is my current approach, which works quite well; unfortunately the extra information is not saved in the array itself. Is anybody actually using both names and titles? Best regards, Niklas. From bblais at bryant.edu Sun Feb 12 05:03:02 2006 From: bblais at bryant.edu (Brian Blais) Date: Sun Feb 12 05:03:02 2006 Subject: [Numpy-discussion] gnuplot problem with numpy In-Reply-To: <129e1cd10602110904i9177e55t@mail.gmail.com> References: <43ECC154.1000004@bryant.edu> <129e1cd10602110904i9177e55t@mail.gmail.com> Message-ID: <43EF3156.9040601@bryant.edu> David TREMOUILLES wrote: > Hello, > Maybe you have to upgrade your numeric to 24.2 bingo! thanks. I had already upgraded my numpy, and since I kept seeing "numpy=Numeric" in many threads, I didn't think to upgrade that as well. thanks, Brian Blais -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From gerard.vermeulen at grenoble.cnrs.fr Sun Feb 12 05:26:07 2006 From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen) Date: Sun Feb 12 05:26:07 2006 Subject: [Numpy-discussion] ANN: first release of IVuPy-0.1 Message-ID: <20060212142525.45b6b924.gerard.vermeulen@grenoble.cnrs.fr> I am proud to announce IVuPy-0.1 (I-View-Py). IVuPy is a Python extension module developed to write Python programs for 3D visualization of large data sets using Qt and PyQt. Python is extended by IVuPy with more than 600 classes of two of the Coin3D C++ class libraries: Coin and SoQt. Coin is compatible with the Open Inventor API. Open Inventor is an object-oriented 3D toolkit built on OpenGL that provides a 3D scene database, a built-in event model for user interaction, and the ability to print objects and exchange data with other graphics formats. The SoQt library interfaces Coin to Qt. See http://www.coin3d.org for more information on Coin3D. IVuPy requires at least one of the Numerical Python extension modules: NumPy, Numeric, or numarray (IVuPy works with all of them at once). Data transfer between the Numerical Python arrays and the Coin data structures has been implemented by copying. The design of the Open Inventor API favors ease of use over performance. The API is a natural match for Python, and in my opinion it is fun to program with IVuPy. The performance penalty of the design choice is small. The first example at http://ivupy.sourceforge.net/examples.html demonstrates this: NumPy calculates a surface with a million nodes in 1.7 seconds and Coin3D redisplays the surface in 0.3 seconds on my Linux system with a 3.6 GHz Pentium and a nVidea graphics card (NV41.1). The Inventor Mentor ( http://www.google.com/search?q=inventor+mentor ) is essential for learning IVuPy. The IVuPy documentation supplements the Inventor Mentor. IVuPy includes all C++ examples from the Inventor Mentor and their Python translations. There are also more advanced examples to show the integration of IVuPy and PyQt. IVuPy has been used for almost 6 months on Linux and Windows in the development of a preprocessor for a finite element flow solver and has been proven to be very stable. Prerequisites for IVuPy are: - Python-2.4.x or -2.3.x - at least one of NumPy, numarray, or Numeric - Qt-3.3.x, -3.2.x, or -3.1.x - SIP-4.3.x or -4.2.1 - PyQt-3.15.x or -3.14.1 - Coin-2.4.4 or -2.4.3 - SoQt-1.3.0 or -1.2.0 IVuPy is licensed under the terms of the GPL. Contact me, if the GPL is an obstacle for you. http://ivupy.sourceforge.net is the home page of IVuPy. Have fun -- Gerard Vermeulen From faltet at carabos.com Mon Feb 13 01:03:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 13 01:03:03 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EEF995.10706@web.de> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> <43EEF995.10706@web.de> Message-ID: <1139821316.7532.5.camel@localhost.localdomain> El dg 12 de 02 del 2006 a les 09:02 +0000, en/na N. Volbers va escriure: > I still don't understand the reason for keeping two different names. > IMHO it adds some extra complexity and might be a potential source for > errors. If I keep an extra title in the array, then I think I should be > allowed to name it whatever I want. If this is not the case, then I > would be better off to just have unique field names and keep my extra > information about the fields in a separate dictionary with the field > names as keys and the extra information as value. This is my current > approach, which works quite well; unfortunately the extra information is > not saved in the array itself. Yes. I agree that accessing fields by both name and title might become a common source of confusion. So, in order to avoid problems in the future, I wouldn't let the users to access the fields by title. > > Is anybody actually using both names and titles? Not me. -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From aisaac at american.edu Mon Feb 13 05:30:07 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 13 05:30:07 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <1139821316.7532.5.camel@localhost.localdomain> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org><43EEF995.10706@web.de><1139821316.7532.5.camel@localhost.localdomain> Message-ID: >> Is anybody actually using both names and titles? On Mon, 13 Feb 2006, Francesc Altet apparently wrote: > Not me. Is the "title" the appropriate storage for the "displayname" for fields that are to be plotted? Or not? Thanks, Alan Isaac From faltet at carabos.com Mon Feb 13 06:08:15 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon Feb 13 06:08:15 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> <43EEF995.10706@web.de><1139821316.7532.5.camel@localhost.localdomain> Message-ID: <1139839637.7532.14.camel@localhost.localdomain> El dl 13 de 02 del 2006 a les 08:35 -0500, en/na Alan G Isaac va escriure: > >> Is anybody actually using both names and titles? > > On Mon, 13 Feb 2006, Francesc Altet apparently wrote: > > Not me. > > Is the "title" the appropriate storage for the "displayname" > for fields that are to be plotted? Or not? Uh, yes. Perhaps I messed up things. Of course it is interesting to have both a name and a title. What I tried to mean is that accessing fields by *both* names and titles might introduce confusion. For example, allowing: >>> mydata = [(1,1), (2,4), (3,9)] >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'],'titles': ['col 2', 'col 1']} >>> b = numpy.array( mydata, dtype=mytype) >>> b array([(1, 1.0), (2, 4.0), (3, 9.0)], dtype=(void,6)) >>> b['col1'] array([1, 2, 3], dtype=int16) >>> b['col 2'] array([1, 2, 3], dtype=int16) seems quite strange to me. My point is that I think that keys in arrays for accessing fields should be unique, and thus, I'd remove the last sentence as a valid one. But of course I think that having both names and titles is a good thing. Sorry for the confusion. Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V C?rabos Coop. V. Enjoy Data "-" From wbaxter at gmail.com Mon Feb 13 06:46:00 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 06:46:00 2006 Subject: [Numpy-discussion] matplotlib Message-ID: Anyone know if matplotlib is supposed to work with the new NumPy or if there is work afoot to make it work? It seems to truncate all numpy.array and numpy.matrix inputs to integer values: import matplotlib matplotlib.interactive(True) matplotlib.use('WXAgg') import matplotlib.pylab as g g.plot(rand(5),rand(5),'bo') just puts a dot at (0,0), while this g.plot(rand(5)*10,rand(5)*10,'bo') generates a plot of 5 points but all at integer coordinates. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryanlists at gmail.com Mon Feb 13 06:53:16 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 06:53:16 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: Message-ID: What version are you using? I know that CVS matplotlib works with numpy and I think the latest releases do as well. I think the current version is 0.86.2 On 2/13/06, Bill Baxter wrote: > Anyone know if matplotlib is supposed to work with the new NumPy or if there > is work afoot to make it work? > It seems to truncate all numpy.array and numpy.matrix inputs to integer > values: > > import matplotlib > matplotlib.interactive(True) > matplotlib.use('WXAgg') > import matplotlib.pylab as g > > g.plot(rand(5),rand(5),'bo') > > just puts a dot at (0,0), while this > > g.plot(rand(5)*10,rand(5)*10,'bo') > > generates a plot of 5 points but all at integer coordinates. > > > --bb > From wbaxter at gmail.com Mon Feb 13 07:07:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 07:07:01 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: Message-ID: I've got 0.86.2. It looks like if I do 'import pylab as g' it doesn't work, but 'from pylab import *' does for some reason. --bb On 2/13/06, Ryan Krauss wrote: > > What version are you using? I know that CVS matplotlib works with > numpy and I think the latest releases do as well. I think the current > version is 0.86.2 > > On 2/13/06, Bill Baxter wrote: > > Anyone know if matplotlib is supposed to work with the new NumPy or if > there > > is work afoot to make it work? > > It seems to truncate all numpy.array and numpy.matrix inputs to integer > > values: > > > > import matplotlib > > matplotlib.interactive(True) > > matplotlib.use('WXAgg') > > import matplotlib.pylab as g > > > > g.plot(rand(5),rand(5),'bo') > > > > just puts a dot at (0,0), while this > > > > g.plot(rand(5)*10,rand(5)*10,'bo') > > > > generates a plot of 5 points but all at integer coordinates. > > > > > > --bb > > > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 07:13:08 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 07:13:08 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: (Bill Baxter's message of "Mon, 13 Feb 2006 23:45:00 +0900") References: Message-ID: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Bill" == Bill Baxter writes: Bill> Anyone know if matplotlib is supposed to work with the new Bill> NumPy or if there is work afoot to make it work? It seems Bill> to truncate all numpy.array and numpy.matrix inputs to Bill> integer values: You're script as posted is incomplete import matplotlib matplotlib.interactive(True) matplotlib.use('WXAgg') import matplotlib.pylab as g g.plot(rand(5),rand(5),'bo') where for example is rand coming from? My guess is you have an import statement you are not showing us. If you are using a recent numpy and matplotlib, and set numerix to numpy in your matplotlib rc file (~/.matplotlib/matplotlibrc) everything should work if you get your array symbols from pylab, numpy or matplotlib.numerix (all of which will get their symbols from numpy....) JDH From wbaxter at gmail.com Mon Feb 13 07:29:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 07:29:02 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: from numpy import * was the only line missing, called before the rest. It seems to work fine if I use from pylab import * instead of import pylab as g And actually if I do both in this order: import pylab as g from pylab import * then plot() and g.plot() both do the right thing (no truncating of floats). Seems as if there's some initialization code that only gets run with the 'from pylab import *' version. --bb On 2/14/06, John Hunter wrote: > > >>>>> "Bill" == Bill Baxter writes: > > Bill> Anyone know if matplotlib is supposed to work with the new > Bill> NumPy or if there is work afoot to make it work? It seems > Bill> to truncate all numpy.array and numpy.matrix inputs to > Bill> integer values: > > You're script as posted is incomplete > > import matplotlib > matplotlib.interactive(True) > matplotlib.use('WXAgg') > import matplotlib.pylab as g > > g.plot(rand(5),rand(5),'bo') > > where for example is rand coming from? My guess is you have an import > statement you are not showing us. > > If you are using a recent numpy and matplotlib, and set numerix to > numpy in your matplotlib rc file (~/.matplotlib/matplotlibrc) > everything should work if you get your array symbols from pylab, numpy > or matplotlib.numerix (all of which will get their symbols from > numpy....) > > JDH > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 07:34:05 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 07:34:05 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: (Bill Baxter's message of "Tue, 14 Feb 2006 00:28:07 +0900") References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Bill" == Bill Baxter writes: Bill> from numpy import * was the only line missing, called before Bill> the rest. It seems to work fine if I use from pylab import Bill> * instead of import pylab as g Bill> And actually if I do both in this order: import pylab as g Bill> from pylab import * Bill> Seems as if there's some Bill> initialization code that only gets run with the 'from pylab Bill> import *' version. As far as I know that is a python impossibility, unless perhaps you do some deep dark magic that is beyond my grasp. pylab doesn't know how it is imported. Are you sure you have your numerix set properly? I suggest creating two free standing scripts, one with the problem and one without, and running both with --verbose-helpful to make sure that your settings are what you think they are. If you verify that numerix is set properly and still see the problem, I would like to see both scripts in case it is exposing a problem with matplotlib. Of course, doing multiple import * commands is a recipe for long term pain, especially with packages that have so much overlapping namespace and numpy/scipy/pylab. JDH From wbaxter at gmail.com Mon Feb 13 07:59:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 07:59:03 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: Ah, ok. You're right. Doing from pylab import * was actually just overwriting the definition of array and rand() to be those from Numeric, which pylab was picking to use by default. I guess my expectation was that pylab would default to using the best numerical package installed. With "numerix : numpy" in my ~/.matplotlib/matplotlibrc file, it seems to be working properly now. Thanks for the help! --bb On 2/14/06, John Hunter wrote: > > >>>>> "Bill" == Bill Baxter writes: > > Bill> from numpy import * was the only line missing, called before > Bill> the rest. It seems to work fine if I use from pylab import > Bill> * instead of import pylab as g > > Bill> And actually if I do both in this order: import pylab as g > Bill> from pylab import * > > Bill> Seems as if there's some > Bill> initialization code that only gets run with the 'from pylab > Bill> import *' version. > > As far as I know that is a python impossibility, unless perhaps you do > some deep dark magic that is beyond my grasp. pylab doesn't know how > it is imported. > > Are you sure you have your numerix set properly? I suggest creating > two free standing scripts, one with the problem and one without, and > running both with --verbose-helpful to make sure that your settings > are what you think they are. If you verify that numerix is set > properly and still see the problem, I would like to see both scripts > in case it is exposing a problem with matplotlib. > > Of course, doing multiple import * commands is a recipe for long term > pain, especially with packages that have so much overlapping namespace > and numpy/scipy/pylab. > > JDH > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryanlists at gmail.com Mon Feb 13 09:01:03 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 09:01:03 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: The point of the numerix setting in the rc file is that matplotlib can't tell you what is the best numerical package to use for your problem. On 2/13/06, Bill Baxter wrote: > Ah, ok. You're right. Doing from pylab import * was actually just > overwriting the definition of array and rand() to be those from Numeric, > which pylab was picking to use by default. I guess my expectation was that > pylab would default to using the best numerical package installed. > > With "numerix : numpy" in my ~/.matplotlib/matplotlibrc file, it seems to be > working properly now. > > Thanks for the help! > > --bb > > On 2/14/06, John Hunter wrote: > > >>>>> "Bill" == Bill Baxter writes: > > > > Bill> from numpy import * was the only line missing, called before > > Bill> the rest. It seems to work fine if I use from pylab import > > Bill> * instead of import pylab as g > > > > Bill> And actually if I do both in this order: import pylab as g > > Bill> from pylab import * > > > > Bill> Seems as if there's some > > Bill> initialization code that only gets run with the 'from pylab > > Bill> import *' version. > > > > As far as I know that is a python impossibility, unless perhaps you do > > some deep dark magic that is beyond my grasp. pylab doesn't know how > > it is imported. > > > > Are you sure you have your numerix set properly? I suggest creating > > two free standing scripts, one with the problem and one without, and > > running both with --verbose-helpful to make sure that your settings > > are what you think they are. If you verify that numerix is set > > properly and still see the problem, I would like to see both scripts > > in case it is exposing a problem with matplotlib. > > > > Of course, doing multiple import * commands is a recipe for long term > > pain, especially with packages that have so much overlapping namespace > > and numpy/scipy/pylab. > > > > JDH > > > > > From ryanlists at gmail.com Mon Feb 13 09:45:04 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 09:45:04 2006 Subject: [Numpy-discussion] indexing problem Message-ID: I am having a problem with indexing an array and not getting the expected scalar behavior for complex128scalar: In [44]: c Out[44]: array([ 3.31781200e+06, 2.20157529e+13, 1.46088259e+20, 9.69386754e+26, 6.43248601e+33, 4.26835585e+40, 2.83232045e+47, 1.87942136e+54, 1.24711335e+61, 8.27537526e+67]) In [45]: s=c[-1]*1.0j In [46]: type(s) Out[46]: In [47]: s**2 Out[47]: (-6.848183561893313e+135+8.3863291020365108e+119j) In [48]: s=8.27537526e+67*1.0j In [49]: type(s) Out[49]: In [50]: s**2 Out[50]: (-6.8481835693820068e+135+0j) Why does result 47 have a non-zero imaginary part? Ryan From ryanlists at gmail.com Mon Feb 13 09:54:02 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 09:54:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: Message-ID: This may only be a problem for ridiculously large numbers. I actually meant to be dealing with these values: In [75]: d Out[75]: array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, 19985.94891221, 24674.01100272]) In [76]: s=d[-1]*1.0j In [77]: s Out[77]: 24674.011002723393j In [78]: type(s) Out[78]: In [79]: s**2 Out[79]: (-608806818.96251547+7.4554869875188623e-08j) So perhaps the previous difference of 26 orders of magnitude really did mean that the imaginary part was negligibly small, that just got obscured by the fact that the real part was order 1e+135. On 2/13/06, Ryan Krauss wrote: > I am having a problem with indexing an array and not getting the > expected scalar behavior for complex128scalar: > > In [44]: c > Out[44]: > array([ 3.31781200e+06, 2.20157529e+13, 1.46088259e+20, > 9.69386754e+26, 6.43248601e+33, 4.26835585e+40, > 2.83232045e+47, 1.87942136e+54, 1.24711335e+61, > 8.27537526e+67]) > > In [45]: s=c[-1]*1.0j > > In [46]: type(s) > Out[46]: > > In [47]: s**2 > Out[47]: (-6.848183561893313e+135+8.3863291020365108e+119j) > > In [48]: s=8.27537526e+67*1.0j > > In [49]: type(s) > Out[49]: > > In [50]: s**2 > Out[50]: (-6.8481835693820068e+135+0j) > > Why does result 47 have a non-zero imaginary part? > > Ryan > From russel at appliedminds.com Mon Feb 13 10:08:13 2006 From: russel at appliedminds.com (Russel Howe) Date: Mon Feb 13 10:08:13 2006 Subject: [Numpy-discussion] String array equality test does not broadcast Message-ID: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> I am converting some numarray code to numpy and I noticed this behavior: >>> from numpy import * >>> sta=array(['abc', 'def', 'ghi']) >>> stb=array(['abc', 'jkl', 'ghi']) >>> sta==stb False I expected the same as this: >>> a1=array([1,2,3]) >>> a2=array([1,4,3]) >>> a1==a2 array([True, False, True], dtype=bool) I am trying to figure out how to fix this now... From chanley at stsci.edu Mon Feb 13 10:57:03 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Mon Feb 13 10:57:03 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EDDC6C.6040005@web.de> References: <43EDDC6C.6040005@web.de> Message-ID: <43F0D640.3000308@stsci.edu> N. Volbers wrote: > > By the way, is there an easy way to access a field vector by its index? > Right now I retrieve the field name from dtype.fields[-1][index] and > then return the 'column' by using myarray[name]. Travis, Perhaps we could add a field method to recarray like numarray's? This would allow access by both field name and "column" index. This would be nice for people who are using this convention and are making the switch from numarray. Chris From Fernando.Perez at colorado.edu Mon Feb 13 11:06:01 2006 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Mon Feb 13 11:06:01 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: <43F0D851.7000807@colorado.edu> John Hunter wrote: >>>>>>"Bill" == Bill Baxter writes: > > > Bill> from numpy import * was the only line missing, called before > Bill> the rest. It seems to work fine if I use from pylab import > Bill> * instead of import pylab as g > > Bill> And actually if I do both in this order: import pylab as g > Bill> from pylab import * > > Bill> Seems as if there's some > Bill> initialization code that only gets run with the 'from pylab > Bill> import *' version. > > As far as I know that is a python impossibility, unless perhaps you do > some deep dark magic that is beyond my grasp. pylab doesn't know how > it is imported. Actually, a little creative use of sys._getframe() can tell you that, in some instances (if the import was done via pure python and the source can be found via inspect, it will fail for extension code and if inspect runs into trouble). If you _really_ want, you can also use dis.dis() on the frame above you and analyze the bytecode. But I seriously doubt matplotlib goes to such unpleasant extremes in this case :) Cheers, f ps - for the morbidly curious, here's how to do this: planck[import_tricks]> cat all.py from trick import * planck[import_tricks]> cat mod.py import trick planck[import_tricks]> cat trick.py import sys,dis f = sys._getframe(1) print f.f_code print dis.dis(f.f_code) planck[import_tricks]> python all.py 1 0 LOAD_CONST 0 (('*',)) 3 IMPORT_NAME 0 (trick) 6 IMPORT_STAR 7 LOAD_CONST 1 (None) 10 RETURN_VALUE None planck[import_tricks]> python mod.py 1 0 LOAD_CONST 0 (None) 3 IMPORT_NAME 0 (trick) 6 STORE_NAME 0 (trick) 9 LOAD_CONST 0 (None) 12 RETURN_VALUE None #### Since the code object has a file path and line number, you could fetch that and look at the source directly instead of dealing with the bytecode. From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 11:09:05 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 11:09:05 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: <43F0D851.7000807@colorado.edu> (Fernando Perez's message of "Mon, 13 Feb 2006 12:04:49 -0700") References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> <43F0D851.7000807@colorado.edu> Message-ID: <87d5hrccil.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Fernando" == Fernando Perez writes: Fernando> Actually, a little creative use of sys._getframe() can Fernando> tell you that, in some instances (if the import was done Which is why I wrote "as far as I know..." because in real life almost nothing is impossible in python if you are willing to get in and inspect and modify the stack. Fernando> But I seriously doubt matplotlib goes to such unpleasant Fernando> extremes in this case :) No, we'll leave that kind of magic to you and the ipython crew :-) JDH From tim.hochberg at cox.net Mon Feb 13 11:31:05 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 11:31:05 2006 Subject: [Numpy-discussion] Re: indexing problem Message-ID: <43F0DE44.9040506@cox.net> I've been trying to look into the problem described below, but I just can't find where complex multiplication is being done (all the other multiplication, but not complex). Could someone with a grasp of the innards of numpy please point me in the right direction? Thanks, -tim Ryan Krauss wrote: >This may only be a problem for ridiculously large numbers. I actually >meant to be dealing with these values: > >In [75]: d >Out[75]: >array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, > 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, > 19985.94891221, 24674.01100272]) > >In [76]: s=d[-1]*1.0j > >In [77]: s >Out[77]: 24674.011002723393j > >In [78]: type(s) >Out[78]: > >In [79]: s**2 >Out[79]: (-608806818.96251547+7.4554869875188623e-08j) > >So perhaps the previous difference of 26 orders of magnitude really >did mean that the imaginary part was negligibly small, that just got >obscured by the fact that the real part was order 1e+135. > >On 2/13/06, Ryan Krauss wrote: > > >>I am having a problem with indexing an array and not getting the >>expected scalar behavior for complex128scalar: >> >>In [44]: c >>Out[44]: >>array([ 3.31781200e+06, 2.20157529e+13, 1.46088259e+20, >> 9.69386754e+26, 6.43248601e+33, 4.26835585e+40, >> 2.83232045e+47, 1.87942136e+54, 1.24711335e+61, >> 8.27537526e+67]) >> >>In [45]: s=c[-1]*1.0j >> >>In [46]: type(s) >>Out[46]: >> >>In [47]: s**2 >>Out[47]: (-6.848183561893313e+135+8.3863291020365108e+119j) >> >>In [48]: s=8.27537526e+67*1.0j >> >>In [49]: type(s) >>Out[49]: >> >>In [50]: s**2 >>Out[50]: (-6.8481835693820068e+135+0j) >> >>Why does result 47 have a non-zero imaginary part? >> >>Ryan >> >> >> > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From aisaac at american.edu Mon Feb 13 12:13:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 13 12:13:01 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <1139839637.7532.14.camel@localhost.localdomain> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org><43EEF995.10706@web.de><1139821316.7532.5.camel@localhost.localdomain><1139839637.7532.14.camel@localhost.localdomain> Message-ID: On Mon, 13 Feb 2006, Francesc Altet apparently wrote: > My point is that I think that keys in arrays for accessing > fields should be unique > But of course I think that having both names and titles is > a good thing. OK. We're in agreement then. Thanks, Alan From oliphant at ee.byu.edu Mon Feb 13 13:59:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 13 13:59:01 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> Message-ID: <43F100F6.1020200@ee.byu.edu> Russel Howe wrote: > I am converting some numarray code to numpy and I noticed this behavior: > > >>> from numpy import * > >>> sta=array(['abc', 'def', 'ghi']) > >>> stb=array(['abc', 'jkl', 'ghi']) > >>> sta==stb > False > > I expected the same as this: > >>> a1=array([1,2,3]) > >>> a2=array([1,4,3]) > >>> a1==a2 > array([True, False, True], dtype=bool) > > I am trying to figure out how to fix this now... Equality testing on string arrays does not work (equality testing uses ufuncs internally which are not supported generally for flexible arrays). You must use chararray's. Thus, sta.view(chararray) == stb.view(chararray) Or create chararray's from the beginning: sta = char.array(['abc','def','ghi']) stb = char.array(['abc','jkl','ghi']) Char arrays are a special subclass of the ndarray that give arrays all the methods of strings (and unicode) elements and allow (rich) comparison operations. -Travis From oliphant at ee.byu.edu Mon Feb 13 14:08:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 13 14:08:03 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F0DE44.9040506@cox.net> References: <43F0DE44.9040506@cox.net> Message-ID: <43F10318.5090507@ee.byu.edu> Tim Hochberg wrote: > > I've been trying to look into the problem described below, but I just > can't find where complex multiplication is being done (all the other > multiplication, but not complex). Could someone with a grasp of the > innards of numpy please point me in the right direction? Look in the build directory for __umath_generated.c. In there you will see that multiplication for complex numbers is done using PyUFunc_FF_F and friends (i.e. using a generic interface for wrapping a "scalar" function). The scalar function wrapped into a ufunc vectorized function is given in multiply_data. In that file you should see it present as nc_prodf, nc_prod, nc_prodl. nc_prod and friends are implemented in umathmodule.c.src -Travis From russel at appliedminds.com Mon Feb 13 14:43:14 2006 From: russel at appliedminds.com (Russel Howe) Date: Mon Feb 13 14:43:14 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <43F100F6.1020200@ee.byu.edu> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> <43F100F6.1020200@ee.byu.edu> Message-ID: OK, Thanks. Russel > Equality testing on string arrays does not work (equality testing > uses ufuncs internally which are not supported generally for > flexible arrays). You must use chararray's. > > Thus, > > sta.view(chararray) == stb.view(chararray) > > Or create chararray's from the beginning: > > sta = char.array(['abc','def','ghi']) > stb = char.array(['abc','jkl','ghi']) > > Char arrays are a special subclass of the ndarray that give arrays > all the methods of strings (and unicode) elements and allow (rich) > comparison operations. > > -Travis > > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at cox.net Mon Feb 13 14:52:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 14:52:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F10318.5090507@ee.byu.edu> References: <43F0DE44.9040506@cox.net> <43F10318.5090507@ee.byu.edu> Message-ID: <43F10D4B.9050501@cox.net> Travis Oliphant wrote: > Tim Hochberg wrote: > >> >> I've been trying to look into the problem described below, but I just >> can't find where complex multiplication is being done (all the other >> multiplication, but not complex). Could someone with a grasp of the >> innards of numpy please point me in the right direction? > > > Look in the build directory for __umath_generated.c. In there you > will see that multiplication for complex numbers is done using > PyUFunc_FF_F and friends (i.e. using a generic interface for wrapping > a "scalar" function). The scalar function wrapped into a ufunc > vectorized function is given in multiply_data. In that file you > should see it present as nc_prodf, nc_prod, nc_prodl. > > nc_prod and friends are implemented in umathmodule.c.src Thanks Travis, it would have taken me a while to track them down. As it turns out I was going off on the wrong track as I'll report in my next message. -tim From tim.hochberg at cox.net Mon Feb 13 15:05:05 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 15:05:05 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> Message-ID: <43F11086.308@cox.net> >> >>Ryan Krauss wrote: >> >> >> >>>This may only be a problem for ridiculously large numbers. I actually >>>meant to be dealing with these values: >>> >>>In [75]: d >>>Out[75]: >>>array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, >>> 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, >>> 19985.94891221, 24674.01100272]) >>> >>>In [76]: s=d[-1]*1.0j >>> >>>In [77]: s >>>Out[77]: 24674.011002723393j >>> >>>In [78]: type(s) >>>Out[78]: >>> >>>In [79]: s**2 >>>Out[79]: (-608806818.96251547+7.4554869875188623e-08j) >>> >>>So perhaps the previous difference of 26 orders of magnitude really >>>did mean that the imaginary part was negligibly small, that just got >>>obscured by the fact that the real part was order 1e+135. >>> >>>On 2/13/06, Ryan Krauss wrote: >>> >>> I got myself all tied up in a knot over this because I couldn't figure out how multiplying two purely complex numbers was going to result in something with a complex portion. Since I couldn't find the complex routines my imagination went wild: perhaps, I thought, numpy uses the complex multiplication routine that uses 3 multiplies instead of the more straightforward one that uses 4 multiples, etc, etc. None of these panned out, and of course they all evaporated when I got pointed to the code that implements this which is pure vanilla. All the time I was overlooking the obvious: Ryan is using s**2, not s*s. So the obvious answer, is that he's just seeing normal error in the function that is implementing pow. If this is inacuracy is problem, I'd just replace s**2 with s*s. It will probably be both faster and more accurate anyway Foolishly, -tim From Chris.Barker at noaa.gov Mon Feb 13 15:07:06 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon Feb 13 15:07:06 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> Message-ID: <43F110D6.6060302@noaa.gov> Sasha wrote: > I updated the "broadcasting" entry. I don't think examples belong to a > glossary. I think a glossary should be more like a quick reference > rather than a tutorial. Unfortunately the broadcasting is one of > those concepts that will never be clear without examples. Then a wiki page on broadcasting may be in order, and the glossary could link to it. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ryanlists at gmail.com Mon Feb 13 15:16:03 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 15:16:03 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F11086.308@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> Message-ID: At the risk of sounding silly, can you explain to me in simple terms why s**2 is less accurate than s*s. I can sort of intuitively appreciate that that would be true, but but might like just a little more detail. Thanks, Ryan On 2/13/06, Tim Hochberg wrote: > > >> > >>Ryan Krauss wrote: > >> > >> > >> > >>>This may only be a problem for ridiculously large numbers. I actually > >>>meant to be dealing with these values: > >>> > >>>In [75]: d > >>>Out[75]: > >>>array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, > >>> 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, > >>> 19985.94891221, 24674.01100272]) > >>> > >>>In [76]: s=d[-1]*1.0j > >>> > >>>In [77]: s > >>>Out[77]: 24674.011002723393j > >>> > >>>In [78]: type(s) > >>>Out[78]: > >>> > >>>In [79]: s**2 > >>>Out[79]: (-608806818.96251547+7.4554869875188623e-08j) > >>> > >>>So perhaps the previous difference of 26 orders of magnitude really > >>>did mean that the imaginary part was negligibly small, that just got > >>>obscured by the fact that the real part was order 1e+135. > >>> > >>>On 2/13/06, Ryan Krauss wrote: > >>> > >>> > > I got myself all tied up in a knot over this because I couldn't figure > out how multiplying two purely complex numbers was going to result in > something with a complex portion. Since I couldn't find the complex > routines my imagination went wild: perhaps, I thought, numpy uses the > complex multiplication routine that uses 3 multiplies instead of the > more straightforward one that uses 4 multiples, etc, etc. None of these > panned out, and of course they all evaporated when I got pointed to the > code that implements this which is pure vanilla. All the time I was > overlooking the obvious: > > Ryan is using s**2, not s*s. > > So the obvious answer, is that he's just seeing normal error in the > function that is implementing pow. > > If this is inacuracy is problem, I'd just replace s**2 with s*s. It will > probably be both faster and more accurate anyway > > Foolishly, > > -tim > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From tim.hochberg at cox.net Mon Feb 13 15:34:01 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 15:34:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> Message-ID: <43F11716.9050204@cox.net> Ryan Krauss wrote: >At the risk of sounding silly, can you explain to me in simple terms >why s**2 is less accurate than s*s. I can sort of intuitively >appreciate that that would be true, but but might like just a little >more detail. > > I don't know that it has to be *less* accurate, although it's unlikely to be more accurate since s*s should be nearly as accurate as you get with floating point. Multiplying two complex numbers in numpy is done in the most straightforward way imaginable: result.real = z1.real*z2.real - z1.imag*z2.imag result.image = z1.real*z2.imag + z1.imag*z2.real The individual results lose very little precision and the overall result will be nearly exact to within the limits of floating point. On the other hand, s**2 is being calculated by a completely different route. Something that will look like: result = pow(s, 2.0) Pow is some general function that computes the value of s to any power. As such it's a lot more complicated than the above simple expression. I don't think that there's any reason in principle that pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff between accuracy, speed and simplicity of implementation. That being said, it may be worthwhile having a look at complex pow and see if there's anything suspicious that might make the error larger than it needs to be. If all of that sounds a little bit like "I really know", there's some of that in there too. Regards, -tim >Thanks, > >Ryan > >On 2/13/06, Tim Hochberg wrote: > > >>>>Ryan Krauss wrote: >>>> >>>> >>>> >>>> >>>> >>>>>This may only be a problem for ridiculously large numbers. I actually >>>>>meant to be dealing with these values: >>>>> >>>>>In [75]: d >>>>>Out[75]: >>>>>array([ 246.74011003, 986.96044011, 2220.66099025, 3947.84176044, >>>>> 6168.50275068, 8882.64396098, 12090.26539133, 15791.36704174, >>>>> 19985.94891221, 24674.01100272]) >>>>> >>>>>In [76]: s=d[-1]*1.0j >>>>> >>>>>In [77]: s >>>>>Out[77]: 24674.011002723393j >>>>> >>>>>In [78]: type(s) >>>>>Out[78]: >>>>> >>>>>In [79]: s**2 >>>>>Out[79]: (-608806818.96251547+7.4554869875188623e-08j) >>>>> >>>>>So perhaps the previous difference of 26 orders of magnitude really >>>>>did mean that the imaginary part was negligibly small, that just got >>>>>obscured by the fact that the real part was order 1e+135. >>>>> >>>>>On 2/13/06, Ryan Krauss wrote: >>>>> >>>>> >>>>> >>>>> >>I got myself all tied up in a knot over this because I couldn't figure >>out how multiplying two purely complex numbers was going to result in >>something with a complex portion. Since I couldn't find the complex >>routines my imagination went wild: perhaps, I thought, numpy uses the >>complex multiplication routine that uses 3 multiplies instead of the >>more straightforward one that uses 4 multiples, etc, etc. None of these >>panned out, and of course they all evaporated when I got pointed to the >>code that implements this which is pure vanilla. All the time I was >>overlooking the obvious: >> >>Ryan is using s**2, not s*s. >> >>So the obvious answer, is that he's just seeing normal error in the >>function that is implementing pow. >> >>If this is inacuracy is problem, I'd just replace s**2 with s*s. It will >>probably be both faster and more accurate anyway >> >>Foolishly, >> >>-tim >> >> >> >> >> >> >> >>------------------------------------------------------- >>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >>for problems? Stop! Download the new AJAX search engine that makes >>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >>_______________________________________________ >>Numpy-discussion mailing list >>Numpy-discussion at lists.sourceforge.net >>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> >> > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From cookedm at physics.mcmaster.ca Mon Feb 13 16:22:05 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Mon Feb 13 16:22:05 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F11716.9050204@cox.net> (Tim Hochberg's message of "Mon, 13 Feb 2006 16:32:38 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> Message-ID: Tim Hochberg writes: > Ryan Krauss wrote: > >>At the risk of sounding silly, can you explain to me in simple terms >>why s**2 is less accurate than s*s. I can sort of intuitively >>appreciate that that would be true, but but might like just a little >>more detail. >> >> > I don't know that it has to be *less* accurate, although it's unlikely > to be more accurate since s*s should be nearly as accurate as you get > with floating point. Multiplying two complex numbers in numpy is done > in the most straightforward way imaginable: > > result.real = z1.real*z2.real - z1.imag*z2.imag > result.image = z1.real*z2.imag + z1.imag*z2.real > > The individual results lose very little precision and the overall > result will be nearly exact to within the limits of floating point. > > On the other hand, s**2 is being calculated by a completely different > route. Something that will look like: > > result = pow(s, 2.0) > > Pow is some general function that computes the value of s to any > power. As such it's a lot more complicated than the above simple > expression. I don't think that there's any reason in principle that > pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff > between accuracy, speed and simplicity of implementation. On a close tangent, I had a patch at one point for Numeric (never committed) that did pow(s, 2.0) (= s**2) actually as s*s at the C level (no pow), which helped a lot in speed (currently, s**2 is slower than s*s). I should have another look at that. The difference is speed is pretty bad: for an array of 100 complex elements, s**2 is 68.4 usec/loop as opposed to s*s with 4.13 usec/loop on my machine. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant at ee.byu.edu Mon Feb 13 16:34:07 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 13 16:34:07 2006 Subject: [Numpy-discussion] Re: ***[Possible UCE]*** Test units on unicode types In-Reply-To: <1139858712.7532.33.camel@localhost.localdomain> References: <1139858712.7532.33.camel@localhost.localdomain> Message-ID: <43F12540.4050006@ee.byu.edu> Francesc Altet wrote: >Hi Travis, > >I've finished a series of tests on your recent new implementation of >unicode types in NumPy. They discovered a couple of issues in Numpy: one >is clearly a bug that show up in UCS2 builds (see the patch attached). >The other, well, it is not clear to me if it is a bug or not: > > Thanks very much for these tests... They are very, very useful. I recently realized that the getitem material must make copies for misaligned data because the convert to UCS2 functions expect aligned data (on Solaris anyway it would cause a segfault). You caught an obvious mistake in that code. >>>>ia1=numpy.array([1]) >>>>type(ia1) >>>> >>>> > > > >>>>type(ia1.view()) >>>> >>>> > > >However, for 0-dimensional arrays: > > > >>>ia0=numpy.array(1) >>>type(ia0) >>> >>> Francesc Altet wrote: >>>type(ia0.view()) >>> >>> !!!!!! Do you think the next this is a bug or a feature? My opinion is that it is a bug, but maybe I'm wrong. In fact, this has a very bad effect on unicode objects in UCS2 interpreters: Almost all of the methods right now, return scalars instead of 0-dimensional arrays on purpose. This was intentional because 0-dimensional arrays were not supposed to be handed around in Python. But, we were unable to completely eliminate them at this point. So, I suppose, though there are a select few methods that should not automatically convert 0-dimensional arrays to the equivalent scalar. .copy() is one of them *already changed* .view() is probably another *should be changed*. If you can think of other methods that should not return scalars instead of 0-dimensional arrays, post it. -Travis From wbaxter at gmail.com Mon Feb 13 16:35:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 16:35:03 2006 Subject: [Numpy-discussion] matplotlib In-Reply-To: References: <87vevj4818.fsf@peds-pc311.bsd.uchicago.edu> <87pslr2shy.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: Sorry, I wasn't very clear. My thinking was like this: - matplotlib web pages don't mention support for numpy anywhere, just numeric and numarray - matplotlib web page says that the default is to use numeric - numpy is basically the successor to numeric plus numarray functionality - conclusion: if matplotlib actually does support numpy, and the web pages are just out of date, then probably numpy would now be the default instead of numeric, since it is the successor to numeric. But apparently there's a flaw in that thinking somewhere. --bb On 2/14/06, Ryan Krauss wrote: > > The point of the numerix setting in the rc file is that matplotlib > can't tell you what is the best numerical package to use for your > problem. > > On 2/13/06, Bill Baxter wrote: > > Ah, ok. You're right. Doing from pylab import * was actually just > > overwriting the definition of array and rand() to be those from Numeric, > > which pylab was picking to use by default. I guess my expectation was > that > > pylab would default to using the best numerical package installed. > > > > With "numerix : numpy" in my ~/.matplotlib/matplotlibrc file, it seems > to be > > working properly now. > > > > Thanks for the help! > > > > --bb > > > > On 2/14/06, John Hunter wrote: > > > >>>>> "Bill" == Bill Baxter writes: > > > > > > Bill> from numpy import * was the only line missing, called before > > > Bill> the rest. It seems to work fine if I use from pylab import > > > Bill> * instead of import pylab as g > > > > > > Bill> And actually if I do both in this order: import pylab as g > > > Bill> from pylab import * > > > > > > Bill> Seems as if there's some > > > Bill> initialization code that only gets run with the 'from pylab > > > Bill> import *' version. > > > > > > As far as I know that is a python impossibility, unless perhaps you do > > > some deep dark magic that is beyond my grasp. pylab doesn't know how > > > it is imported. > > > > > > Are you sure you have your numerix set properly? I suggest creating > > > two free standing scripts, one with the problem and one without, and > > > running both with --verbose-helpful to make sure that your settings > > > are what you think they are. If you verify that numerix is set > > > properly and still see the problem, I would like to see both scripts > > > in case it is exposing a problem with matplotlib. > > > > > > Of course, doing multiple import * commands is a recipe for long term > > > pain, especially with packages that have so much overlapping namespace > > > and numpy/scipy/pylab. > > > > > > JDH > > > > > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Feb 13 17:46:18 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 13 17:46:18 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: <43F110D6.6060302@noaa.gov> References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> <43F110D6.6060302@noaa.gov> Message-ID: On 2/13/06, Christopher Barker wrote: > Then a wiki page on broadcasting may be in order, and the glossary could > link to it. I don't think a glossary should link to anything. I envisioned the glossary as a way to resolve ambiguities for people who already know more than one meaning of the terms. However, if others think a link to more detailed explanation belongs to glossary entries, the natural destination of the link would be a page in Travis' book. Travis, can you suggest a future-proof way to refer to a page in your book? From tim.hochberg at cox.net Mon Feb 13 17:47:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 17:47:04 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F11716.9050204@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> Message-ID: <43F13655.5010907@cox.net> Tim Hochberg wrote: > Ryan Krauss wrote: > >> At the risk of sounding silly, can you explain to me in simple terms >> why s**2 is less accurate than s*s. I can sort of intuitively >> appreciate that that would be true, but but might like just a little >> more detail. >> >> > I don't know that it has to be *less* accurate, although it's unlikely > to be more accurate since s*s should be nearly as accurate as you get > with floating point. Multiplying two complex numbers in numpy is done > in the most straightforward way imaginable: > > result.real = z1.real*z2.real - z1.imag*z2.imag > result.image = z1.real*z2.imag + z1.imag*z2.real > > The individual results lose very little precision and the overall > result will be nearly exact to within the limits of floating point. > > On the other hand, s**2 is being calculated by a completely different > route. Something that will look like: > > result = pow(s, 2.0) > > Pow is some general function that computes the value of s to any > power. As such it's a lot more complicated than the above simple > expression. I don't think that there's any reason in principle that > pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff > between accuracy, speed and simplicity of implementation. > > That being said, it may be worthwhile having a look at complex pow and > see if there's anything suspicious that might make the error larger > than it needs to be. > > If all of that sounds a little bit like "I really know", there's some > of that in there too. To add a little more detail to this, the formula that numpy uses to compute a**b is: exp(b*log(a)) where: log(x) = log(|x|) + arctan2(x.imag, x,real) exp(x) = exp(x.real) * (cos(x.imag) + 1j*sin(x.imag)) With these definitions in hand, it should be apparent what's happening when *a* = |a|j and *b* = 2. First, let's compute 2*log(a) for *a* = 24674.011002723393j: 2*log(a) = (20.227011565110185+3.1415926535897931j) Now it's clear what's happening: ideally the sine of the imaginary part of the above number should be zero. However: sin(3.1415926535897931) = 1.2246063538223773e-016 And this in turn leads to the error we see here. -tim > > Regards, > > -tim > > >> Thanks, >> >> Ryan >> >> On 2/13/06, Tim Hochberg wrote: >> >> >>>>> Ryan Krauss wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> This may only be a problem for ridiculously large numbers. I >>>>>> actually >>>>>> meant to be dealing with these values: >>>>>> >>>>>> In [75]: d >>>>>> Out[75]: >>>>>> array([ 246.74011003, 986.96044011, 2220.66099025, >>>>>> 3947.84176044, >>>>>> 6168.50275068, 8882.64396098, 12090.26539133, >>>>>> 15791.36704174, >>>>>> 19985.94891221, 24674.01100272]) >>>>>> >>>>>> In [76]: s=d[-1]*1.0j >>>>>> >>>>>> In [77]: s >>>>>> Out[77]: 24674.011002723393j >>>>>> >>>>>> In [78]: type(s) >>>>>> Out[78]: >>>>>> >>>>>> In [79]: s**2 >>>>>> Out[79]: (-608806818.96251547+7.4554869875188623e-08j) >>>>>> >>>>>> So perhaps the previous difference of 26 orders of magnitude really >>>>>> did mean that the imaginary part was negligibly small, that just got >>>>>> obscured by the fact that the real part was order 1e+135. >>>>>> >>>>>> On 2/13/06, Ryan Krauss wrote: >>>>>> >>>>>> >>>>>> >>>>> >>> I got myself all tied up in a knot over this because I couldn't figure >>> out how multiplying two purely complex numbers was going to result in >>> something with a complex portion. Since I couldn't find the complex >>> routines my imagination went wild: perhaps, I thought, numpy uses the >>> complex multiplication routine that uses 3 multiplies instead of the >>> more straightforward one that uses 4 multiples, etc, etc. None of these >>> panned out, and of course they all evaporated when I got pointed to the >>> code that implements this which is pure vanilla. All the time I was >>> overlooking the obvious: >>> >>> Ryan is using s**2, not s*s. >>> >>> So the obvious answer, is that he's just seeing normal error in the >>> function that is implementing pow. >>> >>> If this is inacuracy is problem, I'd just replace s**2 with s*s. It >>> will >>> probably be both faster and more accurate anyway >>> >>> Foolishly, >>> >>> -tim >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------- >>> This SF.net email is sponsored by: Splunk Inc. Do you grep through >>> log files >>> for problems? Stop! Download the new AJAX search engine that makes >>> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>> >>> >> >> >> >> ------------------------------------------------------- >> This SF.net email is sponsored by: Splunk Inc. Do you grep through >> log files >> for problems? Stop! Download the new AJAX search engine that makes >> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >> http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> >> >> > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From wbaxter at gmail.com Mon Feb 13 17:49:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 17:49:15 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit Message-ID: Is there anyway to get around this timing difference? * >>> import timeit ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", 'from numpy import zeros,mat') ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += 1.;", 'from numpy import zeros,mat') >>> **t1.timeit(100) 1.8391627591141742 >>> t2.timeit(100) 3.2988266117713465 *Copying all the data of the input array seems wasteful when the array is just going to go out of scope. Or is this not something to be concerned about? It seems like a copy-by-reference version of mat() would be useful. Really I can't imagine any case when I'd want both a matrix and the original version of the array both hanging around as separate copies. I can imagine either 1) the array is just a temp and I won't ever need it again or 2) temporarily wanting a "matrix view" on the array's data to do some linalg, after which I'll go back to using the original (now modified) array as an array again. --bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at cox.net Mon Feb 13 18:02:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 18:02:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: References: Message-ID: <43F139CD.8010407@cox.net> Bill Baxter wrote: > Is there anyway to get around this timing difference? > * > >>> import timeit > ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", > 'from numpy import zeros,mat') > ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += > 1.;", 'from numpy import zeros,mat') > >>> **t1.timeit(100) > 1.8391627591141742 > >>> t2.timeit(100) > 3.2988266117713465 > > *Copying all the data of the input array seems wasteful when the array > is just going to go out of scope. Or is this not something to be > concerned about? You could try using copy=False: >>> import timeit >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", 'from numpy import zeros,mat') >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d'), copy=False); a += 1.;", 'from numpy import z eros,mat') >>> t1.timeit(100) 3.6538127052460578 >>> t2.timeit(100) 3.6567186611706237 I'd also like to point out that your computer appears to be much faster than mine. -tim > > It seems like a copy-by-reference version of mat() would be useful. > Really I can't imagine any case when I'd want both a matrix and the > original version of the array both hanging around as separate copies. > I can imagine either 1) the array is just a temp and I won't ever need > it again or 2) temporarily wanting a "matrix view" on the array's data > to do some linalg, after which I'll go back to using the original (now > modified) array as an array again. > > --bill From tim.hochberg at cox.net Mon Feb 13 18:07:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 18:07:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> Message-ID: <43F13B20.3000301@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>Ryan Krauss wrote: >> >> >> >>>At the risk of sounding silly, can you explain to me in simple terms >>>why s**2 is less accurate than s*s. I can sort of intuitively >>>appreciate that that would be true, but but might like just a little >>>more detail. >>> >>> >>> >>> >>I don't know that it has to be *less* accurate, although it's unlikely >>to be more accurate since s*s should be nearly as accurate as you get >>with floating point. Multiplying two complex numbers in numpy is done >>in the most straightforward way imaginable: >> >> result.real = z1.real*z2.real - z1.imag*z2.imag >> result.image = z1.real*z2.imag + z1.imag*z2.real >> >>The individual results lose very little precision and the overall >>result will be nearly exact to within the limits of floating point. >> >>On the other hand, s**2 is being calculated by a completely different >>route. Something that will look like: >> >> result = pow(s, 2.0) >> >>Pow is some general function that computes the value of s to any >>power. As such it's a lot more complicated than the above simple >>expression. I don't think that there's any reason in principle that >>pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff >>between accuracy, speed and simplicity of implementation. >> >> > >On a close tangent, I had a patch at one point for Numeric (never >committed) that did pow(s, 2.0) (= s**2) actually as s*s at the C level (no >pow), which helped a lot in speed (currently, s**2 is slower than s*s). > >I should have another look at that. The difference is speed is pretty >bad: for an array of 100 complex elements, s**2 is 68.4 usec/loop as >opposed to s*s with 4.13 usec/loop on my machine. > > Python's complex object also special cases integer powers. Which is why you won't see the inaccuracy that started this thread using basic complex objects. However, I'm not convinced this is a good idea for numpy. This would introduce a discontinuity in a**b that could cause problems in some cases. If, for instance, one were running an iterative solver of some sort (something I've been known to do), and b was a free variable, it could get stuck at b = 2 since things would go nonmonotonic there. I would recomend that we just prominently document that x*x is faster and more accurate than x**2 and that people should use x*x where that's a concern. -tim From ryanlists at gmail.com Mon Feb 13 18:11:01 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 18:11:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F13B20.3000301@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> Message-ID: I agree. I already made that change in my code. Ryan On 2/13/06, Tim Hochberg wrote: > David M. Cooke wrote: > > >Tim Hochberg writes: > > > > > > > >>Ryan Krauss wrote: > >> > >> > >> > >>>At the risk of sounding silly, can you explain to me in simple terms > >>>why s**2 is less accurate than s*s. I can sort of intuitively > >>>appreciate that that would be true, but but might like just a little > >>>more detail. > >>> > >>> > >>> > >>> > >>I don't know that it has to be *less* accurate, although it's unlikely > >>to be more accurate since s*s should be nearly as accurate as you get > >>with floating point. Multiplying two complex numbers in numpy is done > >>in the most straightforward way imaginable: > >> > >> result.real = z1.real*z2.real - z1.imag*z2.imag > >> result.image = z1.real*z2.imag + z1.imag*z2.real > >> > >>The individual results lose very little precision and the overall > >>result will be nearly exact to within the limits of floating point. > >> > >>On the other hand, s**2 is being calculated by a completely different > >>route. Something that will look like: > >> > >> result = pow(s, 2.0) > >> > >>Pow is some general function that computes the value of s to any > >>power. As such it's a lot more complicated than the above simple > >>expression. I don't think that there's any reason in principle that > >>pow(s,2) couldn't be as accurate as s*s, but there is a tradeoff > >>between accuracy, speed and simplicity of implementation. > >> > >> > > > >On a close tangent, I had a patch at one point for Numeric (never > >committed) that did pow(s, 2.0) (= s**2) actually as s*s at the C level (no > >pow), which helped a lot in speed (currently, s**2 is slower than s*s). > > > >I should have another look at that. The difference is speed is pretty > >bad: for an array of 100 complex elements, s**2 is 68.4 usec/loop as > >opposed to s*s with 4.13 usec/loop on my machine. > > > > > Python's complex object also special cases integer powers. Which is why > you won't see the inaccuracy that started this thread using basic > complex objects. > > However, I'm not convinced this is a good idea for numpy. This would > introduce a discontinuity in a**b that could cause problems in some > cases. If, for instance, one were running an iterative solver of some > sort (something I've been known to do), and b was a free variable, it > could get stuck at b = 2 since things would go nonmonotonic there. I > would recomend that we just prominently document that x*x is faster and > more accurate than x**2 and that people should use x*x where that's a > concern. > > -tim > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ryanlists at gmail.com Mon Feb 13 18:14:02 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 18:14:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F139CD.8010407@cox.net> References: <43F139CD.8010407@cox.net> Message-ID: You both seem to have cooler computers than I do: In [19]: t1.timeit(100) Out[19]: 4.9827449321746826 In [20]: t2.timeit(100) Out[20]: 4.9990239143371582 On 2/13/06, Tim Hochberg wrote: > Bill Baxter wrote: > > > Is there anyway to get around this timing difference? > > * > > >>> import timeit > > ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", > > 'from numpy import zeros,mat') > > ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += > > 1.;", 'from numpy import zeros,mat') > > >>> **t1.timeit(100) > > 1.8391627591141742 > > >>> t2.timeit(100) > > 3.2988266117713465 > > > > *Copying all the data of the input array seems wasteful when the array > > is just going to go out of scope. Or is this not something to be > > concerned about? > > You could try using copy=False: > > >>> import timeit > >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", 'from > numpy import zeros,mat') > >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d'), copy=False); a > += 1.;", 'from numpy import z > eros,mat') > >>> t1.timeit(100) > 3.6538127052460578 > >>> t2.timeit(100) > 3.6567186611706237 > > I'd also like to point out that your computer appears to be much faster > than mine. > > -tim > > > > > > It seems like a copy-by-reference version of mat() would be useful. > > Really I can't imagine any case when I'd want both a matrix and the > > original version of the array both hanging around as separate copies. > > I can imagine either 1) the array is just a temp and I won't ever need > > it again or 2) temporarily wanting a "matrix view" on the array's data > > to do some linalg, after which I'll go back to using the original (now > > modified) array as an array again. > > > > --bill > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From gruben at bigpond.net.au Mon Feb 13 18:19:01 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Mon Feb 13 18:19:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F13B20.3000301@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> Message-ID: <43F13DE0.8040309@bigpond.net.au> Tim Hochberg wrote: > However, I'm not convinced this is a good idea for numpy. This would > introduce a discontinuity in a**b that could cause problems in some > cases. If, for instance, one were running an iterative solver of some > sort (something I've been known to do), and b was a free variable, it > could get stuck at b = 2 since things would go nonmonotonic there. I don't quite understand the problem here. Tim says Python special cases integer powers but then talks about the problem when b is a floating type. I think special casing x**2 and maybe even x**3 when the power is an integer is still a good idea. Gary R. From wbaxter at gmail.com Mon Feb 13 18:27:04 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 18:27:04 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F139CD.8010407@cox.net> References: <43F139CD.8010407@cox.net> Message-ID: On 2/14/06, Tim Hochberg wrote: > > Bill Baxter wrote: > > > *Copying all the data of the input array seems wasteful when the array > > is just going to go out of scope. Or is this not something to be > > concerned about? > > You could try using copy=False: Lovely. That does the trick. And the syntax isn't so bad after defining a little helper like: def matr(a): return mat(a,copy=False) >>> t1.timeit(100) > 3.6538127052460578 > >>> t2.timeit(100) > 3.6567186611706237 > > I'd also like to point out that your computer appears to be much faster > than mine. Duly noted. :-) -tim --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Mon Feb 13 18:55:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 18:55:01 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) Message-ID: On 2/11/06, Gary Ruben wrote: > > Sasha wrote: > > On 2/10/06, Gary Ruben wrote: > >> ... I must say that Travis's > >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with python - that > >> the upper limit on an integer range is non-inclusive. > > > > In this case you must hate that an integer range starts at 0 (I don't > > think you would want len(range(10)) to be 11). First, I think the range() function in python is ugly to begin with. Why can't python just support range notation directly like 'for a in 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more sense to me than having to call a named function. Anyway, that's a python pet peeve, and python's probably not going to change something so fundamental... Second, sometimes zero-based, non-inclusive ranges are handy, and sometimes one-based inclusive ranges are handy. For array indexing, I personally like zero based. But sometimes I just want a list of N numbers like a human would write it, from 1 to N, and in those cases it seems really odd for N+1 to show up. This is a place where numpy could do something. I think it would be nice if numpy had something like an 'irange' (inclusive range) function to complement the 'arange' function. They would act pretty much the same, except irange(5) would return [1,2,3,4,5], and irange(1,5) would return [1,2,3,4,5]. Anyway, I think I'm going to put a little irange function in my setup. --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at cox.net Mon Feb 13 19:20:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 19:20:02 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: References: Message-ID: <43F14C18.1040600@cox.net> Bill Baxter wrote: > On 2/11/06, *Gary Ruben* > wrote: > > Sasha wrote: > > On 2/10/06, Gary Ruben > wrote: > >> ... I must say that Travis's > >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with > python - that > >> the upper limit on an integer range is non-inclusive. > > > > In this case you must hate that an integer range starts at 0 (I > don't > > think you would want len(range(10)) to be 11). > > > First, I think the range() function in python is ugly to begin with. > Why can't python just support range notation directly like 'for a in > 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more > sense to me than having to call a named function. Anyway, that's a > python pet peeve, and python's probably not going to change something > so fundamental... > > Second, sometimes zero-based, non-inclusive ranges are handy, and > sometimes one-based inclusive ranges are handy. For array indexing, I > personally like zero based. But sometimes I just want a list of N > numbers like a human would write it, from 1 to N, and in those cases > it seems really odd for N+1 to show up. > > This is a place where numpy could do something. I think it would be > nice if numpy had something like an 'irange' (inclusive range) > function to complement the 'arange' function. They would act pretty > much the same, except irange(5) would return [1,2,3,4,5], and > irange(1,5) would return [1,2,3,4,5]. > > Anyway, I think I'm going to put a little irange function in my setup. FWIW, I'd recomend a different name. irange sounds like it belongs in the itertools module with ifilter, islice, izip, etc. Perhaps, rangei would work, although admittedly it's harder to see. Maybe crange for closed range (versus half-open range)? I dunno, but irange seems like it's gonna confuse someone, if not you, then other people who end up looking at your code. -tim From jdhunter at ace.bsd.uchicago.edu Mon Feb 13 19:38:03 2006 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Mon Feb 13 19:38:03 2006 Subject: [Numpy-discussion] number ranges In-Reply-To: (Bill Baxter's message of "Tue, 14 Feb 2006 11:54:15 +0900") References: Message-ID: <87zmkuwrgt.fsf@peds-pc311.bsd.uchicago.edu> >>>>> "Bill" == Bill Baxter writes: Bill> This is a place where numpy could do something. I think it Bill> would be nice if numpy had something like an 'irange' Bill> (inclusive range) function to complement the 'arange' In my view, this goes a bit against the spirit of the Zen of python ('import this') There should be one -- and preferably only one -- obvious way to do it. since there is an obvious way to get the range 1..6. JDH From cjw at sympatico.ca Mon Feb 13 19:41:06 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 19:41:06 2006 Subject: [Numpy-discussion] Matlab page on scipy wiki In-Reply-To: References: <20060210122242.GA21950@sun.ac.za> <43ED10D0.1000405@ee.byu.edu> Message-ID: <43F15148.3060304@sympatico.ca> Sasha wrote: >Actually, what would be wrong with a single letter "c" or "r" for the >concatenator? NumPy already has one single-letter global identifier - >"e", so it will not be against any naming standard. I don't think >either "c" or "r" will conflict with anything in the standard library. > I would still prefer "c" because "r" is taken by RPy. > > It seems to me that a single letter would only be appropriate for a a function which has a very high frequency of use. I used M for the matrix constructor for my numarray based package. Why not rowCat for row catenate or colCat for column catentate - I've never understood why concatentate is used more commonly. Colin W. > >On 2/10/06, Sasha wrote: > > >>To tell you the truth I dislike trailing underscore much more than the >>choice of letter. In my code I will probably be renaming all these >>foo_ to delete the underscore foo_(...) or foo_[...] is way too ugly >>for my taste. However I fully admit that it is just a matter of taste >>and it is trivial to rename things on import in Python. >> >>PS: Trailing underscore reminds me of C++ - the language that I >>happily live without :-) >> >>On 2/10/06, Ryan Krauss wrote: >> >> >>>The problem is that c_ at least used to mean "column concatenate" and >>>concatenate is too long to type. >>> >>>On 2/10/06, Sasha wrote: >>> >>> >>>>On 2/10/06, Travis Oliphant wrote: >>>> >>>> >>>>>The whole point of r_ is to allow you to use slice notation to build >>>>>ranges easily. I wrote it precisely to make it easier to construct >>>>>arrays in a simliar style that Matlab allows. >>>>> >>>>> >>>>Maybe it is just me, but r_ is rather unintuitive. I would expect >>>>something like this to be called "c" for "combine" or "concatenate." >>>>This is the name used by S+ and R. >>>> >>>>From R manual: >>>>""" >>>>c package:base R Documentation >>>>Combine Values into a Vector or List >>>>... >>>>Examples: >>>> c(1,7:9) >>>>... >>>>""" >>>> >>>> >>>>------------------------------------------------------- >>>>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >>>>for problems? Stop! Download the new AJAX search engine that makes >>>>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >>>>http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642 >>>>_______________________________________________ >>>>Numpy-discussion mailing list >>>>Numpy-discussion at lists.sourceforge.net >>>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>>> >>>> >>>> > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From cjw at sympatico.ca Mon Feb 13 19:55:02 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 19:55:02 2006 Subject: [Numpy-discussion] dtype names and titles In-Reply-To: <43EE63E1.10601@ieee.org> References: <43EDDC6C.6040005@web.de> <43EE63E1.10601@ieee.org> Message-ID: <43F15469.10209@sympatico.ca> Travis Oliphant wrote: > N. Volbers wrote: > >> I continue to learn all about the heterogeneous arrays... >> >> When I was reading through the records.py code I discovered that >> besides the 'names' and 'formats' for the fields of a numpy array you >> can also specify 'titles'. Playing around with this feature I >> discovered a bug: >> >> >>> import numpy >> >>> mydata = [(1,1), (2,4), (3,9)] >> >>> mytype = {'names': ['col1','col2'], 'formats':['i2','f4'], >> 'titles': ['col2', 'col1']} >> >>> b = numpy.array( mydata, dtype=mytype) >> >>> print b >> [(1.0, 1.0) (4.0, 4.0) (9.0, 9.0)] >> >> This seems to be caused by the fact that you can access a field by >> both the name and the field title. Why would you want to have two >> names anyway? > > > This lets you use attribute look up on the names but have the titles > be the "true name" of the field. Isn't it better to use the name as the identifier and the title as an external label? e.g. As a column heading when pretty-printing. It seems to me that permitting either the name or the title as an object accessor is potentially confusing. Colin W. > > I've fixed this in SVN, so that it raises an error when the titles > have the same names as the columns. > > Thanks for the test. > > > -Travis > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From cjw at sympatico.ca Mon Feb 13 19:56:09 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 19:56:09 2006 Subject: [Numpy-discussion] More numpy and Numeric differences In-Reply-To: <43EE6AAB.9010408@ieee.org> References: <43EE6AAB.9010408@ieee.org> Message-ID: <43F154CF.7070403@sympatico.ca> Travis Oliphant wrote: > Pearu Peterson wrote: > >> >> I have created a wiki page >> >> http://scipy.org/PearuPeterson/NumpyVersusNumeric >> >> that reports my findings on how numpy and Numeric behave on various >> corner cases. Travis O., could you take a look at it? >> Here is the most recent addition: >> > I fixed the put issue. The problem with clip is actually in choose > (clip is just a specific application of choose). > The problem is in PyArray_ConvertToCommonType. You have an integer > array, an integer scalar, and a floating-point scalar. > I think the rules implemented in PyArray_ConvertToCommonType are not > allowing the scalar to dictate anything. But, this should clearly be > changed to allow scalars of different "kinds" to up-cast the array. > This would be consistent with the umath module. > > So, PyArray_ConvertToCommonType needs to be improved. This will have > an impact on several other functions that use this C-API. > > -Travis > A numarray vs numpy would be helpful for some of us. Colin W. From cjw at sympatico.ca Mon Feb 13 20:09:05 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 13 20:09:05 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <43F14C18.1040600@cox.net> References: <43F14C18.1040600@cox.net> Message-ID: <43F157C0.1040209@sympatico.ca> Tim Hochberg wrote: > Bill Baxter wrote: > >> On 2/11/06, *Gary Ruben* > > wrote: >> >> Sasha wrote: >> > On 2/10/06, Gary Ruben > > wrote: >> >> ... I must say that Travis's >> >> example numpy.r_[1,0,1:5,0,1] highlights my pet hate with >> python - that >> >> the upper limit on an integer range is non-inclusive. >> > >> > In this case you must hate that an integer range starts at 0 (I >> don't >> > think you would want len(range(10)) to be 11). >> >> >> First, I think the range() function in python is ugly to begin with. >> Why can't python just support range notation directly like 'for a in >> 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot >> more sense to me than having to call a named function. Anyway, >> that's a python pet peeve, and python's probably not going to change >> something so fundamental... >> >> Second, sometimes zero-based, non-inclusive ranges are handy, and >> sometimes one-based inclusive ranges are handy. For array indexing, >> I personally like zero based. But sometimes I just want a list of N >> numbers like a human would write it, from 1 to N, and in those cases >> it seems really odd for N+1 to show up. >> >> This is a place where numpy could do something. I think it would be >> nice if numpy had something like an 'irange' (inclusive range) >> function to complement the 'arange' function. They would act pretty >> much the same, except irange(5) would return [1,2,3,4,5], and >> irange(1,5) would return [1,2,3,4,5]. >> >> Anyway, I think I'm going to put a little irange function in my setup. > > > > FWIW, I'd recomend a different name. irange sounds like it belongs in > the itertools module with ifilter, islice, izip, etc. Perhaps, rangei > would work, although admittedly it's harder to see. Maybe crange for > closed range (versus half-open range)? I dunno, but irange seems like > it's gonna confuse someone, if not you, then other people who end up > looking at your code. > > -tim Wouldn't it be nice if we could express range(a, b, c) as a:b:c? Colin W. From cookedm at physics.mcmaster.ca Mon Feb 13 20:14:01 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Mon Feb 13 20:14:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F13DE0.8040309@bigpond.net.au> (Gary Ruben's message of "Tue, 14 Feb 2006 13:18:08 +1100") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: Gary Ruben writes: > Tim Hochberg wrote: > >> However, I'm not convinced this is a good idea for numpy. This would >> introduce a discontinuity in a**b that could cause problems in some >> cases. If, for instance, one were running an iterative solver of >> some sort (something I've been known to do), and b was a free >> variable, it could get stuck at b = 2 since things would go >> nonmonotonic there. > > I don't quite understand the problem here. Tim says Python special > cases integer powers but then talks about the problem when b is a > floating type. I think special casing x**2 and maybe even x**3 when > the power is an integer is still a good idea. Well, what I had done with Numeric did special case x**0, x**1, x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the exponent was a scalar (so x**y where y was an array wouldn't be). I think this is very useful, as I don't want to microoptimize my code to x*x instead of x**2. The reason for just scalar exponents was so choosing how to do the power was lifted out of the inner loop. With that, x**2 was as fast as x*x. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From tim.hochberg at cox.net Mon Feb 13 20:16:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 20:16:04 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <43F100F6.1020200@ee.byu.edu> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> <43F100F6.1020200@ee.byu.edu> Message-ID: <43F1593C.9080203@cox.net> Travis Oliphant wrote: > Russel Howe wrote: > >> I am converting some numarray code to numpy and I noticed this behavior: >> >> >>> from numpy import * >> >>> sta=array(['abc', 'def', 'ghi']) >> >>> stb=array(['abc', 'jkl', 'ghi']) >> >>> sta==stb >> False >> >> I expected the same as this: >> >>> a1=array([1,2,3]) >> >>> a2=array([1,4,3]) >> >>> a1==a2 >> array([True, False, True], dtype=bool) >> >> I am trying to figure out how to fix this now... > > > > Equality testing on string arrays does not work (equality testing uses > ufuncs internally which are not supported generally for flexible > arrays). You must use chararray's. Should string arrays then perhaps raise an exception here to keep people out of trouble? -tim > > Thus, > > sta.view(chararray) == stb.view(chararray) > > Or create chararray's from the beginning: > > sta = char.array(['abc','def','ghi']) > stb = char.array(['abc','jkl','ghi']) > > Char arrays are a special subclass of the ndarray that give arrays all > the methods of strings (and unicode) elements and allow (rich) > comparison operations. > > -Travis > > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From wbaxter at gmail.com Mon Feb 13 20:22:06 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 13 20:22:06 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <43F14C18.1040600@cox.net> References: <43F14C18.1040600@cox.net> Message-ID: Thanks for the suggestion. Didn't realize there were all those other common i* functions. Maybe arangei is better. It gives more indication that it's mostly like arange, but different. --bb On 2/14/06, Tim Hochberg wrote: > > > FWIW, I'd recomend a different name. irange sounds like it belongs in > the itertools module with ifilter, islice, izip, etc. Perhaps, rangei > would work, although admittedly it's harder to see. Maybe crange for > closed range (versus half-open range)? I dunno, but irange seems like > it's gonna confuse someone, if not you, then other people who end up > looking at your code. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gruben at bigpond.net.au Mon Feb 13 20:33:12 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Mon Feb 13 20:33:12 2006 Subject: [Numpy-discussion] number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: References: Message-ID: <43F15D5A.9060202@bigpond.net.au> I agree with Bill's comments. The other languages I've used have used closed/inclusive ranges, which made a lot of sense because they supported enumerated types. The Python Zen guideline breaks down for me because it never felt 'obvious' to me and the principle of least surprise told me to expect closed ranges. I can see the arguments for both though. I discovered that Ruby has both 0..10 and 0...10 syntax depending on whether you want open or closed ranges. I don't know if numpy should break with Python convention here and try to supply a closed range specifier because there should probably be a backup Zen guideline saying 'There should be one -- and preferably only one -- way to do it, even if it's not obvious.' If there was one, I'd use it in preference though. Gary R. From oliphant.travis at ieee.org Mon Feb 13 20:35:13 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 13 20:35:13 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F139CD.8010407@cox.net> References: <43F139CD.8010407@cox.net> Message-ID: <43F15DDF.1030807@ieee.org> Tim Hochberg wrote: > Bill Baxter wrote: > >> Is there anyway to get around this timing difference? >> * >> >>> import timeit >> ** >>> t1 = timeit.Timer("a = zeros((1000,1000),'d'); a += 1.;", >> 'from numpy import zeros,mat') >> ** >>> t2 = timeit.Timer("a = mat(zeros((1000,1000),'d')); a += >> 1.;", 'from numpy import zeros,mat') >> >>> **t1.timeit(100) >> 1.8391627591141742 >> >>> t2.timeit(100) >> 3.2988266117713465 >> >> *Copying all the data of the input array seems wasteful when the >> array is just going to go out of scope. Or is this not something to >> be concerned about? > I think I originally tried to make mat *not* return a copy, but this actually broke code in SciPy. So, I left the default as it was as a copy on input. There is an *asmatrix* command that does not return a copy... -Travis From oliphant.travis at ieee.org Mon Feb 13 20:38:15 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 13 20:38:15 2006 Subject: [Numpy-discussion] String array equality test does not broadcast In-Reply-To: <43F1593C.9080203@cox.net> References: <99AE3D0B-E106-4EE0-B70C-E84DE1BCC2C7@appliedminds.com> <43F100F6.1020200@ee.byu.edu> <43F1593C.9080203@cox.net> Message-ID: <43F15E9B.7040706@ieee.org> Tim Hochberg wrote: > Travis Oliphant wrote: > >> Russel Howe wrote: >> >>> I am converting some numarray code to numpy and I noticed this >>> behavior: >>> >>> >>> from numpy import * >>> >>> sta=array(['abc', 'def', 'ghi']) >>> >>> stb=array(['abc', 'jkl', 'ghi']) >>> >>> sta==stb >>> False >>> >>> I expected the same as this: >>> >>> a1=array([1,2,3]) >>> >>> a2=array([1,4,3]) >>> >>> a1==a2 >>> array([True, False, True], dtype=bool) >>> >>> I am trying to figure out how to fix this now... >> >> >> >> >> Equality testing on string arrays does not work (equality testing >> uses ufuncs internally which are not supported generally for flexible >> arrays). You must use chararray's. > > > Should string arrays then perhaps raise an exception here to keep > people out of trouble? Probably. The equal (not_equal) rich comparison code has some left-over stuff from Numeric which is implemented so that if the ufunc equal (not_equal) failed False (True) was returned. I did not special-case the string arrays in this code. -Travis > > -tim > > >> >> Thus, >> >> sta.view(chararray) == stb.view(chararray) >> >> Or create chararray's from the beginning: >> >> sta = char.array(['abc','def','ghi']) >> stb = char.array(['abc','jkl','ghi']) >> >> Char arrays are a special subclass of the ndarray that give arrays >> all the methods of strings (and unicode) elements and allow (rich) >> comparison operations. >> >> -Travis >> >> >> >> >> >> >> >> >> ------------------------------------------------------- >> This SF.net email is sponsored by: Splunk Inc. Do you grep through >> log files >> for problems? Stop! Download the new AJAX search engine that makes >> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From gruben at bigpond.net.au Mon Feb 13 20:50:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Mon Feb 13 20:50:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: <43F16154.2050907@bigpond.net.au> Hi David, So, I think what you had done would be OK provided you removed the x**0.5 case to avoid the problem Tim raised and checked that the exponent is an integer, not just a scalar. Does anyone see a problem with this approach. Gary R. David M. Cooke wrote: > Gary Ruben writes: > >> Tim Hochberg wrote: >> >>> However, I'm not convinced this is a good idea for numpy. This would >>> introduce a discontinuity in a**b that could cause problems in some >>> cases. If, for instance, one were running an iterative solver of >>> some sort (something I've been known to do), and b was a free >>> variable, it could get stuck at b = 2 since things would go >>> nonmonotonic there. >> I don't quite understand the problem here. Tim says Python special >> cases integer powers but then talks about the problem when b is a >> floating type. I think special casing x**2 and maybe even x**3 when >> the power is an integer is still a good idea. > > Well, what I had done with Numeric did special case x**0, x**1, > x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the > exponent was a scalar (so x**y where y was an array wouldn't be). I > think this is very useful, as I don't want to microoptimize my code to > x*x instead of x**2. The reason for just scalar exponents was so > choosing how to do the power was lifted out of the inner loop. With > that, x**2 was as fast as x*x. > From tim.hochberg at cox.net Mon Feb 13 21:18:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Feb 13 21:18:12 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: <43F167C0.7040806@cox.net> David M. Cooke wrote: >Gary Ruben writes: > > > >>Tim Hochberg wrote: >> >> >> >>>However, I'm not convinced this is a good idea for numpy. This would >>>introduce a discontinuity in a**b that could cause problems in some >>>cases. If, for instance, one were running an iterative solver of >>>some sort (something I've been known to do), and b was a free >>>variable, it could get stuck at b = 2 since things would go >>>nonmonotonic there. >>> >>> >>I don't quite understand the problem here. Tim says Python special >>cases integer powers but then talks about the problem when b is a >>floating type. I think special casing x**2 and maybe even x**3 when >>the power is an integer is still a good idea. >> >> > >Well, what I had done with Numeric did special case x**0, x**1, >x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the >exponent was a scalar (so x**y where y was an array wouldn't be). I >think this is very useful, as I don't want to microoptimize my code to >x*x instead of x**2. The reason for just scalar exponents was so >choosing how to do the power was lifted out of the inner loop. With >that, x**2 was as fast as x*x. > > This is getting harder to object to since, try as I might I can't get a**b to go nonmontonic in the vicinity of b==2. I run out of floating point resolution before the slight shift due to special casing at 2 results in nonmonoticity. I suspect that I could manage it with enough work, but it would require some unlikely function of a**b. I'm not sure if I'm really on board with this, but let me float a slightly modified proposal anyway: 1. numpy.power stays as it is now. That way in the rare case that someone runs into trouble they can drop back to power. Alternatively there could be rawpower and power where rawpower has the current behaviour. While the name rawpower sounds cool/cheesy, power is used infrequently enough that I doubt it matters whether it has these special case optimazations. 2, Don't distinguish between scalars and arrays -- that just makes things harder to explain. 3. Python itself special cases all integral powers between -100 and 100. Beg/borrow/steal their code. This makes it easier to explain since all smallish integer powers are just automagically faster. 4. Is the performance advantage of special casing a**0.5 signifigant? If so use the above trick to special case all half integral and integral powers between -N and N. Since sqrt probably chews up some time the cutoff. The cutoff probably shifts somewhat if we're optimizing half integral as well as integral powers. Perhaps N would be 32 or 64. The net result of this is that a**b would be computed using a combination of repeated multiplication and sqrt for real integral and half integral values of b between -N and N. That seems simpler to explain and somewhat more useful as well. It sounds like a fun project although I'm not certain yet that it's a good idea. -tim From ryanlists at gmail.com Mon Feb 13 21:21:01 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Mon Feb 13 21:21:01 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F16154.2050907@bigpond.net.au> References: <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F16154.2050907@bigpond.net.au> Message-ID: I think that would be great, but is there any chance there would be a problem with the scenario Tim posted earlier: If a script was running some sort of optimization on x**y, could the y value ever actually be returned as an integer and could that throw off the optimization if round off error caused the float version returned a significantly different value than the integer version? Ryan On 2/13/06, Gary Ruben wrote: > Hi David, > So, I think what you had done would be OK provided you removed the > x**0.5 case to avoid the problem Tim raised and checked that the > exponent is an integer, not just a scalar. > Does anyone see a problem with this approach. > Gary R. > > David M. Cooke wrote: > > Gary Ruben writes: > > > >> Tim Hochberg wrote: > >> > >>> However, I'm not convinced this is a good idea for numpy. This would > >>> introduce a discontinuity in a**b that could cause problems in some > >>> cases. If, for instance, one were running an iterative solver of > >>> some sort (something I've been known to do), and b was a free > >>> variable, it could get stuck at b = 2 since things would go > >>> nonmonotonic there. > >> I don't quite understand the problem here. Tim says Python special > >> cases integer powers but then talks about the problem when b is a > >> floating type. I think special casing x**2 and maybe even x**3 when > >> the power is an integer is still a good idea. > > > > Well, what I had done with Numeric did special case x**0, x**1, > > x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the > > exponent was a scalar (so x**y where y was an array wouldn't be). I > > think this is very useful, as I don't want to microoptimize my code to > > x*x instead of x**2. The reason for just scalar exponents was so > > choosing how to do the power was lifted out of the inner loop. With > > that, x**2 was as fast as x*x. > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From cookedm at physics.mcmaster.ca Mon Feb 13 21:46:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Mon Feb 13 21:46:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: <43F167C0.7040806@cox.net> (Tim Hochberg's message of "Mon, 13 Feb 2006 22:16:48 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>Gary Ruben writes: >> >> >> >>>Tim Hochberg wrote: >>> >>> >>> >>>>However, I'm not convinced this is a good idea for numpy. This would >>>>introduce a discontinuity in a**b that could cause problems in some >>>>cases. If, for instance, one were running an iterative solver of >>>>some sort (something I've been known to do), and b was a free >>>>variable, it could get stuck at b = 2 since things would go >>>>nonmonotonic there. >>>> >>>> >>>I don't quite understand the problem here. Tim says Python special >>>cases integer powers but then talks about the problem when b is a >>>floating type. I think special casing x**2 and maybe even x**3 when >>>the power is an integer is still a good idea. >>> >>> >> >>Well, what I had done with Numeric did special case x**0, x**1, >>x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the >>exponent was a scalar (so x**y where y was an array wouldn't be). I >>think this is very useful, as I don't want to microoptimize my code to >>x*x instead of x**2. The reason for just scalar exponents was so >>choosing how to do the power was lifted out of the inner loop. With >>that, x**2 was as fast as x*x. >> >> > This is getting harder to object to since, try as I might I can't get > a**b to go nonmontonic in the vicinity of b==2. I run out of floating > point resolution before the slight shift due to special casing at 2 > results in nonmonoticity. I suspect that I could manage it with enough > work, but it would require some unlikely function of a**b. I'm not > sure if I'm really on board with this, but let me float a slightly > modified proposal anyway: > > 1. numpy.power stays as it is now. That way in the rare case that > someone runs into trouble they can drop back to power. Alternatively > there could be rawpower and power where rawpower has the current > behaviour. While the name rawpower sounds cool/cheesy, power is used > infrequently enough that I doubt it matters whether it has these > special case optimazations. +1 > > 2, Don't distinguish between scalars and arrays -- that just makes > things harder to explain. Makes the optimizations better, though. > 3. Python itself special cases all integral powers between -100 and > 100. Beg/borrow/steal their code. This makes it easier to explain > since all smallish integer powers are just automagically faster. > > 4. Is the performance advantage of special casing a**0.5 > signifigant? If so use the above trick to special case all half > integral and integral powers between -N and N. Since sqrt probably > chews up some time the cutoff. The cutoff probably shifts somewhat if > we're optimizing half integral as well as integral powers. Perhaps N > would be 32 or 64. > > The net result of this is that a**b would be computed using a > combination of repeated multiplication and sqrt for real integral and > half integral values of b between -N and N. That seems simpler to > explain and somewhat more useful as well. > > It sounds like a fun project although I'm not certain yet that it's a > good idea. Basically, my Numeric code looked like this: #define POWER_UFUNC3(prefix, basetype, exptype, outtype) \ static void prefix##_power(char **args, int *dimensions, \ int *steps, void *func) { \ int i, cis1=steps[0], cis2=steps[1], cos=steps[2], n=dimensions[0]; \ int is1=cis1/sizeof(basetype); \ int is2=cis2/sizeof(exptype); \ int os=cos/sizeof(outtype); \ basetype *i1 = (basetype *)(args[0]); \ exptype *i2=(exptype *)(args[1]); \ outtype *op=(outtype *)(args[2]); \ if (is2 == 0) { \ exptype exponent = i2[0]; \ if (POWER_equal(exponent, 0.0)) { \ for (i = 0; i < n; i++, op += os) { \ POWER_one((*op)) \ } \ } else if (POWER_equal(exponent, 1.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ *op = *i1; \ } \ } else if (POWER_equal(exponent, 2.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_square((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, -1.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_inverse((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, 3.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_cube((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, 4.0)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_fourth((*op),(*i1)) \ } \ } else if (POWER_equal(exponent, 0.5)) { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_sqrt((*op),(*i1)) \ } \ } else { \ for (i = 0; i < n; i++, i1 += is1, op += os) { \ POWER_pow((*op), (*i1), (exponent)) \ } \ } \ } else { \ for (i=0; i|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From thorin at gmail.com Mon Feb 13 22:33:01 2006 From: thorin at gmail.com (Curtis Spencer) Date: Mon Feb 13 22:33:01 2006 Subject: [Numpy-discussion] Python Equivalent of Matlab lpc Message-ID: Hi, I am trying to pull cepstral coefficients from wav files for a speech recognizer and I am wondering if there is a python equivalent of the lpc function in matlab? If not, anyone know of any other good ways to featurize speech vectors with python included functions? Thanks, Curtis From arnd.baecker at web.de Tue Feb 14 02:55:02 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Tue Feb 14 02:55:02 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> Message-ID: On Mon, 13 Feb 2006, David M. Cooke wrote: > Gary Ruben writes: > > > Tim Hochberg wrote: > > > >> However, I'm not convinced this is a good idea for numpy. This would > >> introduce a discontinuity in a**b that could cause problems in some > >> cases. If, for instance, one were running an iterative solver of > >> some sort (something I've been known to do), and b was a free > >> variable, it could get stuck at b = 2 since things would go > >> nonmonotonic there. > > > > I don't quite understand the problem here. Tim says Python special > > cases integer powers but then talks about the problem when b is a > > floating type. I think special casing x**2 and maybe even x**3 when > > the power is an integer is still a good idea. > > Well, what I had done with Numeric did special case x**0, x**1, > x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the > exponent was a scalar (so x**y where y was an array wouldn't be). I > think this is very useful, as I don't want to microoptimize my code to > x*x instead of x**2. The reason for just scalar exponents was so > choosing how to do the power was lifted out of the inner loop. With > that, x**2 was as fast as x*x. +1 from me for the special casing. A speed improvement of more than a factor 16 (from David's numbers) is something relevant! Best, Arnd From cimrman3 at ntc.zcu.cz Tue Feb 14 05:16:01 2006 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue Feb 14 05:16:01 2006 Subject: [Numpy-discussion] transpose function Message-ID: <43F1D7F5.6060807@ntc.zcu.cz> Just now I have stumbled on a non-intuitive thing with the transpose function. The help says: """ Help on function transpose in module numpy.core.oldnumeric: transpose(a, axes=None) transpose(a, axes=None) returns array with dimensions permuted according to axes. If axes is None (default) returns array with dimensions reversed. """ There are many functions in scipy that accept the 'axis' argument which is a single integer number. I have overlooked that here it is 'axes' and see what happens (I expected 'flipping' the array around the given single axis, well...): import scipy as nm In [26]:b = nm.zeros( (3,4) ) In [27]:b Out[27]: array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]) In [29]:nm.transpose( b, 0 ) Out[29]:array([0, 0, 0]) In [30]:nm.transpose( b, 1 ) Out[30]:array([0, 0, 0, 0]) So I propose either to replace 'axes' with 'order' or give an example in the docstring. It would be also good to raise an exception when the length of the 'axes' argument does not match the array rank and/or does not contain a permutation (no repetitions) of relevant indices. What do the gurus think? r. From tim.hochberg at cox.net Tue Feb 14 08:58:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 14 08:58:03 2006 Subject: [Numpy-discussion] Re: indexing problem In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> Message-ID: <43F20BFE.5030100@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>David M. Cooke wrote: >> >> >> >>>Gary Ruben writes: >>> >>> >>> >>> >>> >>>>Tim Hochberg wrote: >>>> >>>> >>>> >>>> >>>> >>>>>However, I'm not convinced this is a good idea for numpy. This would >>>>>introduce a discontinuity in a**b that could cause problems in some >>>>>cases. If, for instance, one were running an iterative solver of >>>>>some sort (something I've been known to do), and b was a free >>>>>variable, it could get stuck at b = 2 since things would go >>>>>nonmonotonic there. >>>>> >>>>> >>>>> >>>>> >>>>I don't quite understand the problem here. Tim says Python special >>>>cases integer powers but then talks about the problem when b is a >>>>floating type. I think special casing x**2 and maybe even x**3 when >>>>the power is an integer is still a good idea. >>>> >>>> >>>> >>>> >>>Well, what I had done with Numeric did special case x**0, x**1, >>>x**(-1), x**0.5, x**2, x**3, x**4, and x**5, and only when the >>>exponent was a scalar (so x**y where y was an array wouldn't be). I >>>think this is very useful, as I don't want to microoptimize my code to >>>x*x instead of x**2. The reason for just scalar exponents was so >>>choosing how to do the power was lifted out of the inner loop. With >>>that, x**2 was as fast as x*x. >>> >>> >>> >>> >>This is getting harder to object to since, try as I might I can't get >>a**b to go nonmontonic in the vicinity of b==2. I run out of floating >>point resolution before the slight shift due to special casing at 2 >>results in nonmonoticity. I suspect that I could manage it with enough >>work, but it would require some unlikely function of a**b. I'm not >>sure if I'm really on board with this, but let me float a slightly >>modified proposal anyway: >> >> 1. numpy.power stays as it is now. That way in the rare case that >>someone runs into trouble they can drop back to power. Alternatively >>there could be rawpower and power where rawpower has the current >>behaviour. While the name rawpower sounds cool/cheesy, power is used >>infrequently enough that I doubt it matters whether it has these >>special case optimazations. >> >> > >+1 > > > >> 2, Don't distinguish between scalars and arrays -- that just makes >>things harder to explain. >> >> > >Makes the optimizations better, though. > > Ah, Because you can hoist all the checks for what type of optimization to do, if any, out of the core loop, right? That's a good point. Still I'm not keen on a**b having different performance *and* different results depending on whether b is a scalar or matrix. The first thing to do is to measure how much overhead doing the optimization element by element is going to add. Assuming that it's signifigant that leaves us with the familiar dilema: fast, simple or general purpose; pick any two. 1. Do what I've proposed: optimize things at the c_pow level. This is general purpose and relatively simple to implement (since we can steal most of the code from complexobject.c). It may have a signifigant speed penalty versus 2 though: 2. Do what you've proposed: optimize things at the ufunc level. This fast and relatively simple to implement. It's more limited in scope and a bit harder to explain than 2. 3. Do both. This is straightforward, but adds a bunch of extra code paths with all the attendant required testing and possibility for bugs. So, fast, general purpose, but not simple. > > >> 3. Python itself special cases all integral powers between -100 and >>100. Beg/borrow/steal their code. This makes it easier to explain >>since all smallish integer powers are just automagically faster. >> >> 4. Is the performance advantage of special casing a**0.5 >>signifigant? If so use the above trick to special case all half >>integral and integral powers between -N and N. Since sqrt probably >>chews up some time the cutoff. The cutoff probably shifts somewhat if >>we're optimizing half integral as well as integral powers. Perhaps N >>would be 32 or 64. >> >>The net result of this is that a**b would be computed using a >>combination of repeated multiplication and sqrt for real integral and >>half integral values of b between -N and N. That seems simpler to >>explain and somewhat more useful as well. >> >>It sounds like a fun project although I'm not certain yet that it's a >>good idea. >> >> > >Basically, my Numeric code looked like this: > >#define POWER_UFUNC3(prefix, basetype, exptype, outtype) \ >static void prefix##_power(char **args, int *dimensions, \ > int *steps, void *func) { \ > int i, cis1=steps[0], cis2=steps[1], cos=steps[2], n=dimensions[0]; \ > int is1=cis1/sizeof(basetype); \ > int is2=cis2/sizeof(exptype); \ > int os=cos/sizeof(outtype); \ > basetype *i1 = (basetype *)(args[0]); \ > exptype *i2=(exptype *)(args[1]); \ > outtype *op=(outtype *)(args[2]); \ > if (is2 == 0) { \ > exptype exponent = i2[0]; \ > if (POWER_equal(exponent, 0.0)) { \ > for (i = 0; i < n; i++, op += os) { \ > POWER_one((*op)) \ > } \ > } else if (POWER_equal(exponent, 1.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > *op = *i1; \ > } \ > } else if (POWER_equal(exponent, 2.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_square((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, -1.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_inverse((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, 3.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_cube((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, 4.0)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_fourth((*op),(*i1)) \ > } \ > } else if (POWER_equal(exponent, 0.5)) { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_sqrt((*op),(*i1)) \ > } \ > } else { \ > for (i = 0; i < n; i++, i1 += is1, op += os) { \ > POWER_pow((*op), (*i1), (exponent)) \ > } \ > } \ > } else { \ > for (i=0; i POWER_pow((*op), (*i1), (*i2)) \ > } \ > } \ >} >#define POWER_UFUNC(prefix, type) POWER_UFUNC3(prefix, type, type, type) > >#define FTYPE float >#define POWER_equal(x,y) x == y >#define POWER_one(o) o = 1.0; >#define POWER_square(o,x) o = x*x; >#define POWER_inverse(o,x) o = 1.0 / x; >#define POWER_cube(o,x) FTYPE y=x; o = y*y*y; >#define POWER_fourth(o,x) FTYPE y=x, s = y*y; o = s * s; >#define POWER_sqrt(o,x) o = sqrt(x); >#define POWER_pow(o,x,n) o = pow(x, n); >POWER_UFUNC(FLOAT, float) >POWER_UFUNC3(FLOATD, float, double, float) > >plus similiar definitions for float, double, complex float, and >complex double. Using the POWER_square, etc. macros means the complex >case was easy to add. > >The speed comes from the inlining of how to do the power _outside_ the >inner loop. The reason x**2, etc. are slower currently is there is a >function call in the inner loop. Your's and mine C library's pow() >function mostly likely does something like I have above, for a single >case: pow(x, 2.0) is calculated as x*x. However, each time through it >has do decide _how_ to do it. > > Part of our difference in perspective comes from the fact that I've just been staring at the guts of complex power. In this case you always have function calls at present, even for s*s. (At least I'm fairly certain that doesn't get inlined although I haven't checked). Since much of the work I do is with complex matrices, it's appropriate that I focus on this. Have you measured the effect of a function call on the speed here, or is that just an educated guess. If it's an educated guess, it's probably worth determining how much of speed hit the function call actually causes. I was going to try to get a handle on this by comparing multiplication of Complex numbers (which requires a function call plus more math), with multiplication of Floats which does not. Perversly, the Complex multiplication came out marginally faster, which is hard to explain whichever way you look at it. >>> timeit.Timer("a*b", "from numpy import arange; a = arange(10000)+0j; b = arange(10000)+0j").time it(10000) 3.2974959107959876 >>> timeit.Timer("a*b", "from numpy import arange; a = arange(10000); b = arange(10000)").timeit(100 00) 3.4541194481425919 >That's why I limited the optimization to scalar exponents: array >exponents would mean it's about as slow as the pow() call, even if the >checks were inlined into the loop. It would probably be even slower >for the non-optimized case, as you'd check for the special exponents, >then call pow() if it fails (which would likely recheck the exponents). > > Again, here I'm thinking of the complex case. In that case at least, I don't think that the non-optimized case would take a noticeable speed hit. I would put it into pow itself, which already special cases a==0 and b==0. For float pow it might, but that's already slow, so I doubt that it would make much difference. >Maybe a simple way to add this is to rewrite x.__pow__() as something >like the C equivalent of > >def __pow__(self, p): > if p is not a scalar: > return power(self, p) > elif p == 1: > return p > elif p == 2: > return square(self) > elif p == 3: > return cube(self) > elif p == 4: > return power_4(self) > elif p == 0: > return ones(self.shape, dtype=self.dtype) > elif p == -1: > return 1.0/self > elif p == 0.5: > return sqrt(self) > >and add ufuncs square, cube, power_4 (etc.). > > It sounds like we need to benchmark some stuff and see what we come up with. One approach would be for each of us to implement this for one time (say float) and see how the approaches compare speed wise. That's not entirely fair as my approach will do much better at complex than float I believe, but it's certainly easier. regards, -tim From Chris.Barker at noaa.gov Tue Feb 14 11:43:02 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 14 11:43:02 2006 Subject: [Numpy-discussion] NumPy Glossary: a request for review In-Reply-To: References: <43ED700B.10903@bigpond.net.au> <43ED7C3D.6070705@bigpond.net.au> <43F110D6.6060302@noaa.gov> Message-ID: <43F2328E.8090306@noaa.gov> Sasha wrote: > However, if others think a link to more detailed explanation belongs > to glossary entries, the natural destination of the link would be a > page in Travis' book. Not unless that glossary is part of the book. Links in the Wiki should point to the wiki, or maybe to other openly available sources on the web. A link isn't critical, but a page about broadcasting would still be nice, it's a great feature of numpy. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant.travis at ieee.org Tue Feb 14 15:02:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 14 15:02:03 2006 Subject: [Numpy-discussion] New release of NumPy coming Message-ID: <43F26137.5040901@ieee.org> I'd like to make a new release of NumPy in the next day or two. If there are any outstanding issues, please let me know. -Travis From cookedm at physics.mcmaster.ca Tue Feb 14 15:13:04 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 14 15:13:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases (was: indexing problem) In-Reply-To: <43F20BFE.5030100@cox.net> (Tim Hochberg's message of "Tue, 14 Feb 2006 09:57:34 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> Message-ID: [changed subject to reflect this better] Tim Hochberg writes: > David M. Cooke wrote: >>Tim Hochberg writes: >>>David M. Cooke wrote: >>> 2, Don't distinguish between scalars and arrays -- that just makes >>>things harder to explain. >>Makes the optimizations better, though. >> >> > Ah, Because you can hoist all the checks for what type of optimization > to do, if any, out of the core loop, right? That's a good point. Still > I'm not keen on a**b having different performance *and* different > results depending on whether b is a scalar or matrix. The first thing > to do is to measure how much overhead doing the optimization element > by element is going to add. Assuming that it's signifigant that leaves > us with the familiar dilema: fast, simple or general purpose; pick any > two. > > 1. Do what I've proposed: optimize things at the c_pow level. This is > general purpose and relatively simple to implement (since we can steal > most of the code from complexobject.c). It may have a signifigant > speed penalty versus 2 though: > > 2. Do what you've proposed: optimize things at the ufunc level. This > fast and relatively simple to implement. It's more limited in scope > and a bit harder to explain than 2. > > 3. Do both. This is straightforward, but adds a bunch of extra code > paths with all the attendant required testing and possibility for > bugs. So, fast, general purpose, but not simple. Start with #1, then try #2. The problem with #2 is that you still have to include #1: if you're doing x**y when y is an array, then you have do if (y==2) etc. checks in your inner loop anyways. In that case, you might as well do it in nc_pow. At that point, it may be better to move the #1 optimization to the level of x.__pow__ (see below). >>The speed comes from the inlining of how to do the power _outside_ the >>inner loop. The reason x**2, etc. are slower currently is there is a >>function call in the inner loop. Your's and mine C library's pow() >>function mostly likely does something like I have above, for a single >>case: pow(x, 2.0) is calculated as x*x. However, each time through it >>has do decide _how_ to do it. >> >> > Part of our difference in perspective comes from the fact that I've > just been staring at the guts of complex power. In this case you > always have function calls at present, even for s*s. (At least I'm > fairly certain that doesn't get inlined although I haven't checked). > Since much of the work I do is with complex matrices, it's > appropriate that I focus on this. Ah, ok, now things are clicking. Complex power is going to be harder, because making sure that going from x**2.001 to x**2 doesn't do some funny complex branch cut stuff (I work in reals all the time :-) For the real numbers, these type of optimizations *are* a big win, and don't have the same type of continuity problems. I'll put them into numpy soon. > Have you measured the effect of a function call on the speed here, or > is that just an educated guess. If it's an educated guess, it's > probably worth determining how much of speed hit the function call > actually causes. I was going to try to get a handle on this by > comparing multiplication of Complex numbers (which requires a function > call plus more math), with multiplication of Floats which does not. > Perversly, the Complex multiplication came out marginally faster, > which is hard to explain whichever way you look at it. > >>>> timeit.Timer("a*b", "from numpy import arange; a = > arange(10000)+0j; b = arange(10000)+0j").time > it(10000) > 3.2974959107959876 >>>> timeit.Timer("a*b", "from numpy import arange; a = arange(10000); >>>> b > = arange(10000)").timeit(100 > 00) > 3.4541194481425919 You're not multiplying floats in the last one: you're multiplying integers. You either need to use a = arange(10000.0), or a = arange(10000.0, dtype=float) (to be more specific). Your integer numbers are about 3x better than mine, though (difference in architecture, maybe? I'm on an Athlon64). >>That's why I limited the optimization to scalar exponents: array >>exponents would mean it's about as slow as the pow() call, even if the >>checks were inlined into the loop. It would probably be even slower >>for the non-optimized case, as you'd check for the special exponents, >>then call pow() if it fails (which would likely recheck the exponents). >> >> > Again, here I'm thinking of the complex case. In that case at least, I > don't think that the non-optimized case would take a noticeable speed > hit. I would put it into pow itself, which already special cases a==0 > and b==0. For float pow it might, but that's already slow, so I doubt > that it would make much difference. It does make a bit of difference with float pow: the general case slows down a bit. >>Maybe a simple way to add this is to rewrite x.__pow__() as something >>like the C equivalent of >> >>def __pow__(self, p): >> if p is not a scalar: >> return power(self, p) >> elif p == 1: >> return p >> elif p == 2: >> return square(self) >> elif p == 3: >> return cube(self) >> elif p == 4: >> return power_4(self) >> elif p == 0: >> return ones(self.shape, dtype=self.dtype) >> elif p == -1: >> return 1.0/self >> elif p == 0.5: >> return sqrt(self) >> >>and add ufuncs square, cube, power_4 (etc.). > > It sounds like we need to benchmark some stuff and see what we come up > with. One approach would be for each of us to implement this for one > time (say float) and see how the approaches compare speed wise. That's > not entirely fair as my approach will do much better at complex than > float I believe, but it's certainly easier. The way the ufuncs are templated, we can split out the complex routines easily enough. Here's what I propose: - add a square() ufunc, where square(x) == x*x (but faster of course) - I'll fiddle with the floats - you fiddle with the complex numbers :-) I've created a new branch in svn, at http://svn.scipy.org/svn/numpy/branches/power_optimization to do this fiddling. The changes below I mention are all checked in as revision 2104 (http://projects.scipy.org/scipy/numpy/changeset/2104). I've added a square() ufunc to the power_optimization branch because I'd argue that it's probably *the* most common use of **. I've implemented it, and it's as fast as a*a for reals, and runs in 2/3 the time as a*a for complex (which makes sense: squaring a complex numbers has 3 real multiplications, while multiplying has 4 in the (simple) scheme [1]). At least with square(), there's no argument about continuity, as it only squares :-). The next step I'd suggest is special-casing x.__pow__, like I suggest above. We could just test for integer scalar exponents (0, 1, 2), and just special-case those (returning ones(), x.copy(), square(x)), and leave all the rest to power(). I've also checked in code to the power_optimization branch that special cases power(x, ), or anytime the basic ufunc gets called with a stride of 0 for the exponent. It doesn't do complex x, so no problems on your side, but it's a good chunk faster for this case than what we've got now. One reason I'm also looking at adding square() is because my optimization of power() makes x**2 run (only) 1.5 slower than x*x (and I can't for the life of me see where that 0.5 is coming from! It should be 1.0 like square()!). [1] which brings up another point. Would using the 3-multiplication version for complex multiplication be good? There might be some effects with cancellation errors due to the extra subtractions... -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From ndarray at mac.com Tue Feb 14 15:21:05 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 14 15:21:05 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: I was deliquent checking in ma code that takes advantage of context in __array__ . Will try to do it tomorrow. (Need to add more tests.) On 2/14/06, Travis Oliphant wrote: > > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. > > -Travis > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From strawman at astraw.com Tue Feb 14 21:08:02 2006 From: strawman at astraw.com (Andrew Straw) Date: Tue Feb 14 21:08:02 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: <43F2B6FC.7040508@astraw.com> Travis Oliphant wrote: > > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. > Here's one: http://projects.scipy.org/scipy/numpy/ticket/4 From strawman at astraw.com Tue Feb 14 22:24:02 2006 From: strawman at astraw.com (Andrew Straw) Date: Tue Feb 14 22:24:02 2006 Subject: [Numpy-discussion] Re: NumPy Glossary Was:Matlab page on scipy wiki In-Reply-To: References: <20060210072301.9608D12D10@sc8-sf-spam2.sourceforge.net> <200602101616.k1AGG0Vj026623@oobleck.astro.cornell.edu> Message-ID: <43F2C8C6.5030909@astraw.com> Bill Baxter wrote: > On the point of professionalism, I'd like to change the matlab page's > title from "NumPy for Matlab Addicts" to simply "NumPy for Matlab > Users". It's been bugging me since I put it up there initially... but > I'm not really sure how to chage the name of a page in the wiki. I went ahead and renamed the page and created a redirect from the old page. From pearu at scipy.org Wed Feb 15 00:10:08 2006 From: pearu at scipy.org (Pearu Peterson) Date: Wed Feb 15 00:10:08 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F2B6FC.7040508@astraw.com> References: <43F26137.5040901@ieee.org> <43F2B6FC.7040508@astraw.com> Message-ID: On Tue, 14 Feb 2006, Andrew Straw wrote: > Travis Oliphant wrote: > >> >> I'd like to make a new release of NumPy in the next day or two. If there >> are any outstanding issues, please let me know. >> > Here's one: > http://projects.scipy.org/scipy/numpy/ticket/4 Fixed in svn. Pearu From arnd.baecker at web.de Wed Feb 15 01:18:03 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Wed Feb 15 01:18:03 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: On Tue, 14 Feb 2006, Travis Oliphant wrote: > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. It seems that the icc stuff has fallen through the cracks, http://article.gmane.org/gmane.comp.python.numeric.general/3517/ (it is not relevant to me at this point - it was only a test after your request for compilations with compilers other than gcc ;-). Best, Arnd From faltet at carabos.com Wed Feb 15 02:02:27 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 15 02:02:27 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: <200602151100.37229.faltet@carabos.com> A Dimecres 15 Febrer 2006 00:01, Travis Oliphant va escriure: > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. Well, I've run all the recently added tests for unicode in both UCS4 and UCS2 platforms and all passes flawlessly. Very good :-) However, there are some problem when trying to load the unicode tests when running the complete suite: In [1]: import numpy In [2]: numpy.test(1) Found 11 tests for numpy.core.umath Found 8 tests for numpy.lib.arraysetops Found 26 tests for numpy.core.ma Found 6 tests for numpy.core.records Found 14 tests for numpy.core.numeric Found 4 tests for numpy.distutils.misc_util Found 3 tests for numpy.lib.getlimits Found 30 tests for numpy.core.numerictypes Found 9 tests for numpy.lib.twodim_base Found 1 tests for numpy.core.oldnumeric Found 44 tests for numpy.lib.shape_base Found 4 tests for numpy.lib.index_tricks Found 42 tests for numpy.lib.type_check Found 3 tests for numpy.dft.helper Warning: !! FAILURE importing tests for /usr/lib/python2.3/site-packages/numpy/core/tests/test_multiarray.py:195: ImportError: No module named test_unicode (in ?) Found 7 tests for numpy.core.defmatrix Found 33 tests for numpy.lib.function_base Found 0 tests for __main__ ....................................................................................................................................................................................................................................................... ---------------------------------------------------------------------- Ran 247 tests in 0.951s OK I've been trying to see how to correctly load the unicode tests, but failed miserably. Perhaps Pearu can tell us about the correct way to do that. Thanks, >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From schofield at ftw.at Wed Feb 15 02:32:02 2006 From: schofield at ftw.at (Ed Schofield) Date: Wed Feb 15 02:32:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F15DDF.1030807@ieee.org> References: <43F139CD.8010407@cox.net> <43F15DDF.1030807@ieee.org> Message-ID: <43F302F8.7080200@ftw.at> Travis Oliphant wrote: > I think I originally tried to make mat *not* return a copy, but this > actually broke code in SciPy. So, I left the default as it was as a > copy on input. There is an *asmatrix* command that does not return a > copy... All SciPy's unit tests actually pass with a default of copy=False in the matrix constructor. So SciPy needn't be the blocker here. I'd like to cast a vote for not copying by default, in the interests of efficiency and, as Bill Baxter argued, usefulness. -- Ed From pearu at scipy.org Wed Feb 15 03:20:01 2006 From: pearu at scipy.org (Pearu Peterson) Date: Wed Feb 15 03:20:01 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <200602151100.37229.faltet@carabos.com> References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> Message-ID: On Wed, 15 Feb 2006, Francesc Altet wrote: > I've been trying to see how to correctly load the unicode tests, but > failed miserably. Perhaps Pearu can tell us about the correct way to > do that. I have fixed it in svn. When importing modules from tests/ directory, one must surround the corresponding import statements with set_local_path() and restore_path() calls. Regards, Pearu From faltet at carabos.com Wed Feb 15 05:07:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed Feb 15 05:07:03 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> Message-ID: <200602151405.47640.faltet@carabos.com> A Dimecres 15 Febrer 2006 11:18, Pearu Peterson va escriure: > On Wed, 15 Feb 2006, Francesc Altet wrote: > > I've been trying to see how to correctly load the unicode tests, but > > failed miserably. Perhaps Pearu can tell us about the correct way to > > do that. > > I have fixed it in svn. When importing modules from tests/ directory, one > must surround the corresponding import statements with set_local_path() > and restore_path() calls. Ah, ok. Is there any place where this is explained or we have to use the source to figure out these sort of things? Thanks, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From arnd.baecker at web.de Wed Feb 15 05:25:04 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Wed Feb 15 05:25:04 2006 Subject: [Numpy-discussion] numpy/scipy transition Message-ID: Hi, concerning the transition from Numeric/scipy to the new numpy/scipy I have a couple of points for which I would be very interested in any advice/suggestions: As some of you might know we are running a computational physics course using python+Numeric+scipy+ipython. In six weeks the course will be run again and we are facing the question whether to switch to numpy/scipy right now or to delay this for one more year. Reasons to switch + numpy is better in several respects compared to Numeric + numpy/scipy installs more easily on newer machines + students will learn the most recent tools + extensive and alive documentation on www.scipy.org Reasons not to switch - there is no enthought edition yet (right?) - there are only packages for a few platforms/distribution - we need scipy.xplt (matplotlib is still no option at this point) Discussion/Background: To us the two main show-stoppers are scipy.xplt and the question about an Enthought Edition for Windows. For the Pool of PCs where the tutorial groups are to be held, it won't be a problem to install numpy/scipy in such a way that scipy.sandbox.xplt is visible as scipy.xplt (at least I hope). However, the students will either have windows (around 80%) or Linux at home. For windows users we have used the Enthought Edition (http://code.enthought.com/enthon/) and linux users were pointed to available packages for their machines or to install Numeric/scipy themselves. Concerning xplt another option might be to install scipy.sandbox.xplt in such a way that a `import xplt` would work. If that is possible we could try to supply `xplt` separately for some of the distributions, and maybe also for windows (which I don't use, so I have no idea how difficult that would be). If something like this was possible, the main question is whether a new enthon distribution with new numpy/scipy/ipython and all the other niceties of mayavi/VTK/wxPython/.... will come out in the near future? I would really love to use the new numpy/scipy - so any ideas are very welcome! Best, Arnd From manouchk at gmail.com Wed Feb 15 07:19:05 2006 From: manouchk at gmail.com (manouchk) Date: Wed Feb 15 07:19:05 2006 Subject: [Numpy-discussion] compressed method in doc Message-ID: <200602151326.49403.manouchk@gmail.com> Hi, First of all, I'm quite new to python and numpy (maybe to new!)... I was loooking for a convenient way to convert a (1D) masked array to a numeric array which only contain the not masked values. I spent several hours to figure out that the method "compressed" was the exact thing I was looking for! So I'm wondering what is the common way to find the right method (if there is) for a beginner ? Look at help of all methods (using completion to find all methods with tab key)? I hope the question is not too much basic for numpy mailing-list! Emmanuel From chanley at stsci.edu Wed Feb 15 08:59:16 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Wed Feb 15 08:59:16 2006 Subject: [Numpy-discussion] field method in recarray Message-ID: <43F35DB6.6070905@stsci.edu> Hi Travis, I have added a field method to recarray. This allows field access via either field name or index number. Example below: In [1]: from numpy import rec In [2]: r = rec.fromrecords([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3') In [3]: r.field('col1') Out[3]: array([456, 2]) In [4]: r.field(0) Out[4]: array([456, 2]) In [5]: r.field(0)[1]=1000 In [6]: r.field(0) Out[6]: array([ 456, 1000]) Chris From cwmoad at gmail.com Wed Feb 15 09:05:05 2006 From: cwmoad at gmail.com (Charlie Moad) Date: Wed Feb 15 09:05:05 2006 Subject: [Numpy-discussion] distutils env variables Message-ID: <6382066a0602150903t3d0667bes55165f23843f7b7b@mail.gmail.com> So numpy's distutils highjacking doesn't seem to respect environment variables. When I try something like... CPPFLAGS="-I/extra/include/path" python setup.py build ... it seems ignored. Is this intentional, or just missing functionality. Thanks, Charlie From tim.hochberg at cox.net Wed Feb 15 09:15:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 09:15:02 2006 Subject: [Numpy-discussion] avoiding the matrix copy performance hit In-Reply-To: <43F302F8.7080200@ftw.at> References: <43F139CD.8010407@cox.net> <43F15DDF.1030807@ieee.org> <43F302F8.7080200@ftw.at> Message-ID: <43F36174.3090703@cox.net> Ed Schofield wrote: >Travis Oliphant wrote: > > > >>I think I originally tried to make mat *not* return a copy, but this >>actually broke code in SciPy. So, I left the default as it was as a >>copy on input. There is an *asmatrix* command that does not return a >>copy... >> >> > >All SciPy's unit tests actually pass with a default of copy=False in the >matrix constructor. So SciPy needn't be the blocker here. I'd like to >cast a vote for not copying by default, in the interests of efficiency >and, as Bill Baxter argued, usefulness. > > I would like to cast a vote for keeping the behaviour the same. Note that: mat([[1,2,3], [3,4,5]]) will always create a copy of its data by necesesity. Which means that changing the default copy=False means that some data will be copied, while others will not be, potentially leading to subtle bugs. I strongly disaprove of this sort of inconstent behaviour (don't get me started on reshape!). In this situation, people should just use asmatrix. -tim >-- Ed > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From tim.hochberg at cox.net Wed Feb 15 09:37:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 09:37:02 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <43F26137.5040901@ieee.org> References: <43F26137.5040901@ieee.org> Message-ID: <43F36675.7040501@cox.net> Travis Oliphant wrote: > > I'd like to make a new release of NumPy in the next day or two. If > there are any outstanding issues, please let me know. > Just a datapoint: I just compiled a clean checkout here using VC7 on windows XP and it compiled and passed all tests. -tim From tim.hochberg at cox.net Wed Feb 15 10:03:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 10:03:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> Message-ID: <43F36CB6.5050004@cox.net> David M. Cooke wrote: >[changed subject to reflect this better] > >Tim Hochberg writes: > > >>David M. Cooke wrote: >> >> >>>Tim Hochberg writes: >>> >>> >>>>David M. Cooke wrote: >>>> 2, Don't distinguish between scalars and arrays -- that just makes >>>>things harder to explain. >>>> >>>> >>>Makes the optimizations better, though. >>> >>> >>> >>> >>Ah, Because you can hoist all the checks for what type of optimization >>to do, if any, out of the core loop, right? That's a good point. Still >>I'm not keen on a**b having different performance *and* different >>results depending on whether b is a scalar or matrix. The first thing >>to do is to measure how much overhead doing the optimization element >>by element is going to add. Assuming that it's signifigant that leaves >>us with the familiar dilema: fast, simple or general purpose; pick any >>two. >> >>1. Do what I've proposed: optimize things at the c_pow level. This is >>general purpose and relatively simple to implement (since we can steal >>most of the code from complexobject.c). It may have a signifigant >>speed penalty versus 2 though: >> >>2. Do what you've proposed: optimize things at the ufunc level. This >>fast and relatively simple to implement. It's more limited in scope >>and a bit harder to explain than 2. >> >>3. Do both. This is straightforward, but adds a bunch of extra code >>paths with all the attendant required testing and possibility for >>bugs. So, fast, general purpose, but not simple. >> >> > >Start with #1, then try #2. The problem with #2 is that you still have >to include #1: if you're doing x**y when y is an array, then you have >do if (y==2) etc. checks in your inner loop anyways. In that case, you >might as well do it in nc_pow. At that point, it may be better to move >the #1 optimization to the level of x.__pow__ (see below). > > OK. >>>The speed comes from the inlining of how to do the power _outside_ the >>>inner loop. The reason x**2, etc. are slower currently is there is a >>>function call in the inner loop. Your's and mine C library's pow() >>>function mostly likely does something like I have above, for a single >>>case: pow(x, 2.0) is calculated as x*x. However, each time through it >>>has do decide _how_ to do it. >>> >>> >>> >>> >>Part of our difference in perspective comes from the fact that I've >>just been staring at the guts of complex power. In this case you >>always have function calls at present, even for s*s. (At least I'm >>fairly certain that doesn't get inlined although I haven't checked). >>Since much of the work I do is with complex matrices, it's >>appropriate that I focus on this. >> >> > >Ah, ok, now things are clicking. Complex power is going to be harder, >because making sure that going from x**2.001 to x**2 doesn't do some >funny complex branch cut stuff (I work in reals all the time :-) > > We're always dealing with the principle branch though, so probably we can just ignore any branch cut issues. We'll see I suppose. >For the real numbers, these type of optimizations *are* a big win, and >don't have the same type of continuity problems. I'll put them into numpy soon. > > Complex power is something like thirty times slower than s*s, so there is some room for optimization there. Some peril too though, as you note. > > >>Have you measured the effect of a function call on the speed here, or >>is that just an educated guess. If it's an educated guess, it's >>probably worth determining how much of speed hit the function call >>actually causes. I was going to try to get a handle on this by >>comparing multiplication of Complex numbers (which requires a function >>call plus more math), with multiplication of Floats which does not. >>Perversly, the Complex multiplication came out marginally faster, >>which is hard to explain whichever way you look at it. >> >> >> >>>>>timeit.Timer("a*b", "from numpy import arange; a = >>>>> >>>>> >>arange(10000)+0j; b = arange(10000)+0j").time >>it(10000) >>3.2974959107959876 >> >> >>>>>timeit.Timer("a*b", "from numpy import arange; a = arange(10000); >>>>>b >>>>> >>>>> >>= arange(10000)").timeit(100 >>00) >>3.4541194481425919 >> >> > >You're not multiplying floats in the last one: you're multiplying >integers. You either need to use a = arange(10000.0), or a = >arange(10000.0, dtype=float) (to be more specific). > > Doh! Now that's embarassing. Well, when I actually measure float multiplication, it's between two and ten times as fast. For small arrays (N=1000) the difference is relatively small (3.5x), I assume because the setup overhead starts to dominate. For midsized array (N=10,000) the difference is larger (10x). For large arrays (N=100,000) the difference becomes small (2x). Presumably the memory is no longer fitting in the cache and I'm having memory bandwidth issues. >Your integer numbers are about 3x better than mine, though (difference >in architecture, maybe? I'm on an Athlon64). > > I'm on a P4. > > >>>That's why I limited the optimization to scalar exponents: array >>>exponents would mean it's about as slow as the pow() call, even if the >>>checks were inlined into the loop. It would probably be even slower >>>for the non-optimized case, as you'd check for the special exponents, >>>then call pow() if it fails (which would likely recheck the exponents). >>> >>> >>> >>> >>Again, here I'm thinking of the complex case. In that case at least, I >>don't think that the non-optimized case would take a noticeable speed >>hit. I would put it into pow itself, which already special cases a==0 >>and b==0. For float pow it might, but that's already slow, so I doubt >>that it would make much difference. >> >> > >It does make a bit of difference with float pow: the general case >slows down a bit. > > OK. I was hoping that the difference would not be noticeable. I suspect that in the complex pow case, that will be the case since complex pow is so slow to begin with and since it already is doing some testing on the exponent. >>>Maybe a simple way to add this is to rewrite x.__pow__() as something >>>like the C equivalent of >>> >>>def __pow__(self, p): >>> if p is not a scalar: >>> return power(self, p) >>> elif p == 1: >>> return p >>> elif p == 2: >>> return square(self) >>> elif p == 3: >>> return cube(self) >>> elif p == 4: >>> return power_4(self) >>> elif p == 0: >>> return ones(self.shape, dtype=self.dtype) >>> elif p == -1: >>> return 1.0/self >>> elif p == 0.5: >>> return sqrt(self) >>> >>>and add ufuncs square, cube, power_4 (etc.). >>> >>> >>It sounds like we need to benchmark some stuff and see what we come up >>with. One approach would be for each of us to implement this for one >>time (say float) and see how the approaches compare speed wise. That's >>not entirely fair as my approach will do much better at complex than >>float I believe, but it's certainly easier. >> >> > >The way the ufuncs are templated, we can split out the complex >routines easily enough. > >Here's what I propose: > >- add a square() ufunc, where square(x) == x*x (but faster of course) >- I'll fiddle with the floats >- you fiddle with the complex numbers :-) > >I've created a new branch in svn, at >http://svn.scipy.org/svn/numpy/branches/power_optimization >to do this fiddling. The changes below I mention are all checked in as >revision 2104 (http://projects.scipy.org/scipy/numpy/changeset/2104). > >I've added a square() ufunc to the power_optimization branch because >I'd argue that it's probably *the* most common use of **. I've >implemented it, and it's as fast as a*a for reals, and runs in 2/3 the >time as a*a for complex (which makes sense: squaring a complex numbers >has 3 real multiplications, while multiplying has 4 in the (simple) >scheme [1]). > >At least with square(), there's no argument about continuity, as it >only squares :-). > > Actually that's not entirely true, This get's back to the odd inaccuracy that started this thread: >>> array([1234567j])**2 array([ -1.52415568e+12+0.00018665j]) If you special case this the extraneous imaginary value will vanish, but raising things to the 2.000001 power or 1.999999 power will still be off a similar amount. I played with this a bunch though and I couldn't come up with a plausible way for this to make things break. I suspect I could come up with an implausible one though. [some time passes while I sleep and otherwise try to live a normal life....] >The next step I'd suggest is special-casing x.__pow__, like I suggest >above. We could just test for integer scalar exponents (0, 1, 2), and >just special-case those (returning ones(), x.copy(), square(x)), and >leave all the rest to power(). > > As I've been thinking about this some more, I think the correct thing to do is not to mess with the power ufuncs at all. Rather in x.__pow__ (since I don't know that there's anywhere else to do it), after the above checks check the types of the array and in the cases where the first argument is a float or complex and the second argument is some sort of integer array. This would be dispatched to some other helper function instead of the normal pow_ufunc. In other words, optimize: A**2, A**2.0, A**(2.0+0j), etc and A**array([1,2,3]) but not A**array[1.0, 2.0, 3.0] I think that this takes care of the optimization slowing down power for general floats and optimizes the only array-array case that really matter. >I've also checked in code to the power_optimization branch that >special cases power(x, ), or anytime the basic ufunc >gets called with a stride of 0 for the exponent. It doesn't do complex >x, so no problems on your side, but it's a good chunk faster for this >case than what we've got now. One reason I'm also looking at adding >square() is because my optimization of power() makes x**2 run (only) >1.5 slower than x*x (and I can't for the life of me see where that 0.5 >is coming from! It should be 1.0 like square()!). > > I just checked out your branch and I'll fiddle with the complex stuff as I've got time. I've got relatives in town this week, so my extra cycles just dropped precipitously. >[1] which brings up another point. Would using the 3-multiplication >version for complex multiplication be good? There might be some >effects with cancellation errors due to the extra subtractions... > > I'm inclined to leave this be for now. Both because I'm unsure of the rounding issues and because I'm not sure it would actually be faster. It has one less multiplication, but several more additions, so it would depend on the relative speed add/sub with multiplication and how things end up getting scheduled in the FP pipeline. At some point it's probably worth trying; if it turns out to be signifigantly faster we can think about rounding then. If it's not faster then no need to think. -tim From oliphant.travis at ieee.org Wed Feb 15 10:24:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 10:24:02 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: References: Message-ID: <43F37183.3000103@ieee.org> Arnd Baecker wrote: >Reasons not to switch >- there is no enthought edition yet (right?) >- there are only packages for a few platforms/distribution >- we need scipy.xplt > (matplotlib is still no option at this point) > >Discussion/Background: > >To us the two main show-stoppers are scipy.xplt and the question >about an Enthought Edition for Windows. >For the Pool of PCs where the tutorial groups are to be held, >it won't be a problem to install numpy/scipy in such >a way that scipy.sandbox.xplt is visible as scipy.xplt >(at least I hope). >However, the students will either have windows (around 80%) >or Linux at home. For windows users we have used >the Enthought Edition (http://code.enthought.com/enthon/) >and linux users were pointed to >available packages for their machines or to install Numeric/scipy >themselves. > > > As long as there are binaries for all the packages. Just having a list of Windows installers can also work. Were you using all of what is in the Enthon edition? >Concerning xplt another option might be to >install scipy.sandbox.xplt in such a way >that a `import xplt` would work. If that is possible we could >try to supply `xplt` separately for some of the distributions, >and maybe also for windows (which I don't use, so I have >no idea how difficult that would be). > > I don't think that would be hard at all. You can just run python setup.py bdist_inst from within the sandbox/xplt directory and get a windows installer. >If something like this was possible, the main question is >whether a new enthon distribution with new numpy/scipy/ipython >and all the other niceties of mayavi/VTK/wxPython/.... >will come out in the near future? > > I have no idea about that one. But, it sounds like the guy (Joe) at Enthought who did most of the work on the Enthon distribution is no longer as available for them, so I'm not sure... -Travis From stefan at sun.ac.za Wed Feb 15 10:38:05 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed Feb 15 10:38:05 2006 Subject: [Numpy-discussion] possible bug in dtype for records Message-ID: <20060215184056.GA31926@alpha> Using In [3]: numpy.__version__ Out[3]: '0.9.5.2024' I see the following: In [4]: import numpy as N In [5]: ctype = N.dtype({'names': ('x', 'y', 'z'), 'formats' : [N.float32, N.float32, N.float32]}) In [6]: ctype Out[6]: dtype([('x', ' References: <20060215184056.GA31926@alpha> Message-ID: <43F37C1E.4010409@ee.byu.edu> Stefan van der Walt wrote: >Using > >In [3]: numpy.__version__ >Out[3]: '0.9.5.2024' > >I see the following: > >In [4]: import numpy as N > >In [5]: ctype = N.dtype({'names': ('x', 'y', 'z'), 'formats' : [N.float32, N.float32, N.float32]}) > >In [6]: ctype >Out[6]: dtype([('x', ' >In [7]: N.array([(1,2,3), (4,5,6)], dtype=ctype) >Segmentation fault > > A segmentation fault is never expected behavior. Thanks for pointing this out. I'll see if I can figure out what is wrong. -Travis From oliphant at ee.byu.edu Wed Feb 15 11:27:14 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 11:27:14 2006 Subject: [Numpy-discussion] possible bug in dtype for records In-Reply-To: <20060215184056.GA31926@alpha> References: <20060215184056.GA31926@alpha> Message-ID: <43F38054.4000704@ee.byu.edu> Stefan van der Walt wrote: >Using > >In [3]: numpy.__version__ >Out[3]: '0.9.5.2024' > >I see the following: > >In [4]: import numpy as N > >In [5]: ctype = N.dtype({'names': ('x', 'y', 'z'), 'formats' : [N.float32, N.float32, N.float32]}) > >In [6]: ctype >Out[6]: dtype([('x', ' >In [7]: N.array([(1,2,3), (4,5,6)], dtype=ctype) >Segmentation fault > >However, when I use a mutable list for defining dtype, i.e. > >'names': ['x', 'y', 'z'] instead of >'names': ('x', 'y', 'z') > >it works fine. > >Is this expected behaviour? > > Got it. Basically, the VOID_setitem code expected the special -1 entry in the fields dictionary to be a list, but it was never enforced. So, when you entered a tuple like this problems arose. I changed it in SVN so that the -1 entry is always a tuple (you can't change the fields names anyway unless you define a new data-type) so its more aptly described as a tuple. Thanks again for finding this... -Travis From viznut at charter.net Wed Feb 15 11:43:03 2006 From: viznut at charter.net (Randall Hopper) Date: Wed Feb 15 11:43:03 2006 Subject: [Numpy-discussion] Numeric.identify(4) failure Message-ID: Here on 64-bit Linux, I get strange Python errors with some Numeric functions (see below). Has this been fixed in more recent versions of Numeric? Currently running python-numeric-23.7-3 which is what comes with SuSE 9.3. Thanks, Randall Python 2.4 (#1, Mar 22 2005, 18:42:42) [GCC 3.3.5 20050117 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. import Numeric Numeric.identity(4) Traceback (most recent call last): File "", line 1, in ? File "/usr/lib64/python2.4/site-packages/Numeric/Numeric.py", line 604, in identity return resize(array([1]+n*[0],typecode=typecode), (n,n)) File "/usr/lib64/python2.4/site-packages/Numeric/Numeric.py", line 398, in resize return reshape(a, new_shape) ValueError: total size of new array must be unchanged From Chris.Barker at noaa.gov Wed Feb 15 12:00:08 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 15 12:00:08 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: References: Message-ID: <43F3881E.6060101@noaa.gov> Arnd Baecker wrote: > - we need scipy.xplt > (matplotlib is still no option at this point) Why not? just curious. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at ee.byu.edu Wed Feb 15 13:22:18 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 13:22:18 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3837B.5040603@gmail.com> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> Message-ID: <43F39B20.50502@ee.byu.edu> Robert Kern wrote: >skip at pobox.com wrote: > > >>I'm trying to build numpy an svn sandbox (just updated a couple minutes >>ago). If I grub around in numpy/distutils/system_info.py it says something >>about creating a site.cfg file with (for example) information about locating >>atlas. It says nothing about where this file belongs. >> >> > >Sure it does. "The file 'site.cfg' in the same directory as this module is read >for configuration options." I think it's a really bad place for it to be, but >that is the state of affairs right now. > > > So, in particular, does this mean that it is read from (relative to the location of the main setup.py file) numpy/distutils/site.cfg ?? Yes, that is a bad place. We need some suggestions as to where site.cfg should be read from. I think you can set the environment variable ATLAS to 'None' and it will ignore ATLAS... I believe this is true of any of the configuration options. -Travis From oliphant at ee.byu.edu Wed Feb 15 13:27:14 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 13:27:14 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <17395.33239.806646.855808@montanaro.dyndns.org> References: <17395.33239.806646.855808@montanaro.dyndns.org> Message-ID: <43F39C96.6080506@ee.byu.edu> skip at pobox.com wrote: >I'm trying to build numpy an svn sandbox (just updated a couple minutes >ago). If I grub around in numpy/distutils/system_info.py it says something >about creating a site.cfg file with (for example) information about locating >atlas. It says nothing about where this file belongs. I took a stab and >placed it in my numpy source tree, right next to setup.py, with these lines: > >Failing all this, is there some way to build numpy/scipy without atlas? At >this point I just want the damn thing to build. I'll worry about >performance later (if at all). > > Yes. You need to set the appropriate environment variables to 'None' In particular, on my system (which is multithreaded and has a BLAS picked up from /usr/lib and an unthreaded ATLAS that the system will find) export PTATLAS='None' export ATLAS='None' export BLAS='None' did the trick. -Travis From robert.kern at gmail.com Wed Feb 15 13:46:10 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 15 13:46:10 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F39B20.50502@ee.byu.edu> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> Message-ID: <43F3A0FD.4000004@gmail.com> Travis Oliphant wrote: > So, in particular, does this mean that it is read from (relative to the > location of the main setup.py file) > > numpy/distutils/site.cfg ?? Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils directory. > Yes, that is a bad place. We need some suggestions as to where > site.cfg should be read from. Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in system_info.py will give you this directory even if you start running the script from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can check this in if we agree that this is what we want. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From cookedm at physics.mcmaster.ca Wed Feb 15 14:06:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 15 14:06:02 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3A0FD.4000004@gmail.com> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> Message-ID: <20060215220433.GA30194@arbutus.physics.mcmaster.ca> On Wed, Feb 15, 2006 at 03:45:33PM -0600, Robert Kern wrote: > Travis Oliphant wrote: > > > So, in particular, does this mean that it is read from (relative to the > > location of the main setup.py file) > > > > numpy/distutils/site.cfg ?? > > Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils > directory. > > > Yes, that is a bad place. We need some suggestions as to where > > site.cfg should be read from. > > Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in > system_info.py will give you this directory even if you start running the script > from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can > check this in if we agree that this is what we want. As a note: Python's distutils looks for distutils.cfg in it's installed location (/usr/lib/python2.4/distutils or whatever), then in ~/.pydistutils.cfg (or $HOME/pydistutils.cfg on non-Posix systems like Windows), then for setup.cfg in the current directory. Keys in later files override ones in earlier files. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From oliphant at ee.byu.edu Wed Feb 15 14:15:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 14:15:01 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3A0FD.4000004@gmail.com> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> Message-ID: <43F3A7A4.6040703@ee.byu.edu> Robert Kern wrote: >Travis Oliphant wrote: > > > >>So, in particular, does this mean that it is read from (relative to the >>location of the main setup.py file) >> >>numpy/distutils/site.cfg ?? >> >> > >Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils >directory. > > > >>Yes, that is a bad place. We need some suggestions as to where >>site.cfg should be read from. >> >> > >Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in >system_info.py will give you this directory even if you start running the script >from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can >check this in if we agree that this is what we want. > > > I've started this process already. I think a useful search order is 1) next to current setup.py --- os.getcwd() is probably better than what I did (backing up the frame until you can't go back anymore and getting the __file__ from that frame). Incidentally, it looks like a site.cfg present there is already copied to numpy/distutils on install --- it's looks like its just not used for the numpy build itself. 2) in the compilers "HOME" directory --- not sure how to implement that. 3) in the system-wide directory (what is currently done --- except when you are installing numpy that means it has to be in numpy/distutils/site.cfg). I created a get_site_cfg() function in system_info where this searching can be done. Feel free to change it as appropriate. -Travis From oliphant at ee.byu.edu Wed Feb 15 14:16:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Feb 15 14:16:02 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <20060215220433.GA30194@arbutus.physics.mcmaster.ca> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> <20060215220433.GA30194@arbutus.physics.mcmaster.ca> Message-ID: <43F3A800.4020508@ee.byu.edu> David M. Cooke wrote: >On Wed, Feb 15, 2006 at 03:45:33PM -0600, Robert Kern wrote: > > >>Travis Oliphant wrote: >> >> >> >>>So, in particular, does this mean that it is read from (relative to the >>>location of the main setup.py file) >>> >>>numpy/distutils/site.cfg ?? >>> >>> >>Yes. And in the case of scipy, it needs to be in the *installed* numpy/distutils >>directory. >> >> >> >>>Yes, that is a bad place. We need some suggestions as to where >>>site.cfg should be read from. >>> >>> >>Next to the setup.py that *invokes* numpy.distutils. AFAICT using os.getcwd() in >>system_info.py will give you this directory even if you start running the script >>from a different directory (e.g. "python ~/svn/scipy/setup.py install"). I can >>check this in if we agree that this is what we want. >> >> > >As a note: Python's distutils looks for distutils.cfg in it's installed >location (/usr/lib/python2.4/distutils or whatever), then in >~/.pydistutils.cfg (or $HOME/pydistutils.cfg on non-Posix systems like >Windows), then for setup.cfg in the current directory. Keys in later files >override ones in earlier files. > > > I think this is a good plan. However, what I've started doesn't implement the over-riding process properly. Anybody want to take a stab at that? It would be nice if it could get into this next release. -Travis From cookedm at physics.mcmaster.ca Wed Feb 15 14:45:07 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Wed Feb 15 14:45:07 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F3A7A4.6040703@ee.byu.edu> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> <43F3A0FD.4000004@gmail.com> <43F3A7A4.6040703@ee.byu.edu> Message-ID: <20060215224318.GA30435@arbutus.physics.mcmaster.ca> On Wed, Feb 15, 2006 at 03:13:56PM -0700, Travis Oliphant wrote: > Robert Kern wrote: > I've started this process already. I think a useful search order is > > 1) next to current setup.py --- os.getcwd() is probably better than > what I did (backing up the frame until you can't go back anymore and > getting the __file__ from that frame). Incidentally, it looks like a > site.cfg present there is already copied to numpy/distutils on install > --- it's looks like its just not used for the numpy build itself. > > 2) in the compilers "HOME" directory --- not sure how to implement that. Have a look at distutils.dist for the Distribution.find_config_files method. Also, the parse_config_files method reads the config options in a way that keeps which filenames they come from. > 3) in the system-wide directory (what is currently done --- except when > you are installing numpy that means it has to be in > numpy/distutils/site.cfg). -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From gruben at bigpond.net.au Wed Feb 15 15:27:08 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Wed Feb 15 15:27:08 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F36CB6.5050004@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> Message-ID: <43F3B8A9.3000507@bigpond.net.au> Tim Hochberg wrote: > As I've been thinking about this some more, I think the correct thing to > do is not to mess with the power ufuncs at all. Rather in x.__pow__ > (since I don't know that there's anywhere else to do it), after the > above checks check the types of the array and in the cases where the > first argument is a float or complex and the second argument is some > sort of integer array. This would be dispatched to some other helper > function instead of the normal pow_ufunc. In other words, optimize: > > A**2, A**2.0, A**(2.0+0j), etc > > and > > A**array([1,2,3]) > > but not > > A**array[1.0, 2.0, 3.0] > > I think that this takes care of the optimization slowing down power for > general floats and optimizes the only array-array case that really matter. I think this might still be a tiny bit dangerous despite not observing monotonicity problems and would be a bit more conservative and change it to: optimize: A**2, A**(2+0j), etc and A**array([1,2,3]) but not A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) -- Gary R. From tim.hochberg at cox.net Wed Feb 15 16:42:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 15 16:42:02 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F3B8A9.3000507@bigpond.net.au> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> Message-ID: <43F3CA12.4000907@cox.net> Gary Ruben wrote: > Tim Hochberg wrote: > > >> As I've been thinking about this some more, I think the correct thing >> to do is not to mess with the power ufuncs at all. Rather in >> x.__pow__ (since I don't know that there's anywhere else to do it), >> after the above checks check the types of the array and in the cases >> where the first argument is a float or complex and the second >> argument is some sort of integer array. This would be dispatched to >> some other helper function instead of the normal pow_ufunc. In other >> words, optimize: >> >> A**2, A**2.0, A**(2.0+0j), etc >> >> and >> >> A**array([1,2,3]) >> >> but not >> >> A**array[1.0, 2.0, 3.0] >> >> I think that this takes care of the optimization slowing down power >> for general floats and optimizes the only array-array case that >> really matter. > > > I think this might still be a tiny bit dangerous despite not observing > monotonicity problems and would be a bit more conservative and change > it to: > > optimize: > > A**2, A**(2+0j), etc I'm guessing here that you did not mean to include (2+0j) on both lists and that, in fact you wanted not to optimize on complex exponents: . So, optimize: A**-1, A**0, A**1, A**2, etc. > > and > > A**array([1,2,3]) > > but not > > A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) That makes sense. It's safer and easier to explain: "numpy optimizes raising matrices (and possibly scalars) to integer powers)". The only sticking point that I see is if David is still interested in optimizing A**0.5, that's not going to mesh with this. On the other hand, perhaps he can be persuaded that sqrt(A) is just as good. After all, it's only one more character long ;) -tim From oliphant.travis at ieee.org Wed Feb 15 17:10:13 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 17:10:13 2006 Subject: [Numpy-discussion] Re: New release of NumPy coming In-Reply-To: <200602151405.47640.faltet@carabos.com> References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> <200602151405.47640.faltet@carabos.com> Message-ID: Francesc Altet wrote: > A Dimecres 15 Febrer 2006 11:18, Pearu Peterson va escriure: > >>On Wed, 15 Feb 2006, Francesc Altet wrote: >> >>>I've been trying to see how to correctly load the unicode tests, but >>>failed miserably. Perhaps Pearu can tell us about the correct way to >>>do that. >> >>I have fixed it in svn. When importing modules from tests/ directory, one >>must surround the corresponding import statements with set_local_path() >>and restore_path() calls. > > > Ah, ok. Is there any place where this is explained or we have to use > the source to figure out these sort of things? > Good thing to put in the doc subdirectory.... I was obviously not certain about the use of set_local_path() myself until Pearu corrected me. -Travis From harrison.ian at gmail.com Wed Feb 15 17:21:17 2006 From: harrison.ian at gmail.com (Ian Harrison) Date: Wed Feb 15 17:21:17 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays Message-ID: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> Hello, I have two groups of 3x1 arrays that are arranged into two larger 3xn arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In Matlab, I'd use the function cross() to calculate the cross product of the corresponding 'vectors' from each array. In other words: if ai and bj are 3x1 column vectors: A = [ a1 a2 a3 ] B = [ b1 b2 b3 ] C = A x B = [ (a1 x b1) (a2 x b2) (a3 x b3) ] Could someone suggest a clean way to do this? I suppose I could write a for loop to cycle through each pair of vectors and send them to numpy's cross(), but since I'm new to python/scipy/numpy, I'm guessing that there's probably a better method that I'm overlooking. Thanks, Ian From gruben at bigpond.net.au Wed Feb 15 17:25:11 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Wed Feb 15 17:25:11 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F3CA12.4000907@cox.net> References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> Message-ID: <43F3D43B.6090301@bigpond.net.au> Tim Hochberg wrote: >> optimize: >> >> A**2, A**(2+0j), etc > > > I'm guessing here that you did not mean to include (2+0j) on both lists > and that, in fact you wanted not to optimize on complex exponents: Oops. The complex index in the optimise list has integer parts. I assumed Python distinguished between complex numbers with integer and real parts, but it doesn't, so you're correct about which cases I'd vote to optimise. > So, optimize: > > A**-1, A**0, A**1, A**2, etc. > >> >> and >> >> A**array([1,2,3]) >> >> but not >> >> A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) > > > That makes sense. It's safer and easier to explain: "numpy optimizes > raising matrices (and possibly scalars) to integer powers)". The only > sticking point that I see is if David is still interested in optimizing > A**0.5, that's not going to mesh with this. On the other hand, perhaps > he can be persuaded that sqrt(A) is just as good. After all, it's only > one more character long ;) > > -tim From skip at pobox.com Wed Feb 15 17:39:32 2006 From: skip at pobox.com (skip at pobox.com) Date: Wed Feb 15 17:39:32 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Where do I put site.cfg? What should it contain? Can I build w/o atlas? In-Reply-To: <43F39B20.50502@ee.byu.edu> References: <17395.33239.806646.855808@montanaro.dyndns.org> <43F3837B.5040603@gmail.com> <43F39B20.50502@ee.byu.edu> Message-ID: <17395.55188.130052.595605@montanaro.dyndns.org> Travis> Yes, that is a bad place. We need some suggestions as to where Travis> site.cfg should be read from. First place to look should be `pwd`. Travis> I think you can set the environment variable ATLAS to 'None' and Travis> it will ignore ATLAS... Thank you, thank you, thank you. I now have numpy built... I'll tackle the rest of scipy ma?ana. Skip From gruben at bigpond.net.au Wed Feb 15 18:04:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Wed Feb 15 18:04:02 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> Message-ID: <43F3DD7B.1010107@bigpond.net.au> This *almost* does what you want, I think. I can't see a neat way to give column vectors in the solution: In [21]: a=array([[[1],[2],[3]],[[4],[5],[6]],[[7],[8],[9]]]) In [22]: b=array([[[1],[2],[4]],[[4],[5],[7]],[[7],[8],[10]]]) In [24]: cross(a.transpose(),b.transpose()) Out[24]: array([[[ 2, -1, 0], [ 5, -4, 0], [ 8, -7, 0]]]) Gary R. Ian Harrison wrote: > Hello, > > I have two groups of 3x1 arrays that are arranged into two larger 3xn > arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In > Matlab, I'd use the function cross() to calculate the cross product of > the corresponding 'vectors' from each array. In other words: > > if ai and bj are 3x1 column vectors: > > A = [ a1 a2 a3 ] > > B = [ b1 b2 b3 ] > > C = A x B = [ (a1 x b1) (a2 x b2) (a3 x b3) ] > > Could someone suggest a clean way to do this? I suppose I could write > a for loop to cycle through each pair of vectors and send them to > numpy's cross(), but since I'm new to python/scipy/numpy, I'm guessing > that there's probably a better method that I'm overlooking. > > Thanks, > Ian From oliphant.travis at ieee.org Wed Feb 15 19:16:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 19:16:04 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> Message-ID: <43F3EE4E.6030301@ieee.org> Ian Harrison wrote: >Hello, > >I have two groups of 3x1 arrays that are arranged into two larger 3xn >arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In >Matlab, I'd use the function cross() to calculate the cross product of >the corresponding 'vectors' from each array. In other words: > > Help on function cross in module numpy.core.numeric: cross(a, b, axisa=-1, axisb=-1, axisc=-1) Return the cross product of two (arrays of) vectors. The cross product is performed over the last axis of a and b by default, and can handle axes with dimensions 2 and 3. For a dimension of 2, the z-component of the equivalent three-dimensional cross product is returned. It's the axisa, axisb, and axisc that you are interested in. The default is to assume you have Nx3 arrays and return an Nx3 array. But you can change the axis used to find vectors. cross(A,B,axisa=0,axisb=0,axisc=0) will do what you want. I suppose, a single axis= argument might be useful as well for the common situation of having all the other axis arguments be the same. -Travis >if ai and bj are 3x1 column vectors: > >A = [ a1 a2 a3 ] > >B = [ b1 b2 b3 ] > >C = A x B = [ (a1 x b1) (a2 x b2) (a3 x b3) ] > >Could someone suggest a clean way to do this? I suppose I could write >a for loop to cycle through each pair of vectors and send them to >numpy's cross(), but since I'm new to python/scipy/numpy, I'm guessing >that there's probably a better method that I'm overlooking. > >Thanks, >Ian > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From oliphant.travis at ieee.org Wed Feb 15 21:58:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 15 21:58:02 2006 Subject: [Numpy-discussion] Release tomorrow Message-ID: <43F41440.2040504@ieee.org> I just made some changes to the system_info file to parse site.cfg from the three standard locations (in order) 1) System-wide (the location of numpy/distutils/system_info.py) 2) Users HOME directory 3) Current working directory I'm assuming the config parser will update the appropriate sections as later files are read. I'd like to make a release tomorrow but would like a code-review on my changes in this section before hand. If somebody who uses site.cfg could try out the SVN version and see if it works as expected. That would be great. -Travis From gruben at bigpond.net.au Thu Feb 16 01:00:04 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 16 01:00:04 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F3EE4E.6030301@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> Message-ID: <43F43F02.1090001@bigpond.net.au> Hi Travis, Have you tested this? It appears to give the wrong answer on my system. I expect to get from this In [21]: a=array([[[1],[2],[3]],[[4],[5],[6]],[[7],[8],[9]]]) In [22]: b=array([[[1],[2],[4]],[[4],[5],[7]],[[7],[8],[10]]]) the solution Out[24]: array([[[ 2], [-1], [ 0]], [[ 5], [-4], [ 0]], [[ 8], [-7], [ 0]]]) i.e. the same as my example but with column vectors instead of rows, but doing cross(a,b,axisa=0,axisb=0,axisc=0) gives Out[15]: array([[[ 0], [ 0], [-3]], [[ 0], [ 0], [ 6]], [[ 0], [ 0], [-3]]]) Gary R. Travis Oliphant wrote: > Ian Harrison wrote: > >> Hello, >> >> I have two groups of 3x1 arrays that are arranged into two larger 3xn >> arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In >> Matlab, I'd use the function cross() to calculate the cross product of >> the corresponding 'vectors' from each array. In other words: >> >> > > > Help on function cross in module numpy.core.numeric: > > cross(a, b, axisa=-1, axisb=-1, axisc=-1) > Return the cross product of two (arrays of) vectors. > > The cross product is performed over the last axis of a and b by default, > and can handle axes with dimensions 2 and 3. For a dimension of 2, > the z-component of the equivalent three-dimensional cross product is > returned. > > It's the axisa, axisb, and axisc that you are interested in. > > The default is to assume you have Nx3 arrays and return an Nx3 array. > But you can change the axis used to find vectors. > > cross(A,B,axisa=0,axisb=0,axisc=0) > > will do what you want. I suppose, a single axis= argument might be > useful as well for the common situation of having all the other axis > arguments be the same. > > -Travis From arnd.baecker at web.de Thu Feb 16 01:47:03 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Thu Feb 16 01:47:03 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: <43F37183.3000103@ieee.org> References: <43F37183.3000103@ieee.org> Message-ID: Hi, On Wed, 15 Feb 2006, Travis Oliphant wrote: > Arnd Baecker wrote: > > >Reasons not to switch > >- there is no enthought edition yet (right?) > >- there are only packages for a few platforms/distribution > >- we need scipy.xplt > > (matplotlib is still no option at this point) > > > >Discussion/Background: > > > >To us the two main show-stoppers are scipy.xplt and the question > >about an Enthought Edition for Windows. > >For the Pool of PCs where the tutorial groups are to be held, > >it won't be a problem to install numpy/scipy in such > >a way that scipy.sandbox.xplt is visible as scipy.xplt > >(at least I hope). > >However, the students will either have windows (around 80%) > >or Linux at home. For windows users we have used > >the Enthought Edition (http://code.enthought.com/enthon/) > >and linux users were pointed to > >available packages for their machines or to install Numeric/scipy > >themselves. > > > As long as there are binaries for all the packages. Just having a list > of Windows installers can also work. > Were you using all of what is in > the Enthon edition? Of course not (there is so much stuff ;-), but the students made good use of VPython and also VTK/MayaVi was well recieved. So the bare minimum might be python+numpy/scipy/ipython/Vpython (+any windows specific stuff?) + maybe VTK and MayaVi > >Concerning xplt another option might be to > >install scipy.sandbox.xplt in such a way > >that a `import xplt` would work. If that is possible we could > >try to supply `xplt` separately for some of the distributions, > >and maybe also for windows (which I don't use, so I have > >no idea how difficult that would be). > > > > > I don't think that would be hard at all. You can just run python > setup.py bdist_inst from within the sandbox/xplt directory and get a > windows installer. OK, some in our group have much better knowledge about windows, so I will ask them to test this approach. > >If something like this was possible, the main question is > >whether a new enthon distribution with new numpy/scipy/ipython > >and all the other niceties of mayavi/VTK/wxPython/.... > >will come out in the near future? > > > I have no idea about that one. But, it sounds like the guy (Joe) at > Enthought who did most of the work on the Enthon distribution is no > longer as available for them, so I'm not sure... I see - could this somehow be turned into a community effort? Surely it is a non-trivial task - in particular the monolithic structure of an all-in-one download package seems to make upgrades of individual components difficult. Would something like some super-installer, calling the individual package installers be possible? (or does anything like a package management system for windows exist?) You see, I don't know anything about Windows, so I better shut up on this ;-). Many thanks, Arnd From pearu at scipy.org Thu Feb 16 02:32:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Thu Feb 16 02:32:03 2006 Subject: [Numpy-discussion] New release of NumPy coming In-Reply-To: <200602151405.47640.faltet@carabos.com> References: <43F26137.5040901@ieee.org> <200602151100.37229.faltet@carabos.com> <200602151405.47640.faltet@carabos.com> Message-ID: On Wed, 15 Feb 2006, Francesc Altet wrote: > A Dimecres 15 Febrer 2006 11:18, Pearu Peterson va escriure: >> On Wed, 15 Feb 2006, Francesc Altet wrote: >>> I've been trying to see how to correctly load the unicode tests, but >>> failed miserably. Perhaps Pearu can tell us about the correct way to >>> do that. >> >> I have fixed it in svn. When importing modules from tests/ directory, one >> must surround the corresponding import statements with set_local_path() >> and restore_path() calls. > > Ah, ok. Is there any place where this is explained or we have to use > the source to figure out these sort of things? There is a single note about set_local_path in numpy/doc/DISTUTILS.txt. I agree that it would be nice to have Howtos such as "Howto write scipy-styled setup.py" "Howto write scipy-styled unit tests" etc Pearu From faltet at carabos.com Thu Feb 16 02:57:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 16 02:57:07 2006 Subject: [Numpy-discussion] [ANN] PyTables (A Hierarchical Database) 1.2.2 is out Message-ID: <200602161156.46810.faltet@carabos.com> =========================== Announcing PyTables 1.2.2 =========================== This is a maintenance version. Some important improvements and bug fixes has been addressed in it. Go to the PyTables web site for downloading the beast: http://pytables.sourceforge.net/ or keep reading for more info about the new features and bugs fixed in this version. Changes more in depth ===================== Improvements: - Multidimensional arrays of strings are now supported as node attributes. They just need to be wrapped into ``CharArray`` objects (see the ``numarray.strings`` module). - The limit of 512 KB for row sizes in tables has been removed. Now, there is no limit in the row size. - When using table row iterators in non-iterator contexts, a warning is issued recommending the users to use them in iterator contexts. Before, when these iterators were used, it was printed a regular record get from an arbitrary place of the memory, giving a non-sense record as a result. - Compression libraries are now dynamically loaded as different extension modules, so there is no longer need for producing several binary packages supporting different sets of compressors. Bug fixes: - Solved a leak that exposed when reading VLArray data. The problem was due to the usage of different heaps (C and Python) of memory. Thanks to Russel Howe to report this and to provide an initial patch. Known issues: - Time datatypes are non-portable between big-endian and little-endian architectures. This is ultimately a consequence of an HDF5 limitation. See SF bug #1234709 for more info. Backward-incompatible changes: - Please, see RELEASE-NOTES.txt file. Important notes for Windows users ================================= If you are willing to use PyTables with Python 2.4 in Windows platforms, you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET 2003. It can be found at: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win-net.ZIP Users of Python 2.3 on Windows will have to download the version of HDF5 compiled with MSVC 6.0 available in: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win.ZIP Also, note that support for the UCL compressor has not been added in the binary build of PyTables for Windows because of memory problems (perhaps some bad interaction between UCL and something else). Eventually, UCL support might be dropped in the future, so, please, refrain to create datasets compressed with it. What it is ========== **PyTables** is a package for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data (with support for full 64-bit file addressing). It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code, makes it a very easy-to-use tool for high performance data storage and retrieval. PyTables runs on top of the HDF5 library and numarray (Numeric is also supported and NumPy support is coming along) package for achieving maximum throughput and convenient use. Besides, PyTables I/O for table objects is buffered, implemented in C and carefully tuned so that you can reach much better performance with PyTables than with your own home-grown wrappings to the HDF5 library. PyTables sports indexing capabilities as well, allowing doing selections in tables exceeding one billion of rows in just seconds. Platforms ========= This version has been extensively checked on quite a few platforms, like Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64 (Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC and MacOSX on PowerPC. For other platforms, chances are that the code can be easily compiled and run without further issues. Please, contact us in case you are experiencing problems. Resources ========= Go to the PyTables web site for more details: http://pytables.sourceforge.net/ About the HDF5 library: http://hdf.ncsa.uiuc.edu/HDF5/ About numarray: http://www.stsci.edu/resources/software_hardware/numarray To know more about the company behind the PyTables development, see: http://www.carabos.com/ Acknowledgments =============== Thanks to various the users who provided feature improvements, patches, bug reports, support and suggestions. See THANKS file in distribution package for a (incomplete) list of contributors. Many thanks also to SourceForge who have helped to make and distribute this package! And last but not least, a big thank you to THG (http://www.hdfgroup.org/) for sponsoring many of the new features recently introduced in PyTables. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From arnd.baecker at web.de Thu Feb 16 04:39:01 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Thu Feb 16 04:39:01 2006 Subject: [Numpy-discussion] numpy/scipy transition In-Reply-To: <43F3881E.6060101@noaa.gov> References: <43F3881E.6060101@noaa.gov> Message-ID: Hi Chris, On Wed, 15 Feb 2006, Christopher Barker wrote: > Arnd Baecker wrote: > > - we need scipy.xplt > > (matplotlib is still no option at this point) > > Why not? just curious. That's a slightly longer story, but since you asked ;-): First, I should emphasize that I really think that matplotlib is extremely good, the quality of the plots is superb! Also it is used a lot in our group for research. However, in our opinion we cannot use matplotlib for our course for the following two main reasons (a) some (for us crucial) bugs (b) speed Concerning (a), the most crucial problem is the double-buffering problem. This did not exist with matplotlib 0.82 and has been reported several times, the last one being http://sourceforge.net/mailarchive/forum.php?thread_id=9559204&forum_id=33405 The presently suggested work-around is to use TkAgg as backend. However, the TkAgg backend is slower than any other backend. We cannot tell our students to use TkAgg for one problem and switch to WXAgg for another problem - they already struggle enough with learning python (there are several first-time-programmers as well) and the hard physics problems we give them ;-). To us this double-buffering problem is the show-stopper number one. Unfortunately, I don't understand the internals of matplotlib well enough to help with tracking this one down. There are a couple of further problems which we reported, but have fallen through the cracks - no complaint, that's how things are. But it is (from our point of view) not worth talking about them again as long as the double-buffering problem is still there. On the speed side (b): we have been using scipy.xplt and even that (though generally considered to be really fast) is not as fast as for example pgplot ;-). In addition many of our students run older maschines starting from PIIIs (I think the PIIs are gone by now, but two years ago quite a few still used them). So this is something to be kept in mind when talking about speed. We hired a student to do the conversion of our exercises from scipy.xplt to matplotlib and look into some of the speed issues. With John Hunters help this got pretty far, http://sourceforge.net/mailarchive/forum.php?thread_id=8153459&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8185639&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8243168&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8346924&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8498518&forum_id=33405 http://sourceforge.net/mailarchive/forum.php?thread_id=8728580&forum_id=33405 I think that there was no further message after this and the whole approach has not yet been incorporated into MPL. To me it made the impression that it was very close to a good solution, and I would be willing to take up this issue again if there is a chance that it gets integrated into MPL. So, presumably many of the speed issues could be resolved The price to be paid is in some cases a factor of two more lines of code for the plotting (compared to scipy.xplt). By using a bit more encapsulation, this could surely be overcome. Ok, I hope I could roughly explain why we think that we cannot yet use Matplotlib - it is really almost there, so I remain very optimistic that at least next year we will be using it as the default plotting environment. Best, Arnd From oliphant.travis at ieee.org Thu Feb 16 06:21:15 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 06:21:15 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F43F02.1090001@bigpond.net.au> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> Message-ID: <43F48A2D.80705@ieee.org> Gary Ruben wrote: > I expect to get from this > > In [21]: a=array([[[1],[2],[3]],[[4],[5],[6]],[[7],[8],[9]]]) > In [22]: b=array([[[1],[2],[4]],[[4],[5],[7]],[[7],[8],[10]]]) > > the solution > > Out[24]: > array([[[ 2], > [-1], > [ 0]], > > [[ 5], > [-4], > [ 0]], > > [[ 8], > [-7], > [ 0]]]) Why do you expect to get this solution with axis=0? Remember axis=0 thinks the vectors are formed in the 0th dimension. Thus a[:,0,0] and a[:,1,0] and a[:,2,0] are the vectors you are using. You appear to be thinking of the vectors in the axis=1 dimension where the vectors would be a[0,:,0], a[1,:,0], a[2,:,0] But this is specified with axis=1 (there is a single axis argument available now in SVN which means axisa=axisb=axisc=axis) Thus, cross(a,b,axis=1) Gives the solution I think you are after. -Travis From travis at enthought.com Thu Feb 16 11:18:05 2006 From: travis at enthought.com (Travis N. Vaught) Date: Thu Feb 16 11:18:05 2006 Subject: [Numpy-discussion] ANN: Python Enthought Edition Version 0.9.2 Released Message-ID: <43F4CFB4.1080305@enthought.com> Enthought is pleased to announce the release of Python Enthought Edition Version 0.9.2 (http://code.enthought.com/enthon/) -- a python distribution for Windows. This is a kitchen-sink-included Python distribution including the following packages/tools out of the box: Numeric 24.2 SciPy 0.3.3 IPython 0.6.15 Enthought Tool Suite 1.0.2 wxPython 2.6.1.0 PIL 1.1.4 mingw 20030504-1 f2py 2.45.241_1926 MayaVi 1.5 Scientific Python 2.4.5 VTK 4.4 and many more... 0.9.2 Release Notes Summary --------------------------- Version 0.9.2 of Python Enthought Edition is the first to include the Enthought Tool Suite Package (http://code.enthought.com/ets/). Other changes include upgrading to Numeric 24.2, including MayaVi 1.5 (rather than 1.3) and removing a standalone PyCrust package in favor of the one included with wxPython. Also, elementtree and celementtree have been added to the distribution. Notably, this release is still based on Python 2.3.5 and still includes SciPy 0.3.3. You'll also notice that we have changed the version numbering to a major.minor.point format (from a build number format). see full release notes at: http://code.enthought.com/release/changelog-enthon0.9.2.shtml Best, Travis N. Vaught From stefan at sun.ac.za Thu Feb 16 11:27:05 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 16 11:27:05 2006 Subject: [Numpy-discussion] storage for records Message-ID: <20060216192556.GA20396@alpha> Is there any way to control the underlying storage for a record? I am trying to use Travis' earlier example of an image with named fields: dt = N.dtype(' References: <20060216192556.GA20396@alpha> Message-ID: <43F4E947.9070409@ieee.org> Stefan van der Walt wrote: >Is there any way to control the underlying storage for a record? > >I am trying to use Travis' earlier example of an image with named fields: > >dt = N.dtype('img = N.array(N.empty((rows,columns)), dtype=dt) > >Using this, I can access the different bands of the image using > >img['r'], img['g'], img['b'] (but not img.r as mentioned in some of >the posts). > > Attribute lookup (img.r) is the purpose of the record array subclass. rimg= img.view(numpy.recarray) rimg.r --- will now work. >'img' itself is a matrix of similar dimension as img['r'], but >contains the combined items of type ' > Be-ware that the 'However, I would like to store the image as a 3xMxN array, with the r, >g and b bands being contained in > >img[0], img[1] and img[2] > > You don't need a record array to do that. Just define your array as a 3xMxN array of floats. But, you could just re-define the data-type as img2 = img.view(('f4',3)) -- if img is MxN then img2 is MxNx3. or use rimg.field(0) --- field was recently added to record-arrays. >Is there a way to construct the record so that this structure is used >for storage? Further, how do I specify the dtype above, i.e. > >N.dtype(' >in the style > >N.dtype({'names' : ['r','g','b'], 'formats': ['f4','f4','f4']}) > > > >(how do I specify that the combined type is 'f12')? > > Use a tuple N.dtype((' I was looking at the code implementing array_new in arrayobject.c and for a while I could not convince myself that it handles ref. counts correctly. The cleanup code (at the "fail:" label contains Py_XDECREF(descr), meaning that descr is unreferenced on failure unless it is NULL. This makes sense because descr is created inside array_new by PyArray_DescrConverter, but if the failure is detected in PyArg_ParseTupleAndKeywords, descr may be NULL. What was puzzling to me, failures of PyArray_NewFromDescr are handled by "if (ret == NULL) {descr=NULL;goto fail;}" that sets descr to NULL before jumping to cleanup. As I investigated further, I've discovered the following helpful comment preceding PyArray_NewFromDescr : /* steals a reference to descr (even on failure) */ that explains why descr=NULL is necessary. I wonder what was the motivation for this design choice. I don't think this is a natural behavior for python C-API functions. I am not proposing to make any changes, just curious about the design. From oliphant.travis at ieee.org Thu Feb 16 13:51:05 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 13:51:05 2006 Subject: [Numpy-discussion] Reference counting question In-Reply-To: References: Message-ID: <43F4F3A1.7000400@ieee.org> Sasha wrote: >I was looking at the code implementing array_new in arrayobject.c and >for a while I could not convince myself that it handles ref. counts >correctly. The cleanup code (at the "fail:" label contains >Py_XDECREF(descr), meaning that descr is unreferenced on failure >unless it is NULL. This makes sense because descr is created inside >array_new by PyArray_DescrConverter, but if the failure is detected >in PyArg_ParseTupleAndKeywords, descr may be NULL. What was >puzzling to me, failures of PyArray_NewFromDescr are handled by "if >(ret == NULL) {descr=NULL;goto fail;}" that sets descr to NULL before >jumping to cleanup. As I investigated further, I've discovered the >following helpful comment preceding PyArray_NewFromDescr : /* steals a >reference to descr (even on failure) */ that explains why descr=NULL >is necessary. > >I wonder what was the motivation for this design choice. I don't >think this is a natural behavior for python C-API functions. I am not >proposing to make any changes, just curious about the design. > > The PyArray_Descr structure never used to be a Python object. Now it is. There is the C-API PyArray_DescrFromType that used to just return a C-structure but now it returns a reference-counted Python object. People are not used to reference counting the PyArray_Descr objects. The easiest way to make this work in my mind was to have the functions that use the Descr object steal a reference because ultimately the Descr objects purpose is to reside in an array. It is created for the purpose of being a member of an array structure which therefore steals it's reference. As an example, with this design you can write (and there are macros that do). PyArray_NewFromDescr(...., PyArray_DescrFromType(type_num), ....) and not create reference-count leaks. -Travis From stefan at sun.ac.za Thu Feb 16 13:55:05 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 16 13:55:05 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <43F4E947.9070409@ieee.org> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> Message-ID: <20060216215403.GH20396@alpha> On Thu, Feb 16, 2006 at 02:06:15PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >Is there any way to control the underlying storage for a record? > > > >I am trying to use Travis' earlier example of an image with named fields: > > > >dt = N.dtype(' >img = N.array(N.empty((rows,columns)), dtype=dt) > > > >Using this, I can access the different bands of the image using > > > >img['r'], img['g'], img['b'] (but not img.r as mentioned in some of > >the posts). > > > > > Attribute lookup (img.r) is the purpose of the record array subclass. > > rimg= img.view(numpy.recarray) > > rimg.r --- will now work. Thanks for the quick response! This is very useful information. Regards St?fan From gruben at bigpond.net.au Thu Feb 16 15:01:04 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 16 15:01:04 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F48A2D.80705@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: <43F50402.2000009@bigpond.net.au> Thanks Travis, I think this is what Ian was asking for (axis=1 rather than axis=0). I was confused by your previous reply in this thread which I blindly followed without thinking about it. >(there is a single axis argument available now in SVN which means > axisa=axisb=axisc=axis) Nice addition. Gary R. From oliphant.travis at ieee.org Thu Feb 16 16:11:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 16:11:08 2006 Subject: [Numpy-discussion] Release of NumPy 0.9.5 Message-ID: <43F5147A.3020202@ieee.org> I'm pleased to announce the release of NumPy 0.9.5 The release notes and download site can be found at http://www.scipy.org Best regards, -Travis Oliphant From bblais at bryant.edu Thu Feb 16 18:41:01 2006 From: bblais at bryant.edu (Brian Blais) Date: Thu Feb 16 18:41:01 2006 Subject: [Numpy-discussion] calculating matrix values at particular indices? Message-ID: <43F53719.50907@bryant.edu> Hello, In my attempt to learn python, migrating from matlab, I have the following problem. Here is what I want to do, (with the wrong syntax): from numpy import * t=arange(0,20,.1) x=zeros(len(t),'f') idx=where(t>5) tau=5 x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) #------------------ what is the best way to replace the wrong line with something that works: replace all of the values of x at the indices idx with exp(-t/tau) for values of t at indices idx? I do this all the time in matlab scripts, but I don't know that the pythonic preferred method is. thanks, bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From bblais at bryant.edu Thu Feb 16 18:47:03 2006 From: bblais at bryant.edu (Brian Blais) Date: Thu Feb 16 18:47:03 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: References: Message-ID: <43F5388A.7010905@bryant.edu> Colin J. Williams wrote: > Brian Blais wrote: >> In my attempt to learn python, migrating from matlab, I have the >> following problem. Here is what I want to do, (with the wrong syntax): >> >> from numpy import * >> >> t=arange(0,20,.1) >> x=zeros(len(t),'f') >> >> idx=(t>5) # <---this produces a Boolean array, probably not what you want. >> tau=5 >> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) >> > What are you trying to do? It is most unlikely that you need Boolean > values in x[idx] > in this example, as in many that I would do in matlab, I want to replace part of a vector with values from another vector. In this case, I want x to be zero from t=0 to 5, and then have a value of exp(-t/tau) for t>5. I could do it with an explicit for-loop, but that would be both inefficient and unpython-like. For those who know matlab, what I am doing here is: t=0:.1:20; idx=find(t>5); tau=5; x=zeros(size(t)); x(idx)=exp(-t(idx)/tau) is that clearer? I am sure there is a nice method to do this in python, but I haven't found it in the python or numpy docs. thanks, bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From oliphant.travis at ieee.org Thu Feb 16 19:02:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 19:02:02 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: <43F5388A.7010905@bryant.edu> References: <43F5388A.7010905@bryant.edu> Message-ID: <43F53C87.6040903@ieee.org> Brian Blais wrote: > Colin J. Williams wrote: > >> Brian Blais wrote: >> >>> In my attempt to learn python, migrating from matlab, I have the >>> following problem. Here is what I want to do, (with the wrong syntax): >>> >>> from numpy import * >>> >>> t=arange(0,20,.1) >>> x=zeros(len(t),'f') >>> >>> idx=(t>5) # <---this produces a Boolean array, >>> probably not what you want. >>> tau=5 >>> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) >>> >> What are you trying to do? It is most unlikely that you need Boolean >> values in x[idx] >> > > in this example, as in many that I would do in matlab, I want to > replace part of a vector with values from another vector. In this > case, I want x to be zero from t=0 to 5, and then have a value of > exp(-t/tau) for t>5. I could do it with an explicit for-loop, but > that would be both inefficient and unpython-like. For those who know > matlab, what I am doing here is: > from numpy import * t = r_[0:20:0.1] idx = t>5 tau = 5 x = zeros_like(t) x[idx] = exp(-t[idx]/tau) Should do it. -Travis From ryanlists at gmail.com Thu Feb 16 19:16:06 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu Feb 16 19:16:06 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: <43F53C87.6040903@ieee.org> References: <43F5388A.7010905@bryant.edu> <43F53C87.6040903@ieee.org> Message-ID: Brian's example works fine for me as long as x=zeros(len(t),'d') for some reason the error seems to come from assignment of a double array to a single precision array: In [318]: x=zeros(len(t),'f') In [319]: t=arange(0,20,.1) In [320]: idx=where(t>5) In [321]: x[idx]=exp(-t[idx]/tau) --------------------------------------------------------------------------- exceptions.TypeError Traceback (most recent call last) /home/ryan/thesis/accuracy/ TypeError: array cannot be safely cast to required type In [322]: x=zeros(len(t),'d') In [323]: x[idx]=exp(-t[idx]/tau) In [324]: Ryan On 2/16/06, Travis Oliphant wrote: > Brian Blais wrote: > > > Colin J. Williams wrote: > > > >> Brian Blais wrote: > >> > >>> In my attempt to learn python, migrating from matlab, I have the > >>> following problem. Here is what I want to do, (with the wrong syntax): > >>> > >>> from numpy import * > >>> > >>> t=arange(0,20,.1) > >>> x=zeros(len(t),'f') > >>> > >>> idx=(t>5) # <---this produces a Boolean array, > >>> probably not what you want. > >>> tau=5 > >>> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) > >>> > >> What are you trying to do? It is most unlikely that you need Boolean > >> values in x[idx] > >> > > > > in this example, as in many that I would do in matlab, I want to > > replace part of a vector with values from another vector. In this > > case, I want x to be zero from t=0 to 5, and then have a value of > > exp(-t/tau) for t>5. I could do it with an explicit for-loop, but > > that would be both inefficient and unpython-like. For those who know > > matlab, what I am doing here is: > > > from numpy import * > > > t = r_[0:20:0.1] > idx = t>5 > tau = 5 > x = zeros_like(t) > x[idx] = exp(-t[idx]/tau) > > > Should do it. > > -Travis > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ndarray at mac.com Thu Feb 16 19:16:07 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 16 19:16:07 2006 Subject: [Numpy-discussion] calculating matrix values at particular indices? In-Reply-To: <43F53719.50907@bryant.edu> References: <43F53719.50907@bryant.edu> Message-ID: You should make t and x the same type: either add dtype='f' to arange or change dtype='f' to dtype='d' in zeros. On 2/16/06, Brian Blais wrote: > Hello, > > In my attempt to learn python, migrating from matlab, I have the following problem. > Here is what I want to do, (with the wrong syntax): > > from numpy import * > > t=arange(0,20,.1) > x=zeros(len(t),'f') > > idx=where(t>5) > tau=5 > x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) > > #------------------ > > what is the best way to replace the wrong line with something that works: replace all > of the values of x at the indices idx with exp(-t/tau) for values of t at indices idx? > > I do this all the time in matlab scripts, but I don't know that the pythonic > preferred method is. > > > > thanks, > > bb > > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From harrison.ian at gmail.com Thu Feb 16 19:21:03 2006 From: harrison.ian at gmail.com (Ian Harrison) Date: Thu Feb 16 19:21:03 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F3EE4E.6030301@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> Message-ID: <764834db0602161920r59568392m611c7abd9a895900@mail.gmail.com> On 2/15/06, Travis Oliphant wrote: > Ian Harrison wrote: > > >Hello, > > > >I have two groups of 3x1 arrays that are arranged into two larger 3xn > >arrays. Each of the 3x1 sub-arrays represents a vector in 3D space. In > >Matlab, I'd use the function cross() to calculate the cross product of > >the corresponding 'vectors' from each array. In other words: > > > > > > > Help on function cross in module numpy.core.numeric: > > cross(a, b, axisa=-1, axisb=-1, axisc=-1) > Return the cross product of two (arrays of) vectors. > > The cross product is performed over the last axis of a and b by default, > and can handle axes with dimensions 2 and 3. For a dimension of 2, > the z-component of the equivalent three-dimensional cross product is > returned. > > It's the axisa, axisb, and axisc that you are interested in. > > The default is to assume you have Nx3 arrays and return an Nx3 array. > But you can change the axis used to find vectors. > > cross(A,B,axisa=0,axisb=0,axisc=0) > > will do what you want. I suppose, a single axis= argument might be > useful as well for the common situation of having all the other axis > arguments be the same. > > -Travis Travis, Thanks for your patience. This is what I was looking for. Ian From wbaxter at gmail.com Thu Feb 16 19:36:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 16 19:36:02 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: <43F5388A.7010905@bryant.edu> References: <43F5388A.7010905@bryant.edu> Message-ID: Howdy, On 2/17/06, Brian Blais wrote: > > Colin J. Williams wrote: > > Brian Blais wrote: > >> In my attempt to learn python, migrating from matlab, I have the > >> following problem. Here is what I want to do, (with the wrong syntax): > >> > >> from numpy import * > >> > >> t=arange(0,20,.1) > >> x=zeros(len(t),'f') This was the line causing the type error. t is type double (float64). 'f' makes x be type float32. That causes the assignment below to fail. Replacing that line with x=zeros(len(t),'d') should work. Or the zeros_like() that Travis suggested. >> > >> idx=(t>5) # <---this produces a Boolean array, probably > not what you want. > >> tau=5 > >> x[idx]=exp(-t[idx]/tau) # <---this line is wrong (gives a TypeError) > >> You could also use idx=where(t>5) In place of idx=(t>5) Although in this case it probably doesn't make much difference, where(expr) is more directly equivalent to matlab's find(expr). See http://www.scipy.org/Wiki/NumPy_for_Matlab_Users for more Matlab equivalents. And consider contributing your own, if you have some good ones that aren't there already. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Feb 16 21:33:02 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu Feb 16 21:33:02 2006 Subject: [Numpy-discussion] OT: Apologies for my (relative) silence Message-ID: <43F56002.6050408@gmail.com> I just realized today that for some reason I haven't actually been subscribed to this list since the end of September. Apparently, I've only been getting mails addressed to numpy-discussion if they were CC'ed to one of the scipy lists or to me personally. This was just enough traffic to fool me into thinking I was still subscribed. I wondered why you guys were so quiet. To the people whom I redirected here from comp.lang.python and then (seemingly) ignored, I'm sorry! -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From charlesr.harris at gmail.com Thu Feb 16 21:41:05 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu Feb 16 21:41:05 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: <43F48A2D.80705@ieee.org> References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: Would anyone be interested in a quaternion version of this for nx4 arrays with nx3 as a special case where the scalar part == 0? Looking at the the cross product implementation, it shouldn't be to hard to duplicate this for quaternions. What should such a product be called? Something like qprod? Chuck From charlesr.harris at gmail.com Thu Feb 16 21:45:04 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu Feb 16 21:45:04 2006 Subject: [Numpy-discussion] Creating arrays with fromfile Message-ID: Hi Travis, I notice that the fromfile function in NumPy no longer accepts the shape keyword that the numarray version has. The functionalitiy can be duplicated by reshaping the array after creating it, but I think the shape keyword is a bit more convenient for that. Thoughts? Chuck From wbaxter at gmail.com Thu Feb 16 21:59:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 16 21:59:02 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: Quaternions using which convention? [s,x,y,z] or [x,y,z,w]? The docstring should make it very clear. Perhaps support a flag for choosing which, unless there's some python-wide standard for quaternions that I'm not aware of. --Bill On 2/17/06, Charles R Harris wrote: > > Would anyone be interested in a quaternion version of this for nx4 > arrays with nx3 as a special case where the scalar part == 0? Looking > at the the cross product implementation, it shouldn't be to hard to > duplicate this for quaternions. What should such a product be called? > Something like qprod? > > Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Feb 16 22:21:15 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu Feb 16 22:21:15 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: Bill, On 2/16/06, Bill Baxter wrote: > Quaternions using which convention? [s,x,y,z] or [x,y,z,w]? > The docstring should make it very clear. Perhaps support a flag for > choosing which, unless there's some python-wide standard for quaternions > that I'm not aware of. > > --Bill > > > On 2/17/06, Charles R Harris wrote: > > Would anyone be interested in a quaternion version of this for nx4 > > arrays with nx3 as a special case where the scalar part == 0? Looking > > at the the cross product implementation, it shouldn't be to hard to > > duplicate this for quaternions. What should such a product be called? > > Something like qprod? > > > > Chuck > > > > I like to put the scalar last, but I am open to putting it first if anyone has strong feelings about it. As far as I know, there is no scipy convention on this. Hmm, maybe a flag would be useful just because folks are likely to have files sitting around full of quaternions using both conventions. Maybe one more scalar type to add to the NumPy mix? I must admit that dtype=Quaternion512 seems a bit much. Anyway, I am open to suggestions. From oliphant.travis at ieee.org Thu Feb 16 22:51:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 16 22:51:02 2006 Subject: [Numpy-discussion] Creating arrays with fromfile In-Reply-To: References: Message-ID: <43F5721D.3030801@ieee.org> Charles R Harris wrote: >Hi Travis, > >I notice that the fromfile function in NumPy no longer accepts the >shape keyword that the numarray version has. The functionalitiy can be >duplicated by reshaping the array after creating it, but I think the >shape keyword is a bit more convenient for that. Thoughts? > > > It was just that much more effort to implement correctly in C and since it can be easily done using fromfile(....).reshape(dim1,dim2,dim3,...) I didn't think it critical. Perhaps numarray compatibility functions should be placed in a numcompat module. -Travis From cookedm at physics.mcmaster.ca Thu Feb 16 23:31:03 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 16 23:31:03 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F36CB6.5050004@cox.net> (Tim Hochberg's message of "Wed, 15 Feb 2006 11:02:30 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>[1] which brings up another point. Would using the 3-multiplication >>version for complex multiplication be good? There might be some >>effects with cancellation errors due to the extra subtractions... >> >> > I'm inclined to leave this be for now. Both because I'm unsure of the > rounding issues and because I'm not sure it would actually be faster. > It has one less multiplication, but several more additions, so it > would depend on the relative speed add/sub with multiplication and how > things end up getting scheduled in the FP pipeline. At some point it's > probably worth trying; if it turns out to be signifigantly faster we > can think about rounding then. If it's not faster then no need to > think. I did some thinking, and looked up how to analyse it. 3M goes like this: xy = (a+bi)(c+di) = (ac-bd) + ((a+c)(b+d)-ac-bd)i Consider x = y = t + i/t, for which x**2 = (t**2-1/t**2) + 2i, then xy=x^2 = t*t - 1/t*1/t + ((t+1/t)(t+1/t) - t**2 - 1/t**2)i Consider when t is large enough that (t+1/t)**2 = t**2 in floating point; then Im fl(xy) will be -1/t**2, instead of 2. So...let's leave it as is. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cookedm at physics.mcmaster.ca Thu Feb 16 23:39:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 16 23:39:02 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F3CA12.4000907@cox.net> (Tim Hochberg's message of "Wed, 15 Feb 2006 17:40:50 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> Message-ID: Tim Hochberg writes: > Gary Ruben wrote: > So, optimize: > > A**-1, A**0, A**1, A**2, etc. > >> >> and >> >> A**array([1,2,3]) >> >> but not >> >> A**array[1.0, 2.0, 3.0], A**2.0, A**(2.0+0j) > > > That makes sense. It's safer and easier to explain: "numpy optimizes > raising matrices (and possibly scalars) to integer powers)". The only > sticking point that I see is if David is still interested in > optimizing A**0.5, that's not going to mesh with this. On the other > hand, perhaps he can be persuaded that sqrt(A) is just as good. After > all, it's only one more character long ;) sigh, ok :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From wbaxter at gmail.com Thu Feb 16 23:42:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 16 23:42:03 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: For folks using quats to represent rotations (which is all I use them for, anyway), if you're batch transforming a bunch of vectors by one quaternion, it's a lot more efficient to convert the quat to a 3x3 matrix first and transform using matrix multiply (9 mults per transform that way vs 21 or so depending on the implementation of q*v*q^-1). Given that, I can't see many situations when I'd need a super speedy C version of quaternion multiply. --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Fri Feb 17 00:16:01 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Fri Feb 17 00:16:01 2006 Subject: [Numpy-discussion] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash Message-ID: After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running the following script causes a crash of python.exe: -------------test.py----------------- import pylab ---------------------------------------- In my .matplotlibrc file I have the following: ------------- backend : WXAgg numerix : numpy ------------- Reinstalling numpy 0.9.4 fixed the problem. Matplotlib version is 0.86.2-win32-py2.4 I also tried reinstalling matplotlib, but that didn't help. --Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From cookedm at physics.mcmaster.ca Fri Feb 17 00:18:19 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 17 00:18:19 2006 Subject: [Numpy-discussion] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash In-Reply-To: (Bill Baxter's message of "Fri, 17 Feb 2006 17:15:45 +0900") References: Message-ID: Bill Baxter writes: > After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running the > following script causes a crash of python.exe: > > -------------test.py----------------- > import pylab > ---------------------------------------- > > > In my .matplotlibrc file I have the following: > ------------- > backend : WXAgg > numerix : numpy > ------------- > > > Reinstalling numpy 0.9.4 fixed the problem. > > Matplotlib version is 0.86.2-win32-py2.4 > I also tried reinstalling matplotlib, but that didn't help. You'll have to recompile matplotlib against the newer numpy. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From wbaxter at gmail.com Fri Feb 17 00:31:10 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Fri Feb 17 00:31:10 2006 Subject: [Numpy-discussion] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash In-Reply-To: References: Message-ID: Ew. Ok. No thanks. :-) I'll just stick with numpy 0.9.4 for now. I appreciate the speedy response. --bb On 2/17/06, David M. Cooke wrote: > > Bill Baxter writes: > > > After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running the > > following script causes a crash of python.exe: > > > > -------------test.py----------------- > > import pylab > > ---------------------------------------- > > > > > > In my .matplotlibrc file I have the following: > > ------------- > > backend : WXAgg > > numerix : numpy > > ------------- > > > > > > Reinstalling numpy 0.9.4 fixed the problem. > > > > Matplotlib version is 0.86.2-win32-py2.4 > > I also tried reinstalling matplotlib, but that didn't help. > > You'll have to recompile matplotlib against the newer numpy. > > -- > |>|\/|< > > /--------------------------------------------------------------------------\ > |David M. Cooke > http://arbutus.physics.mcmaster.ca/dmc/ > |cookedm at physics.mcmaster.ca > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant.travis at ieee.org Fri Feb 17 01:35:05 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 01:35:05 2006 Subject: [Numpy-discussion] Re: [matplotlib-devel] Numpy 0.9.5 + pylab 0.86.2 = python.exe crash In-Reply-To: References: Message-ID: <43F5988A.70001@ieee.org> Bill Baxter wrote: > After upgrading to numpy 0.9.5 (numpy-0.9.5.win32-py2.4.exe) running > the following script causes a crash of python.exe: > > -------------test.py----------------- > import pylab > ---------------------------------------- > > > In my .matplotlibrc file I have the following: > ------------- > backend : WXAgg > numerix : numpy > ------------- > > > Reinstalling numpy 0.9.4 fixed the problem. > > Matplotlib version is 0.86.2-win32-py2.4 > I also tried reinstalling matplotlib, but that didn't help. > You have to re-compile the matplotlib extension. There are warnings present now so that hopefully in the future such needs will be comunicated better. -Travis From josegomez at gmx.net Fri Feb 17 01:52:04 2006 From: josegomez at gmx.net (Jose Gomez-Dans) Date: Fri Feb 17 01:52:04 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin Message-ID: Hi! Yesterday I posted on the scipy mailing list that I could not compile NumPy on Cygwin. I would like to provide some more information on what the problems are, as I would really like to be able to use it on Cygwin. I got the 0.9.5 tarball, and uncompress it, and type python setup.py build. The process starts, there is an indication that it finds BLAS and LAPACK (cygwin versions). It stops when linking the umath.dll, complaining about missing references. Here's an extract: "gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes' compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.4 -c' gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o -L/usr/lib/python2.4/config -lpython2.4 -o build/lib.cygwin-1.5.19-i686-2.4/numpy/core/umath.dll build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x2f5e): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x2fe3): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x3081): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x3139): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x320c): referencia a `_feraiseexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x129ef): referencia a `_fetestexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x12a1d): referencia a `_feclearexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x12b1b): referencia a `_fetestexcept' sin definir build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o: umathmodule.c:(.text+0x12b27): referencia a `_feclearexcept' sin definir collect2: ld devolvi'o el estado de salida 1 error: Command "gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.19-i686-2.4/build/src/numpy/core/src/umathmodule.o -L/usr/lib/python2.4/config -lpython2.4 -o build/lib.cygwin-1.5.19-i686-2.4/numpy/core/umath.dll" failed with exit status 1" (yes, I have the Spanish locale set :D). The functions needed are responsible for setting exceptions, and presumably, only need a simple addition to the library linking path. Is this correct? does anyone know how to deal with this? Many thanks! Jose From oliphant.travis at ieee.org Fri Feb 17 02:20:01 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Fri Feb 17 02:20:01 2006 Subject: [Numpy-discussion] Re: ANN: Release of NumPy 0.9.5 In-Reply-To: <45ljqdF7706gU1@individual.net> References: <45ljqdF7706gU1@individual.net> Message-ID: <43F5A317.8010008@ieee.org> Thomas Gellekum wrote: > "Travis E. Oliphant" writes: > > >> - Improvements to numpy.distutils > > > Stupid questions: is it really necessary to keep your own copy of > distutils and even install it? What's wrong with the version in the > Python distribution? How can I disable the colorized output to get > something more readable than yellow on white (well, seashell1)? > Yes --- distutils does not provide enough functionality. Besides, it's not a *copy* of distutils. It's enhancements to distutils. It builds on top of standard distutils. I don't know the answer to the colorized output question. Please post to numpy-discussions at lists.sourceforge.net to get more help and/or make suggestions. -Travis From oliphant.travis at ieee.org Fri Feb 17 03:22:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 03:22:04 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin In-Reply-To: References: Message-ID: <43F5B1AE.9080208@ieee.org> Jose Gomez-Dans wrote: >Hi! >Yesterday I posted on the scipy mailing list that I could not compile NumPy on >Cygwin. I would like to provide some more information on what the problems are, >as I would really like to be able to use it on Cygwin. > > Thanks Jose. It looks like we are not doing the right thing in the platform-specific section of code here. But the right thing can potentially be done. Look here: ftp://sunsite.dk/projects/cygwinports/release/python/numpy/ It looks like somebody figured out how to make it work with cygwin (one option, of course is to just disable the IEEE error-setting modes for cygwin). -Travis From oliphant.travis at ieee.org Fri Feb 17 03:50:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 03:50:10 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin In-Reply-To: References: Message-ID: <43F5B856.5060301@ieee.org> Jose Gomez-Dans wrote: >Hi! >Yesterday I posted on the scipy mailing list that I could not compile NumPy on >Cygwin. I would like to provide some more information on what the problems are, >as I would really like to be able to use it on Cygwin. > > I looked in to how people at cygwin ports got the IEEE math stuff done. They borrowed it from BSD basically. So, I've taken their patch and placed it in the main tree. Jose, could you check out the latest SVN version of numpy and try to build and install it on cygwin to see if I made the right changes? -Travis From bblais at bryant.edu Fri Feb 17 05:33:15 2006 From: bblais at bryant.edu (Brian Blais) Date: Fri Feb 17 05:33:15 2006 Subject: [Numpy-discussion] Re: calculating on matrix indices In-Reply-To: References: <43F50D7F.8010108@bryant.edu> Message-ID: <43F5D003.2090705@bryant.edu> Robert Kern wrote: > The traceback tells you exactly what's wrong: > > In [7]: x[idx] = exp(-t[idx]/tau) > --------------------------------------------------------------------------- > exceptions.TypeError Traceback (most recent call > last) > yes, I saw that, but all of the types (i.e. type(x)) came out to be the same, so I figured the problem was with the indexing, and that was causing a typecast problem. I didn't know about dtype > In [13]: x = zeros(len(t), float) well that is confusing! zeros(5,'f') is single precision, zeros(5,'d') is double, and zeros(5,float) is double! that's where I got whacked, because I remembered that "float" was "double" in python...but I guess, not always. thanks for your help! bb -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From agn at noc.soton.ac.uk Fri Feb 17 06:37:02 2006 From: agn at noc.soton.ac.uk (George Nurser) Date: Fri Feb 17 06:37:02 2006 Subject: [Numpy-discussion] maxima of masked arrays Message-ID: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> I am trying to get the n-1 dimensional array of maxima of an array taken over a given axis. with ordinary arrays this works fine. E.g. In [49]: a = arange(1,13).reshape(3,4) In [50]: a Out[50]: array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]]) The maximum over all elements is: In [51]: print a.max() 12 & the array of maxima over the 0-axis is OK too: In [52]: print a.max(0) [ 9 10 11 12] But with a masked array there are problems. In [54]: amask = ma.masked_where(a < 5,a) In [55]: amask Out[55]: array(data = [[999999 999999 999999 999999] [ 5 6 7 8] [ 9 10 11 12]], mask = [[True True True True] [False False False False] [False False False False]], fill_value=999999) The maximum over all elements is fine: In [56]: amask.max() Out[56]: 12 but trying to get an array of maxima over the 0-axis fails: n [57]: amask.max(0) Out[57]: array(data = [[999999 999999 999999 999999] [ 5 6 7 8] [ 9 10 11 12]], mask = [[True True True True] ....... Are there any workarounds for this? -George. From schofield at ftw.at Fri Feb 17 07:23:13 2006 From: schofield at ftw.at (Ed Schofield) Date: Fri Feb 17 07:23:13 2006 Subject: [Numpy-discussion] Dot products and casting Message-ID: <20060217152142.GA9621@ftw.at> Hi all, I think there's a bug in dot() that prevents it from operating on two arrays, neither of which can be safely cast to the other. Here's an example: >>> from numpy import * >>> a = arange(10, dtype=float32) >>> b = arange(10, dtype=float64) >>> c = arange(10, dtype=int64) >>> d = arange(10, dtype=int32) >>> e = arange(10, dtype=int16) # Dot products between b and either c or d work fine: >>> dot(b,c) 285.0 >>> dot(b,d) 285.0 # Dot products with e also work fine: >>> dot(a,e) 285.0 >>> dot(b,e) 285.0 # But dot products between a and either c or d don't work: >>> dot(a,c) Traceback (most recent call last): File "", line 1, in ? TypeError: array cannot be safely cast to required type >>> dot(a,d) Traceback (most recent call last): File "", line 1, in ? TypeError: array cannot be safely cast to required type The problem seems to be with the PyArray_ObjectType() calls in dotblas_matrixproduct(), which are returning typenum=PyArray_FLOAT, but this isn't sufficiently large for a safe cast from the int32 and int64 arrays. It seems like PyArray_ObjectType() should be returning PyArray_DOUBLE here instead. Here's another example: >>> f = arange(10, dtype=complex64) >>> dot(b, f) Traceback (most recent call last): File "", line 1, in ? TypeError: array cannot be safely cast to required type So it seems like the problem isn't isolated to float32 arrays, but occurs elsewhere when we need to find a minimum data type of two arrays when *both* need to be upcasted. -- Ed From stefan at sun.ac.za Fri Feb 17 07:24:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 17 07:24:01 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <43F4E947.9070409@ieee.org> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> Message-ID: <20060217152207.GA971@sun.ac.za> I am probably trying to do something silly, but still: In [1]: import numpy as N In [2]: N.__version__ Out[2]: '0.9.6.2127' In [3]: P = N.array(N.zeros((2,2)), N.dtype((('f4',3), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']}))) *** glibc detected *** malloc(): memory corruption: 0x0830bb48 *** Aborted Regards St?fan On Thu, Feb 16, 2006 at 02:06:15PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >Is there any way to control the underlying storage for a record? > > > >I am trying to use Travis' earlier example of an image with named fields: > > > >dt = N.dtype(' >img = N.array(N.empty((rows,columns)), dtype=dt) From arnd.baecker at web.de Fri Feb 17 08:11:04 2006 From: arnd.baecker at web.de (Arnd Baecker) Date: Fri Feb 17 08:11:04 2006 Subject: [Numpy-discussion] ANN: Python Enthought Edition Version 0.9.2 Released In-Reply-To: <43F4CFB4.1080305@enthought.com> References: <43F4CFB4.1080305@enthought.com> Message-ID: On Thu, 16 Feb 2006, Travis N. Vaught wrote: > Enthought is pleased to announce the release of Python Enthought Edition > Version 0.9.2 (http://code.enthought.com/enthon/) -- a python > distribution for Windows. This is a kitchen-sink-included Python > distribution including the following packages/tools out of the box: > > Numeric 24.2 > SciPy 0.3.3 > IPython 0.6.15 > Enthought Tool Suite 1.0.2 > wxPython 2.6.1.0 > PIL 1.1.4 > mingw 20030504-1 > f2py 2.45.241_1926 > MayaVi 1.5 > Scientific Python 2.4.5 > VTK 4.4 > and many more... Brilliant - many thanks for the effort! I was just about to ask for the plans about numpy/scipy, but the changelog at http://code.enthought.com/release/changelog-enthon0.9.2.shtml shows quite a bit of activity in this direction! Do you have an estimate about when a numpy/scipy version of the Enthought Edition might happen? Many thanks, Arnd From charlesr.harris at gmail.com Fri Feb 17 08:23:06 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri Feb 17 08:23:06 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: On 2/17/06, Bill Baxter wrote: > For folks using quats to represent rotations (which is all I use them for, > anyway), if you're batch transforming a bunch of vectors by one quaternion, > it's a lot more efficient to convert the quat to a 3x3 matrix first and > transform using matrix multiply (9 mults per transform that way vs 21 or so > depending on the implementation of q*v*q^-1). Given that, I can't see many > situations when I'd need a super speedy C version of quaternion multiply. > > --Bill True. On the other hand, I have files containing 20,000 quaternions, each of which needs to be converted to a rotation matrix and applied to some 400 vectors, so a c version would have its place. I have a python quaternion class that I use for such things but profiling shows that it is one of the prime time bandits, so I am tempted to use c for that class anyway. Note that the current NumPy cross product is implemented in Python. On a related note, indexing in numarray is some 3x faster than in NumPy and I'm wondering what needs to be done to speed that up. Chuck From travis at enthought.com Fri Feb 17 08:46:06 2006 From: travis at enthought.com (Travis N. Vaught) Date: Fri Feb 17 08:46:06 2006 Subject: [Numpy-discussion] ANN: Python Enthought Edition Version 0.9.2 Released In-Reply-To: References: <43F4CFB4.1080305@enthought.com> Message-ID: <43F5FD9D.2090706@enthought.com> Arnd Baecker wrote: > On Thu, 16 Feb 2006, Travis N. Vaught wrote: > > >> Enthought is pleased to announce the release of Python Enthought Edition >> Version 0.9.2 (http://code.enthought.com/enthon/) -- a python >> distribution for Windows. This is a kitchen-sink-included Python >> distribution including the following packages/tools out of the box: >> >> Numeric 24.2 >> SciPy 0.3.3 >> IPython 0.6.15 >> Enthought Tool Suite 1.0.2 >> wxPython 2.6.1.0 >> PIL 1.1.4 >> mingw 20030504-1 >> f2py 2.45.241_1926 >> MayaVi 1.5 >> Scientific Python 2.4.5 >> VTK 4.4 >> and many more... >> > > Brilliant - many thanks for the effort! > > I was just about to ask for the plans about numpy/scipy, > but the changelog at > http://code.enthought.com/release/changelog-enthon0.9.2.shtml > shows quite a bit of activity in this direction! > > Do you have an estimate about when a numpy/scipy version > of the Enthought Edition might happen? > > Many thanks, > > Arnd > It's a bit difficult to say with much accuracy, so I'll be transparent but imprecise. Our release of Enthon versions typically tracks the state of the platform we are using for the custom software development we do to pay the bills. Thus, our current project code typically has to be ported to build and run on a cobbled-together build of the newer versions before we do a release. I realize this is a drag on the release schedule for Enthon, but it's how we allocate resources to the builds. Enough excuses, though--we are working on the migration of our project code now (Pearu Peterson) and I expect in weeks (rather than months) we'll have an Enthon release candidate with Python 2.4.2, and the latest SciPy and NumPy on Windows. Robert Kern is already working on a project that is based on this tool chain, so the wedge is in place. Thanks for the interest! (and sorry for the cross-post) Travis From faltet at carabos.com Fri Feb 17 10:21:26 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 17 10:21:26 2006 Subject: [Numpy-discussion] PyTables with support for NumPy 0.9.5 Message-ID: <200602171920.35075.faltet@carabos.com> Hi, I've just uploaded a new version of PyTables with (almost) complete support for the recent NumPy 0.9.5. All the range of homogeneous and heterogeneous (including those with nested fields) arrays using any combination of data-types should be supported. The only exception is the lack of support of unicode types (I have to figure out yet which HDF5 datatype would be best to mapping them; suggestions are welcome!). You can fetch the tarball from: http://pytables.carabos.com/download/preliminary/pytables-1.3beta2.tar.gz Test it as much as you can, and if you find any strange quirk, do not hesitate to report it back. Regards, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From tim.hochberg at cox.net Fri Feb 17 11:14:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 17 11:14:12 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> Message-ID: <43F62069.80209@cox.net> Here's a little progress report: I now have A**2 running as fast as square(A). This is done by special casing stuff in array_power so that A**2 acutally calls square(A) instead of going through power(A,2). Things still need a bunch of cleaning up (in fact right now A**1 segfaults, but I know why and it should be an easy fix). However, I think I've discovered why you couldn't get your special cased power to run as fast for A**2 as square(A) or A*A. It appears that the overhead of creating a new array object from the integer 2 is the bottleneck. I was running into the same mysterious overhead, even when dispatching from array_power, until I special cased on PyInt to avoid the array creation in that case. -tim From oliphant.travis at ieee.org Fri Feb 17 11:23:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 11:23:02 2006 Subject: [Numpy-discussion] Dot products and casting In-Reply-To: <20060217152142.GA9621@ftw.at> References: <20060217152142.GA9621@ftw.at> Message-ID: <43F62258.3040602@ieee.org> Ed Schofield wrote: >Hi all, > > > >The problem seems to be with the PyArray_ObjectType() calls in >dotblas_matrixproduct(), which are returning typenum=PyArray_FLOAT, but >this isn't sufficiently large for a safe cast from the int32 and int64 >arrays. It seems like PyArray_ObjectType() should be returning >PyArray_DOUBLE here instead. > > This sounds like an accurate diagnosis. I'll have to look at the type-evaluation code a bit more to see why a suitable type is not being found --- unless someone else gets there first. I won't have time for awhile today. -Travis From oliphant.travis at ieee.org Fri Feb 17 11:25:59 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 11:25:59 2006 Subject: [Numpy-discussion] cross product of two 3xn arrays In-Reply-To: References: <764834db0602151720r77370d7bx327e2a6dca954dbe@mail.gmail.com> <43F3EE4E.6030301@ieee.org> <43F43F02.1090001@bigpond.net.au> <43F48A2D.80705@ieee.org> Message-ID: <43F62313.9030607@ieee.org> Charles R Harris wrote: >On 2/17/06, Bill Baxter wrote: > > >>For folks using quats to represent rotations (which is all I use them for, >>anyway), if you're batch transforming a bunch of vectors by one quaternion, >>it's a lot more efficient to convert the quat to a 3x3 matrix first and >>transform using matrix multiply (9 mults per transform that way vs 21 or so >>depending on the implementation of q*v*q^-1). Given that, I can't see many >>situations when I'd need a super speedy C version of quaternion multiply. >> >>--Bill >> >> > >On a related note, indexing in numarray is some 3x faster than in >NumPy and I'm wondering what needs to be done to speed that up. > > Please explain with a benchmark. This is not true for all indexing operations. But, it is possible that certain use cases are faster. We can't do anything without knowing what you are talking about exactly. -Travis From cookedm at physics.mcmaster.ca Fri Feb 17 11:31:05 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 17 11:31:05 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F62069.80209@cox.net> (Tim Hochberg's message of "Fri, 17 Feb 2006 12:13:45 -0700") References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> Message-ID: Tim Hochberg writes: > Here's a little progress report: I now have A**2 running as fast as > square(A). This is done by special casing stuff in array_power so that > A**2 acutally calls square(A) instead of going through power(A,2). > Things still need a bunch of cleaning up (in fact right now A**1 > segfaults, but I know why and it should be an easy fix). However, I > think I've discovered why you couldn't get your special cased power to > run as fast for A**2 as square(A) or A*A. It appears that the overhead > of creating a new array object from the integer 2 is the bottleneck. I > was running into the same mysterious overhead, even when dispatching > from array_power, until I special cased on PyInt to avoid the array > creation in that case. Hmm, if that's true about the overhead, that'll hit all computations of the type op(x, ). Something to look at. That ufunc code for casting the arguments is pretty big and hairy, so I'm not going to look at right now :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From bryan at cole.uklinux.net Fri Feb 17 12:11:06 2006 From: bryan at cole.uklinux.net (Bryan Cole) Date: Fri Feb 17 12:11:06 2006 Subject: [Numpy-discussion] Re: number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: References: Message-ID: <1139945386.3346.38.camel@pc1.cole.uklinux.net> > > > First, I think the range() function in python is ugly to begin with. > Why can't python just support range notation directly like 'for a in > 0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more > sense to me than having to call a named function. Anyway, that's a > python pet peeve, and python's probably not going to change something > so fundamental... There was a python PEP on this. It got rejected as having too many 'issues'. Pity, in my view. see http://www.python.org/peps/pep-0204.html BC From Chris.Barker at noaa.gov Fri Feb 17 12:24:01 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 17 12:24:01 2006 Subject: [Numpy-discussion] Behavior of array scalars Message-ID: <43F630F9.9050208@noaa.gov> Hi all, It just dawned on my that the numpy array scalars might give something I have wanted once in a while: mutable scalars. However, it seems that we almost, but no quite, have them. A few questions: >>> import numpy as N >>> N.__version__ '0.9.2' >>> N.array(5) array(5) >>> >>> x = N.array(5) >>> x.shape () So it looks like a scalar. >>> y = x Now I have two names bound to the same object. >>> x += 5 I expect this to change the object in place. >>> x 10 but what is this? is it no longer an array? >>> y array(10) y changed, so it looks like the object has changed in place. >>> type(x) >>> type(y) So why did x += 5 result in a different type of object? Also: I can see that we could use += and friends to mutate an array scalar, but what if I want to set it's value, as a mutation, like: >>> x = N.array((5,)) >>> x array([5]) >>> x[0] = 10 >>> x array([10]) but I can't so that with an array scalar: >>> x = N.array(5) >>> x array(5) >>> x[0] = 10 Traceback (most recent call last): File "", line 1, in ? IndexError: 0-d arrays can't be indexed. >>> x[] = 10 File "", line 1 x[] = 10 ^ SyntaxError: invalid syntax >>> x[:] = 10 Traceback (most recent call last): File "", line 1, in ? ValueError: cannot slice a scalar Is there a way to set the value in place, without resorting to: >>> x *= 0 >>> x += 34 I think it would be really handy to have a full featured, mutable scalar. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From alexander.belopolsky at gmail.com Fri Feb 17 12:51:03 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri Feb 17 12:51:03 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F630F9.9050208@noaa.gov> References: <43F630F9.9050208@noaa.gov> Message-ID: On 2/17/06, Christopher Barker wrote: > >>> x += 5 > > I expect this to change the object in place. > > >>> x > 10 > > but what is this? is it no longer an array? I would say it is a bug, but here is an easy work-around >>> x = array(5) >>> id(x) 6425088 >>> x[()]+=5 >>> id(x) 6425088 >>> x array(10) You can also use >>> x[...]+=5 >>> x array(15) With an additional benefit that the same syntax works for any shape. > Is there a way to set the value in place, without resorting to: >>> x[...] = 10 or >>> x[()] = 10 You can see more on this feature at http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray From ndarray at mac.com Fri Feb 17 13:09:07 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 13:09:07 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> Message-ID: On 2/17/06, Christopher Barker wrote: > >>> x += 5 > > I expect this to change the object in place. > > >>> x > 10 > > but what is this? is it no longer an array? I would say it is a bug, but here is an easy work-around >>> x = array(5) >>> id(x) 6425088 >>> x[()]+=5 >>> id(x) 6425088 >>> x array(10) You can also use >>> x[...]+=5 >>> x array(15) With an additional benefit that the same syntax works for any shape. > Is there a way to set the value in place, without resorting to: >>> x[...] = 10 or >>> x[()] = 10 You can see more on this feature at http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray From tim.hochberg at cox.net Fri Feb 17 13:19:00 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Feb 17 13:19:00 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> Message-ID: <43F63D82.7060702@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>Here's a little progress report: I now have A**2 running as fast as >>square(A). This is done by special casing stuff in array_power so that >>A**2 acutally calls square(A) instead of going through power(A,2). >>Things still need a bunch of cleaning up (in fact right now A**1 >>segfaults, but I know why and it should be an easy fix). However, I >>think I've discovered why you couldn't get your special cased power to >>run as fast for A**2 as square(A) or A*A. It appears that the overhead >>of creating a new array object from the integer 2 is the bottleneck. I >>was running into the same mysterious overhead, even when dispatching >>from array_power, until I special cased on PyInt to avoid the array >>creation in that case. >> >> > >Hmm, if that's true about the overhead, that'll hit all computations >of the type op(x, ). Something to look at. That ufunc code >for casting the arguments is pretty big and hairy, so I'm not going to >look at right now :-) > > Well, it's just a guess based on the fact that the extra time went away when I stopped calling PyArray_EnsureArray(o2) for python ints. For what it's worth, numpy scalars seem to have much less overhead. As indicated below (note that numpy scalars are not currently special cased like PyInts are). The overhead from PyInts was closer to 75% versus about 15% for numpy scalars. Of course, the percentage of overhead is going to go up for smaller arrays. >>> Timer('a**2', 'from numpy import arange;a = arange(10000.); b = a[2]').timeit(10000) 0.28345055027943999 >>> Timer('a**b', 'from numpy import arange;a = arange(10000.); b = a[2]').timeit(10000) 0.32190487897204889 >>> Timer('a*a', 'from numpy import arange;a = arange(10000.); b = a[2]').timeit(10000) 0.27305732991204223 >>> Timer('square(a)', 'from numpy import arange, square;a = arange(10000.); b = a[2]').timeit(10000) 0.27989618792332749 -tim From oliphant at ee.byu.edu Fri Feb 17 15:04:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 15:04:01 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F630F9.9050208@noaa.gov> References: <43F630F9.9050208@noaa.gov> Message-ID: <43F65627.5080901@ee.byu.edu> Christopher Barker wrote: > Hi all, > > It just dawned on my that the numpy array scalars might give something > I have wanted once in a while: mutable scalars. However, it seems that > we almost, but no quite, have them. A few questions: NumPy (starting with Numeric) has always had this love-hate relationship with zero-dimensional arrays. We use them internally to simplify the code, but try not to expose them to the user. Ultimately, we couldn't figure out how to do that cleanly and so we have the current compromise situation where 0-d arrays are available but treated as second-class citizens. Thus, we still get funny behavior in certain circumstances. I think you found another such quirky area. I'm open to suggestions. To analyze this particular case... The a+= 10 operation should be equivalent to add(a,10,a). Note that explicitly writing add(a,10,a) returns a scalar (all ufuncs return scalars if 0-d arrays are the result). But, a is modified in-place as you wanted. Perhaps what is going on is that a += 10 is begin translated to a = a + 10 rather than add(a,10,a) I'll have to look deeper to see why. -Travis From ndarray at mac.com Fri Feb 17 15:44:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 15:44:04 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F65627.5080901@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: On 2/17/06, Travis Oliphant wrote: > ... > Perhaps what is going on is that > a += 10 > is begin translated to > a = a + 10 > rather than > add(a,10,a) > I'll have to look deeper to see why. It is actually being translated to "a = add(a,10,a)" by virtue of array_inplace_add supplied in the inplace_add slot. Here is the proof: >>> a = array(0) >>> a = b = array(0) >>> a += 10 >>> b array(10) >>> a 10 Another way to explain it is to note that a += 10 is equivalent to "a = a.__iadd__(10)" and a.__iadd__(10) is equivalent to "add(a, 10, a)". This is not easy to fix because the real culprit is >>> a = array(0) >>> type(a) is type(a+a) False Maybe we can change ufunc logic so that when the output argument is supplied it is returned without scalar conversion. From oliphant at ee.byu.edu Fri Feb 17 15:51:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 15:51:01 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: <43F66123.204@ee.byu.edu> Sasha wrote: >Maybe we can change ufunc logic so that when the output argument is >supplied it is returned without scalar conversion. > > That seems sensible. Any objections? It is PyArray_Return that changes things from 0-d array's to scalars. It's all that function has every really done.... Notice that this behavior was always in Numeric... a = Numeric.array(5) a += 10 type(a) >>> a = Numeric.array(5) >>> type(a) == type(a+a) False But... >>> a = Numeric.array(5,'f') >>> type(a) == type(a+a) True So, we've been dealing with these issues (poorly) for a long time.... -Travis From oliphant at ee.byu.edu Fri Feb 17 16:05:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 16:05:01 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: <43F6647C.6060107@ee.byu.edu> Sasha wrote: >It is actually being translated to "a = add(a,10,a)" by virtue of >array_inplace_add supplied in the inplace_add slot. Here is the >proof: > > I think we do need to fix something. Because the problem is even more apparent when you in-place add an array to a matrix. Consider... a = rand(5,5) b = mat(a) a += b What do you think the type of a now is? What should it be? Currently, this code would change a from an array to a matrix because add(a,b,a) returns a matrix. I'm thinking that we should establish the rule that if output arrays are given, then what is returned should just be those output arrays... This seems to make consistent sense and it will make the inplace operators work as expected (not changing the type). We are currently not letting in-place operators change the data-type. Our argument against that behavior is weakened if we do let them change the Python type.... -Travis From cookedm at physics.mcmaster.ca Fri Feb 17 16:21:08 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 17 16:21:08 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F6647C.6060107@ee.byu.edu> (Travis Oliphant's message of "Fri, 17 Feb 2006 17:04:12 -0700") References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F6647C.6060107@ee.byu.edu> Message-ID: Travis Oliphant writes: > Consider... > > a = rand(5,5) > b = mat(a) > > a += b > > What do you think the type of a now is? What should it be? > > Currently, this code would change a from an array to a matrix because > > add(a,b,a) returns a matrix. > > I'm thinking that we should establish the rule that if output arrays > are given, then what is returned should just be those output arrays... +1. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From ndarray at mac.com Fri Feb 17 16:31:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 16:31:03 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F65627.5080901@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: Sorry for a truncated post. Here is what I intended. On 2/17/06, Travis Oliphant wrote: > NumPy (starting with Numeric) has always had this love-hate relationship > with zero-dimensional arrays. We use them internally to simplify the > code, but try not to expose them to the user. Ultimately, we couldn't > figure out how to do that cleanly and so we have the current compromise > situation where 0-d arrays are available but treated as second-class > citizens. Thus, we still get funny behavior in certain circumstances. It would be nice to collect the motivations behind the current state of affairs with rank-0 arrays in one place. Due to the "hard-hat" nature of the issue, I would suggest to do it at http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray . Travis' Numeric3 design document actually leaves the issue open """ What does single element indexing return? Scalars or rank-0 arrays? Right now, a scalar is returned if there is a direct map to a Python type, otherwise a rank-0 array (Numeric scalar) is returned. But, in problems which reduce to an array of arbitrary size, this can lead to a lot of code that basically just checks to see if the object is a scalar. There are two ways I can see to solve this: 1) always return rank-0 arrays (never convert to Python scalars) and 2) always use special functions (like alen) that handle Python scalars correctly. I'm open to both ideas, but probably prefer #1 (never convert to Python scalars) unless requested. """ I can think of two compelling reasons in favor of scalar array types: 1. Rank-0 arrays cannot be used as indices to tuples. 2. Rank-0 arrays cannot be used as keys in dicts. Neither of these resons is future proof. It looks like python 2.5 will introduce __index__ slot that will fix #1 and #2 is probably better solved by introduction of "frozen" ndarray. In any case I will collect all these thoughts on the ZeroRankArray page unless I hear that this belongs to the main wiki. From oliphant at ee.byu.edu Fri Feb 17 16:54:05 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 16:54:05 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> Message-ID: <43F67012.5090303@ee.byu.edu> Sasha wrote: >It would be nice to collect the motivations behind the current state >of affairs with rank-0 arrays in one place. Due to the "hard-hat" >nature of the issue, I would suggest to do it at >http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray . > >Travis' Numeric3 design document actually leaves the issue open > > > This document is old. Please don't refer to it too stringently. It reflected my thinking at the start of the project. There are mailing list discussions that have more relevance. The source reflects what was actually done. What was done is introduce scalar array types for every data-type and return those. I had originally thought that the pure Python user would *never* see rank-0 arrays. That's why PyArray_Return is called all over the place in the code. The concept that practicality beats purity won out and there are a few limited wasy you can get zero-dimensional arrays (i.e. using array(5) which used to return an array scalar). They just don't *stay* 0-d arrays and are converted to array scalars at almost every opportunity.... I have been relaxing this over time, however. I can't say I have some grand understanding that is guiding the relaxation of this rule, however, except that I still think array scalars are *better* to deal with (I think this will be especially obvious when we get scalar math implemented). So, I relunctantly give visibility to 0-d arrays when particular use-cases emerge. >In any case I will collect all these thoughts on the ZeroRankArray >page unless I hear that this belongs to the main wiki. > > It's a good start. This particular use case of course is actually showing us a deeper flaw in our use of output arguments in the ufunc which needs changing. -Travis From Chris.Barker at noaa.gov Fri Feb 17 17:19:03 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 17 17:19:03 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F67012.5090303@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F67012.5090303@ee.byu.edu> Message-ID: <43F67618.8060501@noaa.gov> Travis Oliphant wrote: > They just don't > *stay* 0-d arrays and are converted to array scalars at almost every > opportunity.... I'm still confused as to what the difference is. This (recent) convesation started with my desire for a mutable scalar. CAn array scalars fill this role? What I mean by that role is some way to do: x += 5 # (and friends) x[()] = 45 # or some other notation And have x be the same object throughout. Heck even something like: x.set(45) would work for me. Alexander Belopolsky wrote: >>>>x[...] = 10 > > or > >>>>x[()] = 10 I can't do that. HAs that been added since version: '0.9.2' ? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at ee.byu.edu Fri Feb 17 17:30:09 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 17 17:30:09 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F67618.8060501@noaa.gov> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F67012.5090303@ee.byu.edu> <43F67618.8060501@noaa.gov> Message-ID: <43F67875.1090302@ee.byu.edu> Christopher Barker wrote: > Travis Oliphant wrote: > >> They just don't *stay* 0-d arrays and are converted to array scalars >> at almost every opportunity.... > > > I'm still confused as to what the difference is. This (recent) > convesation started with my desire for a mutable scalar. CAn array > scalars fill this role? No. array scalars are immutable (well except for the void array scalar...) > > What I mean by that role is some way to do: > > x += 5 # (and friends) This now works in SVN. In-place operations on 0-d arrays don't change on you. -Travis From ndarray at mac.com Fri Feb 17 17:32:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 17:32:04 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F67618.8060501@noaa.gov> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F67012.5090303@ee.byu.edu> <43F67618.8060501@noaa.gov> Message-ID: On 2/17/06, Christopher Barker wrote: > >>>>x[()] = 10 > > I can't do that. HAs that been added since version: '0.9.2' ? Yes, you need 0.9.4 or later. From ndarray at mac.com Fri Feb 17 17:56:05 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 17:56:05 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F66123.204@ee.byu.edu> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F66123.204@ee.byu.edu> Message-ID: On 2/17/06, Travis Oliphant wrote: > Sasha wrote: > > >Maybe we can change ufunc logic so that when the output argument is > >supplied it is returned without scalar conversion. > > > > > That seems sensible. Attached patch implements this idea. With the patch applied: >>> from numpy import * >>> x = array(5) >>> add(x,5,x) array(10) >>> x+=5 >>> x array(15) The patch passes all the tests, but I would like to hear from others before commit. Personally, I am unhappy that I had to change C-API function. -------------- next part -------------- Index: numpy/core/src/ufuncobject.c =================================================================== --- numpy/core/src/ufuncobject.c (revision 2128) +++ numpy/core/src/ufuncobject.c (working copy) @@ -846,7 +846,8 @@ #undef _GETATTR_ static int -construct_matrices(PyUFuncLoopObject *loop, PyObject *args, PyArrayObject **mps) +construct_matrices(PyUFuncLoopObject *loop, PyObject *args, + PyArrayObject **mps, Bool *supplied_output) { int nargs, i, maxsize; int arg_types[MAX_ARGS]; @@ -952,6 +953,7 @@ mps[i] = NULL; continue; } + supplied_output[i] = TRUE; Py_INCREF(mps[i]); if (!PyArray_Check((PyObject *)mps[i])) { PyObject *new; @@ -1000,6 +1002,7 @@ NULL, NULL, 0, 0, NULL); if (mps[i] == NULL) return -1; + supplied_output[i] = FALSE; } /* reset types for outputs that are equivalent @@ -1271,7 +1274,8 @@ } static PyUFuncLoopObject * -construct_loop(PyUFuncObject *self, PyObject *args, PyArrayObject **mps) +construct_loop(PyUFuncObject *self, PyObject *args, + PyArrayObject **mps, Bool* supplied_output) { PyUFuncLoopObject *loop; int i; @@ -1299,7 +1303,7 @@ &(loop->errobj)) < 0) goto fail; /* Setup the matrices */ - if (construct_matrices(loop, args, mps) < 0) goto fail; + if (construct_matrices(loop, args, mps, supplied_output) < 0) goto fail; PyUFunc_clearfperr(); @@ -1381,13 +1385,13 @@ /*UFUNC_API*/ static int PyUFunc_GenericFunction(PyUFuncObject *self, PyObject *args, - PyArrayObject **mps) + PyArrayObject **mps, Bool* supplied_output) { PyUFuncLoopObject *loop; int i; BEGIN_THREADS_DEF - if (!(loop = construct_loop(self, args, mps))) return -1; + if (!(loop = construct_loop(self, args, mps, supplied_output))) return -1; if (loop->notimplemented) {ufuncloop_dealloc(loop); return -2;} LOOP_BEGIN_THREADS @@ -2561,6 +2565,7 @@ PyTupleObject *ret; PyArrayObject *mps[MAX_ARGS]; PyObject *retobj[MAX_ARGS]; + Bool supplied_output[MAX_ARGS]; PyObject *res; PyObject *wrap; int errval; @@ -2569,7 +2574,7 @@ if something goes wrong. */ for(i=0; inargs; i++) mps[i] = NULL; - errval = PyUFunc_GenericFunction(self, args, mps); + errval = PyUFunc_GenericFunction(self, args, mps, supplied_output); if (errval < 0) { for(i=0; inargs; i++) Py_XDECREF(mps[i]); if (errval == -1) @@ -2619,7 +2624,9 @@ continue; } } - retobj[i] = PyArray_Return(mps[j]); + retobj[i] = supplied_output[j] + ? (PyObject *)mps[j] + : PyArray_Return(mps[j]); } if (self->nout == 1) { From oliphant.travis at ieee.org Fri Feb 17 20:11:17 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 17 20:11:17 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F66123.204@ee.byu.edu> Message-ID: <43F69E3C.30407@ieee.org> Sasha wrote: >On 2/17/06, Travis Oliphant wrote: > > >>>>from numpy >>>> > >The patch passes all the tests, but I would like to hear from others >before commit. Personally, I am unhappy that I had to change C-API >function. > > Sorry we worked on the same code. I already comitted a changed that solves the problem. It doesn't change the C-API function but instead changes the _find_wrap code which needed changing anyway so that other objects passed in as output arrays work cause the returned object to be obj.__array_wrap__() no matter what the array priority was. Now. a += b doesn't change type no matter what a is. -Travis From ndarray at mac.com Fri Feb 17 20:23:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 20:23:03 2006 Subject: [Numpy-discussion] maxima of masked arrays In-Reply-To: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> References: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> Message-ID: On 2/17/06, George Nurser wrote: > But with a masked array there are problems. What you see is a bug in ma. > Are there any workarounds for this? For now you can use >>> ma.maximum.reduce(amask, 0) array(data = [ 9 10 11 12], mask = [False False False False], fill_value=999999) From ndarray at mac.com Fri Feb 17 20:36:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 20:36:04 2006 Subject: [Numpy-discussion] Behavior of array scalars In-Reply-To: <43F69E3C.30407@ieee.org> References: <43F630F9.9050208@noaa.gov> <43F65627.5080901@ee.byu.edu> <43F66123.204@ee.byu.edu> <43F69E3C.30407@ieee.org> Message-ID: On 2/17/06, Travis Oliphant wrote: > Sorry we worked on the same code. Not a problem. My code was just a proof of concept anyway. > I already comitted a changed that solves the problem. I've seen your change on the timeline. Thanks for extensive comments. > It doesn't change the C-API function but instead > changes the _find_wrap code which needed changing anyway ... You are right, my patch could be fooled by an output arg with an __array_wrap__. However, I am not sure calling output argument's __array_wrap__ is a good idea: it looks like it may lead to "(a is add(a, 2, a)) == False)" in some circumstances. Another concern is that it looks like what output arguments are supplied is determined twice: in _find_wrap and in construct_matrices. From ndarray at mac.com Fri Feb 17 21:08:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 17 21:08:01 2006 Subject: [Numpy-discussion] maxima of masked arrays In-Reply-To: References: <9348F161-A8D2-4755-9AF5-54968D134FDF@noc.soton.ac.uk> Message-ID: On 2/17/06, Sasha wrote: > On 2/17/06, George Nurser wrote: > > But with a masked array there are problems. > > What you see is a bug in ma. Fixed in SVN. From ndarray at mac.com Sat Feb 18 11:50:08 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 18 11:50:08 2006 Subject: [Numpy-discussion] A case for rank-0 arrays Message-ID: I have reviewed mailing list discussions of rank-0 arrays vs. scalars and I concluded that the current implementation that contains both is (almost) correct. I will address the "almost" part with a concrete proposal at the end of this post (search for PROPOSALS if you are only interested in the practical part). The main criticism of supporting both scalars and rank-0 arrays is that it is "unpythonic" in the sense that it provides two almost equivalent ways to achieve the same result. However, I am now convinced that this is the case where practicality beats purity. If you take the "one way" rule to it's logical conclusion, you will find that once your language has functions, it does not need numbers or any other data type because they all can be represented by functions (see http://en.wikipedia.org/wiki/Church_numeral). Another example of core python violating the "one way rule" is the presence of scalars and length-1 tuples. In S+, for example, scalars are represented by single element lists. The situation with ndarrays is somewhat similar. A rank-N array is very similar to a function with N arguments, where each argument has a finite domain (i-th domain of a is range(a.shape[i])). A rank-0 array is just a function with no arguments and as such it is quite different from a scalar. Just as a function with no arguments cannot be replaced by a constant in the case when a value returned may change during the run of the program, rank-0 array cannot be replaced by an array scalar because it is mutable. (See http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray for use cases). Rather than trying to hide rank-0 arrays from the end-user and treat it as an implementation artifact, I believe numpy should emphasize the difference between rank-0 arrays and scalars and have clear rules on when to use what. PROPOSALS ========== Here are three suggestions: 1. Probably the most controversial question is what getitem should return. I believe that most of the confusion comes from the fact that the same syntax implements two different operations: indexing and projection (for the lack of better name). Using the analogy between ndarrays and functions, indexing is just the application of the function to its arguments and projection is the function projection ((f, x) -> lambda (*args): f(x, *args)). The problem is that the same syntax results in different operations depending on the rank of the array. Let >>> x = ones((2,2)) >>> y = ones(2) then x[1] is projection and type(x[1]) is ndarray, but y[1] is indexing and type(y[1]) is int32. Similarly, y[1,...] is indexing, while x[1,...] is projection. I propose to change numpy rules so that if ellipsis is present inside [], the operation is always projection and both y[1,...] and x[1,1,...] return zero-rank arrays. Note that I have previously rejected Francesc's idea that x[...] and x[()] should have different meaning for zero-rank arrays. I was wrong. 2. Another source of ambiguity is the various "reduce" operations such as sum or max. Using the previous example, type(x.sum(axis=0)) is ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: a. Make x.sum(axis) return ndarray unless axis is None, making type(y.sum(axis=0)) is ndarray true in the example. b. Allow axis to be a sequence of ints and make x.sum(axis=range(rank(x))) return rank-0 array according to the rule 2.a above. c. Make x.sum() raise an error for rank-0 arrays and scalars, but allow x.sum(axis=()) to return x. This will make numpy sum consistent with the built-in sum that does not work on scalars. 3. This is a really small change currently >>> empty(()) array(0) but >>> ndarray(()) Traceback (most recent call last): File "", line 1, in ? ValueError: need to give a valid shape as the first argument I propose to make shape=() valid in ndarray constructor. From tim.hochberg at cox.net Sat Feb 18 17:19:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sat Feb 18 17:19:03 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: References: <43F0DBA3.9010405@cox.net> <43F11086.308@cox.net> <43F11716.9050204@cox.net> <43F13B20.3000301@cox.net> <43F13DE0.8040309@bigpond.net.au> <43F167C0.7040806@cox.net> <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> Message-ID: <43F7C73B.2000806@cox.net> OK, I now have a faily clean implementation in C of: def __pow__(self, p): if p is not a scalar: return power(self, p) elif p == 1: return p elif p == 2: return square(self) # elif p == 3: # return cube(self) # elif p == 4: # return power_4(self) # elif p == 0: # return ones(self.shape, dtype=self.dtype) # elif p == -1: # return 1.0/self elif p == 0.5: return sqrt(self) First a couple of technical questions, then on to the philosophical portion of this note. 1. Is there a nice fast way to get a matrix filled with ones from C. I've been tempted to write a ufunc 'ones_like', but I'm afraid that might be considered inappropriate. 2. Are people aware that array_power is sometimes passed non arrays as its first argument? Despite having the signature: array_power(PyArrayObject *a1, PyObject *o2) This caused me almost no end of headaches, not to mention crashes during numpy.test(). I'll check this into the power_optimization branch RSN, hopefully with a fix for the zero power case. Possibly also after extending it to inplace power as well. OK, now on to more important stuff. As I've been playing with this my opinion has gone in circles a couple of times. I now think the issue of optimizing integer powers of complex numbers and integer powers of floats are almost completely different. Because complex powers are quite slow and relatively inaccurate, it is appropriate to optimize them for integer powers at the level of nc_pow. This should be just a matter of liberal borrowing from complexobject.c, but I haven't tried it yet. On the other hand, real powers are fast enough that doing anything at the single element level is unlikely to help. So in that case we're left with either optimizing the cases where the dimension is zero as David has done, or optimizing at the __pow__ (AKA array_power) level as I've done now based on David's original suggestion. This second approach is faster because it avoids the mysterious python scalar -> zero-D array conversion overhead. However, it suffers if we want to optimize lots of different powers since one needs a ufunc for each one. So the question becomes, which powers should we optimize? My latest thinking on this is that we should optimize only those cases where the optimized result is no less accurate than that produced by pow. I'm going to assume that all C operations are equivalently accurate, so pow(x,2) has roughly the same amount of error as x*x. (Something on the order of 0.5 ULP I'd guess). In that case: pow(x, -1) -> 1 / x pow(x, 0) -> 1 pow(x, 0.5) -> sqrt(x) pow(x, 1) -> x pow(x, 2) -> x*x can all be implemented in terms of multiply or divide with the same accuracy as the original power methods. Once we get beyond these, the error will go up progressively. The minimal set described above seems like it should be relatively uncontroversial and it's what I favor. Once we get beyond this basic set, we would need to reach some sort of consensus on how much additional error we are willing to tolerate for optimizing these extra cases. You'll notice that I've changed my mind, yet again, over whether to optimize A**0.5. Since the set of additional ufuncs needed in this case is relatively small, just square and inverse (==1/x), this minimal set works well if optimizing in pow as I've done. That's the state of my thinking on this at this exact moment. I'd appreciate any comments and suggestions you might have. From oliphant.travis at ieee.org Sat Feb 18 17:21:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 18 17:21:10 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <20060217152207.GA971@sun.ac.za> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> <20060217152207.GA971@sun.ac.za> Message-ID: <43F7C7F2.4050600@ieee.org> Stefan van der Walt wrote: >I am probably trying to do something silly, but still: > >In [1]: import numpy as N > >In [2]: N.__version__ >Out[2]: '0.9.6.2127' > >In [3]: P = N.array(N.zeros((2,2)), N.dtype((('f4',3), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']}))) >*** glibc detected *** malloc(): memory corruption: 0x0830bb48 *** >Aborted > >Regards >St?fan > > This code found a bug that's been there for a while in the PyArray_CastTo code (only seen on multiple copies) which is being done here as the 2x2 array of zeros is being cast to a 2x2x3 array of floating-point zeros. The bug should be fixed in SVN, now. Despite the use of fields, the base-type is ('f4',3) which is equivalent to (tack on a 3 to the shape of the array of 'f4'). So, on array creation the fields will be lost and you will get a 2x2x3 array of float32. Types like ('f4', 3) are really only meant to be used in records. If they are used "by themselves" they simply create an array of larger dimension. By the way, the N.dtype in the array constructor is unnecessary as that is essentially what is done to the second argument anyway You can get two different views of the same data (which it seems you are after) like this: P = N.zeros((2,2), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']}) Q = P.view(('f4',3)) Then Q[...,0] = 10 print P['x'] If you want the field to vary in the first dimension, then you really want a FORTRAN array. So, P = N.zeros((2,2), {'names': ['x','y','z'], 'formats': ['f4','f4','f4']},fortran=1) Q = P.view(('f4',3)) Then Q[0] = 20 print P['x'] Best, -Travis From cjw at sympatico.ca Sun Feb 19 12:44:08 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sun Feb 19 12:44:08 2006 Subject: [Numpy-discussion] Re: number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <1139945386.3346.38.camel@pc1.cole.uklinux.net> References: <1139945386.3346.38.camel@pc1.cole.uklinux.net> Message-ID: <43F8D881.5030804@sympatico.ca> Bryan Cole wrote: >> >> >>First, I think the range() function in python is ugly to begin with. >>Why can't python just support range notation directly like 'for a in >>0:10'. Or with 0..10 or 0...10 syntax. That seems to make a lot more >>sense to me than having to call a named function. Anyway, that's a >>python pet peeve, and python's probably not going to change something >>so fundamental... >> >> > >There was a python PEP on this. It got rejected as having too many >'issues'. Pity, in my view. > >see http://www.python.org/peps/pep-0204.html > >BC > > This decision appears to have been made nearly six years ago. It would be a good idea to revisit the decision, particularly since the reasons for rejection are not clearly spelled out. The conditional expression (PEP 308) was rejected but is currently being implemented in Python version 2.5. It will have the syntax A if C else B. I have felt, as Gary Ruben says above, that the range structure is ugly. Two alternatives have been suggested: a) a:b:c b) a..b..c Do either of these create parsing problems? for i in a:b: Should this be treated as an error, with a missing c (the increment) print i for i in 2..5: print i We don't know whether the 2 is an integer until the second period is scanned. It would be good if the range and slice could be merged in some way, although the extended slice is rather complicated - I don't understand it. The semantics for an extended slicing are as follows. The primary must evaluate to a mapping object, and it is indexed with a key that is constructed from the slice list, as follows. If the slice list contains at least one comma, the key is a tuple containing the conversion of the slice items; otherwise, the conversion of the lone slice item is the key. The conversion of a slice item that is an expression is that expression. The conversion of an ellipsis slice item is the built-in |Ellipsis| object. The conversion of a proper slice is a slice object (see section section 4.2 The standard type hierarchy ) whose |start|, |stop| and |step| attributes are the values of the expressions given as lower bound, upper bound and stride, respectively, substituting |None| for missing expressions. [source: http://www.network-theory.co.uk/docs/pylang/ref_60.html] The seems to be a bit of a problem with slicing that needs sorting out. The syntax for a slice list appears to allow multiple slices in a list: extended_slicing ::= primary "[" slice_list "]" slice_list ::= slice_item ("," slice_item )* [","] but the current interpreter reports an error: >>> a= range(20) >>> a[slice(3, 9, 2)] [3, 5, 7] >>> a[slice(3, 9, 2), slice(5, 10)] Traceback (most recent call last): File "", line 1, in ? TypeError: list indices must be integers >>> I have taken the liberty of cross posting this to c.l.p. Colin W. From stefan at sun.ac.za Sun Feb 19 13:37:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sun Feb 19 13:37:01 2006 Subject: [Numpy-discussion] storage for records In-Reply-To: <43F7C7F2.4050600@ieee.org> References: <20060216192556.GA20396@alpha> <43F4E947.9070409@ieee.org> <20060217152207.GA971@sun.ac.za> <43F7C7F2.4050600@ieee.org> Message-ID: <20060219213542.GA15643@alpha> On Sat, Feb 18, 2006 at 06:20:50PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >I am probably trying to do something silly, but still: > > > >In [1]: import numpy as N > > > >In [2]: N.__version__ > >Out[2]: '0.9.6.2127' > > > >In [3]: P = N.array(N.zeros((2,2)), N.dtype((('f4',3), {'names': > >['x','y','z'], 'formats': ['f4','f4','f4']}))) > >*** glibc detected *** malloc(): memory corruption: 0x0830bb48 *** > >Aborted > > > >Regards > >St?fan > > > > > This code found a bug that's been there for a while in the > PyArray_CastTo code (only seen on multiple copies) which is being done > here as the 2x2 array of zeros is being cast to a 2x2x3 array of > floating-point zeros. > > The bug should be fixed in SVN, now. Thank you very much for fixing this! (It works now). > Despite the use of fields, the base-type is ('f4',3) which is equivalent > to (tack on a 3 to the shape of the array of 'f4'). So, on array > creation the fields will be lost and you will get a 2x2x3 array of > float32. Types like ('f4', 3) are really only meant to be used in > records. If they are used "by themselves" they simply create an array > of larger dimension. Exactly what I needed for my application! I'll write this up and put it on the wiki. Cheers St?fan From aisaac at american.edu Sun Feb 19 13:39:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Sun Feb 19 13:39:01 2006 Subject: [Numpy-discussion] Re: number ranges (was Re: Matlab page on scipy wiki) In-Reply-To: <43F8D881.5030804@sympatico.ca> References: <1139945386.3346.38.camel@pc1.cole.uklinux.net><43F8D881.5030804@sympatico.ca> Message-ID: On Sun, 19 Feb 2006, "Colin J. Williams" apparently wrote: > The conditional expression (PEP 308) was rejected but is currently being > implemented in Python version 2.5. It will have the syntax A if C else B. It's coming http://www.python.org/peps/pep-0308.html But in 2.5? http://www.python.org/dev/doc/devel/whatsnew/whatsnew25.html Thank you, Alan Isaac From tim.hochberg at cox.net Sun Feb 19 14:34:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sun Feb 19 14:34:02 2006 Subject: [Numpy-discussion] complex division Message-ID: <43F8F202.80500@cox.net> While rummaging around Python's complexobject.c looking for code to steal for complex power, I came across the following comment relating to complex division: /****************************************************************** This was the original algorithm. It's grossly prone to spurious overflow and underflow errors. It also merrily divides by 0 despite checking for that(!). The code still serves a doc purpose here, as the algorithm following is a simple by-cases transformation of this one: Py_complex r; double d = b.real*b.real + b.imag*b.imag; if (d == 0.) errno = EDOM; r.real = (a.real*b.real + a.imag*b.imag)/d; r.imag = (a.imag*b.real - a.real*b.imag)/d; return r; ******************************************************************/ /* This algorithm is better, and is pretty obvious: first divide the * numerators and denominator by whichever of {b.real, b.imag} has * larger magnitude. The earliest reference I found was to CACM * Algorithm 116 (Complex Division, Robert L. Smith, Stanford * University). As usual, though, we're still ignoring all IEEE * endcases. */ The algorithm shown, and maligned, in this comment is pretty much exactly what is done in numpy at present. The function goes on to use the improved algorithm, which I will include at the bottom of the post. It seems nearly certain that using this algorithm will result in some speed hit, although I'm not certain how much. I will probably try this out at some point and see what the speed hit, but in case I drop the ball I thought I'd throw this out there as something we should at least look at. In most cases, I'll take accuracy over raw speed (within reason). -tim Py_complex r; /* the result */ const double abs_breal = b.real < 0 ? -b.real : b.real; const double abs_bimag = b.imag < 0 ? -b.imag : b.imag; if (abs_breal >= abs_bimag) { /* divide tops and bottom by b.real */ if (abs_breal == 0.0) { errno = EDOM; r.real = r.imag = 0.0; } else { const double ratio = b.imag / b.real; const double denom = b.real + b.imag * ratio; r.real = (a.real + a.imag * ratio) / denom; r.imag = (a.imag - a.real * ratio) / denom; } } else { /* divide tops and bottom by b.imag */ const double ratio = b.real / b.imag; const double denom = b.real * ratio + b.imag; assert(b.imag != 0.0); r.real = (a.real * ratio + a.imag) / denom; r.imag = (a.imag * ratio - a.real) / denom; } return r; From charlesr.harris at gmail.com Sun Feb 19 16:13:02 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun Feb 19 16:13:02 2006 Subject: [Numpy-discussion] complex division In-Reply-To: <43F8F202.80500@cox.net> References: <43F8F202.80500@cox.net> Message-ID: Hmm... The new algorithm does look better with respect to overflow and underflow, but I wonder if it is not a bit of overkill. It seems to me that the same underflow/overflow problems attend complex multiplication, which is pretty much all that goes on in the original algorithm. One thing I do know is that division is expensive. I wonder if one division and two multiplications might be cheaper than two divisions. I'll have to check that out. Chuck On 2/19/06, Tim Hochberg wrote: > > While rummaging around Python's complexobject.c looking for code to > steal for complex power, I came across the following comment relating to > complex division: > > /****************************************************************** > This was the original algorithm. It's grossly prone to spurious > overflow and underflow errors. It also merrily divides by 0 despite > checking for that(!). The code still serves a doc purpose here, as > the algorithm following is a simple by-cases transformation of this > one: > > Py_complex r; > double d = b.real*b.real + b.imag*b.imag; > if (d == 0.) > errno = EDOM; > r.real = (a.real*b.real + a.imag*b.imag)/d; > r.imag = (a.imag*b.real - a.real*b.imag)/d; > return r; > ******************************************************************/ > > /* This algorithm is better, and is pretty obvious: first > divide the > * numerators and denominator by whichever of {b.real, b.imag} has > * larger magnitude. The earliest reference I found was to CACM > * Algorithm 116 (Complex Division, Robert L. Smith, Stanford > * University). As usual, though, we're still ignoring all IEEE > * endcases. > */ > > The algorithm shown, and maligned, in this comment is pretty much > exactly what is done in numpy at present. The function goes on to use > the improved algorithm, which I will include at the bottom of the post. > It seems nearly certain that using this algorithm will result in some > speed hit, although I'm not certain how much. I will probably try this > out at some point and see what the speed hit, but in case I drop the > ball I thought I'd throw this out there as something we should at least > look at. In most cases, I'll take accuracy over raw speed (within reason). > > -tim > > > > > > > > Py_complex r; /* the result */ > const double abs_breal = b.real < 0 ? -b.real : b.real; > const double abs_bimag = b.imag < 0 ? -b.imag : b.imag; > > if (abs_breal >= abs_bimag) { > /* divide tops and bottom by b.real */ > if (abs_breal == 0.0) { > errno = EDOM; > r.real = r.imag = 0.0; > } > else { > const double ratio = b.imag / b.real; > const double denom = b.real + b.imag * ratio; > r.real = (a.real + a.imag * ratio) / denom; > r.imag = (a.imag - a.real * ratio) / denom; > } > } > else { > /* divide tops and bottom by b.imag */ > const double ratio = b.real / b.imag; > const double denom = b.real * ratio + b.imag; > assert(b.imag != 0.0); > r.real = (a.real * ratio + a.imag) / denom; > r.imag = (a.imag * ratio - a.real) / denom; > } > return r; > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From cookedm at physics.mcmaster.ca Sun Feb 19 16:25:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Sun Feb 19 16:25:02 2006 Subject: [Numpy-discussion] complex division In-Reply-To: <43F8F202.80500@cox.net> References: <43F8F202.80500@cox.net> Message-ID: <20060220002319.GA15783@arbutus.physics.mcmaster.ca> On Sun, Feb 19, 2006 at 03:32:34PM -0700, Tim Hochberg wrote: > > While rummaging around Python's complexobject.c looking for code to > steal for complex power, I came across the following comment relating to > complex division: > > /****************************************************************** > This was the original algorithm. It's grossly prone to spurious > overflow and underflow errors. It also merrily divides by 0 despite > checking for that(!). The code still serves a doc purpose here, as > the algorithm following is a simple by-cases transformation of this > one: > > Py_complex r; > double d = b.real*b.real + b.imag*b.imag; > if (d == 0.) > errno = EDOM; > r.real = (a.real*b.real + a.imag*b.imag)/d; > r.imag = (a.imag*b.real - a.real*b.imag)/d; > return r; > ******************************************************************/ > > /* This algorithm is better, and is pretty obvious: first > divide the > * numerators and denominator by whichever of {b.real, b.imag} has > * larger magnitude. The earliest reference I found was to CACM > * Algorithm 116 (Complex Division, Robert L. Smith, Stanford > * University). As usual, though, we're still ignoring all IEEE > * endcases. > */ > > The algorithm shown, and maligned, in this comment is pretty much > exactly what is done in numpy at present. The function goes on to use > the improved algorithm, which I will include at the bottom of the post. > It seems nearly certain that using this algorithm will result in some > speed hit, although I'm not certain how much. I will probably try this > out at some point and see what the speed hit, but in case I drop the > ball I thought I'd throw this out there as something we should at least > look at. In most cases, I'll take accuracy over raw speed (within reason). > > -tim The condition for accuracy on this is |Z - z| < epsilon |z| where I'm using Z for the computed value of z=a/b, and epsilon is on the order of machine accuracy. As pointed out by Steward (ACM TOMS, v. 11, pg 238 (1985)), this doesn't mean that the real and imaginary components are accurate. The example he gives is a = 1e70 + 1e-70i and b=1e56+1e-56i, where z=a/b=1e14 + 1e-99i, which is susceptible to underflow for a machine with 10 decimal digits and a exponent range of +-99. Priest (ACM TOMS v30, pg 389 (2004)) gives an alternative, which I won't show here, b/c it does bit-twiddling with the double representation. But it does a better job of handling overflow and underflow in intermediate calculations, is competitive in terms of accuracy, and is faster (at least on a 750 MHz UltraSPARC-III ;) than the other algorithms except for the textbook version. One problem is the sample code is for double precision; for single or longdouble, we'd have to figure out some magic constants. Maybe I'll look into it later, but for now Smith's algorithm is better than the textbook one we were using :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cookedm at physics.mcmaster.ca Sun Feb 19 16:39:00 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Sun Feb 19 16:39:00 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F7C73B.2000806@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> Message-ID: <20060220003714.GB15783@arbutus.physics.mcmaster.ca> On Sat, Feb 18, 2006 at 06:17:47PM -0700, Tim Hochberg wrote: > > OK, I now have a faily clean implementation in C of: > > def __pow__(self, p): > if p is not a scalar: > return power(self, p) > elif p == 1: > return p > elif p == 2: > return square(self) > # elif p == 3: > # return cube(self) > # elif p == 4: > # return power_4(self) > # elif p == 0: > # return ones(self.shape, dtype=self.dtype) > # elif p == -1: > # return 1.0/self > elif p == 0.5: > return sqrt(self) > > > First a couple of technical questions, then on to the philosophical portion > of this note. > > 1. Is there a nice fast way to get a matrix filled with ones from C. I've > been tempted to write a ufunc 'ones_like', but I'm afraid that might be > considered inappropriate. > > 2. Are people aware that array_power is sometimes passed non arrays as its > first argument? Despite having the signature: > > array_power(PyArrayObject *a1, PyObject *o2) > > This caused me almost no end of headaches, not to mention crashes during > numpy.test(). Yes; because it's the implementation of __pow__, the second argument can be anything. > I'll check this into the power_optimization branch RSN, hopefully with a > fix for the zero power case. Possibly also after extending it to inplace > power as well. > > > OK, now on to more important stuff. As I've been playing with this my > opinion has gone in circles a couple of times. I now think the issue of > optimizing integer powers of complex numbers and integer powers of floats > are almost completely different. Because complex powers are quite slow and > relatively inaccurate, it is appropriate to optimize them for integer > powers at the level of nc_pow. This should be just a matter of liberal > borrowing from complexobject.c, but I haven't tried it yet. Ok. > On the other hand, real powers are fast enough that doing anything at the > single element level is unlikely to help. So in that case we're left with > either optimizing the cases where the dimension is zero as David has done, > or optimizing at the __pow__ (AKA array_power) level as I've done now based > on David's original suggestion. This second approach is faster because it > avoids the mysterious python scalar -> zero-D array conversion overhead. > However, it suffers if we want to optimize lots of different powers since > one needs a ufunc for each one. So the question becomes, which powers > should we optimize? Hmm, ufuncs are passed a void* argument for passing info to them. Now, what that argument is defined when the ufunc is created, but maybe there's a way to piggy-back on it. > My latest thinking on this is that we should optimize only those cases > where the optimized result is no less accurate than that produced by pow. > I'm going to assume that all C operations are equivalently accurate, so > pow(x,2) has roughly the same amount of error as x*x. (Something on the > order of 0.5 ULP I'd guess). In that case: > pow(x, -1) -> 1 / x > pow(x, 0) -> 1 > pow(x, 0.5) -> sqrt(x) > pow(x, 1) -> x > pow(x, 2) -> x*x > can all be implemented in terms of multiply or divide with the same > accuracy as the original power methods. Once we get beyond these, the error > will go up progressively. > > The minimal set described above seems like it should be relatively > uncontroversial and it's what I favor. Once we get beyond this basic set, > we would need to reach some sort of consensus on how much additional error > we are willing to tolerate for optimizing these extra cases. You'll notice > that I've changed my mind, yet again, over whether to optimize A**0.5. > Since the set of additional ufuncs needed in this case is relatively small, > just square and inverse (==1/x), this minimal set works well if optimizing > in pow as I've done. Ok. I'm still not happy with the speed of pow(), though. I'll have to sit and look at. We may be able to optimize integer powers better. And there's another area: integer powers of integers. Right now that uses pow(), whereas we might be able to do better. I'm looking into that. A framework for that could be helpful for the complex powers too. Too bad we couldn't make a function generator :-) [Well, we could using weave...] -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cjw at sympatico.ca Sun Feb 19 17:46:03 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sun Feb 19 17:46:03 2006 Subject: [Numpy-discussion] number ranges In-Reply-To: References: <1139945386.3346.38.camel@pc1.cole.uklinux.net><43F8D881.5030804@sympatico.ca> Message-ID: <43F91F2D.6070208@sympatico.ca> An HTML attachment was scrubbed... URL: From mahmood.tariq at gmail.com Sun Feb 19 17:58:03 2006 From: mahmood.tariq at gmail.com (Tariq Mahmood) Date: Sun Feb 19 17:58:03 2006 Subject: [Numpy-discussion] numpy 0.9.4 on cygwin Message-ID: Hi, Has anyone been successful at installing numpy 0.9.4 on cygwin? Details: os.name: posix uname -a: CYGWIN_NT-5.1 sys.platform: cygwin sys.version: 2.4.1 numpy.version: 0.9.4 Steps taken: 1. unpacked numpy-0.9.4.tar.gz 2. changed to numpy-0.9.4 directory 3. python setup.py install Major error messages while compiling C source: 1. undefined reference to `_feclearexcept' in umathmodule 2. Command "gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.18-i686-2.4/build/src/numpy/core/src/umathmodule.o -L/usr/lib/python2.4/config -lpython2.4 -o build/lib.cygwin-1.5.18-i686-2.4/numpy/core/umath.dll" failed with exit status 1 Other information that might be useful: 1. successfully installed both Numeric 24.2 and numarray 1.5.0 Any help would be appreciated. Tariq From rhl at astro.princeton.edu Sun Feb 19 18:14:01 2006 From: rhl at astro.princeton.edu (Robert Lupton) Date: Sun Feb 19 18:14:01 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray Message-ID: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> I have a swig extension that defines a class that inherits from both a personal C-coded image struct (actImage), and also from Numeric's UserArray. This works very nicely, but I thought that it was about time to upgrade to numpy. The code looks like: from UserArray import * class Image(UserArray, actImage): def __init__(self, *args): actImage.__init__(self, *args) UserArray.__init__(self, self.getArray(), 'd', copy=False, savespace=False) I can't figure out how to convert this to use ndarray, as ndarray doesn't seem to have an __init__ method, merely a __new__. So what's the approved numpy way to handle multiple inheritance? I've a nasty idea that this is a python question that I should know the answer to, but I'm afraid that I don't... R From tim.hochberg at cox.net Sun Feb 19 19:36:12 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sun Feb 19 19:36:12 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <20060220003714.GB15783@arbutus.physics.mcmaster.ca> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> Message-ID: <43F938FA.80200@cox.net> David M. Cooke wrote: >On Sat, Feb 18, 2006 at 06:17:47PM -0700, Tim Hochberg wrote: > > >>OK, I now have a faily clean implementation in C of: >> >>def __pow__(self, p): >> if p is not a scalar: >> return power(self, p) >> elif p == 1: >> return p >> elif p == 2: >> return square(self) >># elif p == 3: >># return cube(self) >># elif p == 4: >># return power_4(self) >># elif p == 0: >># return ones(self.shape, dtype=self.dtype) >># elif p == -1: >># return 1.0/self >> elif p == 0.5: >> return sqrt(self) >> >> >>First a couple of technical questions, then on to the philosophical portion >>of this note. >> >>1. Is there a nice fast way to get a matrix filled with ones from C. I've >>been tempted to write a ufunc 'ones_like', but I'm afraid that might be >>considered inappropriate. >> >>2. Are people aware that array_power is sometimes passed non arrays as its >>first argument? Despite having the signature: >> >>array_power(PyArrayObject *a1, PyObject *o2) >> >>This caused me almost no end of headaches, not to mention crashes during >>numpy.test(). >> >> > >Yes; because it's the implementation of __pow__, the second argument can >be anything. > > No, you misunderstand.. What I was talking about was that the *first* argument can also be something that's not a PyArrayObject, despite the functions signature. > > >>I'll check this into the power_optimization branch RSN, hopefully with a >>fix for the zero power case. Possibly also after extending it to inplace >>power as well. >> >> >>OK, now on to more important stuff. As I've been playing with this my >>opinion has gone in circles a couple of times. I now think the issue of >>optimizing integer powers of complex numbers and integer powers of floats >>are almost completely different. Because complex powers are quite slow and >>relatively inaccurate, it is appropriate to optimize them for integer >>powers at the level of nc_pow. This should be just a matter of liberal >>borrowing from complexobject.c, but I haven't tried it yet. >> >> > >Ok. > > > >>On the other hand, real powers are fast enough that doing anything at the >>single element level is unlikely to help. So in that case we're left with >>either optimizing the cases where the dimension is zero as David has done, >>or optimizing at the __pow__ (AKA array_power) level as I've done now based >>on David's original suggestion. This second approach is faster because it >>avoids the mysterious python scalar -> zero-D array conversion overhead. >>However, it suffers if we want to optimize lots of different powers since >>one needs a ufunc for each one. So the question becomes, which powers >>should we optimize? >> >> > >Hmm, ufuncs are passed a void* argument for passing info to them. Now, >what that argument is defined when the ufunc is created, but maybe >there's a way to piggy-back on it. > > Yeah, I really felt like I was fighting the ufuncs when I was playing with this. On the one hand, you really want to use the ufunc machinery. On the other hand that forces you into using the same types for both arguments. That really wouldn't be a problem, since we could just define an integer_power that took doubles, but did integer powers, except for the conversion overhead of Python_Integers into arrays. It looks like you started down this road and I played with this as well. I can think a of at least one (horrible) way around the matrix overhead, but the real fix would be to dig into PyArray_EnsureArray and see why it's slow for Python_Ints. It is much faster for numarray scalars. Another approach is to actually compute (x*x)*(x*x) for pow(x,4) at the level of array_power. I think I could make this work. It would probably work well for medium size arrays, but might well make things worse for large arrays that are limited by memory bandwidth since it would need to move the array from memory into the cache multiple times. >>My latest thinking on this is that we should optimize only those cases >>where the optimized result is no less accurate than that produced by pow. >>I'm going to assume that all C operations are equivalently accurate, so >>pow(x,2) has roughly the same amount of error as x*x. (Something on the >>order of 0.5 ULP I'd guess). In that case: >> pow(x, -1) -> 1 / x >> pow(x, 0) -> 1 >> pow(x, 0.5) -> sqrt(x) >> pow(x, 1) -> x >> pow(x, 2) -> x*x >>can all be implemented in terms of multiply or divide with the same >>accuracy as the original power methods. Once we get beyond these, the error >>will go up progressively. >> >>The minimal set described above seems like it should be relatively >>uncontroversial and it's what I favor. Once we get beyond this basic set, >>we would need to reach some sort of consensus on how much additional error >>we are willing to tolerate for optimizing these extra cases. You'll notice >>that I've changed my mind, yet again, over whether to optimize A**0.5. >>Since the set of additional ufuncs needed in this case is relatively small, >>just square and inverse (==1/x), this minimal set works well if optimizing >>in pow as I've done. >> >> Just to add a little more confusion to the mix. I did a little testing to see how close pow(x,n) and x*x*... actually are. They are slightly less close for small values of N and slightly closer for large values of N than I would have expected. The upshot of this is that integer powers between -2 and +4 all seem to vary by the same amount when computed using pow(x,n) versus multiplies. I'm including the test code at the end. Assuming that this result is not a fluke that expands the noncontroversial set by at least 3 more values. That's starting to strain the ufunc aproach, so perhaps optimizing in @TYP at _power is the way to go after all. Or, more likely, adding @TYP at _int_power or maybe @TYP at _fast_power (so as to be able to include some half integer powers) and dispatching appropriately from array_power. The problem here, of course, is the overhead that PyArray_EnsureArray runs into. I'm not sure if the ufuncs actually call that, but I was using that to convert things to arrays at one point and I saw the slowdown, so I suspect that the slowdown is in something PyArray_EnsureArray calls if not in that routine itself. I'm afraid to dig into that stuff though.. On the other hand, it would probably speed up all kinds of stuff if that was sped up. > >Ok. I'm still not happy with the speed of pow(), though. I'll have to >sit and look at. We may be able to optimize integer powers better. >And there's another area: integer powers of integers. Right now that >uses pow(), whereas we might be able to do better. > So that's why that's so slow. I assumed it was doing some sort of sucessive multiplication. For this, the code that complexobject uses for integer powers might be helpful. >I'm looking into >that. A framework for that could be helpful for the complex powers too. > >Too bad we couldn't make a function generator :-) [Well, we could using >weave...]\ > > Yaigh! -tim def check(n=100000): import math sqrt = math.sqrt failures = {} for x in [math.pi, math.e, 1.1]+[1.0+1.0/y for y in range(1,1+n)]: for e, expr in [ (-5, "1/((x*x)*(x*x)*x)"), (-4,"1/((x*x)*(x*x))"), (3, "1/((x*x)*x)"), (1, "x"), (-2, "1/(x*x)"), (-1,"1/x"), (0, "1"), (1, "x"), (2, "x*x"), (3, "x*x*x"), (4, "(x*x)*(x*x)"), (4, "x*x*x*x"), (5, '(x*x)*(x*x)*x'), (6, '(x*x)*(x*x)*(x*x)'), (7, '(x*x)*(x*x)*(x*x)*x'), (8, '((x*x)*(x*x))*((x*x)*(x*x))'), (-1.5, "1/(sqrt(x)*x)"), (-0.5, "1/sqrt(x)"), (0.5, "sqrt(x)"), (1.5, "x*sqrt(x)")]: delta = abs(pow(x,e) - eval(expr, locals())) / pow(x,e) if delta: key = (e, expr) if key not in failures: failures[key] = [(delta, x)] failures[key].append((delta, x)) for key in sorted(failures.keys()): e, expr = key fstring = ', '.join(str(x) for x in list(reversed(sorted(failures[key])))[:1]) if len(failures[key]) > 1: fstring += ', ...' print "Failures for x**%s (%s): %s" % (e, expr, fstring) From ndarray at mac.com Sun Feb 19 19:38:02 2006 From: ndarray at mac.com (Sasha) Date: Sun Feb 19 19:38:02 2006 Subject: [Numpy-discussion] What is the status of the multidimensional arrays PEP? Message-ID: What is the status of the multidimensional arrays PEP? It seems to me that there is one part of the PEP that can be easily separated into a rather uncontroversial PEP. This is the part that defines array protocol. Python already has a (1-dimensional) array object in the standard library. Python array already supports buffer protocol and it looks like implementing full array protocol is straightforward. I believe that having an object that supports array protocol even without multiple dimensions will be immediately useful. From wbaxter at gmail.com Sun Feb 19 22:13:05 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 22:13:05 2006 Subject: [Numpy-discussion] Some missing linalg things (wanted: LU decomposition) Message-ID: This url http://www.rexx.com/~dkuhlman/scipy_course_01.html seems to keep turning up in my searches for numpy and scipy things, but many of the linalg operations it lists don't seem to exist in recent versions of numpy (or scipy). Some of them are: * norm * factorizations: lu, lu_factor, lu_solve, qr * iterative solvers: cg, cgs, gmres etc. Did these things used to exist in Numeric but they haven't been ported over? Will they be re-introduced sometime? In the short term, the one I'm after right now is LU decompose and solve functionality. Anyone have a numpy implementation? --Bill Baxter -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sun Feb 19 22:37:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 22:37:02 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: References: Message-ID: Upon further inspection I find that if I call 'from scipy import *' then linalg.lu etc are defined. But if I do anything else to import scipy like 'import scipy' or 'import scipy as S' or 'from scipy import linalg', then lu, cg etc are not defined. Why is that? I can get at them without importing * by doing 'from scipy.linalg import lu', but that's kind of odd to have to do that. --bb On 2/20/06, Bill Baxter wrote: > > This url http://www.rexx.com/~dkuhlman/scipy_course_01.htmlseems to keep turning up in my searches for numpy and scipy things, > but many of the linalg operations it lists don't seem to exist in recent > versions of numpy (or scipy). > > Some of them are: > > * norm > * factorizations: lu, lu_factor, lu_solve, qr > * iterative solvers: cg, cgs, gmres etc. > > Did these things used to exist in Numeric but they haven't been ported > over? Will they be re-introduced sometime? > > In the short term, the one I'm after right now is LU decompose and solve > functionality. Anyone have a numpy implementation? > > --Bill Baxter > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sun Feb 19 22:49:05 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 22:49:05 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: References: Message-ID: Ack. I may be able to get references to lu, lu_factor, et al, but they don't actually work with numpy arrays: from scipy.linalg import lu,lu_factor,lu_solve import scipy as S A = S.rand(2,2) lu(A) Traceback (most recent call last): File "", line 1, in ? File "C:\Python24\Lib\site-packages\scipy\linalg\decomp.py", line 249, in lu flu, = get_flinalg_funcs(('lu',),(a1,)) File "C:\Python24\Lib\site-packages\scipy\linalg\flinalg.py", line 30, in get_flinalg_funcs t = arrays[i].dtypechar AttributeError: 'numpy.ndarray' object has no attribute 'dtypechar' Ok, so, once again, does anyone have an lu_factor / lu_solve implementation in python that I could borrow? Apologies for the monologue. --bb On 2/20/06, Bill Baxter wrote: > > Upon further inspection I find that if I call 'from scipy import *' then > linalg.lu etc are defined. > But if I do anything else to import scipy like 'import scipy' or 'import > scipy as S' or 'from scipy import linalg', then lu, cg etc are not defined. > > > Why is that? > > I can get at them without importing * by doing 'from scipy.linalg import > lu', but that's kind of odd to have to do that. > > --bb > > On 2/20/06, Bill Baxter wrote: > > > > This url http://www.rexx.com/~dkuhlman/scipy_course_01.htmlseems to keep turning up in my searches for numpy and scipy things, > > but many of the linalg operations it lists don't seem to exist in recent > > versions of numpy (or scipy). > > > > Some of them are: > > > > * norm > > * factorizations: lu, lu_factor, lu_solve, qr > > * iterative solvers: cg, cgs, gmres etc. > > > > Did these things used to exist in Numeric but they haven't been ported > > over? Will they be re-introduced sometime? > > > > In the short term, the one I'm after right now is LU decompose and solve > > functionality. Anyone have a numpy implementation? > > > > --Bill Baxter > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sun Feb 19 23:38:08 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Sun Feb 19 23:38:08 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: <43F96736.3020003@mecha.uni-stuttgart.de> References: <43F96736.3020003@mecha.uni-stuttgart.de> Message-ID: Should have mentioned -- I was using numpy 0.9.4 / scipy 0.4.4. Looks like it works in numpy 0.9.5 / scipy 0.4.6 But matplotlib, which I also need, hasn't been updated for numpy 0.9.5 yet. :-( It's also still pretty weird to me that you have to do "from scipy.linalgimport lu" specifically. And then after doing that one import, then all the other scipy.linalg.* functions magically spring into existence too. Is that sort of hing expected behavior from Python imports? >>> import numpy as N >>> import scipy as S >>> S.linalg.lu Traceback (most recent call last): File "", line 1, in ? AttributeError: 'module' object has no attribute 'lu' >>> from scipy.linalg import lu >>> S.linalg.lu(N.rand(2,2)) (array([[ 0., 1.], [ 1., 0.]]), array([[ 1. , 0. ], [ 0.18553085, 1. ]]), array([[ 0.71732168, 0.48540043], [ 0. , 0.61379118]])) >>> (N.__version__, S.__version__) ('0.9.5', '0.4.6') --bb On 2/20/06, Nils Wagner wrote: > > Bill Baxter wrote: > > Ack. I may be able to get references to lu, lu_factor, et al, but > > they don't actually work with numpy arrays: > > > > from scipy.linalg import lu,lu_factor,lu_solve > > import scipy as S > > A = S.rand(2,2) > > lu(A) > > Traceback (most recent call last): > > File "", line 1, in ? > > File "C:\Python24\Lib\site-packages\scipy\linalg\decomp.py", line > > 249, in lu > > flu, = get_flinalg_funcs(('lu',),(a1,)) > > File "C:\Python24\Lib\site-packages\scipy\linalg\flinalg.py", line > > 30, in get_flinalg_funcs > > t = arrays[i].dtypechar > > AttributeError: 'numpy.ndarray' object has no attribute 'dtypechar' > > > > > > Ok, so, once again, does anyone have an lu_factor / lu_solve > > implementation in python that I could borrow? > > > > Apologies for the monologue. > > > > --bb > > > > > > On 2/20/06, *Bill Baxter* > > wrote: > > > > Upon further inspection I find that if I call 'from scipy import > > *' then linalg.lu etc are defined. > > But if I do anything else to import scipy like 'import scipy' or > > 'import scipy as S' or 'from scipy import linalg', then lu, cg etc > > are not defined. > > > > Why is that? > > > > I can get at them without importing * by doing 'from scipy.linalg > > import lu', but that's kind of odd to have to do that. > > > > --bb > > > > > > On 2/20/06, * Bill Baxter* > > wrote: > > > > This url http://www.rexx.com/~dkuhlman/scipy_course_01.html > > seems > > to keep turning up in my searches for numpy and scipy things, > > but many of the linalg operations it lists don't seem to exist > > in recent versions of numpy (or scipy). > > > > Some of them are: > > > > * norm > > * factorizations: lu, lu_factor, lu_solve, qr > > * iterative solvers: cg, cgs, gmres etc. > > > > Did these things used to exist in Numeric but they haven't > > been ported over? Will they be re-introduced sometime? > > > > In the short term, the one I'm after right now is LU decompose > > and solve functionality. Anyone have a numpy implementation? > > > > --Bill Baxter > > > No problem here. > > >>> from scipy.linalg import lu,lu_factor,lu_solve > >>> import scipy as S > >>> A = S.rand(2,2) > >>> lu(A) > (array([[ 0., 1.], > [ 1., 0.]]), array([[ 1. , 0. ], > [ 0.81367315, 1. ]]), array([[ 0.49886054, 0.57065709], > [ 0. , -0.30862809]])) > >>> S.__version__ > '0.4.7.1614' > > > Nils > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josegomez at gmx.net Mon Feb 20 00:19:01 2006 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Mon Feb 20 00:19:01 2006 Subject: [Numpy-discussion] Problems compiling on Cygwin References: <43F5B856.5060301@ieee.org> Message-ID: <18838.1140423471@www073.gmx.net> Hi Travis, > I looked in to how people at cygwin ports got the IEEE math stuff done. > They borrowed it from BSD basically. So, I've taken their patch and > placed it in the main tree. > > Jose, could you check out the latest SVN version of numpy and try to > build and install it on cygwin to see if I made the right changes? I can?t access remote SVNs/CVSs servers from work, but I will download it at home and try it tonight. Many thanks! Jose -- Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko! Satte Provisionen f?r GMX Partner: http://www.gmx.net/de/go/partner From curzio.basso at gmail.com Mon Feb 20 08:49:23 2006 From: curzio.basso at gmail.com (Curzio Basso) Date: Mon Feb 20 08:49:23 2006 Subject: [Numpy-discussion] [nd_image] histograms of RGB images Message-ID: Hello everybody! I was wondering if someone already had the problem of computing histograms of RGB images for all channels simultaneously (that is getting a rank-3 array) rather than on the three channels separately. Just looking for a way to avoid writing the C function :-) cheers curzio From fonnesbeck at gmail.com Mon Feb 20 11:28:06 2006 From: fonnesbeck at gmail.com (Chris Fonnesbeck) Date: Mon Feb 20 11:28:06 2006 Subject: [Numpy-discussion] selecting random array element Message-ID: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> What is the best way to select a random element from a numpy array? I know I could index by a random integer, but was wondering if there was a built-in method or function. Thanks, C. -- Chris Fonnesbeck + Atlanta, GA + http://trichech.us From aisaac at american.edu Mon Feb 20 11:44:02 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 20 11:44:02 2006 Subject: [Numpy-discussion] selecting random array element In-Reply-To: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: At http://www.american.edu/econ/pytrix/pytrix.py find def permute(x): '''Return a permutation of a sequence or array. :note: Also consider numpy.random.shuffle (to permute *inplace* 1-d arrays) ''' x = numpy.asarray(x) xshape = x.shape pidx = numpy.random.random(x.size).argsort() return x.flat[pidx].reshape(xshape) Note the note. ;-) Cheers, Alan Isaac From strawman at astraw.com Mon Feb 20 11:59:02 2006 From: strawman at astraw.com (Andrew Straw) Date: Mon Feb 20 11:59:02 2006 Subject: [Numpy-discussion] [nd_image] histograms of RGB images In-Reply-To: References: Message-ID: <43FA1F58.1030204@astraw.com> See this example. You don't really need pylab/matplotlib -- you could just use numpy.histogram. import pylab import matplotlib.numerix as nx import Image im = Image.open('data/lena.jpg') imbuf = im.tostring('raw','RGB',0,-1) imnx = nx.fromstring(imbuf,nx.UInt8) imnx.shape = im.size[1], im.size[0], 3 bins = nx.arange(0,256) pylab.hist( nx.ravel(imnx[:,:,0]), bins=bins, facecolor='r', edgecolor='r' ) pylab.hist( nx.ravel(imnx[:,:,1]), bins=bins, facecolor='g', edgecolor='g' ) pylab.hist( nx.ravel(imnx[:,:,2]), bins=bins, facecolor='b', edgecolor='b' ) pylab.show() Curzio Basso wrote: >Hello everybody! > >I was wondering if someone already had the problem of computing >histograms of RGB images for all channels simultaneously (that is >getting a rank-3 array) rather than on the three channels separately. >Just looking for a way to avoid writing the C function :-) > >cheers >curzio > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From robert.kern at gmail.com Mon Feb 20 14:42:06 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon Feb 20 14:42:06 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: Alan G Isaac wrote: > At http://www.american.edu/econ/pytrix/pytrix.py find > def permute(x): > '''Return a permutation of a sequence or array. > > :note: Also consider numpy.random.shuffle > (to permute *inplace* 1-d arrays) > ''' > x = numpy.asarray(x) > xshape = x.shape > pidx = numpy.random.random(x.size).argsort() > return x.flat[pidx].reshape(xshape) You may want to consider numpy.random.permutation() In [22]: numpy.random.permutation? Type: builtin_function_or_method Base Class: String Form: Namespace: Interactive Docstring: Given an integer, return a shuffled sequence of integers >= 0 and < x; given a sequence, return a shuffled array copy. permutation(x) -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From robert.kern at gmail.com Mon Feb 20 15:47:01 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon Feb 20 15:47:01 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: Chris Fonnesbeck wrote: > What is the best way to select a random element from a numpy array? I > know I could index by a random integer, but was wondering if there was > a built-in method or function. Generating a random index is what I do. I think there's certainly room for a RandomState.choice() method. I think something like this covers most of the use cases: import numpy from numpy import random def choice(x, axis=None): """Select an element or subarray uniformly randomly. If axis is None, then a single element is chosen from the entire array. Otherwise, a subarray is chosen from the given axis. """ x = numpy.asarray(x) if axis is None: length = numpy.multiply.reduce(x.shape) n = random.randint(length) return x.flat[n] else: n = random.randint(x.shape[axis]) # I'm sure there's a better way of doing this idx = map(slice, x.shape) idx[axis] = n return x[tuple(idx)] -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From aisaac at american.edu Mon Feb 20 17:31:13 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 20 17:31:13 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: On Mon, 20 Feb 2006, Robert Kern apparently wrote: > length = numpy.multiply.reduce(x.shape) Can this be different from x.size? Thanks, Alan Isaac From robert.kern at gmail.com Mon Feb 20 17:39:06 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon Feb 20 17:39:06 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: Alan G Isaac wrote: > On Mon, 20 Feb 2006, Robert Kern apparently wrote: > >> length = numpy.multiply.reduce(x.shape) > > Can this be different from x.size? No, it's just that old habits die hard. I knew there was a clean way to get that information, I just didn't remember it or bother myself to look for it. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From wbaxter at gmail.com Mon Feb 20 17:49:02 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Mon Feb 20 17:49:02 2006 Subject: [Numpy-discussion] Re: Some missing linalg things (wanted: LU decomposition) In-Reply-To: <463e11f90602200851o6885dd64l404eaf028509f544@mail.gmail.com> References: <43F96736.3020003@mecha.uni-stuttgart.de> <463e11f90602200851o6885dd64l404eaf028509f544@mail.gmail.com> Message-ID: On 2/20/06, Bill Baxter wrote: > > Should have mentioned -- I was using numpy 0.9.4 / scipy 0.4.4. > > Looks like it works in numpy 0.9.5 / scipy 0.4.6 > > > > But matplotlib, which I also need, hasn't been updated for numpy 0.9.5yet. > > :-( > > > > On 2/21/06, Jonathan Taylor wrote: > > For matplotlib, I just use tolist() like > > a = array([1,3,2,3]) > > ... > > pylab.plot(a.tolist()) > > Maybe that will work for you until you can fix your problem. > J. Excellent idea! That does the trick for now (if I take the numerix: numpy line out of my .matplotlibrc to stop it from crashing on import). --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Mon Feb 20 17:58:23 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 20 17:58:23 2006 Subject: [Numpy-discussion] Re: selecting random array element In-Reply-To: References: <723eb6930602201101v2b8ebf9axd5801fe8b60998b3@mail.gmail.com> Message-ID: On Mon, 20 Feb 2006, Robert Kern apparently wrote: > You may want to consider numpy.random.permutation() Yes indeed. Thanks, Alan From oliphant.travis at ieee.org Mon Feb 20 19:45:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 20 19:45:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F938FA.80200@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> Message-ID: <43FA8C9C.2020002@ieee.org> Tim Hochberg wrote: >> Hmm, ufuncs are passed a void* argument for passing info to them. Now, >> what that argument is defined when the ufunc is created, but maybe >> there's a way to piggy-back on it. >> >> > Yeah, I really felt like I was fighting the ufuncs when I was playing > with this. On the one hand, you really want to use the ufunc > machinery. On the other hand that forces you into using the same types > for both arguments. This is not true. Ufuncs can have different types for their arguments. Perhaps you meant something else? > Just to add a little more confusion to the mix. I did a little testing > to see how close pow(x,n) and x*x*... actually are. They are slightly > less close for small values of N and slightly closer for large values > of N than I would have expected. The upshot of this is that integer > powers between -2 and +4 all seem to vary by the same amount when > computed using pow(x,n) versus multiplies. I'm including the test code > at the end. Assuming that this result is not a fluke that expands the > noncontroversial set by at least 3 more values. That's starting to > strain the ufunc aproach, so perhaps optimizing in @TYP at _power is the > way to go after all. Or, more likely, adding @TYP at _int_power or maybe > @TYP at _fast_power (so as to be able to include some half integer > powers) and dispatching appropriately from array_power. > > The problem here, of course, is the overhead that PyArray_EnsureArray > runs into. I'm not sure if the ufuncs actually call that, but I was > using that to convert things to arrays at one point and I saw the > slowdown, so I suspect that the slowdown is in something > PyArray_EnsureArray calls if not in that routine itself. I'm afraid to > dig into that stuff though.. On the other hand, it would probably > speed up all kinds of stuff if that was sped up. EnsureArray simply has some short cuts and then calls PyArray_FromAny. PyArray_FromAny is the big array conversion code. It converts anything (it can) to an array. >> Too bad we couldn't make a function generator :-) [Well, we could using >> weave...]\ >> >> > Yaigh! That's actually an interesting approach that could use some attention.. -Travis From pearu at scipy.org Tue Feb 21 00:55:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 21 00:55:03 2006 Subject: [Numpy-discussion] comparing container objects with arrays Message-ID: Hi, Question: what is the recommended way to compare two array objects? And when they are contained in a tuple or list or dictinary or etc.? I ask because I found that arr1.__eq__(arr2) can return either bool or an array of bools when shape(arr1)!=shape(arr2) or shape(arr1)==shape(arr2), respectively: >>> array([1,2])==array([1,0,0]) False >>> array([1,2])==array([1,0]) array([True, False], dtype=bool) I wonder if numpy users are happy with that? Shouldn't arr1==arr2 return bool as well because current __eq__ behaviour is handled by equal() function when the shapes are equal? Note that if __eq__ would always return bool then the following codes would work as I would expect: >>> (1,array([1,2]))==(1,array([1,2])) Traceback (most recent call last): File "", line 1, in ? ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>> # I would expect True, compare with >>> (1,[1,2])==(1,[1,2]) True >>> (1,array([1,2]))==(1,array([1,2,0])) False I started to write this message because object1 == object2 returns boolean for (all?) Python builtin objects but as soon as these objects contain arrays, the test will fail with an exception. May be numpy needs equalobjs(obj1,obj2) that always returns boolean and can handle comparing objects like {1:array([1,2])}, [3,[array([2,2])], etc. Pearu From arabic at post.cz Tue Feb 21 01:59:02 2006 From: arabic at post.cz (=?windows-1251?B?wO3g8fLg8ej/IMji4O3u4uA=?=) Date: Tue Feb 21 01:59:02 2006 Subject: [Numpy-discussion] =?windows-1251?B?zM7SyMLA1sjfIM3AIDEwMCU=?= Message-ID: <069801c636cd$972e2de1$d7b5d1ce@post.cz> ??? ?????????? ?? ?????????, ????????????? ??????????? ?????????????. ???????-???????: ????????? 100%. ??????? ?????????, ??? ???????????? ????????????? ?????????? ????? ???????????. 9-10 ????? ? ???? ??????? ????? ???????? ?????? ? ???????, ?? ???? ??? ?? ???????? ?? ????????, ???? ??? ?? ???????. ????????????? ?????????????? ??????????? ? ???????? ????? ????? ????? ?????????? ???????????? ?????????????, ? ????? ????????? ???????? ?? ????????? ??? ?????. ??? ??????? ?? ???????????? ????? ????????, ?? ????????? ????, ??? ?? ?????????? ?????? ? ??? ?????? ?? ?????????. ??? ??????, ?????????? ??? ???????????? ?????????? ?????? ? ??????? ??? ??????, ????? ?? ??????? ????? ?????????? ? ???????? ??????? ? ??????? ????? ??? ??? ????? ????????????? ????? ?? ??????? ????????? ????????????? ?? ?????? ?????????? ????? ??????? ??? ?? ?????? ??????? ?? ????? ?? ??? ? ?????? ?????? ??????? ?? ??????? ???????? ?????? ? ???? ????????. ?????????? ????????: - ????????? ????????????? ?????????????? ??????????? ?? ??????? ????????????? - ???????????? ???????? ????? ???????????? - ????????? ?????????? ????? ???????? ? ??????????? ?????????? ?????????: 1. ???????? ????????? ? ????????? ???????????. ???? ?????????? ?????, ??? ??????? ?? ?????????? ??????????? ??????? ? ??????? ? ????????? ??? ????????????? ???????? ? ????????? ????????? ? ????????? ????? ????? 2. ??????????? ?????????????? ??????????? ? ?? ????????????? ? ???????? ?????????? ? ??????????????? ??????? ???????? ????????? ??????????? ??????????? ??????????? ?? ???????????????? ??????? ????????? ???????? ?????????????? ???????????? ?????????? ? ????????????, ?????? ????????????? ?????????????? ?????? ???? ? ?????????? ??? ????????? ????????? ??????????? ????????? 3 ????????? ? ????????????? ??????????? 4. ?????????? ?????????? ??? ?????????? ???????????? ?????????? ? ?????????? ????????. ????? 80 ???????? ???????????? ????????? (???????????? ? ??????????????) ????? ???????????? ??????????? ? ??????????? ?? ???? ???????????? ? ???? ?????? ???????????? ??????, ???????????? cases 5. ?????????? ????? ? ????????????? ??? ????????????? ?????? SMART - ????, ?????? ?????, ?????????????, ??????????? ???????? ?????? ??? ????????????? - ?????????? ?????????????? ????????? ??? ????????????? ????????????? ??? ?????? ????????? ? ??????????? ????????? 6. ????????????? ???????? ? ?? ??????? ?? ?????????????? ? ????????? ????????? ??????????? ??????? ????????? ?? ???? ????????????? ???????? ????????????? ???????? ??? ????????????? ?????? ????????? 9. ?????????????? ??????? ??? ?????? ????????? ?????? ? ??????????? ??????????????? ??????? (???? ???????, ?????????????? ???????, ??????? ????????????? ???????, ?????????? ? ??????) ????????? 10. ???????????? ??????? ? ???????? ?????????? ?????????? ??? ?????????? ? ????????????????? ????? ??????? ???????????? ??????? ????????? 11. ???????? ???????????: ??? ??????? ???????? ??????????: ? ???????? ???????????? ???????????? ?????????? cases ?? ???????? ????????, ??????? ????, ? ???? ??????? ????????? ?????? ?????? ????? ??????? ???????? ?????????????? ??????? ??????? ?????? ????????? ? ???????????????, ??????? ????. ? ???? ???????? ??????????? ??? ???????????? ??????? - ???????????? ?????????????? ???????????? ???? ? ??????????????, ??????? ???????????? ????? ???? ??????????? ? ???????? ??????. ??????? ????????: ?????? ? ??????????? ? ????? ???????????, ???????????????? ????????, ????????? ?????????, ????????????. ???????????? ???? ? ??????? ?????????? ?????????? ????? 10 ???. ????????????? ????? MBA "?????????? ????????????? ?????????" ??????. ????? ????? 70 ??????. ??????? ??????? ?? ????? ?? ?????? ????????????, ??????????? ?????????. ???????? ????????? ?????????? ?? ?????? ?? ?????????: (495) 200-2634 ??? ? ???? ?????? ?? ?????? 910032006 at mail.ru -------------- next part -------------- An HTML attachment was scrubbed... URL: From bblais at bryant.edu Tue Feb 21 04:27:01 2006 From: bblais at bryant.edu (Brian Blais) Date: Tue Feb 21 04:27:01 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? Message-ID: <43FB0661.1040202@bryant.edu> Hello, I am trying to translate some Matlab/mex code to Python, for doing neural simulations. This application is definitely computing-time limited, and I need to optimize at least one inner loop of the code, or perhaps even rethink the algorithm. The procedure is very simple, after initializing any variables: 1) select a random input vector, which I will call "x". right now I have it as an array, and I choose columns from that array randomly. in other cases, I may need to take an image, select a patch, and then make that a column vector. 2) calculate an output value, which is the dot product of the "x" and a weight vector, "w", so y=dot(x,w) 3) modify the weight vector based on a matrix equation, like: w=w+ eta * (y*x - y**2*w) ^ | +---- learning rate constant 4) repeat steps 1-3 many times I've organized it like: for e in 100: # outer loop for i in 1000: # inner loop (steps 1-3) display things. so that the bulk of the computation is in the inner loop, and is amenable to converting to a faster language. This is my issue: straight python, in the example posted below for 250000 inner-loop steps, takes 20 seconds for each outer-loop step. I tried Pyrex, which should work very fast on such a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex file in matlab takes 1.5 seconds per outer-loop step. Given the huge difference between the Pyrex and the Mex, I feel that there is something I am doing wrong, because the C-code for both should run comparably. Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind coding some in C, but the Python API seemed a bit challenging to me. One note: I am using the Numeric package, not numpy, only because I want to be able to use the Enthought version for Windows. I develop on Linux, and haven't had a chance to see if I can compile numpy using the Enthought Python for Windows. If there is anything else anyone needs to know, I'll post it. I put the main script, and a dohebb.pyx code below. thanks! Brian Blais -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais # Main script: from dohebb import * import pylab as p from Numeric import * from RandomArray import * import time x=random((100,1000)) # 1000 input vectors numpats=x.shape[0] w=random((numpats,1)); th=random((1,1)) params={} params['eta']=0.001; params['tau']=100.0; old_mx=0; for e in range(100): rnd=randint(0,numpats,250000) t1=time.time() if 0: # straight python for i in range(len(rnd)): pat=rnd[i] xx=reshape(x[:,pat],(1,-1)) y=matrixmultiply(xx,w) w=w+params['eta']*(y*transpose(xx)-y**2*w); th=th+(1.0/params['tau'])*(y**2-th); else: # pyrex dohebb(params,w,th,x,rnd) print time.time()-t1 p.plot(w,'o-') p.xlabel('weights') p.show() #============================================= # dohebb.pyx cdef extern from "Numeric/arrayobject.h": struct PyArray_Descr: int type_num, elsize char type ctypedef class Numeric.ArrayType [object PyArrayObject]: cdef char *data cdef int nd cdef int *dimensions, *strides cdef object base cdef PyArray_Descr *descr cdef int flags def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): cdef int num_iterations cdef int num_inputs cdef int offset cdef double *wp,*xp,*thp cdef int *rndp cdef double eta,tau eta=params['eta'] # learning rate tau=params['tau'] # used for variance estimate cdef double y num_iterations=rnd.dimensions[0] num_inputs=w.dimensions[0] # get the pointers wp=w.data xp=X.data rndp=rnd.data thp=th.data for it from 0 <= it < num_iterations: offset=rndp[it]*num_inputs # calculate the output y=0.0 for i from 0 <= i < num_inputs: y=y+wp[i]*xp[i+offset] # change in the weights for i from 0 <= i < num_inputs: wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) # estimate the variance thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) From stefan at sun.ac.za Tue Feb 21 04:29:02 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 21 04:29:02 2006 Subject: [Numpy-discussion] wiki page for record arrays Message-ID: <20060221122737.GA14470@alpha> I wrote a short tutorial on using record arrays, which can be found at http://www.scipy.org/ArrayRecords The page is named ArrayRecords instead of RecordArrays, so I'd be glad if someone with priviledges could rename it. Also, please fix any mistakes I might have made. Regards St?fan From svetosch at gmx.net Tue Feb 21 05:43:03 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Tue Feb 21 05:43:03 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? Message-ID: <43FB0A96.10803@gmx.net> Hi, sometimes I'm still struggling with peculiarities of numpy-arrays vs. numpy-matrices; my latest story goes like this: I first slice out a column of a 2d-numpy-array (a = somearray[:,1]). I can just manage to understand the resulting shape ( == (112,) ). Then I slice a column from a numpy-matrix b = somematrix[:,1] and get the expected (112,1) shape. Then I do what I thought was the easiest thing in the world, I subtract the two vectors: c = a - b I was very surprised by the bug that showed up due to the fact that c.shape == (112,112) !! First conclusion: broadcasting is nice and everything, but here I somehow think that it shouldn't be like this. I like numpy, but this is frustrating. Next, I try to workaround by b.squeeze(). That seems to work, but why is b.squeeze().shape == (1, 112) instead of (112,)? Then I thought maybe b.flattened() does the job, but then I get an error (matrix has no attr flattened). Again, I'm baffled. Could someone please explain? I already own the numpy-book, otherwise I wouldn't even have thought of using those methods, but here it hasn't enlightened me. Second (preliminary) conclusion: I will paranoically use even more asmatrix()-conversions in my code to avoid dealing with those array-beasts ;-) and get column vectors I can trust... Is there a better general advice than to say: "numpy-matrices and numpy-arrays are best kept in separated worlds" ? Thanks for any insights, Sven From mpi at osc.kiku.dk Tue Feb 21 06:18:04 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 06:18:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? Message-ID: Hey, I am relatively new to python and Numeric, and am currently involved in a project of developing molecular dynamics code written in python and using Numeric for number crunching. During this, I've run into some problems, that I hope I can get some assistance with here. My main issues here are: 1. around() appears to be slow 2. C code appears to be much faster 1. One of the bottle necks in MD is the calculation of the distances between all particle pairs. In MD simulations with periodic boundary conditions, you have to estimate the shortest distance between all the particle pairs in your system. Fx. on a line of length box = 10, the distance dx between two points x0 = 1 and x1 = 9, will be dx = -2 (and NOT dx = 9). One way to do this in numpy is dx = x1 - x0 dx -= around(dx/box) My first observation here, is that around() seems to be very slow. So I looked in umathmodule.c and implemented rint() from the C math library and made my own custom Numeric module. This gives a speed-up by a factor of app. 4 compared to around(). I suggest that rint() is added as a ufunc or is there any concerns here that I am not aware of? 2. Here is the main loop for finding all possible pair distances, which corresponds to a loop over the upper triangular part of a square matrix # Loop over all particles for i in range(n-1): dx = x[i+1:] - x[i] dy = y[i+1:] - y[i] dx -= box*rint(dx/box) dy -= box*rint(dy/box) r2 = dx**2 + dy**2 # square of dist. between points where x and y contain the positions of the particles. A naive implementation in C is // loop over all particles for (int i=0; i References: <20060221122737.GA14470@alpha> Message-ID: <43FB2298.2080003@bigpond.net.au> Thanks for this St?fan, Can I make some observations. I don't want to just change your formatting. I think it would be good to have some discussion about the formatting used in tutorials like this, because all should probably follow a standard presentation style. I like the usage summary at the end. 1. I'd put 'assumes from numpy import *' in the preamble. 2. Is it possible to change the formatting to make it more obvious what is input and what is output? I think it is better to show the input and output with a standard Python prompt a'la idle or possibly ipython. A couple of things specific to your examples: 3. I think it might be worth pointing out that img = array([(0,0,0), (1,0,0), (0,1,0), (0,0,1)], [('r',Float32),('g',F loat32),('b',Float32)]) is valid syntax that can be replaced by the 2-line version you present. Should the valid syntax for creating a record array be presented in EBNF format? 4. Can you explain dtype=(void,12)? 5. When the page's name is changed, a link should be put to it in the 'Getting Started and Tutorial' section of the Documentation page. What do you and others think? Gary R. Stefan van der Walt wrote: > I wrote a short tutorial on using record arrays, which can be found at > > http://www.scipy.org/ArrayRecords > > The page is named ArrayRecords instead of RecordArrays, so I'd be glad > if someone with priviledges could rename it. Also, please fix any > mistakes I might have made. > > Regards > St?fan From alexander.belopolsky at gmail.com Tue Feb 21 06:52:16 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue Feb 21 06:52:16 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On 2/21/06, Mads Ipsen wrote: > I suggest that rint() is added as a ufunc or is there any concerns > here that I am not aware of? You might want to use astype(int). On my system it is much faster than around: > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" 10000 loops, best of 3: 176 usec per loop > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" 100000 loops, best of 3: 3.2 usec per loop the difference is too big to be explained by the fact that around allocates twice as much memory for the result. In fact the following equivalent of rint is still very fast: > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int).astype(float)" 100000 loops, best of 3: 6.48 usec per loop From ndarray at mac.com Tue Feb 21 07:00:00 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 21 07:00:00 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On the second thought, the difference between around and astype is not surprising because around operates in terms of decimals. Rather than adding rint, I would suggest to make a special case decimals=0 use C rint. > On 2/21/06, Mads Ipsen wrote: > > I suggest that rint() is added as a ufunc or is there any concerns > > here that I am not aware of? > > You might want to use astype(int). On my system it is much faster than around: > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" > 10000 loops, best of 3: 176 usec per loop > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" > 100000 loops, best of 3: 3.2 usec per loop > > the difference is too big to be explained by the fact that around > allocates twice as much memory for the result. In fact the following > equivalent of rint is still very fast: > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int).astype(float)" > 100000 loops, best of 3: 6.48 usec per loop > From travis at enthought.com Tue Feb 21 07:06:02 2006 From: travis at enthought.com (Travis N. Vaught) Date: Tue Feb 21 07:06:02 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <20060221122737.GA14470@alpha> References: <20060221122737.GA14470@alpha> Message-ID: <43FB2C19.2060001@enthought.com> Stefan van der Walt wrote: > I wrote a short tutorial on using record arrays, which can be found at > > http://www.scipy.org/ArrayRecords > > The page is named ArrayRecords instead of RecordArrays, so I'd be glad > if someone with priviledges could rename it. Also, please fix any > mistakes I might have made. > ... I've renamed it. Now the page is at: http://www.scipy.org/RecordArray Travis From travis at enthought.com Tue Feb 21 07:12:07 2006 From: travis at enthought.com (Travis N. Vaught) Date: Tue Feb 21 07:12:07 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <43FB2C19.2060001@enthought.com> References: <20060221122737.GA14470@alpha> <43FB2C19.2060001@enthought.com> Message-ID: <43FB2DA5.1010306@enthought.com> Travis N. Vaught wrote: > > I've renamed it. Now the page is at: > > http://www.scipy.org/RecordArray > Doh! That should have been http://www.scipy.org/RecordArrays . From bsouthey at gmail.com Tue Feb 21 07:16:07 2006 From: bsouthey at gmail.com (Bruce Southey) Date: Tue Feb 21 07:16:07 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <43FB0661.1040202@bryant.edu> References: <43FB0661.1040202@bryant.edu> Message-ID: Hi, In the current version, note that Y is scalar so replace the squaring (Y**2) with Y*Y as you do in the dohebb function. On my system without blas etc removing the squaring removes a few seconds (16.28 to 12.4). It did not seem to help factorizing Y. Also, eta and tau are constants so define them only once as scalars outside the loops and do the division outside the loop. It only saves about 0.2 seconds but these add up. The inner loop probably can be vectorized because it is just vector operations on a matrix. You are just computing over the ith dimension of X. I think that you could be able to find the matrix version on the net. Regards Bruce On 2/21/06, Brian Blais wrote: > Hello, > > I am trying to translate some Matlab/mex code to Python, for doing neural > simulations. This application is definitely computing-time limited, and I need to > optimize at least one inner loop of the code, or perhaps even rethink the algorithm. > The procedure is very simple, after initializing any variables: > > 1) select a random input vector, which I will call "x". right now I have it as an > array, and I choose columns from that array randomly. in other cases, I may need to > take an image, select a patch, and then make that a column vector. > > 2) calculate an output value, which is the dot product of the "x" and a weight > vector, "w", so > > y=dot(x,w) > > 3) modify the weight vector based on a matrix equation, like: > > w=w+ eta * (y*x - y**2*w) > ^ > | > +---- learning rate constant > > 4) repeat steps 1-3 many times > > I've organized it like: > > for e in 100: # outer loop > for i in 1000: # inner loop > (steps 1-3) > > display things. > > so that the bulk of the computation is in the inner loop, and is amenable to > converting to a faster language. This is my issue: > > straight python, in the example posted below for 250000 inner-loop steps, takes 20 > seconds for each outer-loop step. I tried Pyrex, which should work very fast on such > a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex > file in matlab takes 1.5 seconds per outer-loop step. > > Given the huge difference between the Pyrex and the Mex, I feel that there is > something I am doing wrong, because the C-code for both should run comparably. > Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind > coding some in C, but the Python API seemed a bit challenging to me. > > One note: I am using the Numeric package, not numpy, only because I want to be able > to use the Enthought version for Windows. I develop on Linux, and haven't had a > chance to see if I can compile numpy using the Enthought Python for Windows. > > If there is anything else anyone needs to know, I'll post it. I put the main script, > and a dohebb.pyx code below. > > > thanks! > > Brian Blais > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > > > # Main script: > > from dohebb import * > import pylab as p > from Numeric import * > from RandomArray import * > import time > > x=random((100,1000)) # 1000 input vectors > > numpats=x.shape[0] > w=random((numpats,1)); > > th=random((1,1)) > > params={} > params['eta']=0.001; > params['tau']=100.0; > old_mx=0; > for e in range(100): > > rnd=randint(0,numpats,250000) > t1=time.time() > if 0: # straight python > for i in range(len(rnd)): > pat=rnd[i] > xx=reshape(x[:,pat],(1,-1)) > y=matrixmultiply(xx,w) > w=w+params['eta']*(y*transpose(xx)-y**2*w); > th=th+(1.0/params['tau'])*(y**2-th); > else: # pyrex > dohebb(params,w,th,x,rnd) > print time.time()-t1 > > > p.plot(w,'o-') > p.xlabel('weights') > p.show() > > > #============================================= > > # dohebb.pyx > > cdef extern from "Numeric/arrayobject.h": > > struct PyArray_Descr: > int type_num, elsize > char type > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > cdef char *data > cdef int nd > cdef int *dimensions, *strides > cdef object base > cdef PyArray_Descr *descr > cdef int flags > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > cdef int num_iterations > cdef int num_inputs > cdef int offset > cdef double *wp,*xp,*thp > cdef int *rndp > cdef double eta,tau > > eta=params['eta'] # learning rate > tau=params['tau'] # used for variance estimate > > cdef double y > num_iterations=rnd.dimensions[0] > num_inputs=w.dimensions[0] > > # get the pointers > wp=w.data > xp=X.data > rndp=rnd.data > thp=th.data > > for it from 0 <= it < num_iterations: > > offset=rndp[it]*num_inputs > > # calculate the output > y=0.0 > for i from 0 <= i < num_inputs: > y=y+wp[i]*xp[i+offset] > > # change in the weights > for i from 0 <= i < num_inputs: > wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > # estimate the variance > thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From mpi at osc.kiku.dk Tue Feb 21 07:24:08 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 07:24:08 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On Tue, 21 Feb 2006, Alexander Belopolsky wrote: > On 2/21/06, Mads Ipsen wrote: > > I suggest that rint() is added as a ufunc or is there any concerns > > here that I am not aware of? > > You might want to use astype(int). On my system it is much faster than around: > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" > 10000 loops, best of 3: 176 usec per loop > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" > 100000 loops, best of 3: 3.2 usec per loop > > the difference is too big to be explained by the fact that around > allocates twice as much memory for the result. In fact the following > equivalent of rint is still very fast: > > > python -m timeit -s "from numpy import array, around; Maybe I am wrong here, but around() and rint() is supposed to round to the closest integer, i.e. for x = array([1.1, 1.8]) around(x) = [1.0, 2.0] whereas x.astype(int).astype(float) = [1.0, 1.0] This particular property of around() as well as rint() is crucial for my application. // Mads From ndarray at mac.com Tue Feb 21 07:32:09 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 21 07:32:09 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On 2/21/06, Mads Ipsen wrote: > Maybe I am wrong here, but around() and rint() is supposed to round to > the closest integer, i.e. for x = array([1.1, 1.8]) You are right. In the follow-up I've suggested to speed-up the case decimals=0 in around in around instead of adding another function. I think that would be a more "pythonic" solution. From stefan at sun.ac.za Tue Feb 21 07:56:09 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 21 07:56:09 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <43FB2298.2080003@bigpond.net.au> References: <20060221122737.GA14470@alpha> <43FB2298.2080003@bigpond.net.au> Message-ID: <20060221155513.GC14470@alpha> Hi Gary Thanks for your suggestions. I incorporated them. St?fan On Wed, Feb 22, 2006 at 01:24:24AM +1100, Gary Ruben wrote: > 1. I'd put 'assumes from numpy import *' in the preamble. > 2. Is it possible to change the formatting to make it more obvious what > is input and what is output? I think it is better to show the input and > output with a standard Python prompt a'la idle or possibly ipython. > 3. I think it might be worth pointing out that > > img = array([(0,0,0), (1,0,0), (0,1,0), (0,0,1)], [('r',Float32),('g',F > loat32),('b',Float32)]) > > is valid syntax that can be replaced by the 2-line version you present. > 4. Can you explain dtype=(void,12)? > 5. When the page's name is changed, a link should be put to it in the > 'Getting Started and Tutorial' section of the Documentation page. From pau.gargallo at gmail.com Tue Feb 21 08:02:07 2006 From: pau.gargallo at gmail.com (Pau Gargallo) Date: Tue Feb 21 08:02:07 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: <6ef8f3380602210801j3e321795l5ade05c7c1539002@mail.gmail.com> > the closest integer, i.e. for x = array([1.1, 1.8]) > > around(x) = [1.0, 2.0] > > whereas > > x.astype(int).astype(float) = [1.0, 1.0] > (x+0.5).astype(int).astype(float) = [1.0, 2.0] i hope it helps, pau From zpincus at stanford.edu Tue Feb 21 09:19:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 21 09:19:02 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> Mads, The game with numpy, just as it is with Matlab or any other interpreted numeric environment, is to try push as much of the looping down into the C code as you can. This is because, as you now know, compiled C can loop much faster than interpreted python. A simple example for averaging 1000 (x,y,z) points: print data.shape (1000, 3) # bad: explicit for loop in python avg = numpy.zeros(3, numpy.float_) for i in data: avg += i avg /= 1000.0 # good: implicit for loop in C avg = numpy.add.reduce(data, axis = 0) avg /= 1000.0 In your case, instead of explicitly looping through each point, why not do the calculations in parallel, operating on entire vectors of points at one time? Then the looping is "pushed down" into compiled C code. Or if you're really lucky, it's pushed all the way down to the vector math units on your cpu if you have a good BLAS or whatever installed. Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine > 2. Here is the main loop for finding all possible pair distances, > which corresponds to a loop over the upper triangular part of a > square matrix > > # Loop over all particles > for i in range(n-1): > dx = x[i+1:] - x[i] > dy = y[i+1:] - y[i] > > dx -= box*rint(dx/box) > dy -= box*rint(dy/box) > > r2 = dx**2 + dy**2 # square of dist. between points > > where x and y contain the positions of the particles. A naive > implementation in C is > > > // loop over all particles > for (int i=0; i for (int j=i+1; j dx = x[j] - x[i]; > dy = y[j] - y[i]; > > dx -= box*rint(dx/box); > dy -= box*rint(dy/box); > > r2 = dx*dx + dy*dy; > } > } > > For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is > app. 10 times faster than the Python/Numeric counterpart. This is of > course not satisfactory. > > Are there any things I am doing completely wrong here, basic > approaches completely misunderstood, misuses etc? > > Any suggestions, guidelines, hints are most welcome. > > Best regards, > > Mads Ipsen > > > +---------------------------------+-------------------------+ > | Mads Ipsen | | > | Dept. of Chemistry | phone: +45-35320220 | > | H.C.?rsted Institute | fax: +45-35320322 | > | Universitetsparken 5 | | > | DK-2100 Copenhagen ?, Denmark | mpi at osc.kiku.dk | > +---------------------------------+-------------------------+ > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From mpi at osc.kiku.dk Tue Feb 21 10:25:04 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 10:25:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> Message-ID: On Tue, 21 Feb 2006, Zachary Pincus wrote: > Mads, > > The game with numpy, just as it is with Matlab or any other > interpreted numeric environment, is to try push as much of the > looping down into the C code as you can. This is because, as you now > know, compiled C can loop much faster than interpreted python. > > A simple example for averaging 1000 (x,y,z) points: > > print data.shape > (1000, 3) > # bad: explicit for loop in python > avg = numpy.zeros(3, numpy.float_) > for i in data: avg += i > avg /= 1000.0 > > # good: implicit for loop in C > avg = numpy.add.reduce(data, axis = 0) > avg /= 1000.0 > > In your case, instead of explicitly looping through each point, why > not do the calculations in parallel, operating on entire vectors of > points at one time? Then the looping is "pushed down" into compiled C > code. Or if you're really lucky, it's pushed all the way down to the > vector math units on your cpu if you have a good BLAS or whatever > installed. > > Zach Pincus > > Program in Biomedical Informatics and Department of Biochemistry > Stanford University School of Medicine > > > > > 2. Here is the main loop for finding all possible pair distances, > > which corresponds to a loop over the upper triangular part of a > > square matrix > > > > # Loop over all particles > > for i in range(n-1): > > dx = x[i+1:] - x[i] > > dy = y[i+1:] - y[i] > > > > dx -= box*rint(dx/box) > > dy -= box*rint(dy/box) > > > > r2 = dx**2 + dy**2 # square of dist. between points > > > > where x and y contain the positions of the particles. A naive > > implementation in C is > > > > > > // loop over all particles > > for (int i=0; i > for (int j=i+1; j > dx = x[j] - x[i]; > > dy = y[j] - y[i]; > > > > dx -= box*rint(dx/box); > > dy -= box*rint(dy/box); > > > > r2 = dx*dx + dy*dy; > > } > > } > > > > For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is > > app. 10 times faster than the Python/Numeric counterpart. This is of > > course not satisfactory. > > > > Are there any things I am doing completely wrong here, basic > > approaches completely misunderstood, misuses etc? > > > > Any suggestions, guidelines, hints are most welcome. > > > > Best regards, > > > > Mads Ipsen > > > > > > +---------------------------------+-------------------------+ > > | Mads Ipsen | | > > | Dept. of Chemistry | phone: +45-35320220 | > > | H.C.?rsted Institute | fax: +45-35320322 | > > | Universitetsparken 5 | | > > | DK-2100 Copenhagen ?, Denmark | mpi at osc.kiku.dk | > > +---------------------------------+-------------------------+ > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through > > log files > > for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD > > SPLUNK! > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 > > _______________________________________________ > > Numpy-discussion mailing list > > Nump I agree completely with your comments. But, as you can, the innermost part of the loop has been removed in the code, and replaced with numpy slices. It's hard for me to see how to compress the outer loop as well, since it determines the ranges for the inner loop. Unless there is some fancy slice notation, that allows you to loop over a triangular part of a matrix, ie. x[i] = sum(A[i+1,i]) meaning x[i] = sum of elements in i'th row of A using only elements from position i+1 up to n. Of course, there is the possibility of hardcoding this in C and then make it available as a Python module. But I don't want to do this before I am sure there isn't a numpy way out this. Let me know, if you have any suggestions. // Mads From oliphant.travis at ieee.org Tue Feb 21 10:58:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 21 10:58:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43FA9B28.6070309@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FA8C9C.2020002@ieee.org> <43FA9B28.6070309@cox.net> Message-ID: <43FB628D.1050305@ieee.org> Tim Hochberg wrote: > Yeah, sort of. I meant that the little helper functions that ufuncs > call, such as DOUBLE_multiply, take the same types of arguments. > However, I just realized that I'm not certain that's true -- I just > assumed it because all the one's I've ever seen do. Also, this isn't > really a problem anyway -- the real problem is the slow conversion of > Python scalars to arrays in ufuncs. Yes, that is true. We have only defined multiplication for same-types. But, I just wanted to clarify that the ufunc machinery is more general than that, because others have been confused in the past. > I took a look at this earlier and it appears that the reason that > conversion of Python scalars are slow is that FromAny trys every other > conversion first. The check for Python scalars looks pretty cheap, so > it seems reasonable to check for them and do the appropriate > conversion early. Do the ufunc's call EnsureArray or FromAny? If the > former it would seem pretty straighforward to just stick another check > in there. Then David's original strategy of optimizing in DOUBLE_pow > should be close to as fast as what I'm doing. Yes, I suspect the biggest slow-downs are the two attribute lookups which allow anything with __array__ or the array interface defined to be used. I think we could special-case Python scalars in that code. -Travis From tim.hochberg at cox.net Tue Feb 21 11:12:04 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 21 11:12:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> Message-ID: <43FB65D1.5080707@cox.net> Mads Ipsen wrote: >On Tue, 21 Feb 2006, Zachary Pincus wrote: > > > >>Mads, >> >>The game with numpy, just as it is with Matlab or any other >>interpreted numeric environment, is to try push as much of the >>looping down into the C code as you can. This is because, as you now >>know, compiled C can loop much faster than interpreted python. >> >>A simple example for averaging 1000 (x,y,z) points: >> >>print data.shape >>(1000, 3) >># bad: explicit for loop in python >>avg = numpy.zeros(3, numpy.float_) >>for i in data: avg += i >>avg /= 1000.0 >> >># good: implicit for loop in C >>avg = numpy.add.reduce(data, axis = 0) >>avg /= 1000.0 >> >>In your case, instead of explicitly looping through each point, why >>not do the calculations in parallel, operating on entire vectors of >>points at one time? Then the looping is "pushed down" into compiled C >>code. Or if you're really lucky, it's pushed all the way down to the >>vector math units on your cpu if you have a good BLAS or whatever >>installed. >> >>Zach Pincus >> >>Program in Biomedical Informatics and Department of Biochemistry >>Stanford University School of Medicine >> >> >> >> >> >>>2. Here is the main loop for finding all possible pair distances, >>> which corresponds to a loop over the upper triangular part of a >>> square matrix >>> >>> # Loop over all particles >>> for i in range(n-1): >>> dx = x[i+1:] - x[i] >>> dy = y[i+1:] - y[i] >>> >>> dx -= box*rint(dx/box) >>> dy -= box*rint(dy/box) >>> >>> r2 = dx**2 + dy**2 # square of dist. between points >>> >>>where x and y contain the positions of the particles. A naive >>>implementation in C is >>> >>> >>> // loop over all particles >>> for (int i=0; i>> for (int j=i+1; j>> dx = x[j] - x[i]; >>> dy = y[j] - y[i]; >>> >>> dx -= box*rint(dx/box); >>> dy -= box*rint(dy/box); >>> >>> r2 = dx*dx + dy*dy; >>> } >>> } >>> >>>For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is >>>app. 10 times faster than the Python/Numeric counterpart. This is of >>>course not satisfactory. >>> >>>Are there any things I am doing completely wrong here, basic >>>approaches completely misunderstood, misuses etc? >>> >>>Any suggestions, guidelines, hints are most welcome. >>> >>>Best regards, >>> >>>Mads Ipsen >>> >>> >>>+---------------------------------+-------------------------+ >>>| Mads Ipsen | | >>>| Dept. of Chemistry | phone: +45-35320220 | >>>| H.C.?rsted Institute | fax: +45-35320322 | >>>| Universitetsparken 5 | | >>>| DK-2100 Copenhagen ?, Denmark | mpi at osc.kiku.dk | >>>+---------------------------------+-------------------------+ >>> >>> >>>------------------------------------------------------- >>>This SF.net email is sponsored by: Splunk Inc. Do you grep through >>>log files >>>for problems? Stop! Download the new AJAX search engine that makes >>>searching your log files as easy as surfing the web. DOWNLOAD >>>SPLUNK! >>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 >>>_______________________________________________ >>>Numpy-discussion mailing list >>>Nump >>> >>> > >I agree completely with your comments. But, as you can, the innermost part >of the loop has been removed in the code, and replaced with numpy slices. >It's hard for me to see how to compress the outer loop as well, since it >determines the ranges for the inner loop. Unless there is some fancy slice >notation, that allows you to loop over a triangular part of a matrix, ie. > > x[i] = sum(A[i+1,i]) > >meaning x[i] = sum of elements in i'th row of A using only elements from >position i+1 up to n. > >Of course, there is the possibility of hardcoding this in C and then make >it available as a Python module. But I don't want to do this before I am >sure there isn't a numpy way out this. > >Let me know, if you have any suggestions. > Can you explain a little more about what you are trying to calculate? The bit about subtracting off box*rint(dx/box) is a little odd. It almost seems like you should be able to do something with fmod, but I admit that I'm not sure how. If I had to guess as to source of the relative slowness I'd say it's because you are creating a lot of temporary matrices. There are ways to avoid this, but when taken to the extreme, they make your code look ugly. You might try the following, untested, code or some variation and see if it speeds things up. This makes extensive use of the little known optional destination argument for ufuncs. I only tend to do this sort of stuff where it's very critical since, as you can see, it makes things quite ugly. dx_space = x.copy() dy_space = y.copy() scratch_space = x.copy() for i in range(n-1): dx = dx_space[i+1:] dy = dy_space[i+1:] scratch = scratch_space[i+1:] subtract(x[i+1:], x[i], dx) subtract(y[i+1:], y[i], dy) # dx -= box*rint(dx/box) divide(dx, box, scratch) rint(scratch, scratch) scratch *= box dx -= scratch # dy -= box*rint(dy/box) divide(dy, box, scratch) rint(scratch, scratch) scratch *= box dy -= scratch r2 = dx**2 + dy**2 # square of dist. between points Hope that helps: -tim From oliphant.travis at ieee.org Tue Feb 21 11:13:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 21 11:13:02 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB0A96.10803@gmx.net> References: <43FB0A96.10803@gmx.net> Message-ID: <43FB6604.6000406@ieee.org> Sven Schreiber wrote: >Hi, sometimes I'm still struggling with peculiarities of numpy-arrays >vs. numpy-matrices; my latest story goes like this: > >I first slice out a column of a 2d-numpy-array (a = somearray[:,1]). I >can just manage to understand the resulting shape ( == (112,) ). > >Then I slice a column from a numpy-matrix b = somematrix[:,1] and get >the expected (112,1) shape. > >Then I do what I thought was the easiest thing in the world, I subtract >the two vectors: c = a - b >I was very surprised by the bug that showed up due to the fact that >c.shape == (112,112) !! > > As you know this isn't a bug, but very expected behavior. I don't see this changing any time soon. Arrays are different than matrices. Matrices are always 2-d arrays while arrays can have any number of dimensions. The default relationship between arrays and matrices is that 1-d arrays get converted to row-matrices (1,N). Regardless of which convention is chosen somebody will be bitten by that conversion if they think in terms of the other default. I don't see a way around that except to be careful when you mix arrays and matrices. >Next, I try to workaround by b.squeeze(). That seems to work, but why is >b.squeeze().shape == (1, 112) instead of (112,)? > > Again the same reason as before. A matrix is returned from b.squeeze() and there are no 1-d matrices. Thus, you get a row-vector. Use .T if you want a column vector. >Then I thought maybe b.flattened() does the job, but then I get an error >(matrix has no attr flattened). Again, I'm baffled. > > The correct spelling is b.flatten() And again you are going to get a (1,N) matrix out because of how 1d arrays are interpreted as matrices. In short, there is no way to get a 1-d matrix because that doesn't make sense. You can get a 1-d array using b.A.squeeze() >Second (preliminary) conclusion: I will paranoically use even more >asmatrix()-conversions in my code to avoid dealing with those >array-beasts ;-) and get column vectors I can trust... > > > >Is there a better general advice than to say: "numpy-matrices and >numpy-arrays are best kept in separated worlds" ? > > You can mix arrays and matrices just fine if you remember that 1d arrays are equivalent to row-vectors. -Travis From skip at pobox.com Tue Feb 21 12:20:05 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue Feb 21 12:20:05 2006 Subject: [Numpy-discussion] Problems building numpy w/ ATLAS on Solaris 8 Message-ID: <17403.30180.676655.892180@montanaro.dyndns.org> After a brief hiatus I'm back to trying to build numpy. Last time I checked in (on the scipy list) I had successfully built ATLAS and created this simple site.cfg file in .../numpy/distutils/site.cfg: [atlas] library_dirs = /home/titan/skipm/src/ATLAS/lib/SunOS_Babe include_dirs = /home/titan/skipm/src/ATLAS/include/SunOS_Babe # for overriding the names of the atlas libraries atlas_libs = lapack, f77blas, cblas, atlas I svn up'd (now at rev 2138), zapped my build directory, then executed "python setup.py build". Just in case it matters, I'm using Python 2.4.2 built with GCC 3.4.1 on Solaris 8. Here's the output of my build attempt: Running from numpy source directory. No module named __svn_version__ F2PY Version 2_2138 blas_opt_info: blas_mkl_info: /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[] warnings.warn("Library error: libs=%s found_libs=%s" % \ NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['lapack', 'f77blas', 'cblas', 'atlas'] found_libs=[] warnings.warn("Library error: libs=%s found_libs=%s" % \ Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] language = c include_dirs = ['/opt/include'] ... See my site.cfg file? Why does it affect library_dirs but not include_dirs? running build_src building extension "atlas_version" sources adding 'build/src/atlas_version_0x33c6fa32.c' to sources. running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext building 'atlas_version' extension compiling C sources gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC' compile options: '-I/opt/include -Inumpy/core/include -I/opt/app/g++lib6/python-2.4/include/python2.4 -c' /opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so Text relocation remains referenced against symbol offset in file 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) 0xc /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ... bunch of missing s elided ... printf 0x1b /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) printf 0x2d /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) printf 0x3f /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) printf 0x51 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ... what's this? can't find printf??? ... ld: fatal: relocations remain against allocatable but non-writable sections collect2: ld returned 1 exit status Text relocation remains referenced against symbol offset in file 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ... more eliding ... printf 0x108 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) ld: fatal: relocations remain against allocatable but non-writable sections collect2: ld returned 1 exit status ##### msg: error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 FOUND: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] language = c define_macros = [('NO_ATLAS_INFO', 2)] include_dirs = ['/opt/include'] Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas ... How can I initialize things earlier? Does it matter? Traceback (most recent call last): File "setup.py", line 76, in ? setup_package() File "setup.py", line 63, in setup_package config.add_subpackage('numpy') File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage config_list = self.get_subpackage(subpackage_name,subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "/home/ink/skipm/src/numpy/numpy/setup.py", line 10, in configuration config.add_subpackage('core') File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage config_list = self.get_subpackage(subpackage_name,subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage subpackage_path) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "numpy/core/setup.py", line 215, in configuration config.add_data_dir('tests') File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 636, in add_data_dir self.add_data_files((ds,filenames)) File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 702, in add_data_files dist.data_files.extend(data_dict.items()) AttributeError: 'NoneType' object has no attribute 'extend' And finally, a traceback. What's up with that? In parallel with trying to build with ATLAS I'm also trying Travis's suggestion of explicitly setting PTATLAS, ATLAS and BLAS to "None". Numpy builds when I do that. -- Skip Montanaro - skip at pobox.com "The values to which people cling most stubbornly under inappropriate conditions are those values that were previously the source of their greatest triumphs over adversity." -- Jared Diamond in "Collapse" From Chris.Barker at noaa.gov Tue Feb 21 12:49:02 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 21 12:49:02 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB6604.6000406@ieee.org> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> Message-ID: <43FB7C87.1010007@noaa.gov> Travis Oliphant wrote: > You can mix arrays and matrices just fine if you remember that 1d arrays > are equivalent to row-vectors. and you can easily get a column vector out of an array, if you remember that you want to keep it 2-d. i.e. use a slice rather than an index: >>> import numpy as N >>> a = N.ones((5,10)) >>> a[:,1].shape # an index: it reduces the rank (5,) >>> a[:,1:2].shape # a slice: it keeps the rank (5, 1) -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From mpi at osc.kiku.dk Tue Feb 21 13:05:05 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Tue Feb 21 13:05:05 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <43FB65D1.5080707@cox.net> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> Message-ID: On Tue, 21 Feb 2006, Tim Hochberg wrote: > Can you explain a little more about what you are trying to calculate? > The bit about subtracting off box*rint(dx/box) is a little odd. It > almost seems like you should be able to do something with fmod, but I > admit that I'm not sure how. > > If I had to guess as to source of the relative slowness I'd say it's > because you are creating a lot of temporary matrices. There are ways to > avoid this, but when taken to the extreme, they make your code look > ugly. You might try the following, untested, code or some variation and > see if it speeds things up. This makes extensive use of the little known > optional destination argument for ufuncs. I only tend to do this sort of > stuff where it's very critical since, as you can see, it makes things > quite ugly. > > dx_space = x.copy() > dy_space = y.copy() > scratch_space = x.copy() > for i in range(n-1): > dx = dx_space[i+1:] > dy = dy_space[i+1:] > scratch = scratch_space[i+1:] > subtract(x[i+1:], x[i], dx) > subtract(y[i+1:], y[i], dy) > # dx -= box*rint(dx/box) > divide(dx, box, scratch) > rint(scratch, scratch) > scratch *= box > dx -= scratch > # dy -= box*rint(dy/box) > divide(dy, box, scratch) > rint(scratch, scratch) > scratch *= box > dy -= scratch > r2 = dx**2 + dy**2 # square of dist. between points > > > > Hope that helps: > > -tim Here's what I am trying to do: My system consists of N particles, whose coordinates in the xy-plane is given by the two vectors x and y. I need to calculate the distance between all particle pairs, which goes like this: I pick particle 1 and calculate its distance to the N-1 other points. Then I pick particle 2. Since its distance to particle 1 was found in the previuos step, I only have to find its distance to the N-2 remaining points. In the i'th step, I therefore only have to consider particle i+1 up to particle N. That explains the loop structure, where dx = x[i+1:] - x[i] dy = y[i+1:] - y[i] the resulting vectors dx and dy will contain the x-distances from x[i] to the proceeding points from i+1 up to N. The square of the distance r2 is the given by r2 = dx**2 + dy**2 Another approach would be to use dx = subtract.outer(x,x) dy = subtract.outer(y,y) but that will be overkill, since all distances are counted twice, and also, the storage requirements grow rapidly if you have more than 1000 particles (app. 10^6 particle pairs). Thanks for your code feedback, which I'll have a closer look at. But I try to believe, that numpy/Numeric/Python was invented with the one purpose of avoiding coding like this - I think this is also a point you already made. But thanks again. // Mads From aisaac at american.edu Tue Feb 21 13:43:04 2006 From: aisaac at american.edu (Alan G Isaac) Date: Tue Feb 21 13:43:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> Message-ID: On Tue, 21 Feb 2006, (CET) Mads Ipsen apparently wrote: > My system consists of N particles, whose coordinates in > the xy-plane is given by the two vectors x and y. I need > to calculate the distance between all particle pairs Of possible interest? http://www.cs.umd.edu/~mount/ANN/ Cheers, Alan Isaac From robert.kern at gmail.com Tue Feb 21 14:10:12 2006 From: robert.kern at gmail.com (Robert Kern) Date: Tue Feb 21 14:10:12 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: <17403.30180.676655.892180@montanaro.dyndns.org> References: <17403.30180.676655.892180@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > After a brief hiatus I'm back to trying to build numpy. Last time I checked > in (on the scipy list) I had successfully built ATLAS and created this > simple site.cfg file in .../numpy/distutils/site.cfg: > > [atlas] > library_dirs = /home/titan/skipm/src/ATLAS/lib/SunOS_Babe > include_dirs = /home/titan/skipm/src/ATLAS/include/SunOS_Babe > # for overriding the names of the atlas libraries > atlas_libs = lapack, f77blas, cblas, atlas > > I svn up'd (now at rev 2138), zapped my build directory, then executed > "python setup.py build". Just in case it matters, I'm using Python 2.4.2 > built with GCC 3.4.1 on Solaris 8. Here's the output of my build attempt: > > Running from numpy source directory. > No module named __svn_version__ > F2PY Version 2_2138 > blas_opt_info: > blas_mkl_info: > /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[] > warnings.warn("Library error: libs=%s found_libs=%s" % \ > NOT AVAILABLE > > atlas_blas_threads_info: > Setting PTATLAS=ATLAS > /home/ink/skipm/src/numpy/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['lapack', 'f77blas', 'cblas', 'atlas'] found_libs=[] > warnings.warn("Library error: libs=%s found_libs=%s" % \ > Setting PTATLAS=ATLAS > Setting PTATLAS=ATLAS > FOUND: > libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] > language = c > include_dirs = ['/opt/include'] > ... > > See my site.cfg file? Why does it affect library_dirs but not include_dirs? Probably a bug, but I don't see exactly where at the moment. It shouldn't really affect anything, I don't think. The only header file that comes with ATLAS is cblas.h, and I'm pretty sure that numpy itself doesn't need it. In fact, we provide our own copy in numpy/core/blasdot/ for the parts that can use it. > running build_src > building extension "atlas_version" sources > adding 'build/src/atlas_version_0x33c6fa32.c' to sources. > running build_ext > customize UnixCCompiler > customize UnixCCompiler using build_ext > building 'atlas_version' extension > compiling C sources > gcc options: '-fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC' > compile options: '-I/opt/include -Inumpy/core/include -I/opt/app/g++lib6/python-2.4/include/python2.4 -c' > /opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so > Text relocation remains referenced > against symbol offset in file > 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > 0xc /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ... bunch of missing s elided ... > printf 0x1b /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > printf 0x2d /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > printf 0x3f /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > printf 0x51 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ... what's this? can't find printf??? ... > ld: fatal: relocations remain against allocatable but non-writable sections > collect2: ld returned 1 exit status Hmm. Was ATLAS compiled -fPIC? I'm afraid I'm a little out of my depth when it comes to linking shared objects on Solaris. > Text relocation remains referenced > against symbol offset in file > 0x7 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ... more eliding ... > printf 0x108 /home/titan/skipm/src/ATLAS/lib/SunOS_Babe/libatlas.a(ATL_buildinfo.o) > ld: fatal: relocations remain against allocatable but non-writable sections > collect2: ld returned 1 exit status > ##### msg: error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 > error: Command "/opt/lang/gcc-3.4/bin/gcc -shared build/temp.solaris-2.8-i86pc-2.4/build/src/atlas_version_0x33c6fa32.o -L/home/titan/skipm/src/ATLAS/lib/SunOS_Babe -llapack -lf77blas -lcblas -latlas -o build/temp.solaris-2.8-i86pc-2.4/atlas_version.so" failed with exit status 1 > FOUND: > libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/home/titan/skipm/src/ATLAS/lib/SunOS_Babe'] > language = c > define_macros = [('NO_ATLAS_INFO', 2)] > include_dirs = ['/opt/include'] > > Warning: distutils distribution has been initialized, it may be too late to add an extension _dotblas > ... > > How can I initialize things earlier? Does it matter? You get messages like this when something previous goes wrong. There's nothing you can do to initialize things earlier except to make sure that the previous steps don't fail. It's not the most informative error message, I know. > Traceback (most recent call last): > File "setup.py", line 76, in ? > setup_package() > File "setup.py", line 63, in setup_package > config.add_subpackage('numpy') > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage > config_list = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage > subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py > config = setup_module.configuration(*args) > File "/home/ink/skipm/src/numpy/numpy/setup.py", line 10, in configuration > config.add_subpackage('core') > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 592, in add_subpackage > config_list = self.get_subpackage(subpackage_name,subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 582, in get_subpackage > subpackage_path) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 539, in _get_configuration_from_setup_py > config = setup_module.configuration(*args) > File "numpy/core/setup.py", line 215, in configuration > config.add_data_dir('tests') > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 636, in add_data_dir > self.add_data_files((ds,filenames)) > File "/home/ink/skipm/src/numpy/numpy/distutils/misc_util.py", line 702, in add_data_files > dist.data_files.extend(data_dict.items()) > AttributeError: 'NoneType' object has no attribute 'extend' > > And finally, a traceback. What's up with that? Essentially, the same issue here. Since an earlier step failed, dist.data_files is still None. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From alexander.belopolsky at gmail.com Tue Feb 21 14:24:05 2006 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue Feb 21 14:24:05 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: It turns out that around (or round_) is implemented in python: def round_(a, decimals=0): """Round 'a' to the given number of decimal places. Rounding behaviour is equivalent to Python. Return 'a' if the array is not floating point. Round both the real and imaginary parts separately if the array is complex. """ a = asarray(a) if not issubclass(a.dtype.type, _nx.inexact): return a if issubclass(a.dtype.type, _nx.complexfloating): return round_(a.real, decimals) + 1j*round_(a.imag, decimals) if decimals is not 0: decimals = asarray(decimals) s = sign(a) if decimals is not 0: a = absolute(multiply(a, 10.**decimals)) else: a = absolute(a) rem = a-asarray(a).astype(_nx.intp) a = _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) # convert back if decimals is not 0: return multiply(a, s/(10.**decimals)) else: return multiply(a, s) I see many ways to improve the performance here. First, there is no need to check for "decimals is not 0" three times. This can be done once, maybe at the expense of some code duplication. Second, _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a ufunc as Mads originally suggested, "decimals is 0" branch can just call that. It is tempting to rewrite the whole thing in C, but before I do that I have a few questions about current implementation. 1. It is implemented in oldnumeric.py . Does this mean it is deprecated. If so, what is the recommended replacement? 2. Was it intended to support array and fractional values for decimals or is it an implementation artifact. Currently: >>> around(array([1.2345]*5),[1,2,3,4,5]) array([ 1.2 , 1.23 , 1.235 , 1.2345, 1.2345]) >>> around(1.2345,2.5) array(1.2332882874656679) 3. It does nothing to exact types, even if decimals<0 >>> around(1234, -2) array(1234) Is this a bug? Consider that >>> round(1234, -2) 1200.0 and >>> around(1234., -2) array(1200.0) Docstring is self-contradictory: "Rounding behaviour is equivalent to Python" is not consistent with "Return 'a' if the array is not floating point." I propose to deprecate around and implement a new "round" member function in C that will only accept scalar "decimals" and will behave like a properly vectorized builtin round. I will do the coding if there is interest. In any case, something has to be done here. I don't think the following timings are acceptable: > python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)" 100000 loops, best of 3: 18.8 usec per loop > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" 10000 loops, best of 3: 155 usec per loop On 2/21/06, Sasha wrote: > On the second thought, the difference between around and astype is not > surprising because around operates in terms of decimals. Rather than > adding rint, I would suggest to make a special case decimals=0 use C > rint. > > > On 2/21/06, Mads Ipsen wrote: > > > I suggest that rint() is added as a ufunc or is there any concerns > > > here that I am not aware of? > > > > You might want to use astype(int). On my system it is much faster than around: > > > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" > > 10000 loops, best of 3: 176 usec per loop > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int)" > > 100000 loops, best of 3: 3.2 usec per loop > > > > the difference is too big to be explained by the fact that around > > allocates twice as much memory for the result. In fact the following > > equivalent of rint is still very fast: > > > > > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "x.astype(int).astype(float)" > > 100000 loops, best of 3: 6.48 usec per loop > > > From tim.hochberg at cox.net Tue Feb 21 14:24:07 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 21 14:24:07 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> <43FB8933.1080400@cox.net> Message-ID: <43FB92D1.4020802@cox.net> Mads Ipsen wrote: >On Tue, 21 Feb 2006, Tim Hochberg wrote: > > > >>This all makes perfect sense, but what happended to box? In your >>original code there was a step where you did some mumbo jumbo and box >>and rint. Namely: >> >> > >It's a minor detail, but the reason for this is the following > >Suppose you have a line with length of box = 10 with periodic boundary >conditions (basically this is a circle). Now consider two points x0 = 1 >and x1 = 9 on this line. The shortest distance dx between the points x0 >and x1 is dx = -2 and not 8. The calculation > > dx = x1 - x0 ( = +8) > dx -= box*rint(dx/box) ( = -2) > >will give you the desired result, namely dx = -2. Hope this makes better >sense. Note that fmod() won't work since > > fmod(dx,box) = 8 > > I think you could use some variation like "fmod(dx+box/2, box) - box/2" but rint seems better. >Part of my original post was concerned with the fact, that I initially was >using around() from numpy for this step. This was terribly slow, so I made >some custom changes and added rint() from the C-math library to the numpy >module, giving a speedup factor about 4 for this particular line in the >code. > >Best regards // Mads > > > OK, that all makes sense. You might want to try the following, which factors out all the divisions and half the multiplies by box and produces several fewer temporaries. Note I replaced x**2 with x*x, which for the moment is much faster (I don't know if you've been following the endless yacking about optimizing x**n, but x**2 will get fast eventually). Depending on what you're doing with r2, you may be able to avoid the last multiple by box as well. # Loop over all particles xbox = x/box ybox = y/box for i in range(n-1): dx = xbox[i+1:] - xbox[i] dy = ybox[i+1:] - ybox[i] dx -= rint(dx) dy -= rint(dy) r2 = (dx*dx + dy*dy) r2 *= box Regards, -tim From gruben at bigpond.net.au Tue Feb 21 15:18:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Tue Feb 21 15:18:02 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <20060221155513.GC14470@alpha> References: <20060221122737.GA14470@alpha> <43FB2298.2080003@bigpond.net.au> <20060221155513.GC14470@alpha> Message-ID: <43FB9F90.90606@bigpond.net.au> Thanks St?fan, I find this much better now. However, I'd like to hear suggestions from others if they can think of ways of further improving the style since I see this as a template for future tutorials. I'll just note that ipython on my windows systems doesn't do the syntax colouring the same, so if I was to make a similarly styled tutorial, there would be some variation in colouring. I also think others would be likely to use the default >>> Python prompt. I don't think this minor variation in styles would detract from getting the information across, so I wouldn't advocate trying to lock authors into any particular style. Good work, Gary Stefan van der Walt wrote: > Hi Gary > > Thanks for your suggestions. I incorporated them. > > St?fan From gruben at bigpond.net.au Tue Feb 21 15:34:01 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Tue Feb 21 15:34:01 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> Message-ID: <43FBA352.3030400@bigpond.net.au> Something like this would be great to see in scipy. Pity about the licence. Gary R. Alan G Isaac wrote: > On Tue, 21 Feb 2006, (CET) Mads Ipsen apparently wrote: >> My system consists of N particles, whose coordinates in >> the xy-plane is given by the two vectors x and y. I need >> to calculate the distance between all particle pairs > > Of possible interest? > http://www.cs.umd.edu/~mount/ANN/ > > Cheers, > Alan Isaac From cookedm at physics.mcmaster.ca Tue Feb 21 15:43:04 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 21 15:43:04 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43F938FA.80200@cox.net> (Tim Hochberg's message of "Sun, 19 Feb 2006 20:35:22 -0700") References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>On Sat, Feb 18, 2006 at 06:17:47PM -0700, Tim Hochberg wrote: >> >> >>>OK, I now have a faily clean implementation in C of: >>> >>>def __pow__(self, p): >>> if p is not a scalar: >>> return power(self, p) >>> elif p == 1: >>> return p >>> elif p == 2: >>> return square(self) >>># elif p == 3: >>># return cube(self) >>># elif p == 4: >>># return power_4(self) >>># elif p == 0: >>># return ones(self.shape, dtype=self.dtype) >>># elif p == -1: >>># return 1.0/self >>> elif p == 0.5: >>> return sqrt(self) I've gone through your code you checked in, and fixed it up. Looks good. One side effect is that def zl(x): a = ones_like(x) a[:] = 0 return a is now faster than zeros_like(x) :-) One problem I had is that in PyArray_SetNumericOps, the "copy" method wasn't picked up on. It may be due to the order of initialization of the ndarray type, or something (since "copy" isn't a ufunc, it's initialized in a different place). I couldn't figure out how to fiddle that, so I replaced the x.copy() call with a call to PyArray_Copy(). >>Yes; because it's the implementation of __pow__, the second argument can >>be anything. >> >> > No, you misunderstand.. What I was talking about was that the *first* > argument can also be something that's not a PyArrayObject, despite the > functions signature. Ah, I suppose that's because the power slot in the number protocol also handles __rpow__. >>> On the other hand, real powers are fast enough that doing anything >>> at the single element level is unlikely to help. So in that case >>> we're left with either optimizing the cases where the dimension is >>> zero as David has done, or optimizing at the __pow__ (AKA >>> array_power) level as I've done now based on David's original >>> suggestion. This second approach is faster because it avoids the >>> mysterious python scalar -> zero-D array conversion overhead. >>> However, it suffers if we want to optimize lots of different powers >>> since one needs a ufunc for each one. So the question becomes, >>> which powers should we optimize? >> >>Hmm, ufuncs are passed a void* argument for passing info to them. Now, >>what that argument is defined when the ufunc is created, but maybe >>there's a way to piggy-back on it. >> >> > Yeah, I really felt like I was fighting the ufuncs when I was playing > with this. On the one hand, you really want to use the ufunc > machinery. On the other hand that forces you into using the same types > for both arguments. That really wouldn't be a problem, since we could > just define an integer_power that took doubles, but did integer > powers, except for the conversion overhead of Python_Integers into > arrays. It looks like you started down this road and I played with > this as well. I can think a of at least one (horrible) way around > the matrix overhead, but the real fix would be to dig into > PyArray_EnsureArray and see why it's slow for Python_Ints. It is much > faster for numarray scalars. Right; that needs to be looked at. > Another approach is to actually compute (x*x)*(x*x) for pow(x,4) at > the level of array_power. I think I could make this work. It would > probably work well for medium size arrays, but might well make things > worse for large arrays that are limited by memory bandwidth since it > would need to move the array from memory into the cache multiple times. I don't like that; I think it would be better memory-wise to do it elementwise. Not just speed, but size of intermediate arrays. >>> My latest thinking on this is that we should optimize only those >>> cases where the optimized result is no less accurate than that >>> produced by pow. I'm going to assume that all C operations are >>> equivalently accurate, so pow(x,2) has roughly the same amount of >>> error as x*x. (Something on the order of 0.5 ULP I'd guess). In >>> that case: >>> pow(x, -1) -> 1 / x >>> pow(x, 0) -> 1 >>> pow(x, 0.5) -> sqrt(x) >>> pow(x, 1) -> x >>> pow(x, 2) -> x*x >>> can all be implemented in terms of multiply or divide with the same >>> accuracy as the original power methods. Once we get beyond these, >>> the error will go up progressively. >>> >>> The minimal set described above seems like it should be relatively >>> uncontroversial and it's what I favor. Once we get beyond this >>> basic set, we would need to reach some sort of consensus on how >>> much additional error we are willing to tolerate for optimizing >>> these extra cases. You'll notice that I've changed my mind, yet >>> again, over whether to optimize A**0.5. Since the set of additional >>> ufuncs needed in this case is relatively small, just square and >>> inverse (==1/x), this minimal set works well if optimizing in pow >>> as I've done. >>> >>> > > Just to add a little more confusion to the mix. I did a little testing > to see how close pow(x,n) and x*x*... actually are. They are slightly > less close for small values of N and slightly closer for large values > of N than I would have expected. The upshot of this is that integer > powers between -2 and +4 all seem to vary by the same amount when > computed using pow(x,n) versus multiplies. I'm including the test code > at the end. Assuming that this result is not a fluke that expands the > noncontroversial set by at least 3 more values. That's starting to > strain the ufunc aproach, so perhaps optimizing in @TYP at _power is the > way to go after all. Or, more likely, adding @TYP at _int_power or maybe > @TYP at _fast_power (so as to be able to include some half integer > powers) and dispatching appropriately from array_power. 'int_power' we could do; that would the next step I think. The half integer powers we could maybe leave; if you want x**(-3/2), for instance, you could do y = x**(-1)*sqrt(x) (or do y = x**(-1); sqrt(y,y) if you're worried about temporaries). Or, 'fast_power' could be documented as doing the optimizations for integer and half-integer _scalar_ exponents, up to a certain size, like 100), and falling back on pow() if necessary. I think we could do a precomputation step to split the exponent into appropiate squarings and such that'll make the elementwise loop faster. Half-integer exponents are exactly representable as doubles (up to some number of course), so there's no chance of decimal-to-binary conversions making things look different. That might work out ok. Although, at that point I'd suggest we make it 'power', and have 'rawpower' (or ????) as the version that just uses pow(). Another point is to look at __div__, and use reciprocal if the dividend is 1. > The problem here, of course, is the overhead that PyArray_EnsureArray > runs into. I'm not sure if the ufuncs actually call that, but I was > using that to convert things to arrays at one point and I saw the > slowdown, so I suspect that the slowdown is in something > PyArray_EnsureArray calls if not in that routine itself. I'm afraid to > dig into that stuff though.. On the other hand, it would probably > speed up all kinds of stuff if that was sped up. I've added a page to the developer's wiki at http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas to keep a list of areas like that to look into if someone has time :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From svetosch at gmx.net Tue Feb 21 15:49:02 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Tue Feb 21 15:49:02 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB6604.6000406@ieee.org> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> Message-ID: <43FBA6C6.6040507@gmx.net> Travis Oliphant schrieb: > Sven Schreiber wrote: > >> Next, I try to workaround by b.squeeze(). That seems to work, but why is >> b.squeeze().shape == (1, 112) instead of (112,)? >> >> > Again the same reason as before. A matrix is returned from b.squeeze() > and there are no 1-d matrices. Thus, you get a row-vector. Use .T if > you want a column vector. Well if squeeze can't really squeeze matrix-vectors (doing a de-facto transpose instead), wouldn't it make more sense to disable the squeeze method for matrices altogether? > >> Then I thought maybe b.flattened() does the job, but then I get an error >> (matrix has no attr flattened). Again, I'm baffled. >> >> > The correct spelling is b.flatten() Ok, but I copied .flattened() from p. 48 of your book, must be a typo then. > You can mix arrays and matrices just fine if you remember that 1d arrays > are equivalent to row-vectors. > -Travis > Ok, thanks. Btw, did the recent numpy release change anything in terms of preserving matrix types when passing to decompositions etc? I checked the release notes but maybe they're just not verbose enough. -Sven From svetosch at gmx.net Tue Feb 21 15:57:00 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Tue Feb 21 15:57:00 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FB7C87.1010007@noaa.gov> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> Message-ID: <43FBA8B2.8010708@gmx.net> Christopher Barker schrieb: > > and you can easily get a column vector out of an array, if you remember > that you want to keep it 2-d. i.e. use a slice rather than an index: > >>>> import numpy as N >>>> a = N.ones((5,10)) >>>> a[:,1].shape # an index: it reduces the rank > (5,) >>>> a[:,1:2].shape # a slice: it keeps the rank > (5, 1) > That's very interesting, thanks. But I find it a little unintuitive/surprising, so I'm not sure if I will use it. I fear that I wouldn't understand my own code after a while of not working on it. I guess I'd rather follow the advice and just remember to treat 1d as a row. But thanks alot, sven From Chris.Barker at noaa.gov Tue Feb 21 16:47:01 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 21 16:47:01 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <43FB92D1.4020802@cox.net> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> <43FB8933.1080400@cox.net> <43FB92D1.4020802@cox.net> Message-ID: <43FBB444.1040504@noaa.gov> > r2 = (dx*dx + dy*dy) Might numpy.hypot() help here? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Feb 21 16:56:01 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 21 16:56:01 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FBA8B2.8010708@gmx.net> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> <43FBA8B2.8010708@gmx.net> Message-ID: <43FBB65C.4040001@noaa.gov> Sven Schreiber wrote: >>>>> a = N.ones((5,10)) >>>>> a[:,1].shape # an index: it reduces the rank >> (5,) >>>>> a[:,1:2].shape # a slice: it keeps the rank >> (5, 1) >> > That's very interesting, thanks. But I find it a little > unintuitive/surprising, so I'm not sure if I will use it. I fear that I > wouldn't understand my own code after a while of not working on it. Well, what's surprising to different people is different. However.... > I guess I'd rather follow the advice and just remember to treat 1d as a row. Except that it's not, universally. For instance, it won't transpose: >>> a = N.ones((5,)) >>> a.transpose() array([1, 1, 1, 1, 1]) >>> a.shape = (1,-1) >>> a array([[1, 1, 1, 1, 1]]) >>> a.transpose() array([[1], [1], [1], [1], [1]]) so while a rank-1 array is often treated like a row vector, it really isn't the same. The concept of a row vs a column vector is a rank-2 array concept -- so keep your arrays rank-2. It's very helpful to remember that indexing reduces rank, and slicing keeps the rank the same. It will serve you well to use that in the future anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ndarray at mac.com Tue Feb 21 17:06:08 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 21 17:06:08 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? Message-ID: [I am reposting this under a different subject because my original post got buried in a long thread that went on to discussing unrelated topics. Sorry if you had to read this post twice.] It turns out that around (or round_) is implemented in python: def round_(a, decimals=0): """Round 'a' to the given number of decimal places. Rounding behaviour is equivalent to Python. Return 'a' if the array is not floating point. Round both the real and imaginary parts separately if the array is complex. """ a = asarray(a) if not issubclass(a.dtype.type, _nx.inexact): return a if issubclass(a.dtype.type, _nx.complexfloating): return round_(a.real, decimals) + 1j*round_(a.imag, decimals) if decimals is not 0: decimals = asarray(decimals) s = sign(a) if decimals is not 0: a = absolute(multiply(a, 10.**decimals)) else: a = absolute(a) rem = a-asarray(a).astype(_nx.intp) a = _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) # convert back if decimals is not 0: return multiply(a, s/(10.**decimals)) else: return multiply(a, s) I see many ways to improve the performance here. First, there is no need to check for "decimals is not 0" three times. This can be done once, maybe at the expense of some code duplication. Second, _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a ufunc as Mads originally suggested, "decimals is 0" branch can just call that. It is tempting to rewrite the whole thing in C, but before I do that I have a few questions about current implementation. 1. It is implemented in oldnumeric.py . Does this mean it is deprecated. If so, what is the recommended replacement? 2. Was it intended to support array and fractional values for decimals or is it an implementation artifact. Currently: >>> around(array([1.2345]*5),[1,2,3,4,5]) array([ 1.2 , 1.23 , 1.235 , 1.2345, 1.2345]) >>> around(1.2345,2.5) array(1.2332882874656679) 3. It does nothing to exact types, even if decimals<0 >>> around(1234, -2) array(1234) Is this a bug? Consider that >>> round(1234, -2) 1200.0 and >>> around(1234., -2) array(1200.0) Docstring is self-contradictory: "Rounding behaviour is equivalent to Python" is not consistent with "Return 'a' if the array is not floating point." I propose to deprecate around and implement a new "round" member function in C that will only accept scalar "decimals" and will behave like a properly vectorized builtin round. I will do the coding if there is interest. In any case, something has to be done here. I don't think the following timings are acceptable: > python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)" 100000 loops, best of 3: 18.8 usec per loop > python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)" 10000 loops, best of 3: 155 usec per loop From skip at pobox.com Tue Feb 21 18:19:01 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue Feb 21 18:19:01 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: References: <17403.30180.676655.892180@montanaro.dyndns.org> Message-ID: <17403.51674.348469.279480@montanaro.dyndns.org> Robert> Hmm. Was ATLAS compiled -fPIC? I'm not certain, but I doubt it should matter since only .a files were generated. There's nothing to relocate: $ ls -ltr total 9190 lrwxrwxrwx 1 skipm develop 41 Feb 9 14:51 Make.inc -> /home/ink/skipm/src/ATLAS/Make.SunOS_Babe -rw-r--r-- 1 skipm develop 1529 Feb 9 14:51 Makefile -rw-r--r-- 1 skipm develop 236004 Feb 9 14:57 libtstatlas.a -rw-r--r-- 1 skipm develop 241352 Feb 9 16:28 libcblas.a -rw-r--r-- 1 skipm develop 280464 Feb 9 16:33 libf77blas.a -rw-r--r-- 1 skipm develop 278616 Feb 9 16:34 liblapack.a -rw-r--r-- 1 skipm develop 3603644 Feb 9 16:36 libatlas.a Robert> You get messages like this when something previous goes Robert> wrong. Thanks. Now I know to focus on only the first problem... Skip From zpincus at stanford.edu Tue Feb 21 19:15:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 21 19:15:02 2006 Subject: [Numpy-discussion] Shouldn't singular_value_decomposition respect full_matrices? Message-ID: <29AE3219-9E29-45EC-BA94-E9487E983A2D@stanford.edu> numpy.linglg.singular_value_decomposition is defined as follows: def singular_value_decomposition(A, full_matrices=0): return svd(A, 0) Shouldn't that last line be return svd(A, full_matrices) Zach From robert.kern at gmail.com Tue Feb 21 19:19:01 2006 From: robert.kern at gmail.com (Robert Kern) Date: Tue Feb 21 19:19:01 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: <17403.51674.348469.279480@montanaro.dyndns.org> References: <17403.30180.676655.892180@montanaro.dyndns.org> <17403.51674.348469.279480@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > Robert> Hmm. Was ATLAS compiled -fPIC? > > I'm not certain, but I doubt it should matter since only .a files were > generated. There's nothing to relocate: > > $ ls -ltr > total 9190 > lrwxrwxrwx 1 skipm develop 41 Feb 9 14:51 Make.inc -> /home/ink/skipm/src/ATLAS/Make.SunOS_Babe > -rw-r--r-- 1 skipm develop 1529 Feb 9 14:51 Makefile > -rw-r--r-- 1 skipm develop 236004 Feb 9 14:57 libtstatlas.a > -rw-r--r-- 1 skipm develop 241352 Feb 9 16:28 libcblas.a > -rw-r--r-- 1 skipm develop 280464 Feb 9 16:33 libf77blas.a > -rw-r--r-- 1 skipm develop 278616 Feb 9 16:34 liblapack.a > -rw-r--r-- 1 skipm develop 3603644 Feb 9 16:36 libatlas.a Google suggests that it does matter. E.g. http://mail.python.org/pipermail/python-dev/2001-March/013510.html http://bugs.mysql.com/bug.php?id=14202 http://mail.python.org/pipermail/image-sig/2002-June/001884.html -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From skip at pobox.com Tue Feb 21 19:59:02 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue Feb 21 19:59:02 2006 Subject: [Numpy-discussion] Re: Problems building numpy w/ ATLAS on Solaris 8 In-Reply-To: References: <17403.30180.676655.892180@montanaro.dyndns.org> <17403.51674.348469.279480@montanaro.dyndns.org> Message-ID: <17403.57676.925970.142021@montanaro.dyndns.org> Robert> Google suggests that it does matter. E.g. Robert> http://mail.python.org/pipermail/python-dev/2001-March/013510.html Robert> http://bugs.mysql.com/bug.php?id=14202 Robert> http://mail.python.org/pipermail/image-sig/2002-June/001884.html *sigh* Thanks. You'd think that Solaris was a common enough platform that the ATLAS folks would get this right... Skip From nadavh at visionsense.com Tue Feb 21 22:59:02 2006 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue Feb 21 22:59:02 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? Message-ID: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> You may get a significant boost by replacing the line: w=w+ eta * (y*x - y**2*w) with w *= 1.0 - eta*y*y w += eta*y*x I ran a test on a similar expression and got 5 fold speed increase. The dot() function runs faster if you compile with dotblas. Nadav. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net on behalf of Bruce Southey Sent: Tue 21-Feb-06 17:15 To: Brian Blais Cc: python-list at python.org; numpy-discussion at lists.sourceforge.net; scipy-user at scipy.net Subject: Re: [Numpy-discussion] algorithm, optimization, or other problem? Hi, In the current version, note that Y is scalar so replace the squaring (Y**2) with Y*Y as you do in the dohebb function. On my system without blas etc removing the squaring removes a few seconds (16.28 to 12.4). It did not seem to help factorizing Y. Also, eta and tau are constants so define them only once as scalars outside the loops and do the division outside the loop. It only saves about 0.2 seconds but these add up. The inner loop probably can be vectorized because it is just vector operations on a matrix. You are just computing over the ith dimension of X. I think that you could be able to find the matrix version on the net. Regards Bruce On 2/21/06, Brian Blais wrote: > Hello, > > I am trying to translate some Matlab/mex code to Python, for doing neural > simulations. This application is definitely computing-time limited, and I need to > optimize at least one inner loop of the code, or perhaps even rethink the algorithm. > The procedure is very simple, after initializing any variables: > > 1) select a random input vector, which I will call "x". right now I have it as an > array, and I choose columns from that array randomly. in other cases, I may need to > take an image, select a patch, and then make that a column vector. > > 2) calculate an output value, which is the dot product of the "x" and a weight > vector, "w", so > > y=dot(x,w) > > 3) modify the weight vector based on a matrix equation, like: > > w=w+ eta * (y*x - y**2*w) > ^ > | > +---- learning rate constant > > 4) repeat steps 1-3 many times > > I've organized it like: > > for e in 100: # outer loop > for i in 1000: # inner loop > (steps 1-3) > > display things. > > so that the bulk of the computation is in the inner loop, and is amenable to > converting to a faster language. This is my issue: > > straight python, in the example posted below for 250000 inner-loop steps, takes 20 > seconds for each outer-loop step. I tried Pyrex, which should work very fast on such > a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex > file in matlab takes 1.5 seconds per outer-loop step. > > Given the huge difference between the Pyrex and the Mex, I feel that there is > something I am doing wrong, because the C-code for both should run comparably. > Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind > coding some in C, but the Python API seemed a bit challenging to me. > > One note: I am using the Numeric package, not numpy, only because I want to be able > to use the Enthought version for Windows. I develop on Linux, and haven't had a > chance to see if I can compile numpy using the Enthought Python for Windows. > > If there is anything else anyone needs to know, I'll post it. I put the main script, > and a dohebb.pyx code below. > > > thanks! > > Brian Blais > > -- > ----------------- > > bblais at bryant.edu > http://web.bryant.edu/~bblais > > > > > # Main script: > > from dohebb import * > import pylab as p > from Numeric import * > from RandomArray import * > import time > > x=random((100,1000)) # 1000 input vectors > > numpats=x.shape[0] > w=random((numpats,1)); > > th=random((1,1)) > > params={} > params['eta']=0.001; > params['tau']=100.0; > old_mx=0; > for e in range(100): > > rnd=randint(0,numpats,250000) > t1=time.time() > if 0: # straight python > for i in range(len(rnd)): > pat=rnd[i] > xx=reshape(x[:,pat],(1,-1)) > y=matrixmultiply(xx,w) > w=w+params['eta']*(y*transpose(xx)-y**2*w); > th=th+(1.0/params['tau'])*(y**2-th); > else: # pyrex > dohebb(params,w,th,x,rnd) > print time.time()-t1 > > > p.plot(w,'o-') > p.xlabel('weights') > p.show() > > > #============================================= > > # dohebb.pyx > > cdef extern from "Numeric/arrayobject.h": > > struct PyArray_Descr: > int type_num, elsize > char type > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > cdef char *data > cdef int nd > cdef int *dimensions, *strides > cdef object base > cdef PyArray_Descr *descr > cdef int flags > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > cdef int num_iterations > cdef int num_inputs > cdef int offset > cdef double *wp,*xp,*thp > cdef int *rndp > cdef double eta,tau > > eta=params['eta'] # learning rate > tau=params['tau'] # used for variance estimate > > cdef double y > num_iterations=rnd.dimensions[0] > num_inputs=w.dimensions[0] > > # get the pointers > wp=w.data > xp=X.data > rndp=rnd.data > thp=th.data > > for it from 0 <= it < num_iterations: > > offset=rndp[it]*num_inputs > > # calculate the output > y=0.0 > for i from 0 <= i < num_inputs: > y=y+wp[i]*xp[i+offset] > > # change in the weights > for i from 0 <= i < num_inputs: > wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > # estimate the variance > thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From zpincus at stanford.edu Wed Feb 22 00:50:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed Feb 22 00:50:04 2006 Subject: [Numpy-discussion] Method to shift elements in an array? Message-ID: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Hello folks, Does numpy have an built-in mechanism to shift elements along some axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, to make [2,3,0,1]) If not, what would be the fastest way to implement this in python? Using take? Using slicing and concatenation? Zach From hot_night at obqv.com Wed Feb 22 02:20:04 2006 From: hot_night at obqv.com (=?ISO-2022-JP?B?GyRCJWklViU5JUYhPCU3JWclcxsoQg==?=) Date: Wed Feb 22 02:20:04 2006 Subject: [Numpy-discussion] $B!!$*BT$A$7$F$*$j$^$9(B Message-ID: <20060222043755.30160.qmail@mail.obqv.com> ?????????????????????? ????????????????????????? ??????? ?????????????????????????????? ?????????????????????? ????????????????OL?????????? ?????????????????????????????? ???????? http://www.covcov.net?num=112 ???????????????? ????????????????? ??????BOX?????? ???refuse at www.covcov.net From mpi at osc.kiku.dk Wed Feb 22 02:27:02 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Wed Feb 22 02:27:02 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: On Tue, 21 Feb 2006, Sasha wrote: > > python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)" > 100000 loops, best of 3: 18.8 usec per loop > > python -m timeit -s just want to point out that the function foo(x) = (x+0.5).astype(int).astype(float) is different from around. For x = array([1.2, 1.8]) it works but for x = array([-1.2, -1.8]) you get around(x) = array([-1., -2.]) whereas foo(x) gives foo(x) = array([0., -1.]) Using foo(x) = where(greater(x,0),x+0.5,x-0.5).astype(int).astype(float) will work. // Mads From schofield at ftw.at Wed Feb 22 02:48:06 2006 From: schofield at ftw.at (Ed Schofield) Date: Wed Feb 22 02:48:06 2006 Subject: [Numpy-discussion] A proposal to implement round in C In-Reply-To: References: Message-ID: <43FC412D.2050402@ftw.at> Sasha wrote: >I propose to deprecate around and implement a new "round" member >function in C that will only accept scalar "decimals" and will behave >like a properly vectorized builtin round. I will do the coding if >there is interest. > >In any case, something has to be done here. I don't think the >following timings are acceptable: > > This sounds great to me :) -- Ed From mpi at osc.kiku.dk Wed Feb 22 03:56:04 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Wed Feb 22 03:56:04 2006 Subject: [Numpy-discussion] Rookie problems - Why is C-code much faster? In-Reply-To: <43FB92D1.4020802@cox.net> References: <633BB073-0A46-416A-96DF-080CF4A6DBDF@stanford.edu> <43FB65D1.5080707@cox.net> <43FB8933.1080400@cox.net> <43FB92D1.4020802@cox.net> Message-ID: On Tue, 21 Feb 2006, Tim Hochberg wrote: > Mads Ipsen wrote: > > >On Tue, 21 Feb 2006, Tim Hochberg wrote: > > > > > > > >>This all makes perfect sense, but what happended to box? In your > >>original code there was a step where you did some mumbo jumbo and box > >>and rint. Namely: > >> > >> > > > >It's a minor detail, but the reason for this is the following > > > >Suppose you have a line with length of box = 10 with periodic boundary > >conditions (basically this is a circle). Now consider two points x0 = 1 > >and x1 = 9 on this line. The shortest distance dx between the points x0 > >and x1 is dx = -2 and not 8. The calculation > > > > dx = x1 - x0 ( = +8) > > dx -= box*rint(dx/box) ( = -2) > > > >will give you the desired result, namely dx = -2. Hope this makes better > >sense. Note that fmod() won't work since > > > > fmod(dx,box) = 8 > > > > > I think you could use some variation like "fmod(dx+box/2, box) - box/2" > but rint seems better. > > >Part of my original post was concerned with the fact, that I initially was > >using around() from numpy for this step. This was terribly slow, so I made > >some custom changes and added rint() from the C-math library to the numpy > >module, giving a speedup factor about 4 for this particular line in the > >code. > > > >Best regards // Mads > > > > > > > OK, that all makes sense. You might want to try the following, which > factors out all the divisions and half the multiplies by box and > produces several fewer temporaries. Note I replaced x**2 with x*x, > which for the moment is much faster (I don't know if you've been > following the endless yacking about optimizing x**n, but x**2 will get > fast eventually). Depending on what you're doing with r2, you may be > able to avoid the last multiple by box as well. > > > # Loop over all particles > xbox = x/box > ybox = y/box > for i in range(n-1): > dx = xbox[i+1:] - xbox[i] > dy = ybox[i+1:] - ybox[i] > dx -= rint(dx) > dy -= rint(dy) > r2 = (dx*dx + dy*dy) > r2 *= box > > > > Regards, > > -tim > Thanks Tim, I am only a factor 2.5 slower than the C loop now, thanks to your suggestions. // Mads From mfmorss at aep.com Wed Feb 22 06:07:32 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Wed Feb 22 06:07:32 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. Message-ID: I built Python successfully on our AIX 5.2 server using "./configure --without-cxx --disable-ipv6". (This uses the native IBM C compiler, invoking it as "cc_r". We have no C++ compiler.) But I have been unable to install Numpy-0.9.5 using the same compiler. After "python setup.py install," the relevant section of the output was: compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/pydirectory/include/python2.4 -c' cc_r: build/src/numpy/core/src/umathmodule.c "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. error: Command "cc_r -DNDEBUG -O -Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/app/sandbox/s625662/installed/include/python2.4 -c build/src/numpy/core/src/umathmodule.c -o build/temp.aix-5.2-2.4 /build/src/numpy/core/src/umathmodule.o" failed with exit status 1 A closely related question is, how can I modify the Numpy setup.py and/or distutils files to enable me to control the options with which cc_r is invoked? I inspected these files, but not being very expert in Python, I could not figure this out. Mark F. Morss Principal Analyst, Market Risk American Electric Power From bsouthey at gmail.com Wed Feb 22 06:25:05 2006 From: bsouthey at gmail.com (Bruce Southey) Date: Wed Feb 22 06:25:05 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> Message-ID: Hi, Actually it makes it slightly worse - given the responses on another thread it is probably due to not pushing enough into C code. Obviously use of blas etc will be faster but it doesn't change the fact that removing the inner loop would be faster still. Bruce On 2/22/06, Nadav Horesh wrote: > You may get a significant boost by replacing the line: > w=w+ eta * (y*x - y**2*w) > with > w *= 1.0 - eta*y*y > w += eta*y*x > > I ran a test on a similar expression and got 5 fold speed increase. > The dot() function runs faster if you compile with dotblas. > > Nadav. > > > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net on behalf of Bruce Southey > Sent: Tue 21-Feb-06 17:15 > To: Brian Blais > Cc: python-list at python.org; numpy-discussion at lists.sourceforge.net; scipy-user at scipy.net > Subject: Re: [Numpy-discussion] algorithm, optimization, or other problem? > Hi, > In the current version, note that Y is scalar so replace the squaring > (Y**2) with Y*Y as you do in the dohebb function. On my system > without blas etc removing the squaring removes a few seconds (16.28 to > 12.4). It did not seem to help factorizing Y. > > Also, eta and tau are constants so define them only once as scalars > outside the loops and do the division outside the loop. It only saves > about 0.2 seconds but these add up. > > The inner loop probably can be vectorized because it is just vector > operations on a matrix. You are just computing over the ith dimension > of X. I think that you could be able to find the matrix version on > the net. > > Regards > Bruce > > > > On 2/21/06, Brian Blais wrote: > > Hello, > > > > I am trying to translate some Matlab/mex code to Python, for doing neural > > simulations. This application is definitely computing-time limited, and I need to > > optimize at least one inner loop of the code, or perhaps even rethink the algorithm. > > The procedure is very simple, after initializing any variables: > > > > 1) select a random input vector, which I will call "x". right now I have it as an > > array, and I choose columns from that array randomly. in other cases, I may need to > > take an image, select a patch, and then make that a column vector. > > > > 2) calculate an output value, which is the dot product of the "x" and a weight > > vector, "w", so > > > > y=dot(x,w) > > > > 3) modify the weight vector based on a matrix equation, like: > > > > w=w+ eta * (y*x - y**2*w) > > ^ > > | > > +---- learning rate constant > > > > 4) repeat steps 1-3 many times > > > > I've organized it like: > > > > for e in 100: # outer loop > > for i in 1000: # inner loop > > (steps 1-3) > > > > display things. > > > > so that the bulk of the computation is in the inner loop, and is amenable to > > converting to a faster language. This is my issue: > > > > straight python, in the example posted below for 250000 inner-loop steps, takes 20 > > seconds for each outer-loop step. I tried Pyrex, which should work very fast on such > > a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex > > file in matlab takes 1.5 seconds per outer-loop step. > > > > Given the huge difference between the Pyrex and the Mex, I feel that there is > > something I am doing wrong, because the C-code for both should run comparably. > > Perhaps the approach is wrong? I'm willing to take any suggestions! I don't mind > > coding some in C, but the Python API seemed a bit challenging to me. > > > > One note: I am using the Numeric package, not numpy, only because I want to be able > > to use the Enthought version for Windows. I develop on Linux, and haven't had a > > chance to see if I can compile numpy using the Enthought Python for Windows. > > > > If there is anything else anyone needs to know, I'll post it. I put the main script, > > and a dohebb.pyx code below. > > > > > > thanks! > > > > Brian Blais > > > > -- > > ----------------- > > > > bblais at bryant.edu > > http://web.bryant.edu/~bblais > > > > > > > > > > # Main script: > > > > from dohebb import * > > import pylab as p > > from Numeric import * > > from RandomArray import * > > import time > > > > x=random((100,1000)) # 1000 input vectors > > > > numpats=x.shape[0] > > w=random((numpats,1)); > > > > th=random((1,1)) > > > > params={} > > params['eta']=0.001; > > params['tau']=100.0; > > old_mx=0; > > for e in range(100): > > > > rnd=randint(0,numpats,250000) > > t1=time.time() > > if 0: # straight python > > for i in range(len(rnd)): > > pat=rnd[i] > > xx=reshape(x[:,pat],(1,-1)) > > y=matrixmultiply(xx,w) > > w=w+params['eta']*(y*transpose(xx)-y**2*w); > > th=th+(1.0/params['tau'])*(y**2-th); > > else: # pyrex > > dohebb(params,w,th,x,rnd) > > print time.time()-t1 > > > > > > p.plot(w,'o-') > > p.xlabel('weights') > > p.show() > > > > > > #============================================= > > > > # dohebb.pyx > > > > cdef extern from "Numeric/arrayobject.h": > > > > struct PyArray_Descr: > > int type_num, elsize > > char type > > > > ctypedef class Numeric.ArrayType [object PyArrayObject]: > > cdef char *data > > cdef int nd > > cdef int *dimensions, *strides > > cdef object base > > cdef PyArray_Descr *descr > > cdef int flags > > > > > > def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd): > > > > > > cdef int num_iterations > > cdef int num_inputs > > cdef int offset > > cdef double *wp,*xp,*thp > > cdef int *rndp > > cdef double eta,tau > > > > eta=params['eta'] # learning rate > > tau=params['tau'] # used for variance estimate > > > > cdef double y > > num_iterations=rnd.dimensions[0] > > num_inputs=w.dimensions[0] > > > > # get the pointers > > wp=w.data > > xp=X.data > > rndp=rnd.data > > thp=th.data > > > > for it from 0 <= it < num_iterations: > > > > offset=rndp[it]*num_inputs > > > > # calculate the output > > y=0.0 > > for i from 0 <= i < num_inputs: > > y=y+wp[i]*xp[i+offset] > > > > # change in the weights > > for i from 0 <= i < num_inputs: > > wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i]) > > > > # estimate the variance > > thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0]) > > > > > > > > > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > > for problems? Stop! Download the new AJAX search engine that makes > > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=k&kid3432&bid#0486&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From svetosch at gmx.net Wed Feb 22 06:48:09 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Wed Feb 22 06:48:09 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FBB65C.4040001@noaa.gov> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> <43FBA8B2.8010708@gmx.net> <43FBB65C.4040001@noaa.gov> Message-ID: <43FC799B.6040505@gmx.net> Christopher Barker schrieb: > Sven Schreiber wrote: >> I guess I'd rather follow the advice and just remember to treat 1d as >> a row. > > Except that it's not, universally. For instance, it won't transpose: > > > It's very helpful to remember that indexing reduces rank, and slicing > keeps the rank the same. It will serve you well to use that in the > future anyway. > Anyway, the problem is really about interaction with pylab/matplotlib (so slightly OT here, sorry); when getting data from a text file with pylab.load you can't be sure if the result is 1d or 2d. This means that: - If I have >1 variable then everything is fine (provided I use your advice of slicing instead of indexing afterwards) and the variables are in the _columns_ of the 2d-array. - But if there's just one data _column_ in the file, then pylab/numpy gives me a 1d-array that sometimes works as a _row_ (and as you noted, sometimes not), but never works as a column. Imho that's bad, because as a consequence I must use overhead code to distinguish between these cases. To me it seems more like pylab's bug instead of numpy's, so please excuse this OT twist, but since there seems to be overlap between the pylab/matplotlib and numpy folks, maybe it's not so bad. Thanks for your patience and helpful input, Sven From cjw at sympatico.ca Wed Feb 22 07:29:05 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Wed Feb 22 07:29:05 2006 Subject: [Numpy-discussion] dtype Message-ID: <43FC8312.3040402@sympatico.ca> I've been trying to gain some understanding of dtype from the builtin documentation and would appreciate advice. I don't find anything in http://projects.scipy.org/scipy/numpy or http://wiki.python.org/moin/NumPy Chapter 2.1 of the book has a good overview, but little reference material. In the following, dt= numpy.dtype Some specific problems are flagged ** below. Colin W. [Dbg]>>> h(dt) Help on class dtype in module numpy: class dtype(__builtin__.object) | Methods defined here: | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __len__(...) | x.__len__() <==> len(x) | | __reduce__(...) | self.__reduce__() for pickling. | | __repr__(...) | x.__repr__() <==> repr(x) | | __setstate__(...) | self.__setstate__() for pickling. | | __str__(...) | x.__str__() <==> str(x) | | newbyteorder(...) | self.newbyteorder() returns a copy of the dtype object | with altered byteorders. If is not given all byteorders | are swapped. Otherwise endian can be '>', '<', or '=' to force | a byteorder. Descriptors in all fields are also updated in the | new dtype object. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T ** What are the parameters? In other words, | what does ... stand for? ** | | alignment = | | | base = | The base data-type or self if no subdtype | | byteorder = | | | char = | | | descr = | The array_protocol type descriptor. | | fields = | | | hasobject = | | | isbuiltin = | Is this a buillt-in data-type descriptor? | | isnative = | Is the byte-order of this descriptor native? | | itemsize = | | | kind = | | | name = | The name of the true data-type | | num = | | | shape = | The shape of the subdtype or (1,) | | str = | The array_protocol typestring. | | subdtype = | A tuple of (descr, shape) or None. | | type = [Dbg]>>> dt.num.__doc__ ** no doc string ** [Dbg]>>> help(dt.num) Help on member_descriptor object: num = class member_descriptor(object) | Methods defined here: | | __delete__(...) | descr.__delete__(obj) | | __get__(...) | descr.__get__(obj[, type]) -> value | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __repr__(...) | x.__repr__() <==> repr(x) | | __set__(...) | descr.__set__(obj, value) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __objclass__ = [Dbg]>>> help(dt.num) Help on member_descriptor object: num = class member_descriptor(object) | Methods defined here: | | __delete__(...) | descr.__delete__(obj) | | __get__(...) | descr.__get__(obj[, type]) -> value | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __repr__(...) | x.__repr__() <==> repr(x) | | __set__(...) | descr.__set__(obj, value) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __objclass__ = [Dbg]>>> help(dt.num.__objclass__) Help on class dtype in module numpy: class dtype(__builtin__.object) | Methods defined here: | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __len__(...) | x.__len__() <==> len(x) | | __reduce__(...) | self.__reduce__() for pickling. | | __repr__(...) | x.__repr__() <==> repr(x) | | __setstate__(...) | self.__setstate__() for pickling. | | __str__(...) | x.__str__() <==> str(x) | | newbyteorder(...) | self.newbyteorder() returns a copy of the dtype object | with altered byteorders. If is not given all byteorders | are swapped. Otherwise endian can be '>', '<', or '=' to force | a byteorder. Descriptors in all fields are also updated in the | new dtype object. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | alignment = | | | base = | The base data-type or self if no subdtype | | byteorder = | | | char = | | | descr = | The array_protocol type descriptor. | | fields = | | | hasobject = | | | isbuiltin = | Is this a buillt-in data-type descriptor? | | isnative = | Is the byte-order of this descriptor native? | | itemsize = | | | kind = | | | name = | The name of the true data-type ** How does this differ from what, in common | Python usage, is a class.__name__? ** | | num = ** What does this mean? ** | | | shape = | The shape of the subdtype or (1,) | | str = | The array_protocol typestring. | | subdtype = | A tuple of (descr, shape) or None. | | type = [Dbg]>>> ** There is no __module__ attribute. How does one identify the modules holding the code? ** From mfmorss at aep.com Wed Feb 22 08:16:15 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Wed Feb 22 08:16:15 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: Message-ID: This problem was solved by adding "#include " to ...numpy-0.9.5 /numpy/core/src/umathmodule.c.src Mark F. Morss Principal Analyst, Market Risk American Electric Power mfmorss at aep.com Sent by: numpy-discussion- To admin at lists.sourc numpy-discussion eforge.net cc 02/22/2006 09:06 AM Subject [Numpy-discussion] Trouble installing Numpy on AIX 5.2. I built Python successfully on our AIX 5.2 server using "./configure --without-cxx --disable-ipv6". (This uses the native IBM C compiler, invoking it as "cc_r". We have no C++ compiler.) But I have been unable to install Numpy-0.9.5 using the same compiler. After "python setup.py install," the relevant section of the output was: compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/pydirectory/include/python2.4 -c' cc_r: build/src/numpy/core/src/umathmodule.c "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. "build/src/numpy/core/src/umathmodule.c", line 2566.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2584.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2602.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2620.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2638.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2654.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2674.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2694.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2714.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) Undeclared identifier FE_OVERFLOW. "build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) Function argument assignment between types "long double*" and "double*" is not allowed. error: Command "cc_r -DNDEBUG -O -Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/app/sandbox/s625662/installed/include/python2.4 -c build/src/numpy/core/src/umathmodule.c -o build/temp.aix-5.2-2.4 /build/src/numpy/core/src/umathmodule.o" failed with exit status 1 A closely related question is, how can I modify the Numpy setup.py and/or distutils files to enable me to control the options with which cc_r is invoked? I inspected these files, but not being very expert in Python, I could not figure this out. Mark F. Morss Principal Analyst, Market Risk American Electric Power ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From oliphant.travis at ieee.org Wed Feb 22 08:20:09 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 08:20:09 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: References: Message-ID: <43FC8F0D.5040409@ieee.org> mfmorss at aep.com wrote: >I built Python successfully on our AIX 5.2 server using "./configure >--without-cxx --disable-ipv6". (This uses the native IBM C compiler, >invoking it as "cc_r". We have no C++ compiler.) > >But I have been unable to install Numpy-0.9.5 using the same compiler. >After "python setup.py install," the relevant section of the output was: > >compile options: '-Ibuild/src/numpy/core/src -Inumpy/core/include >-Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include >-I/pydirectory/include/python2.4 -c' >cc_r: build/src/numpy/core/src/umathmodule.c >"build/src/numpy/core/src/umathmodule.c", line 2734.25: 1506-045 (S) >Undeclared identifier FE_OVERFLOW. > > Thanks for this check. This is an error in the _AIX section of the header. Change line 304 in ufuncobject.h from FE_OVERFLOW to FP_OVERFLOW. >"build/src/numpy/core/src/umathmodule.c", line 9307.32: 1506-280 (W) >Function argument assignment between types "long double*" and "double*" is >not allowed. > > I'm not sure where this error comes from. It seems to appear when modfl is used. What is the content of config.h (in your /numpy/core/include/numpy directory)? Can you find out if modfl is defined on your platform already? >A closely related question is, how can I modify the Numpy setup.py and/or >distutils files to enable me to control the options with which cc_r is >invoked? I inspected these files, but not being very expert in Python, I >could not figure this out. > > The default CFLAGS are those you used to build Python with. I think you can set the CFLAGS environment variable in order to change this. Thank you for your test. I don't have access to _AIX platform and so I appreciate your feedback. -Travis From oliphant.travis at ieee.org Wed Feb 22 08:31:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 08:31:01 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: References: Message-ID: <43FC917B.3040900@ieee.org> mfmorss at aep.com wrote: >This problem was solved by adding "#include " to ...numpy-0.9.5 >/numpy/core/src/umathmodule.c.src > > I suspect this allowed compilation, but I'm not sure if it "solved the problem." It depends on whether or not the FE_OVERFLOW defined in fenv.h is the same as FP_OVERFLOW on the _AIX (it might be...). The better solution is to change the constant to what it should be... Did the long double *, double * problem also resolve itself? This seems to an error with the modfl function you are picking up since the AIX docs say that modfl should take and receive long double arguments. Best, -Travis From mfmorss at aep.com Wed Feb 22 08:34:03 2006 From: mfmorss at aep.com (mfmorss at aep.com) Date: Wed Feb 22 08:34:03 2006 Subject: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. In-Reply-To: <43FC917B.3040900@ieee.org> Message-ID: Thanks for this observation. I will modify ufuncobject.h as you suggested, instead. The other problem still results in a complaint, but not an error; it does not prevent compilation. I have another little problem but I expect to be able to solve it. I will report when and if I have Numpy installed. Mark F. Morss Principal Analyst, Market Risk American Electric Power Travis Oliphant To mfmorss at aep.com 02/22/2006 11:29 cc AM numpy-discussion Subject Re: [Numpy-discussion] Trouble installing Numpy on AIX 5.2. mfmorss at aep.com wrote: >This problem was solved by adding "#include " to ...numpy-0.9.5 >/numpy/core/src/umathmodule.c.src > > I suspect this allowed compilation, but I'm not sure if it "solved the problem." It depends on whether or not the FE_OVERFLOW defined in fenv.h is the same as FP_OVERFLOW on the _AIX (it might be...). The better solution is to change the constant to what it should be... Did the long double *, double * problem also resolve itself? This seems to an error with the modfl function you are picking up since the AIX docs say that modfl should take and receive long double arguments. Best, -Travis From robert.kern at gmail.com Wed Feb 22 09:59:12 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 09:59:12 2006 Subject: [Numpy-discussion] Re: dtype In-Reply-To: <43FC8312.3040402@sympatico.ca> References: <43FC8312.3040402@sympatico.ca> Message-ID: Colin J. Williams wrote: > I've been trying to gain some understanding of dtype from the builtin > documentation and would appreciate advice. > > I don't find anything in http://projects.scipy.org/scipy/numpy or > http://wiki.python.org/moin/NumPy > > Chapter 2.1 of the book has a good overview, but little reference material. > > In the following, dt= numpy.dtype > > Some specific problems are flagged ** below. > > Colin W. > > [Dbg]>>> h(dt) > Help on class dtype in module numpy: > > class dtype(__builtin__.object) > | Methods defined here: > | | __cmp__(...) > | x.__cmp__(y) <==> cmp(x,y) > | | __getitem__(...) > | x.__getitem__(y) <==> x[y] > | | __len__(...) > | x.__len__() <==> len(x) > | | __reduce__(...) > | self.__reduce__() for pickling. > | | __repr__(...) > | x.__repr__() <==> repr(x) > | | __setstate__(...) > | self.__setstate__() for pickling. > | | __str__(...) > | x.__str__() <==> str(x) > | | newbyteorder(...) > | self.newbyteorder() returns a copy of the dtype object > | with altered byteorders. If is not given all byteorders > | are swapped. Otherwise endian can be '>', '<', or '=' to force > | a byteorder. Descriptors in all fields are also updated in the > | new dtype object. > | | ---------------------------------------------------------------------- > | Data and other attributes defined here: > | | __new__ = | > T.__new__(S, ...) -> a new object with type S, a subtype of > T ** What are the parameters? In other words, > | > what does ... stand for? ** http://www.python.org/2.2.3/descrintro.html#__new__ """Recall that you create class instances by calling the class. When the class is a new-style class, the following happens when it is called. First, the class's __new__ method is called, passing the class itself as first argument, followed by any (positional as well as keyword) arguments received by the original call. This returns a new instance. Then that instance's __init__ method is called to further initialize it. (This is all controlled by the __call__ method of the metaclass, by the way.) """ > ** There is no __module__ attribute. How does one identify the modules > holding the code? ** It's an extension type PyArray_Descr* in numpy/core/src/arrayobject.c . -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From aisaac at american.edu Wed Feb 22 11:09:04 2006 From: aisaac at american.edu (Alan G Isaac) Date: Wed Feb 22 11:09:04 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: > Does numpy have an built-in mechanism to shift elements along some > axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, > to make [2,3,0,1]) This sounds like the rotater command in GAUSS. As far as I know there is no equivalent in numpy. Please post your ultimate solution. Cheers, Alan Isaac From tim.hochberg at cox.net Wed Feb 22 11:30:17 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Wed Feb 22 11:30:17 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: <43FCBB6F.50401@cox.net> Alan G Isaac wrote: >On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: > > >>Does numpy have an built-in mechanism to shift elements along some >>axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, >>to make [2,3,0,1]) >> >> > >This sounds like the rotater command in GAUSS. >As far as I know there is no equivalent in numpy. >Please post your ultimate solution. > > If you need to roll just a few elements the following should work fairly efficiently. If you don't want to roll in place, you could instead copy A on the way in and return the modified copy. However, in that case, concatenating slices might be better. -------------------------------------------------------------------- import numpy def roll(A, n): "Roll the array A in place. Positive n -> roll right, negative n -> roll left" if n > 0: n = abs(n) temp = A[-n:] A[n:] = A[:-n] A[:n] = temp elif n < 0: n = abs(n) temp = A[:n] A[:-n] = A[n:] A[-n:] = temp else: pass A = numpy.arange(10) print A roll(A, 3) print A roll(A, -3) print A From mpi at osc.kiku.dk Wed Feb 22 11:41:18 2006 From: mpi at osc.kiku.dk (Mads Ipsen) Date: Wed Feb 22 11:41:18 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: On Wed, 22 Feb 2006, Alan G Isaac wrote: > On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: > > Does numpy have an built-in mechanism to shift elements along some > > axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, > > to make [2,3,0,1]) > > This sounds like the rotater command in GAUSS. > As far as I know there is no equivalent in numpy. > Please post your ultimate solution. > > Cheers, > Alan Isaac > Similar to cshift() (cyclic shift) in F90. Very nice for calculating finite differences, such as x' = ( cshift(x,+1) - cshift(x-1) ) / dx This would be a very handy feature indeed. // Mads From cwmoad at gmail.com Wed Feb 22 12:02:04 2006 From: cwmoad at gmail.com (Charlie Moad) Date: Wed Feb 22 12:02:04 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray In-Reply-To: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> Message-ID: <6382066a0602221201l37495d0fxb5a3fb78e1b28b8e@mail.gmail.com> Since no one has answered this, I am going to take a whack at it. Experts feel free to shoot me down. Here is a sample showing multiple inheritance with a mix of old style and new style classes. I don't claim there is any logic to the code, but it is just for demo purposes. -------------------------------------- from numpy import * class actImage: def __init__(self, colorOrder='RGBA'): self.colorOrder = colorOrder class Image(actImage, ndarray): def __new__(cls, shape=(1024,768), dtype=float32): return ndarray.__new__(cls, shape=shape, dtype=dtype) x = Image() assert isinstance(x[0,1], float32) assert x.colorOrder == 'RGBA' -------------------------------------- Running "help(ndarray)" has some useful info as well. - Charlie On 2/19/06, Robert Lupton wrote: > I have a swig extension that defines a class that inherits from > both a personal C-coded image struct (actImage), and also from > Numeric's UserArray. This works very nicely, but I thought that > it was about time to upgrade to numpy. > > The code looks like: > > from UserArray import * > > class Image(UserArray, actImage): > def __init__(self, *args): > actImage.__init__(self, *args) > UserArray.__init__(self, self.getArray(), 'd', copy=False, > savespace=False) > > I can't figure out how to convert this to use ndarray, as ndarray > doesn't > seem to have an __init__ method, merely a __new__. > > So what's the approved numpy way to handle multiple inheritance? > I've a nasty > idea that this is a python question that I should know the answer to, > but I'm > afraid that I don't... > > R > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From zpincus at stanford.edu Wed Feb 22 12:26:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed Feb 22 12:26:04 2006 Subject: [Numpy-discussion] Method to shift elements in an array? In-Reply-To: References: <1AC06527-8B7F-4FC4-B83E-462F31D3A431@stanford.edu> Message-ID: Here is my eventual solution. I'm not sure it's speed-optimal for even a python implementation, but it is terse. I agree that it might be nice to have this fast, and/or in C (I'm using it for finite difference and related things). def cshift(l, offset): offset %= len(l) return numpy.concatenate((l[-offset:], l[:-offset])) Zach On Feb 22, 2006, at 11:40 AM, Mads Ipsen wrote: > On Wed, 22 Feb 2006, Alan G Isaac wrote: > >> On Wed, 22 Feb 2006, Zachary Pincus apparently wrote: >>> Does numpy have an built-in mechanism to shift elements along some >>> axis in an array? (e.g. to "roll" [0,1,2,3] by some offset, here 2, >>> to make [2,3,0,1]) >> >> This sounds like the rotater command in GAUSS. >> As far as I know there is no equivalent in numpy. >> Please post your ultimate solution. >> >> Cheers, >> Alan Isaac >> > > Similar to cshift() (cyclic shift) in F90. Very nice for calculating > finite differences, such as > > x' = ( cshift(x,+1) - cshift(x-1) ) / dx > > This would be a very handy feature indeed. > > // Mads > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through > log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD > SPLUNK! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=103432&bid=230486&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From Chris.Barker at noaa.gov Wed Feb 22 12:27:11 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed Feb 22 12:27:11 2006 Subject: [Numpy-discussion] mixing arrays and matrices: squeeze yes, flattened no? In-Reply-To: <43FC799B.6040505@gmx.net> References: <43FB0A96.10803@gmx.net> <43FB6604.6000406@ieee.org> <43FB7C87.1010007@noaa.gov> <43FBA8B2.8010708@gmx.net> <43FBB65C.4040001@noaa.gov> <43FC799B.6040505@gmx.net> Message-ID: <43FCC8D9.6020508@noaa.gov> Sven Schreiber wrote: > - If I have >1 variable then everything is fine (provided I use your > advice of slicing instead of indexing afterwards) and the variables are > in the _columns_ of the 2d-array. > - But if there's just one data _column_ in the file, then pylab/numpy > gives me a 1d-array that sometimes works as a _row_ (and as you noted, > sometimes not), but never works as a column. > > Imho that's bad, because as a consequence I must use overhead code to > distinguish between these cases. I'd do that on load. You must have a way of knowing how many variables you're loading, so when it is one you can add this line: a.shape = (1,-1) and then proceed the same way after that. -chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cjw at sympatico.ca Wed Feb 22 13:19:05 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Wed Feb 22 13:19:05 2006 Subject: [Numpy-discussion] Constructor parameters - was Re: dtype In-Reply-To: References: <43FC8312.3040402@sympatico.ca> Message-ID: <43FCD521.2020007@sympatico.ca> Robert Kern wrote: >Colin J. Williams wrote: > > >>I've been trying to gain some understanding of dtype from the builtin >>documentation and would appreciate advice. >> >>I don't find anything in http://projects.scipy.org/scipy/numpy or >>http://wiki.python.org/moin/NumPy >> >>Chapter 2.1 of the book has a good overview, but little reference material. >> >>In the following, dt= numpy.dtype >> >>Some specific problems are flagged ** below. >> >>Colin W. >>[snip] >> >> >>| | ---------------------------------------------------------------------- >>| Data and other attributes defined here: >>| | __new__ = | >>T.__new__(S, ...) -> a new object with type S, a subtype of >>T ** What are the parameters? In other words, >>| >>what does ... stand for? ** >> >> > >http://www.python.org/2.2.3/descrintro.html#__new__ > >"""Recall that you create class instances by calling the class. When the class >is a new-style class, the following happens when it is called. First, the >class's __new__ method is called, passing the class itself as first argument, >followed by any (positional as well as keyword) arguments received by the >original call. This returns a new instance. Then that instance's __init__ method >is called to further initialize it. (This is all controlled by the __call__ >method of the metaclass, by the way.) >""" > > > >>** There is no __module__ attribute. How does one identify the modules >>holding the code? ** >> >> > >It's an extension type PyArray_Descr* in numpy/core/src/arrayobject.c . > > > Robert, Many thank for this. You have described the standard Python approach to constructing an instance. As I understand it, numpy uses the __new__ method, but not __init__, in most cases. My interest is in " any (positional as well as keyword) arguments". What should the user feed the constuctor? This isn't clear from the online documentation. From a Python user's point of view, the module holding the dtype class appears to be multiarray. The standard Python approach is to put the information in a __module__ attribute so that one doesn't have to go hunting around. Please see below. While on the subject of the Standand Python aproach, class names usually start with an upper case letter and the builtins have their own style, ListType etc. numpy equates ArrayType to ndarray but ArrayType is deprecated. Colin W. C:\>python Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy.core.multiarray as mu >>> dir(mu) ['_ARRAY_API', '__doc__', '__file__', '__name__', '__version__', '_fastCopyAndTranspose', '_flagdict ', '_get_ndarray_c_version', 'arange', 'array', 'bigndarray', 'broadcast', 'can_cast', 'concatenate' , 'correlate', 'dot', 'dtype', 'empty', 'error', 'flatiter', 'frombuffer', 'fromfile', 'fromstring', 'getbuffer', 'inner', 'lexsort', 'ndarray', 'newbuffer', 'register_dtype', 'scalar', 'set_numeric_o ps', 'set_string_function', 'set_typeDict', 'typeinfo', 'where', 'zeros'] >>> From robert.kern at gmail.com Wed Feb 22 14:11:05 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 14:11:05 2006 Subject: [Numpy-discussion] Re: Constructor parameters - was Re: dtype In-Reply-To: <43FCD521.2020007@sympatico.ca> References: <43FC8312.3040402@sympatico.ca> <43FCD521.2020007@sympatico.ca> Message-ID: Colin J. Williams wrote: > Robert, > > Many thank for this. You have described the standard Python approach to > constructing an instance. As I understand it, > numpy uses the __new__ method, but not __init__, in most cases. > > My interest is in " any (positional as well as keyword) arguments". > What should the user feed the constuctor? This isn't clear from the > online documentation. Look in the code. The PyArrayDescr_Type method table gives arraydescr_new() as the implementation of the tp_new slot (the C name for __new__). You can read the implementation for information. Patches for documentation will be gratefully accepted. That said: In [16]: a = arange(10) In [17]: a.dtype Out[17]: dtype('>i4') In [18]: dtype('>i4') Out[18]: dtype('>i4') If you want complete documentation on data-type descriptors, it's in Chapter 7 of Travis's book. > From a Python user's point of view, the module holding the dtype class > appears to be multiarray. > > The standard Python approach is to put the information in a __module__ > attribute so that one doesn't have to go hunting around. Please see below. dtype.__module__ (== 'numpy') tells you the canonical place to access it from Python code. It will never be able to tell you what C source file to look in. You'll have to break out grep no matter what. > While on the subject of the Standand Python aproach, class names usually > start with an upper case letter and the builtins have their own style, > ListType etc. numpy equates ArrayType to ndarray but ArrayType is > deprecated. ListType, TupleType et al. are also deprecated in favor of list and tuple, etc. But yes, we do use all lower-case names for classes. This is a conscious decision. It's just a style convention, just like PEP-8 is just a style convention for the standard library. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From zpincus at stanford.edu Wed Feb 22 19:00:06 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Wed Feb 22 19:00:06 2006 Subject: [Numpy-discussion] simple subclassing of ndarray Message-ID: Hello folks, I'm interested in creating a simple subclass of ndarray that just has a few additional methods. I've stared at defmatrix.py, but I'm not sure what is necessary to do. Specifically, I'm not sure how to get new instances of my subclass created properly. e.g.: numpy.matrix([1,2,3]) Out: matrix([[1, 2, 3]]) class m(numpy.ndarray): pass m([1,2,3]) Out: m([[[ 13691, 0, 0], [ 196608, 296292267, 296303312]]]) So clearly I need something else. Looking at the matrix class, it looks like I need a custom __new__ operator. However, looking at matrix's __new__ operator, I see a lot of complexity that I just don't understand. What's the minimum set of things I need in __new__ to get a proper constructor? Or perhaps there's a different and better way to construct instances of my subclass? Something akin to the 'array' function would be perfect. Now, how do I go about creating such a function (or getting 'array' to do it)? Can anyone give me any pointers here? Thanks, Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine From oliphant.travis at ieee.org Wed Feb 22 19:28:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 19:28:02 2006 Subject: [Numpy-discussion] simple subclassing of ndarray In-Reply-To: References: Message-ID: <43FD2B85.8080407@ieee.org> Zachary Pincus wrote: > Hello folks, > > I'm interested in creating a simple subclass of ndarray that just has > a few additional methods. I've stared at defmatrix.py, but I'm not > sure what is necessary to do. > > Specifically, I'm not sure how to get new instances of my subclass > created properly. > > e.g.: > numpy.matrix([1,2,3]) > Out: matrix([[1, 2, 3]]) > > class m(numpy.ndarray): > pass This is enough to define your own sub-class. Now, you need to determine what you want to do. You need to understand that m() is now analagous to numpy.ndarray() so you should look at the numpy.ndarray() docstring for the default arguments. The array() constructor is not the same thing as ndarray.__new__. Look at the ndarray docstring. help(ndarray) You need to define the __new__ method *not* the __init__ method. You could, of course, define an __init__ method if you want to, it's just not necessary. > Out: m([[[ 13691, 0, 0], > [ 196608, 296292267, 296303312]]]) You just created an empty array of shape (1,2,3). The first argument to the default constructor is the shape. > So clearly I need something else. Looking at the matrix class, it > looks like I need a custom __new__ operator. Yes, that is exactly right. > Or perhaps there's a different and better way to construct instances > of my subclass? Something akin to the 'array' function would be > perfect. Now, how do I go about creating such a function (or getting > 'array' to do it)? You could do array(obj).view(m) to get instances of your subclass. This will not call __new__ or __init__, but it will call __array_finalize__(self, obj) where obj is the ndarray constructed from [1,2,3]. Actually __array_finalize__ is called every time a sub-class is constructed and so it could be used to pass along meta-data (or enforce rank-2 as it does in the matrix class). -Travis From ndarray at mac.com Wed Feb 22 19:52:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 19:52:03 2006 Subject: [Numpy-discussion] Why floor and ceil change the type of the array? Message-ID: I was looking for a model to implement "round" in C and discovered that floor and ceil functions change the type of their arguments: >>> floor(array([1,2,3],dtype='i2')).dtype dtype('>> floor(array([1,2,3],dtype='i4')).dtype dtype(' References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> Message-ID: <43FD31FA.6030802@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >>David M. Cooke wrote: >> >> [SNIP] > >I've gone through your code you checked in, and fixed it up. Looks >good. One side effect is that > >def zl(x): > a = ones_like(x) > a[:] = 0 > return a > >is now faster than zeros_like(x) :-) > > I noticed that ones_like was faster than zeros_like, but I didn't think to try that. That's pretty impressive considering how ridicuously easy it was to write. >One problem I had is that in PyArray_SetNumericOps, the "copy" method >wasn't picked up on. It may be due to the order of initialization of >the ndarray type, or something (since "copy" isn't a ufunc, it's >initialized in a different place). I couldn't figure out how to fiddle >that, so I replaced the x.copy() call with a call to PyArray_Copy(). > > Interesting. It worked fine here. > > >>>Yes; because it's the implementation of __pow__, the second argument can >>>be anything. >>> >>> >>> >>> >>No, you misunderstand.. What I was talking about was that the *first* >>argument can also be something that's not a PyArrayObject, despite the >>functions signature. >> >> > >Ah, I suppose that's because the power slot in the number protocol >also handles __rpow__. > > That makes sense. It was giving me fits whatever the cause. [SNIP] > >>but the real fix would be to dig into >>PyArray_EnsureArray and see why it's slow for Python_Ints. It is much >>faster for numarray scalars. >> >> > >Right; that needs to be looked at. > > It doesn't look to bad. But I haven't had a chance to try to do anything about it yet. >>Another approach is to actually compute (x*x)*(x*x) for pow(x,4) at >>the level of array_power. I think I could make this work. It would >>probably work well for medium size arrays, but might well make things >>worse for large arrays that are limited by memory bandwidth since it >>would need to move the array from memory into the cache multiple times. >> >> > >I don't like that; I think it would be better memory-wise to do it >elementwise. Not just speed, but size of intermediate arrays. > > Yeah, for a while I was real hot on the idea since I could do everything without messing with ufuncs. But then I decided not to pursue it because I thought it would be slow because of memory usage -- it would be pullling data into the cache over and over again and I think that would slow things down a lot. [SNIP] >'int_power' we could do; that would the next step I think. The half >integer powers we could maybe leave; if you want x**(-3/2), for >instance, you could do y = x**(-1)*sqrt(x) (or do y = x**(-1); >sqrt(y,y) if you're worried about temporaries). > >Or, 'fast_power' could be documented as doing the optimizations for >integer and half-integer _scalar_ exponents, up to a certain size, >like 100), and falling back on pow() if necessary. I think we could do >a precomputation step to split the exponent into appropiate squarings >and such that'll make the elementwise loop faster. > There's a clever implementation of this in complexobject.c. Speaking of complexobject.c, I did implement fast integer powers for complex objects at the nc_pow level. For small powers at least, it's over 10 times as fast. And, since it's at the nc_pow level it works for matrix matrix powers as well. My implementation is arguably slightly faster than what's in complexobject, but I won't have a chance to check it in till next week -- I'm off for some snowboarding tomorrow. I kind of like power and scalar_power. Then ** could be advertised as calling scalar_power for scalars and power for arrays. Scalar power would do optimizations on integer and half_integer powers. Of course there's no real way to enforce that scalar power is passed scalars, since presumably it would be a ufunc, short of making _scalar_power a ufunc instead and doing something like: def scalar_power(x, y): "compute x**y, where y is a scalar optimizing integer and half integer powers possibly at some minor loss of accuracy" if not is_scalar(y): raise ValuerError("Naughty!!") return _scalar_power(x,y) > Half-integer >exponents are exactly representable as doubles (up to some number of >course), so there's no chance of decimal-to-binary conversions making >things look different. That might work out ok. Although, at that point >I'd suggest we make it 'power', and have 'rawpower' (or ????) as the >version that just uses pow(). > > >Another point is to look at __div__, and use reciprocal if the >dividend is 1. > > That would be easy, but wouldn't it be just as easy to optimize __div__ for scalar divisions. Should probably check that this isn't just as fast since it would be a lot more general. > >I've added a page to the developer's wiki at >http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas >to keep a list of areas like that to look into if someone has time :-) > > Ah, good plan. -tim From oliphant.travis at ieee.org Wed Feb 22 19:59:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 19:59:04 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray In-Reply-To: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> Message-ID: <43FD32E4.10600@ieee.org> Robert Lupton wrote: > I have a swig extension that defines a class that inherits from > both a personal C-coded image struct (actImage), and also from > Numeric's UserArray. This works very nicely, but I thought that > it was about time to upgrade to numpy. > > The code looks like: > > from UserArray import * > > class Image(UserArray, actImage): > def __init__(self, *args): > actImage.__init__(self, *args) > UserArray.__init__(self, self.getArray(), 'd', copy=False, > savespace=False) > > I can't figure out how to convert this to use ndarray, as ndarray > doesn't > seem to have an __init__ method, merely a __new__. Yes, the ndarray method doesn't have an __init__ method (so you don't have to call it). What you need to do is write a __new__ method for your class. However, with multiple-inheritance the details matter. You may actually want to have your C-coded actImage class inherit (in C) from the ndarray. If you would like help on that approach let me know (I'll need to understand your actImage a bit better). But, this can all be done in Python, too, but it is a bit of effort to make sure things get created correctly. Perhaps it might make sense to actually include a slightly modified form of the UserArray in NumPy as a standard "container-class" (instead of a sub-class) of the ndarray. In reality, a container class like UserArray and a sub-class are different things. Here's an outline of what you need to do. This is, of course, untested.... For example, I don't really know what actImage is. from numpy import ndarray, array class Image(ndarray, actImage): def __new__(subtype, *args) act1 = actImage.__new__(actImage, *args) actImage.__init__(act1, *args) arr = array(act1.getArray(), 'd', copy=False) self = arr.view(subtype) # you might need to copy attributes from act1 over to self here... return self The problem here, is that apparently you are creating the array first in actImage.__init__ and then passing it to UserArray. The ndarray constructor wants to either create the array itself or use a buffer-exposing object to use as the memory. Keep us posted as your example is a good one that can help us all learn. -Travis From robert.kern at gmail.com Wed Feb 22 20:08:04 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 20:08:04 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: Sasha wrote: > I was looking for a model to implement "round" in C and discovered > that floor and ceil functions change the type of their arguments: > >>>>floor(array([1,2,3],dtype='i2')).dtype > > dtype(' >>>>floor(array([1,2,3],dtype='i4')).dtype > > dtype(' > I know that this is the same behavior as in Numeric, but wouldn't it > be more natural if fllor and ceil return the argument unchanged (maybe > a copy) if it is already integer? Only if floor() and ceil() returned integer arrays when given floats as input. I presume there are good reasons for this, since it's the same behavior as the standard C functions. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From ndarray at mac.com Wed Feb 22 20:49:08 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 20:49:08 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: On 2/22/06, Robert Kern wrote: > Sasha wrote: > > ... wouldn't it > > be more natural if fllor and ceil return the argument unchanged (maybe > > a copy) if it is already integer? > > Only if floor() and ceil() returned integer arrays when given floats as input. I > presume there are good reasons for this, since it's the same behavior as the > standard C functions. C does not have ceil(int). It has double ceil(double x); float ceilf(float x); long double ceill(long double x); and neither of these functions change the type of the argument. Numpy's "around" is a noop on integers (even for decimals<0, but that's a different story. I cannot really think of any reason for the current numpy behaviour other than the consistency with transcendental functions. Speaking of which, can someone explain this: >>> sin(array(1,'h')).dtype dtype('>> sin(array(1,'i')).dtype dtype(' References: Message-ID: Sasha wrote: > On 2/22/06, Robert Kern wrote: > >>Sasha wrote: >> >>>... wouldn't it >>>be more natural if fllor and ceil return the argument unchanged (maybe >>>a copy) if it is already integer? >> >>Only if floor() and ceil() returned integer arrays when given floats as input. I >>presume there are good reasons for this, since it's the same behavior as the >>standard C functions. > > C does not have ceil(int). It has > > double ceil(double x); > float ceilf(float x); > long double ceill(long double x); > > and neither of these functions change the type of the argument. That's exactly what I meant. These functions only apply to integers through casting to the appropriate float type. That's precisely what numpy.floor() and numpy.ceil() do. Actually, I think the reasoning for having the float versions return floats instead of integers is that an integer-valued double is possibly out of range for an int or long on some platforms, so it's kept as a float. Since this obviously isn't a problem if the input is already an integer type, I don't have any particular objection to making floor() and ceil() return integers if their inputs are integers. > Numpy's "around" is a noop on integers (even for decimals<0, but > that's a different story. It's also a function and not a ufunc. > I cannot really think of any reason for the current numpy behaviour > other than the consistency with transcendental functions. It's simply the easiest thing to do with the ufunc machinery. > Speaking of > which, can someone explain this: > >>>>sin(array(1,'h')).dtype > > dtype(' >>>>sin(array(1,'i')).dtype > > dtype(' C99 defines three functions round, rint and nearbyint that are nearly identical. The only difference is in setting the inexact flag and respecting the rounding mode. Nevertheless, these functions differ significantly in their performance. I've wraped these functions into ufuncs and go the following timings: > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000)" "round(x)" 1000 loops, best of 3: 257 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000)" "nearbyint(x)" 1000 loops, best of 3: 654 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000)" "rint(x)" 10000 loops, best of 3: 103 usec per loop Similarly for single precision: > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000,dtype='f')" "round(x)" 10000 loops, best of 3: 182 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000,dtype='f')" "nearbyint(x)" 1000 loops, best of 3: 606 usec per loop > python -m timeit -s "from numpy import array, round, rint, nearbyint; x = array([1.5]*10000,dtype='f')" "rint(x)" 10000 loops, best of 3: 85.5 usec per loop Obviously, I will use rint in my ndarray.round implementation, however, it may be useful to provide all three as ufuncs. The only question is what name to use for round? 1) round (may lead to confusion with ndarray.round or built-in round) 2) roundint (too similar to rint) 3) round0 (ugly) Any suggestions? Another C99 function that may be worth including is "trunc". Any objections to adding it as a ufunc? From oliphant.travis at ieee.org Wed Feb 22 22:11:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 22:11:01 2006 Subject: [Numpy-discussion] Timings for various round functions In-Reply-To: References: Message-ID: <43FD51DE.4000400@ieee.org> Sasha wrote: >C99 defines three functions round, rint and nearbyint that are nearly >identical. The only difference is in setting the inexact flag and >respecting the rounding mode. Nevertheless, these functions differ >significantly in their performance. I've wraped these functions into >ufuncs and go the following timings: > > >Obviously, I will use rint in my ndarray.round implementation, >however, it may be useful to provide all three as ufuncs. > >The only question is what name to use for round? > >1) round (may lead to confusion with ndarray.round or built-in round) >2) roundint (too similar to rint) >3) round0 (ugly) > >Any suggestions? > >Another C99 function that may be worth including is "trunc". Any >objections to adding it as a ufunc? > > I think we have agreed that C99 functions are good candidates to become ufuncs. The only problem is figuring out what to do on platforms that don't define them. For example, we could define a separate module of C99 functions that is only available on certain platforms. -Travis From oliphant.travis at ieee.org Wed Feb 22 22:42:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 22:42:06 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class Message-ID: <43FD5914.4060506@ieee.org> The bigndarray class is going to disappear (probably in the next release of NumPy). It was a stop-gap measure as the future of 64-bit fixes in Python was unclear. Python 2.5 will have removed the 64-bit limitations that led to the bigndarray and so it will be removed. I have been thinking, however, of replacing it with a super-class that does not define the dimensions or strides. In other words, the default array would be just a block of memory. The standard array would inherit from the default and add dimension and strides pointers. I was thinking that this might make it easier for sub-classes using fixed-sized dimensions and strides. I'm not sure if that would actually be useful, but since I was thinking about the disappearance of the bigndarray, I thought I would ask for comments. -Travis From ndarray at mac.com Wed Feb 22 22:47:04 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 22:47:04 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: On 2/23/06, Robert Kern wrote: > > I cannot really think of any reason for the current numpy behaviour > > other than the consistency with transcendental functions. > > It's simply the easiest thing to do with the ufunc machinery. > That's what I had in mind with the curent rule the same code can be use for ceil as for sin. However, easiest to implement is not necessarily right. > > Speaking of > > which, can someone explain this: > > > >>>>sin(array(1,'h')).dtype > > > > dtype(' > > >>>>sin(array(1,'i')).dtype > > > > dtype(' > AFAICT, the story goes like this: sin() has two implementations, one for > single-precision floats and one for doubles. The ufunc machinery sees the int16 > and picks single-precision as the smallest type of the two that can fit an int16 > without losing precision. Naturally, you probably want the function to operate > in higher precision, but that's not really information that the ufunc machinery > knows about. According to your theory long (i8) integers should cast to long doubles, but >>> sin(array(0,'i8')).dtype dtype('>> exp(400) 5.2214696897641443e+173 >>> exp(array(400,'h')) inf From robert.kern at gmail.com Wed Feb 22 23:06:14 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed Feb 22 23:06:14 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: Sasha wrote: > On 2/23/06, Robert Kern wrote: >>AFAICT, the story goes like this: sin() has two implementations, one for >>single-precision floats and one for doubles. The ufunc machinery sees the int16 >>and picks single-precision as the smallest type of the two that can fit an int16 >>without losing precision. Naturally, you probably want the function to operate >>in higher precision, but that's not really information that the ufunc machinery >>knows about. > > According to your theory long (i8) integers should cast to long doubles, but > >>>>sin(array(0,'i8')).dtype > > dtype(' Given that python's floating point object is a double, I think it > would be natural to cast integer arguments to double for all sizes. I > would also think that in choosing the precision for a function it is > also important that the output fits into data type. I find the > following unfortunate: > >>>>exp(400) > > 5.2214696897641443e+173 > >>>>exp(array(400,'h')) > > inf I prefer consistent, predictable rules that are dependent on the input, not the output. If I want my outputs to be double precision, I will cast appopriately. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From ndarray at mac.com Wed Feb 22 23:21:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 23:21:03 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class In-Reply-To: <43FD5914.4060506@ieee.org> References: <43FD5914.4060506@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: > ... > I have been thinking, however, of replacing it with a super-class that > does not define the dimensions or strides. > Having a simple 1-d array in numpy would be great. In an ideal world I would rather see a 1-d array implemented in C together with a set of array operations that is rich enough to allow trivial implementation of ndarray in pure python. When you say "does not define the dimensions or strides" do you refer to python interface or to C struct? I thought python did not allow to add data members to object structs in subclasses. > In other words, the default array would be just a block of memory. The > standard array would inherit from the default and add dimension and > strides pointers. > If python lets you do it, how will that block of memory know its size? From oliphant.travis at ieee.org Wed Feb 22 23:23:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 23:23:06 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: References: Message-ID: <43FD62C2.7070101@ieee.org> Sasha wrote: >On 2/23/06, Robert Kern wrote: > > >>>I cannot really think of any reason for the current numpy behaviour >>>other than the consistency with transcendental functions. >>> >>> >>It's simply the easiest thing to do with the ufunc machinery. >> >> >>AFAICT, the story goes like this: sin() has two implementations, one for >>single-precision floats and one for doubles. The ufunc machinery sees the int16 >>and picks single-precision as the smallest type of the two that can fit an int16 >>without losing precision. Naturally, you probably want the function to operate >>in higher precision, but that's not really information that the ufunc machinery >>knows about. >> >> > >According to your theory long (i8) integers should cast to long doubles, but > > Robert is basically right, except there is a special case for long integers because long doubles are not cross platform. The relevant code is PyArray_CanCastSafely. This is basically the coercion rule table. You will notice the special checks for long double placed there after it was noticed that on 64-bit platforms long doubles were cropping up an awful lot and it was decided that because long doubles are not very ubiquitous (for example many platforms don't distinguish between long double and double), we should special-case the 64-bit integer rule. You can read about it in the archives if you want. >dtype(' > >Given that python's floating point object is a double, I think it >would be natural to cast integer arguments to double for all sizes. > Perhaps, but that is not what is done. I don't think it's that big a deal because to get "different size" integers you have to ask for them and then you should know that conversion to floating point is not necessarily a double. I think the only accetable direction to pursue is to raise an error and not do automatic upcasting if a ufunc does not have a definition for any of the given types. But, this is an old behavior from Numeric, and I would think such changes now would rightly be considered as gratuitous breakage. > I >would also think that in choosing the precision for a function it is >also important that the output fits into data type. > How do you propose to determine if the output fits into the data-type? Are you proposing to have different output rules for different functions. Sheer madness... The rules now are (relatively) simple and easy to program to. >I find the >following unfortunate: > > >>>>exp(400) >>>> >>>> >5.2214696897641443e+173 > > >>>>exp(array(400,'h')) >>>> >>>> >inf > > Hardly a good example. Are you also concerned about the following? >>> exp(1000) inf >>> exp(array(1000,'g')) 1.97007111401704699387e+434 From ndarray at mac.com Wed Feb 22 23:39:03 2006 From: ndarray at mac.com (Sasha) Date: Wed Feb 22 23:39:03 2006 Subject: [Numpy-discussion] Timings for various round functions In-Reply-To: <43FD51DE.4000400@ieee.org> References: <43FD51DE.4000400@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: > ... > I think we have agreed that C99 functions are good candidates to become > ufuncs. The only problem is figuring out what to do on platforms that > don't define them. > I was going to ask this question myself, but then realized that the answer is in the source code: for functions missing on a platform numpy provides its own implementations. (See for example a comment in umathmodule "if C99 extensions not available then define dummy functions...") I was going to just use rint instead of round and nearbyint on platforms that dont have them. > For example, we could define a separate module of C99 functions that is > only available on certain platforms. This is certainly the easiest to implement option, but we don't want make numpy users worry about portability of their code. From oliphant.travis at ieee.org Wed Feb 22 23:46:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed Feb 22 23:46:04 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class In-Reply-To: References: <43FD5914.4060506@ieee.org> Message-ID: <43FD680F.8040708@ieee.org> Sasha wrote: >On 2/23/06, Travis Oliphant wrote: > > >>... >>I have been thinking, however, of replacing it with a super-class that >>does not define the dimensions or strides. >> >> >> >Having a simple 1-d array in numpy would be great. In an ideal world >I would rather see a 1-d array implemented in C together with a set of >array operations that is rich enough to allow trivial implementation >of ndarray in pure python. > > You do realize that this is essentially numarray, right? And your dream of *rich enough* 1-d operations to allow *trivial* implementation may be a bit far-fetched, but I'm all for dreaming. >When you say "does not define the dimensions or strides" do you refer >to python interface or to C struct? I thought python did not allow to >add data members to object structs in subclasses. > > The C-struct. Yes, you can add data-members to object structs in sub-classes. Every single Python Object does it. The standard Python Object just defines PyObject_HEAD or PyObject_VAR_HEAD. This is actually the essence of inheritance in C and it is why subclasses written in C must have compatible memory layouts. The first part of the C-structure must be identical, but you can add to it all you want. It all comes down to: Can I cast to the base-type C-struct and have everything still work out when I dereference a particular field? This will be true if PyArrayObject is struct { PyBaseArrayObject int nd intp *dimensions intp *strides } I suppose we could change the int nd to intp nd and place it in the PyBaseArrayObject where it would be used as a length But, I don't really like that... >>In other words, the default array would be just a block of memory. The >>standard array would inherit from the default and add dimension and >>strides pointers. >> >> >> >If python lets you do it, how will that block of memory know its size? > > > > It won't of course by itself unless you add an additional size field. Thus, I'm not really sure whether it's a good idea or not. I don't like the idea of adding more and more fields to the basic C-struct that has been around for 10 years unless we have a good reason. The other issue is that the data-pointer doesn't always refer to memory that the ndarray has allocated, so it's actually incorrect to think of the ndarray as both the block of memory and the dimensioned indexing. The memory pointer is just that (a memory pointer). We are currently allowing ndarray's to create their own memory but that could easily change so that they always use some other object to allocate memory. In short, I don't see how to really do it so that the base object is actually useable. From ndarray at mac.com Thu Feb 23 00:42:05 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 23 00:42:05 2006 Subject: [Numpy-discussion] Re: Why floor and ceil change the type of the array? In-Reply-To: <43FD62C2.7070101@ieee.org> References: <43FD62C2.7070101@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: > How do you propose to determine if the output fits into the data-type? > Are you proposing to have different output rules for different > functions. Sheer madness... The rules now are (relatively) simple and > easy to program to. > I did not propose that. I just mentioned the output to argue that the rule to use minimal floating point type that can represent the input is an arbitrary one and no better than cast all integers to doubles. "Sheer madness...", however is too strong a characterization. Note that python (so far) changes the type of the result depending on its value in some cases: >>> type(2**30) >>> type(2**32) This is probably unacceptable to numpy for performance reasons, but it is not madness. Try to explain the following to someone who is used to python arithmetics: >>> 2*array(2**62)+array(2*2**62) 0L > Hardly a good example. Are you also concerned about the following? > > >>> exp(1000) > inf > > >>> exp(array(1000,'g')) > 1.97007111401704699387e+434 No, but I think it is because I am conditioned by C. To me exp() is a double-valued function that happened to work for ints with the help of an implicit cast. You may object that this is so because C does not allow function overloading, but C++ does overload exp so that exp((float)1) is float, and exp((long double)1) is long doubles but exp((short)1), exp((char)1) and exp((long long)1) are all double. Both numpy and C++ made an arbitrary design choice. I find C++ choice simpler and more natural, but I can live with the numpy choice once I've learned what it is. From josegomez at gmx.net Thu Feb 23 01:05:08 2006 From: josegomez at gmx.net (Jose Gomez-Dans) Date: Thu Feb 23 01:05:08 2006 Subject: [Numpy-discussion] Success with SVN on Cygwin Message-ID: Hi! A few days ago, I asked about compiling NumPy on Cygwin. Travis carried out some modifications, and with last night's SVN, I can happily report that it now compiles and works. The tests produced no errors, so it's all good :) Many thanks to all, and to Travis specially, for his really fast response. Many thanks! Jose From stefan at sun.ac.za Thu Feb 23 02:47:04 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 23 02:47:04 2006 Subject: [Numpy-discussion] Multiple inheritance from ndarray In-Reply-To: <43FD32E4.10600@ieee.org> References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> <43FD32E4.10600@ieee.org> Message-ID: <20060223104601.GB26706@alpha> On Wed, Feb 22, 2006 at 08:58:28PM -0700, Travis Oliphant wrote: > Here's an outline of what you need to do. This is, of course, > untested.... For example, I don't really know what actImage is. > > from numpy import ndarray, array > > class Image(ndarray, actImage): > def __new__(subtype, *args) > act1 = actImage.__new__(actImage, *args) > actImage.__init__(act1, *args) > arr = array(act1.getArray(), 'd', copy=False) > self = arr.view(subtype) > # you might need to copy attributes from act1 over to self here... > return self This is probably the right place to use super, i.e.: def __new__(subtype, *args): act1 = super(Image, subtype).__new__(subtype, *args) ... def __init__(self, *args): super(Image, self).__init__(*args) The attached script shows how multiple inheritance runs through different classes. St?fan -------------- next part -------------- A non-text attachment was scrubbed... Name: inh.py Type: text/x-python Size: 855 bytes Desc: not available URL: From fullung at gmail.com Thu Feb 23 03:32:03 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 03:32:03 2006 Subject: [Numpy-discussion] repmat equivalent? Message-ID: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Hello all I recently started using NumPy and one function that I am really missing from MATLAB/Octave is repmat. This function is very useful for implementing algorithms as matrix multiplications instead of for loops. Here's my first attempt at repmat for 1d and 2d (with some optimization by Stefan van der Walt): def repmat(a, m, n): if a.ndim == 1: a = array([a]) (origrows, origcols) = a.shape rows = origrows * m cols = origcols * n b = a.reshape(1,a.size).repeat(m, 0).reshape(rows, origcols).repeat(n, 0) return b.reshape(rows, cols) print repmat(array([[1,2],[3,4]]), 2, 3) produces: [[1 2 1 2 1 2] [3 4 3 4 3 4] [1 2 1 2 1 2] [3 4 3 4 3 4]] which is the same as in MATLAB. There are various issues with my function that I don't quite know how to solve: - How to handle scalar inputs (probably need asarray here) - How to handle more than 2 dimensions More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to agree on how more-dimensional data is organised? As such, I don't know what a NumPy user would expect repmat to do with more than 2 dimensions. Here are some test cases that the current repmat should pass, but doesn't: a = repmat(1, 1, 1) assert_equal(a, 1) a = repmat(array([1]), 1, 1) assert_array_equal(a, array([1])) a = repmat(array([1,2]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) a = repmat(array([[1,2],[3,4]]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [3,4,3,4,3,4], [1,2,1,2,1,2], [3,4,3,4,3,4]])) Any suggestions on how do repmat in NumPy would be appreciated. Regards Albert From cimrman3 at ntc.zcu.cz Thu Feb 23 03:47:04 2006 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu Feb 23 03:47:04 2006 Subject: [Numpy-discussion] system_info problem Message-ID: <43FDA0A8.6020105@ntc.zcu.cz> Hi, I am trying to (finally) move UMFPACK out of the sandbox to the scipy proper and so I need checking for its libraries via system_info.py of numpy distutils. I have added a new section to site.cfg: [umfpack] library_dirs = /home/share/software/packages/UMFPACK/UMFPACK/Lib:/home/share/software/packages/UMFPACK/AMD/Lib include_dirs = /home/share/software/packages/UMFPACK/UMFPACK/Include umfpack_libs = umfpack, amd the names of the libraries are libumfpack.a, libamd.a - they are correctly found in 'system_info._check_libs()' by 'self._lib_list(lib_dir, libs, exts)', but then the function fails, since 'len(found_libs) == len(libs)' which is wrong. Can some numpy.distutils expert help me? Below is the new umfpack_info class I have written using the blas_info class as template. yours clueless, r. PS: I do this because I prefer having the umfpack installed separately. It will be used, if present, to replace the default superLU-based sparse solver. Moving its sources under scipy/Lib/sparse would solve this issue, but Tim Davis recently changed the license of UMFPACK to GPL, and so the last version available for the direct inclusion is 4.4. (4.6 is the current one). Opinions are of course welcome. -- class umfpack_info(system_info): section = 'umfpack' dir_env_var = 'UMFPACK' _lib_names = ['umfpack', 'amd'] includes = 'umfpack.h' notfounderror = UmfpackNotFoundError def calc_info(self): info = {} lib_dirs = self.get_lib_dirs() print lib_dirs umfpack_libs = self.get_libs('umfpack_libs', self._lib_names) print umfpack_libs for d in lib_dirs: libs = self.check_libs(d,umfpack_libs,[]) print d, libs if libs is not None: dict_append(info,libraries=libs) break else: return include_dirs = self.get_include_dirs() print include_dirs h = (self.combine_paths(lib_dirs+include_dirs,includes) or [None])[0] if h: h = os.path.dirname(h) dict_append(info,include_dirs=[h]) print info self.set_info(**info) From stefan at sun.ac.za Thu Feb 23 03:58:03 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu Feb 23 03:58:03 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Message-ID: <20060223115630.GB28483@alpha> On Thu, Feb 23, 2006 at 01:31:47PM +0200, Albert Strasheim wrote: > More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to > agree on how more-dimensional data is organised? As such, I don't know > what a NumPy user would expect repmat to do with more than 2 > dimensions. To expand on this, here is what I see when I create (M,N,3) matrices in both octave and numpy. I expect to see an MxN matrix stacked 3 high: octave ------ octave:1> zeros(2,2,3)) ans = ans(:,:,1) = 0 0 0 0 ans(:,:,2) = 0 0 0 0 ans(:,:,3) = 0 0 0 0 numpy ----- In [19]: zeros((2,3,3)) Out[19]: array([[[0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 0, 0]]]) There is nothing wrong with numpy's array -- but the output generated seems counter-intuitive. St?fan From nadavh at visionsense.com Thu Feb 23 04:12:06 2006 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu Feb 23 04:12:06 2006 Subject: [Numpy-discussion] repmat equivalent? Message-ID: <07C6A61102C94148B8104D42DE95F7E8C8EF3F@exchange2k.envision.co.il> You should really use the "repeat" function. Nadav. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net on behalf of Albert Strasheim Sent: Thu 23-Feb-06 13:31 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] repmat equivalent? Hello all I recently started using NumPy and one function that I am really missing from MATLAB/Octave is repmat. This function is very useful for implementing algorithms as matrix multiplications instead of for loops. Here's my first attempt at repmat for 1d and 2d (with some optimization by Stefan van der Walt): def repmat(a, m, n): if a.ndim == 1: a = array([a]) (origrows, origcols) = a.shape rows = origrows * m cols = origcols * n b = a.reshape(1,a.size).repeat(m, 0).reshape(rows, origcols).repeat(n, 0) return b.reshape(rows, cols) print repmat(array([[1,2],[3,4]]), 2, 3) produces: [[1 2 1 2 1 2] [3 4 3 4 3 4] [1 2 1 2 1 2] [3 4 3 4 3 4]] which is the same as in MATLAB. There are various issues with my function that I don't quite know how to solve: - How to handle scalar inputs (probably need asarray here) - How to handle more than 2 dimensions More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to agree on how more-dimensional data is organised? As such, I don't know what a NumPy user would expect repmat to do with more than 2 dimensions. Here are some test cases that the current repmat should pass, but doesn't: a = repmat(1, 1, 1) assert_equal(a, 1) a = repmat(array([1]), 1, 1) assert_array_equal(a, array([1])) a = repmat(array([1,2]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) a = repmat(array([[1,2],[3,4]]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [3,4,3,4,3,4], [1,2,1,2,1,2], [3,4,3,4,3,4]])) Any suggestions on how do repmat in NumPy would be appreciated. Regards Albert ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From fullung at gmail.com Thu Feb 23 04:37:04 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 04:37:04 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <07C6A61102C94148B8104D42DE95F7E8C8EF3F@exchange2k.envision.co.il> References: <07C6A61102C94148B8104D42DE95F7E8C8EF3F@exchange2k.envision.co.il> Message-ID: <5eec5f300602230436h7655781dq8b34fa926e7e613f@mail.gmail.com> Hello The problem is that repeat is not the same as repmat. For example: >> repmat([1 2; 3 4], 2, 1) ans = 1 2 3 4 1 2 3 4 In [12]: repeat(array([[1, 2],[3,4]]), 2) Out[12]: array([[1, 2], [1, 2], [3, 4], [3, 4]]) How can I use the repeat function as is to accomplish this? Thanks. Regards Albert On 2/23/06, Nadav Horesh wrote: > You should really use the "repeat" function. > > Nadav. From fullung at gmail.com Thu Feb 23 04:41:13 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 04:41:13 2006 Subject: [Numpy-discussion] Re: repmat equivalent? In-Reply-To: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Message-ID: <5eec5f300602230440p75c9e542p4fc54864bd78d408@mail.gmail.com> Hello all Just to clear up any confusion: On 2/23/06, Albert Strasheim wrote: > Here are some test cases that the current repmat should pass, but doesn't: > > a = repmat(1, 1, 1) > assert_equal(a, 1) > a = repmat(array([1]), 1, 1) > assert_array_equal(a, array([1])) > a = repmat(array([1,2]), 2, 3) > assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) > a = repmat(array([[1,2],[3,4]]), 2, 3) > assert_array_equal(a, array([[1,2,1,2,1,2], [3,4,3,4,3,4], > [1,2,1,2,1,2], [3,4,3,4,3,4]])) Only the first two tests fail. The other two pass. Presumably any test that uses a matrix with more than 2 dimensions will also fail. Regards Albert From fullung at gmail.com Thu Feb 23 04:46:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 04:46:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <20060223115630.GB28483@alpha> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <20060223115630.GB28483@alpha> Message-ID: <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> Hello all On 2/23/06, Stefan van der Walt wrote: > On Thu, Feb 23, 2006 at 01:31:47PM +0200, Albert Strasheim wrote: > > More than 2 dimensions is tricky, since NumPy and MATLAB don't seem to > > agree on how more-dimensional data is organised? As such, I don't know > > what a NumPy user would expect repmat to do with more than 2 > > dimensions. > > To expand on this, here is what I see when I create (M,N,3) matrices > in both octave and numpy. I expect to see an MxN matrix stacked 3 > high: There are other (unexpected, for me at least) differences between MATLAB/Octave and NumPy too. For a 3D array in MATLAB, only indexing on the last dimension yields a 2D array, where NumPy always returns a 2D array. I put some examples for the 3D case at: http://students.ee.sun.ac.za/~albert/numscipy.html Regards Albert From wbaxter at gmail.com Thu Feb 23 05:07:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 05:07:03 2006 Subject: [Numpy-discussion] Timings for various round functions In-Reply-To: <43FD51DE.4000400@ieee.org> References: <43FD51DE.4000400@ieee.org> Message-ID: On 2/23/06, Travis Oliphant wrote: >Sasha wrote: >I think we have agreed that C99 functions are good candidates to become >ufuncs. The only problem is figuring out what to do on platforms that >don't define them. >For example, we could define a separate module of C99 functions that is >only available on certain platforms. Might this be some help? http://mega-nerd.com/FPcast/ --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From NadavH at VisionSense.com Thu Feb 23 08:15:18 2006 From: NadavH at VisionSense.com (Nadav Horesh) Date: Thu Feb 23 08:15:18 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> Message-ID: <43FDFB80.9090106@VisionSense.com> An HTML attachment was scrubbed... URL: From tim.hochberg at cox.net Thu Feb 23 08:29:06 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Thu Feb 23 08:29:06 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43FD31FA.6030802@cox.net> References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FD31FA.6030802@cox.net> Message-ID: <43FDE27C.9040606@cox.net> I had some free time some morning, so I merged the nc_pow optimizations for integral powers in and committed them to the power_optimization branch. They could probably use more testing, but I thought someone might like to take a look while I'm out of town. Also, if your looking for a way to do powers as succusive multiplies for fast_power or scalar_power or whatnot, starting with the algorithm for nc_pow would probably be a good place to start. Hmm. Looking at this now, I realize I'm shadowing the input complex number 'a', with a local 'a'. Too much mindless copying from complexobject. That should be fixed, but I can't do it right now. Enjoy, -tim From faltet at carabos.com Thu Feb 23 09:04:03 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu Feb 23 09:04:03 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: Message-ID: <200602231803.27417.faltet@carabos.com> Hi Sasha, A Dissabte 18 Febrer 2006 20:49, Sasha va escriure: > I have reviewed mailing list discussions of rank-0 arrays vs. scalars > and I concluded that the current implementation that contains both is > (almost) correct. I will address the "almost" part with a concrete > proposal at the end of this post (search for PROPOSALS if you are only > interested in the practical part). > > The main criticism of supporting both scalars and rank-0 arrays is > that it is "unpythonic" in the sense that it provides two almost > equivalent ways to achieve the same result. However, I am now > convinced that this is the case where practicality beats purity. It's a bit late, but I want to support your proposal (most of it). I've also come to the conclusion that scalars and rank-0 arrays should coexist. This is something that appears as a natural fact when you have to deal regularly with general algorithms for treat objects with different shapes. And I think you have put this very well. Thanks, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From Chris.Barker at noaa.gov Thu Feb 23 09:20:05 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu Feb 23 09:20:05 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <20060223115630.GB28483@alpha> <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> Message-ID: <43FDEEA8.9060901@noaa.gov> Albert Strasheim wrote: > There are other (unexpected, for me at least) differences between > MATLAB/Octave and NumPy too. First: numpy is not, and was never intended to be, a MATLAB clone, work-alike, whatever. You should *expect* there to be differences. > For a 3D array in MATLAB, only indexing > on the last dimension yields a 2D array, where NumPy always returns a > 2D array. I think the key here is that MATLAB's core data type is a matrix, which is 2-d. The ability to do 3-d arrays was added later, and it looks like they are still preserving the core matrix concept, so that a 3-d array is not really a 3-d array; it is, as someone on this thread mentioned, a "stack" of matrices. In numpy, the core data type is an n-d array. That means that there is nothing special about 2-d vs 4-d vs whatever, except 0-d (scalars). So a 3-d array is a cube shape, that you might want to pull a 2-d array out of it in any orientation. There's nothing special about which axis you're indexing. For that reason, it's very important that indexing any axis will give you the same rank array. Here's the rule: -- indexing reduces the rank by 1, regardless of which axis is being indexed. >>> import numpy as N >>> a = N.zeros((2,3,4)) >>> a array([[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]]) >>> a[0,:,:] array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]) >>> a[:,0,:] array([[0, 0, 0, 0], [0, 0, 0, 0]]) >>> a[:,:,0] array([[0, 0, 0], [0, 0, 0]]) -- slicing does not reduce the rank: >>> a[:,0:1,:] array([[[0, 0, 0, 0]], [[0, 0, 0, 0]]]) >>> a[:,0:1,:].shape (2, 1, 4) It's actually very clean, logical, and useful. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bblais at bryant.edu Thu Feb 23 09:35:11 2006 From: bblais at bryant.edu (Brian Blais) Date: Thu Feb 23 09:35:11 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <43FDFB80.9090106@VisionSense.com> References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> <43FDFB80.9090106@VisionSense.com> Message-ID: <43FDF1B8.3070608@bryant.edu> Nadav Horesh wrote: > It is slower. > > I did a little study on this issue since I got into the issue of > algorithms that can not be easily vectorized (like this one). > On my PC an outer loop step took initially 17.3 seconds, and some > optimization brought it down to ~11 seconds. The dot product consumed > about 1/3 of the time. I estimate that objects creation/destruction > consumes most of the cpu time. It seems that this way comes nowhere near > cmex speed. I suspect that maybe blitz/boost may bridge the gap. > yeah, I realized that pure python would be too slow, because I ran into the exact same problem with matlab scripting. these time-dependent loops are really a mess when it comes to speed optimization. After posting to the Pyrex list, someone pointed out that my loop variables had not been declared as c datatypes. so, I had loops like: for it from 0 <= it <= 1000: for i from 0 <= i <= 100: (stuff) and the "it" and "i" were being treated, due to my oversight, as python variables. for speed, you need to have all the variables in the loop as c datatypes. just putting a line in like: cdef int it,i increases the speed from 8 seconds per block to 0.2 seconds per block, which is comparable to the mex. I learned that I have to be a bit more careful! :) thanks, Brian Blais -- ----------------- bblais at bryant.edu http://web.bryant.edu/~bblais From oliphant.travis at ieee.org Thu Feb 23 10:27:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 10:27:02 2006 Subject: [Numpy-discussion] Success with SVN on Cygwin In-Reply-To: References: Message-ID: <43FDFE3C.8070700@ieee.org> Jose Gomez-Dans wrote: >Hi! >A few days ago, I asked about compiling NumPy on Cygwin. Travis carried out some >modifications, and with last night's SVN, I can happily report that it now >compiles and works. The tests produced no errors, so it's all good :) > > That's good news. I wish our unit test coverage was wide enough that this actually meant that all is good :-) But, it's a good start. Thanks are really do the cygwin ports people who already had a patch (though they didn't let us know about it --- I found it using Google). I just incorporated their patch into the build tree. I'm glad to know that it works. -Travis From cjw at sympatico.ca Thu Feb 23 11:20:04 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Thu Feb 23 11:20:04 2006 Subject: [Numpy-discussion] A question Message-ID: <43FE0A87.8050907@sympatico.ca> Bar is a sub-class of ArrayType (ndarray) and bar is an instance of Bar [Dbg]>>> self.bar Bar([ 5, 1, 1, 11, 3, 7, 14, 0, 5, 2, 4, 12, 9, 10, 4, 12], dtype=uint16) [Dbg]>>> z= self.bar [Dbg]>>> z 1 << Is this expected? [Dbg]>>> type(self.bar) [Dbg]>>> self.bar.__class__.base [Dbg]>>> Colin W. From vidar+list at 37mm.no Thu Feb 23 12:02:02 2006 From: vidar+list at 37mm.no (Vidar Gundersen) Date: Thu Feb 23 12:02:02 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? Message-ID: (i've been updating the cross reference of MATLAB synonymous commands in Numeric Python to NumPy. I've kept Numeric/numarray alternatives in the source XML, but omitted it in the PDF outputs. see, http://37mm.no/download/matlab-python-xref.pdf. feedback is highly appreciated.) as i was working on this, i started wondering why a.max(0), a.min(0), a.ptp(0), a.flatten(0), ... does not allow the axis=0 keyword argument used with the exact same meaning for: m.mean(axis=0), m.sum(axis=0), ... and i also wonder why concatenate can't be used to stack 1-d arrays on top of each other, returning a 2-d array? axis relates to the number of axes in the original array(s)? n [3]: v = arange(9) In [7]: concatenate((v,v)) Out[7]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8]) In [8]: concatenate((v,v),axis=0) Out[8]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8]) In [15]: concatenate((v,v)).reshape(2,-1) Out[15]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8], [0, 1, 2, 3, 4, 5, 6, 7, 8]]) In [5]: m = v.reshape(3,-1) In [10]: concatenate((m,m)) Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: concatenate((m,m), axis=0) Out[11]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [12]: concatenate((m,m), axis=1) Out[12]: array([[0, 1, 2, 0, 1, 2], [3, 4, 5, 3, 4, 5], [6, 7, 8, 6, 7, 8]]) From fullung at gmail.com Thu Feb 23 12:12:06 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 12:12:06 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FDEEA8.9060901@noaa.gov> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <20060223115630.GB28483@alpha> <5eec5f300602230445m4acae431u6ec3cc19ed8d20b0@mail.gmail.com> <43FDEEA8.9060901@noaa.gov> Message-ID: <5eec5f300602231211i23f1f199of0dd76105f6a2666@mail.gmail.com> Hello all On 2/23/06, Christopher Barker wrote: > Albert Strasheim wrote: > > There are other (unexpected, for me at least) differences between > > MATLAB/Octave and NumPy too. > > First: numpy is not, and was never intended to be, a MATLAB clone, > work-alike, whatever. You should *expect* there to be differences. I understand this. As a new user, I'm trying to understand these differences. > > For a 3D array in MATLAB, only indexing > > on the last dimension yields a 2D array, where NumPy always returns a > > 2D array. > > I think the key here is that MATLAB's core data type is a matrix, which > is 2-d. The ability to do 3-d arrays was added later, and it looks like > they are still preserving the core matrix concept, so that a 3-d array > is not really a 3-d array; it is, as someone on this thread mentioned, a > "stack" of matrices. > > In numpy, the core data type is an n-d array. That means that there is > nothing special about 2-d vs 4-d vs whatever, except 0-d (scalars). So a > 3-d array is a cube shape, that you might want to pull a 2-d array out > of it in any orientation. There's nothing special about which axis > you're indexing. For that reason, it's very important that indexing any > axis will give you the same rank array. > > Here's the rule: > > -- indexing reduces the rank by 1, regardless of which axis is being > indexed. Thanks for your comments. These cleared up a few questions I had about NumPy's design. However, I'm still wondering how the average NumPy user would expect repmat implemented for NumPy to behave with arrays with more than 2 dimensions. I would like to clear this up, since I think that a good repmat function is an essential tool for implementing algorithms that use matrix multiplication instead of for loops to perform operations (hopefully with a significant speed increase). If there is another way of accomplishing this, I would love to know. Regards Albert From oliphant.travis at ieee.org Thu Feb 23 12:26:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 12:26:06 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: <43FE1A38.8060101@ieee.org> Vidar Gundersen wrote: >(i've been updating the cross reference of MATLAB synonymous >commands in Numeric Python to NumPy. I've kept Numeric/numarray >alternatives in the source XML, but omitted it in the PDF outputs. >see, http://37mm.no/download/matlab-python-xref.pdf. >feedback is highly appreciated.) > > >as i was working on this, i started wondering why > >a.max(0), a.min(0), a.ptp(0), a.flatten(0), ... > >does not allow the axis=0 keyword argument used with >the exact same meaning for: > > It's actually consistent. These only have a single argument and so don't use keywords. But, I can see now that it might be nice to have keywords even if there is only one argument. Feel free to submit a patch. >m.mean(axis=0), m.sum(axis=0), ... > > These have multiple arguments, so the keywords are important. > >and i also wonder why concatenate can't be used to stack 1-d >arrays on top of each other, returning a 2-d array? >axis relates to the number of axes in the original array(s)? > > Because it's ambiguous what you mean to do. 1-d arrays only have a single axis. How do you propose to tell concatenate to alter the shape of the output array and in what direction? We've left it to the user to do that, like you do in the second example. When you have more than one dimension on input, then it is clear what you mean by "stack" along an axis. With only one-dimension, it isn't clear what is meant. From oliphant.travis at ieee.org Thu Feb 23 12:34:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 12:34:02 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> Message-ID: <43FE1BFA.9070709@ieee.org> Albert Strasheim wrote: >Hello all > >I recently started using NumPy and one function that I am really >missing from MATLAB/Octave is repmat. This function is very useful for >implementing algorithms as matrix multiplications instead of for >loops. > > There is a function in scipy.linalg called kron that could be brought over which can do a repmat. In file: /usr/lib/python2.4/site-packages/scipy/linalg/basic.py def kron(a,b): """kronecker product of a and b Kronecker product of two matrices is block matrix [[ a[ 0 ,0]*b, a[ 0 ,1]*b, ... , a[ 0 ,n-1]*b ], [ ... ... ], [ a[m-1,0]*b, a[m-1,1]*b, ... , a[m-1,n-1]*b ]] """ if not a.flags['CONTIGUOUS']: a = reshape(a, a.shape) if not b.flags['CONTIGUOUS']: b = reshape(b, b.shape) o = outerproduct(a,b) o=o.reshape(a.shape + b.shape) return concatenate(concatenate(o, axis=1), axis=1) Thus, kron(ones((2,3)), arr) >>> sl.kron(ones((2,3)),arr) array([[1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4], [1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4]]) gives you the equivalent of repmat(arr, 2,3) We could bring this over from scipy into numpy as it is simple enough. It has a multidimensional extension (i.e. you can pass in a and b as higher dimensional arrays), But, don't ask me to explain it to you because I can't without further study.... -Travis From robert.kern at gmail.com Thu Feb 23 12:44:04 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu Feb 23 12:44:04 2006 Subject: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: Vidar Gundersen wrote: > and i also wonder why concatenate can't be used to stack 1-d > arrays on top of each other, returning a 2-d array? Use vstack() for that. Also note its companions, hstack() and column_stack(). -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From Norbert.Nemec.list at gmx.de Thu Feb 23 12:44:05 2006 From: Norbert.Nemec.list at gmx.de (Norbert Nemec) Date: Thu Feb 23 12:44:05 2006 Subject: [Numpy-discussion] Up-to-date bugtracker for NumPy? Message-ID: <43FE1E73.8020103@gmx.de> Hi there, is there any bugtracker for NumPy actually in active use? Sourceforge has one for numarray and one for "numpy", but the latter one contains only old bugs (probably for Numeric?) Greetings, Norbert From Norbert.Nemec.list at gmx.de Thu Feb 23 12:54:03 2006 From: Norbert.Nemec.list at gmx.de (Norbert Nemec) Date: Thu Feb 23 12:54:03 2006 Subject: [Numpy-discussion] Up-to-date bugtracker for NumPy? In-Reply-To: <43FE1E73.8020103@gmx.de> References: <43FE1E73.8020103@gmx.de> Message-ID: <43FE20C7.90001@gmx.de> Guess the question was answered before I even asked it: The bug that I reported two hours ago has already been fixed and closed by Travis. Amazing reaction time! Norbert Nemec wrote: >Hi there, > >is there any bugtracker for NumPy actually in active use? Sourceforge >has one for numarray and one for "numpy", but the latter one contains >only old bugs (probably for Numeric?) > >Greetings, >Norbert > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > From robert.kern at gmail.com Thu Feb 23 13:17:06 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu Feb 23 13:17:06 2006 Subject: [Numpy-discussion] Re: Up-to-date bugtracker for NumPy? In-Reply-To: <43FE1E73.8020103@gmx.de> References: <43FE1E73.8020103@gmx.de> Message-ID: Norbert Nemec wrote: > Hi there, > > is there any bugtracker for NumPy actually in active use? Sourceforge > has one for numarray and one for "numpy", but the latter one contains > only old bugs (probably for Numeric?) http://projects.scipy.org/scipy/numpy Click "New Ticket" up at the top to enter a new bug. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From fullung at gmail.com Thu Feb 23 13:22:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 13:22:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FE1BFA.9070709@ieee.org> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> Message-ID: <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com> Hello all On 2/23/06, Travis Oliphant wrote: > Albert Strasheim wrote: > > >Hello all > > > >I recently started using NumPy and one function that I am really > >missing from MATLAB/Octave is repmat. This function is very useful for > >implementing algorithms as matrix multiplications instead of for > >loops. > > > > > There is a function in scipy.linalg called kron that could be brought > over which can do a repmat. > Thus, > > kron(ones((2,3)), arr) > > >>> sl.kron(ones((2,3)),arr) > array([[1, 2, 1, 2, 1, 2], > [3, 4, 3, 4, 3, 4], > [1, 2, 1, 2, 1, 2], > [3, 4, 3, 4, 3, 4]]) > > gives you the equivalent of > > repmat(arr, 2,3) Thanks! Merging this into numpy would be much appreciated. Stefan van der Walt did some benchmarks and this approach seems faster than anything we managed for 2D arrays. However, I'm a bit concerned about the ones(n,m) that is needed by this implementation. It seems to me that this would double the memory requirements of this repmat function, which is fine when working with small matrices, but could be a problem with larger ones. Any thoughts? Regards Albert From fullung at gmail.com Thu Feb 23 13:23:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Thu Feb 23 13:23:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FE1BFA.9070709@ieee.org> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> Message-ID: <5eec5f300602231322r2793cdd9r9d3d8310a8fec559@mail.gmail.com> Hello On 2/23/06, Travis Oliphant wrote: > Albert Strasheim wrote: > > >Hello all > > > >I recently started using NumPy and one function that I am really > >missing from MATLAB/Octave is repmat. This function is very useful for > >implementing algorithms as matrix multiplications instead of for > >loops. > > > > > There is a function in scipy.linalg called kron that could be brought > over which can do a repmat. I quickly tried a few of my test cases with following implementation of repmat: from numpy import asarray from scipy.linalg import kron def repmat(a, m, n): a = asarray(a) return kron(ones((m, n)), a) This test: a = repmat(1, 1, 1) assert_equal(a, 1) fails with: ValueError: 0-d arrays can't be concatenated and this test: a = repmat(array([1,2]), 2, 3) assert_array_equal(a, array([[1,2,1,2,1,2], [1,2,1,2,1,2]])) fails with: AssertionError: Arrays are not equal (shapes (12,), (2, 6) mismatch) Regards Albert From alex.liberzon at gmail.com Thu Feb 23 15:46:08 2006 From: alex.liberzon at gmail.com (Alex Liberzon) Date: Thu Feb 23 15:46:08 2006 Subject: [Numpy-discussion] repmat equivalent Message-ID: <775f17a80602231545i21053caat78d9d87189d8f1c4@mail.gmail.com> I am also mostly Matlab user and I like repmat() a lot. I just realized that in SciPy, i am confused about NumPy/SciPy, but it is possible in both :-)) it is much, much easier. Just use r_[a,a] and c_[a,a] and you get a concatination like repmat() does. if you need 'm' times row concatenation of matrix, you can use (sorry for the ugly way, Pythoners): eval('r_['+m*'a,'+']') then, the repmat is just: (qute, isn't it) def repmat(a,m,n): from scipy import r_, c_ a = eval('r_['+m*'a,'+']') return eval('c_['+n*'a,'+']') the test is: >>> from scipy import * numerix Numeric 24.2 >>> a = array([[0,1],[2,3]]) >>> a array([[0, 1], [2, 3]]) >>> repmat(a,2,3) array([[0, 1, 0, 1, 0, 1], [2, 3, 2, 3, 2, 3], [0, 1, 0, 1, 0, 1], [2, 3, 2, 3, 2, 3]]) Best, Alex Liberzon From gruben at bigpond.net.au Thu Feb 23 15:54:02 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 23 15:54:02 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: <43FE4B00.7070306@bigpond.net.au> Hi Vidar, the pdf file appears to be broken. I get an error when I try to open it. Have you thought of a nice way to generate html from the xml source to incorporate this into the scipy website? I don't think it should be part of the wiki. We'd need a way of making the xml editable via a wiki interface and automatically generating multiple views or something. regards, Gary Vidar Gundersen wrote: > (i've been updating the cross reference of MATLAB synonymous > commands in Numeric Python to NumPy. I've kept Numeric/numarray > alternatives in the source XML, but omitted it in the PDF outputs. > see, http://37mm.no/download/matlab-python-xref.pdf. > feedback is highly appreciated.) From cookedm at physics.mcmaster.ca Thu Feb 23 16:15:01 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 23 16:15:01 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: <43FD31FA.6030802@cox.net> (Tim Hochberg's message of "Wed, 22 Feb 2006 20:54:34 -0700") References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FD31FA.6030802@cox.net> Message-ID: Tim Hochberg writes: > David M. Cooke wrote: > >>Tim Hochberg writes: >> >>>David M. Cooke wrote: >>> >>> > [SNIP] > >> >>I've gone through your code you checked in, and fixed it up. Looks >>good. One side effect is that >> >>def zl(x): >> a = ones_like(x) >> a[:] = 0 >> return a >> >>is now faster than zeros_like(x) :-) >> >> > I noticed that ones_like was faster than zeros_like, but I didn't > think to try that. That's pretty impressive considering how > ridicuously easy it was to write. Might be useful to move zeros_like and empty_like to ufuncs :-) (Well, they'd be better written as regular C functions, though.) >>One problem I had is that in PyArray_SetNumericOps, the "copy" method >>wasn't picked up on. It may be due to the order of initialization of >>the ndarray type, or something (since "copy" isn't a ufunc, it's >>initialized in a different place). I couldn't figure out how to fiddle >>that, so I replaced the x.copy() call with a call to PyArray_Copy(). >> >> > Interesting. It worked fine here. Actually, it works fine in the sense that it works. However, if you time it, it was obvious that it wasn't using an optimized version (x**1 was as slow as x**1.1). > I kind of like power and scalar_power. Then ** could be advertised as > calling scalar_power for scalars and power for arrays. Scalar power > would do optimizations on integer and half_integer powers. Of course > there's no real way to enforce that scalar power is passed scalars, > since presumably it would be a ufunc, short of making _scalar_power a > ufunc instead and doing something like: > > def scalar_power(x, y): > "compute x**y, where y is a scalar optimizing integer and half > integer powers possibly at some minor loss of accuracy" > if not is_scalar(y): raise ValuerError("Naughty!!") > return _scalar_power(x,y) I'm tempted to make it have the same signature as power, but call power if passed an array (or, at the ufunc level, if the stride for the second argument is non-zero). >>Another point is to look at __div__, and use reciprocal if the >>dividend is 1. >> >> > That would be easy, but wouldn't it be just as easy to optimize > __div__ for scalar divisions. Should probably check that this isn't > just as fast since it would be a lot more general. Hmm, scalar division and multiplication could both be speed up: In [36]: a = arange(10000, dtype=float) In [37]: %time for i in xrange(100000): a * 1.0 CPU times: user 3.30 s, sys: 0.00 s, total: 3.30 s Wall time: 3.39 In [38]: b = array([1.]) In [39]: %time for i in xrange(100000): a * b CPU times: user 2.63 s, sys: 0.00 s, total: 2.63 s Wall time: 2.71 The differences in times are probably due to creating an array to hold 1.0. When I have time, I'll look at the ufunc machinery. Since ufuncs are just passed pointers to data and strides, there's no reason (besides increasing complexity ;-) to build an ndarray object for scalars. Alternatively, allow passing scalars to ufuncs: you could define a ufunc (like our scalar_power) to take an array argument and a scalar argument. Or, power could be defined to take (array, array) or (array, scalar), and the ufunc machinery would choose the appropiate one. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From vidar+list at 37mm.no Thu Feb 23 16:18:00 2006 From: vidar+list at 37mm.no (Vidar Gundersen) Date: Thu Feb 23 16:18:00 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: <43FE4B00.7070306@bigpond.net.au> (Gary Ruben's message of "Fri, 24 Feb 2006 10:53:36 +1100") References: <43FE4B00.7070306@bigpond.net.au> Message-ID: ===== Original message from Gary Ruben | 24 Feb 2006: > the pdf file appears to be broken. I get an error when I try to open it. sorry, i've uploaded the files again, and tested it, so hopefully it will work for you now. > Have you thought of a nice way to generate html from the xml source to > incorporate this into the scipy website? that's easy, i'll do it tomorrow. > We'd need a way of making the xml editable via a wiki interface > and automatically generating multiple views or something. hmmm... :) From ndarray at mac.com Thu Feb 23 16:45:04 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 23 16:45:04 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: Rint ufunc and ndarray metod "round" are in svn. x.round(...) is about 20x faster than around(x) for decimals=0 and about 6x faster for decimals>0. The case decimals<0 is slower on integers, but it actually does something :-) From alex.liberzon at gmail.com Thu Feb 23 16:48:03 2006 From: alex.liberzon at gmail.com (Alex Liberzon) Date: Thu Feb 23 16:48:03 2006 Subject: [Numpy-discussion] RE: repmat equivalent Message-ID: <775f17a80602231647g1b26ac7vef5a9ec22b99d86d@mail.gmail.com> Another thought: def repmat(a,m,n): from scipy import hstack, vstack a = eval('hstack(('+n*'a,'+'))') return eval('vstack(('+m*'a,'+'))') may hstack() and vstack() be better for 1d arrays? >>> from scipy import * numerix Numeric 24.2 >>> a = array([1,2]) >>> repmat(a,2,3) array([[1, 2, 1, 2, 1, 2], [1, 2, 1, 2, 1, 2]]) >>> equal(repmat(1,1,1),1) array([ [1]],'b') of course, scipy.linalg.kron(ones((m,n,p,...)),a) is more robust and works for higher dimensions. probably it's the best. From gruben at bigpond.net.au Thu Feb 23 17:00:04 2006 From: gruben at bigpond.net.au (Gary Ruben) Date: Thu Feb 23 17:00:04 2006 Subject: [Numpy-discussion] inconsistent use of axis= keyword argument? In-Reply-To: References: <43FE4B00.7070306@bigpond.net.au> Message-ID: <43FE5A61.2080004@bigpond.net.au> >> Have you thought of a nice way to generate html from the xml source to >> incorporate this into the scipy website? > > that's easy, i'll do it tomorrow. Hi Vidar, I can open it now. Excellent. I can still see a couple of references to numarray and Numeric in there which should be removed if possible, including in the title. Regarding an html version: It would be nice to gerenate as much cross-reference material as possible out of the XML source. I don't think there will be space for all the alternatives on a single html page, so maybe you could generate separate html files for numpy versus Matlab/Octave, numpy versus IDL, numpy versus R. Have you kept the Numeric version in the source? If so, maybe you could also generate numpy versus Numeric. Another idea is to put the numpy column in one frame and all the others together in a frame next to it which can be scrolled sideways to reveal the other environment of choice. Another more difficult idea is to put some javascript in it to allow selection but this is probably not worth the effort. >> We'd need a way of making the xml editable via a wiki interface >> and automatically generating multiple views or something. > > hmmm... :) Yes; hmmm is right. Actually, I'm sure that's a bad idea. It's better if you maintain control of the original, otherwise it will lose utility for you as a general purpose cross reference which we, the lucky numpy users get a side benefit from. Gary R. From oliphant.travis at ieee.org Thu Feb 23 17:28:06 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 17:28:06 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: Message-ID: <43FE60E1.5040701@ieee.org> Sasha wrote: >Rint ufunc and ndarray metod "round" are in svn. x.round(...) is >about 20x faster than around(x) for decimals=0 and about 6x faster for >decimals>0. The case decimals<0 is slower on integers, but it >actually does something :-) > > Great job. Thanks for adding this, Sasha... I think many will enjoy using it. Regarding portability: On my system rint says it conforms to BSD 4.3. How portable is that? Can anyone try it out on say the MSVC compiler for windows? -Travis From wbaxter at gmail.com Thu Feb 23 17:30:19 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 17:30:19 2006 Subject: [Numpy-discussion] help(xxx) vs print(xxx.__doc__) Message-ID: Can someone explain why help(numpy.r_) doesn't contain all the information in print(numpy.r_.__doc__)? Namely you don't get the helpful example showing usage with 'help' that you get with '.__doc__'. I'd rather be able to use 'help' as the one-stop shop for built-in documentation. It's less typing and just looks nicer. Thanks, --Bill -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 23 17:36:09 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 17:36:09 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: <43FE60E1.5040701@ieee.org> References: <43FE60E1.5040701@ieee.org> Message-ID: There doesn't seem to be any rint() round() or nearestint() defined in MSVC 7.1. Can't find it in an MSDN search either. I think that's why a lot of people in the game biz, at least, use that lrintf function written using intrinsics that I posted a link to earlier. I first heard about that on the gd-algorithms mailing list. --Bill On 2/24/06, Travis Oliphant wrote: > > Sasha wrote: > > >Rint ufunc and ndarray metod "round" are in svn. x.round(...) is > >about 20x faster than around(x) for decimals=0 and about 6x faster for > >decimals>0. The case decimals<0 is slower on integers, but it > >actually does something :-) > > > > > Great job. Thanks for adding this, Sasha... > > I think many will enjoy using it. > > Regarding portability: On my system rint says it conforms to BSD 4.3. > How portable is that? > > Can anyone try it out on say the MSVC compiler for windows? > > -Travis > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- William V. Baxter III OLM Digital Kono Dens Building Rm 302 1-8-8 Wakabayashi Setagaya-ku Tokyo, Japan 154-0023 +81 (3) 3422-3380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Feb 23 17:47:15 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu Feb 23 17:47:15 2006 Subject: [Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster? In-Reply-To: References: <43FE60E1.5040701@ieee.org> Message-ID: Generally, C99 support in MSVC.NET is pretty much nil, except for maybe support for "inline" (which MS had already added prior to the C99 standard). This wikipedia article links to a quote from the Visual C++ program manager at Microsoft saying "In general we don't see a lot of demand for C99 features": http://en.wikipedia.org/wiki/C_programming_language#C99 So it's not clear the situtation will change any time soon. I don't know if VC8 is any better in its C99 support. I doubt it. Wikipedia says Borland is dragging their feet too. --Bill On 2/24/06, Bill Baxter wrote: > > There doesn't seem to be any rint() round() or nearestint() defined in > MSVC 7.1. Can't find it in an MSDN search either. I think that's why a > lot of people in the game biz, at least, use that lrintf function written > using intrinsics that I posted a link to earlier. I first heard about that > on the gd-algorithms mailing list. > > --Bill > > On 2/24/06, Travis Oliphant wrote: > > > > Sasha wrote: > > > > >Rint ufunc and ndarray metod "round" are in svn. x.round(...) is > > >about 20x faster than around(x) for decimals=0 and about 6x faster for > > >decimals>0. The case decimals<0 is slower on integers, but it > > >actually does something :-) > > > > > > > > Great job. Thanks for adding this, Sasha... > > > > I think many will enjoy using it. > > > > Regarding portability: On my system rint says it conforms to BSD 4.3. > > How portable is that? > > > > Can anyone try it out on say the MSVC compiler for windows? > > > > -Travis > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryanlists at gmail.com Thu Feb 23 18:09:00 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu Feb 23 18:09:00 2006 Subject: [Numpy-discussion] array.argmax() question Message-ID: Is this the expected behavior for array.argmax(): ipdb> abs(real(disps)).max() Out[38]: 1.7373584411866401e-05 ipdb> abs(real(disps)).argmax() Out[38]: 32 ipdb> shape(disps) Out[38]: (11, 3) ipdb> disps[11,1] *** IndexError: invalid index ipdb> disps[10,1] Out[38]: 0j ipdb> disps[10,2] Out[38]: (-1.7373584411866401e-05+5.2046737124258386e-21j) Basically, I want to find the element with the largest absolute value in a matrix and use it to scale by. But I need to correct for the possibility that the largest abs value may be from a negative number. So, I need to get the corresponding element itself. My array is shape (11,3) and argmax without an axis argument returns 32, which would be the index if the matrix was reshaped into a (33,) vector. Is there a clean way to extract the element based on the output of argmax? (and in my case it is actually using the output of argmax to extract the element from the matrix without the abs). Or do I need to reshape the matrix into a vector first? Thanks, Ryan From ryanlists at gmail.com Thu Feb 23 18:11:06 2006 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu Feb 23 18:11:06 2006 Subject: [Numpy-discussion] Re: array.argmax() question In-Reply-To: References: Message-ID: Is the right answer to just use flatten()? i.e. ind=abs(mymat).argmax() maxelem=mymat.flatten()[ind] On 2/23/06, Ryan Krauss wrote: > Is this the expected behavior for array.argmax(): > > ipdb> abs(real(disps)).max() > Out[38]: 1.7373584411866401e-05 > ipdb> abs(real(disps)).argmax() > Out[38]: 32 > ipdb> shape(disps) > Out[38]: (11, 3) > ipdb> disps[11,1] > *** IndexError: invalid index > ipdb> disps[10,1] > Out[38]: 0j > ipdb> disps[10,2] > Out[38]: (-1.7373584411866401e-05+5.2046737124258386e-21j) > > Basically, I want to find the element with the largest absolute value > in a matrix and use it to scale by. But I need to correct for the > possibility that the largest abs value may be from a negative number. > So, I need to get the corresponding element itself. > > My array is shape (11,3) and argmax without an axis argument returns > 32, which would be the index if the matrix was reshaped into a (33,) > vector. Is there a clean way to extract the element based on the > output of argmax? (and in my case it is actually using the output of > argmax to extract the element from the matrix without the abs). Or do > I need to reshape the matrix into a vector first? > > Thanks, > > Ryan > From cookedm at physics.mcmaster.ca Thu Feb 23 18:28:00 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 23 18:28:00 2006 Subject: [Numpy-discussion] help(xxx) vs print(xxx.__doc__) In-Reply-To: (Bill Baxter's message of "Fri, 24 Feb 2006 10:29:20 +0900") References: Message-ID: "Bill Baxter" writes: > Can someone explain why help(numpy.r_) doesn't contain all the information in > print(numpy.r_.__doc__)? > > Namely you don't get the helpful example showing usage with 'help' that you get > with '.__doc__'. > > I'd rather be able to use 'help' as the one-stop shop for built-in > documentation. It's less typing and just looks nicer. Huh, odd. Note that in IPython, numpy.r_? and numpy.r_.__doc__ give the same results. And I thought I was being clever when I rewrote numpy.r_ :-) Looks like help() looks at the class __doc__ first, while IPython looks at the object's __doc__ first. I've fixed this in svn. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From steve at shrogers.com Thu Feb 23 18:34:04 2006 From: steve at shrogers.com (Steven H. Rogers) Date: Thu Feb 23 18:34:04 2006 Subject: [Numpy-discussion] Thoughts on an ndarray super-class In-Reply-To: <43FD5914.4060506@ieee.org> References: <43FD5914.4060506@ieee.org> Message-ID: <43FE709C.8040701@shrogers.com> I don't have an immediate use for this, but if available, I expect that it would be used. Steve //////////////////////// Travis Oliphant wrote: > > The bigndarray class is going to disappear (probably in the next release > of NumPy). It was a stop-gap measure as the future of 64-bit fixes in > Python was unclear. Python 2.5 will have removed the 64-bit limitations > that led to the bigndarray and so it will be removed. > I have been thinking, however, of replacing it with a super-class that > does not define the dimensions or strides. > In other words, the default array would be just a block of memory. The > standard array would inherit from the default and add dimension and > strides pointers. > > I was thinking that this might make it easier for sub-classes using > fixed-sized dimensions and strides. I'm not sure if that would actually > be useful, but since I was thinking about the disappearance of the > bigndarray, I thought I would ask for comments. > > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > -- Steven H. Rogers, Ph.D., steve at shrogers.com Weblog: http://shrogers.com/weblog "He who refuses to do arithmetic is doomed to talk nonsense." -- John McCarthy From ndarray at mac.com Thu Feb 23 19:28:03 2006 From: ndarray at mac.com (Sasha) Date: Thu Feb 23 19:28:03 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <200602231803.27417.faltet@carabos.com> References: <200602231803.27417.faltet@carabos.com> Message-ID: On 2/23/06, Francesc Altet wrote: > It's a bit late, but I want to support your proposal (most of it). You are not late -- you are the first to reply! When you say "most of it," is there anything in particular that you don't like? > I've also come to the conclusion that scalars and rank-0 arrays should > coexist. This is something that appears as a natural fact when you > have to deal regularly with general algorithms for treat objects with > different shapes. And I think you have put this very well. Thanks for your kind words. If we agree to legitimize rank-0 arrays, maybe we should start by removing conversion to scalars from ufuncs. Currently: >>> type(array(2)*2) I believe it should result in a rank-0 array instead. I've recently wrote ndarray round function and that code illustrates the problem of implicite scalar conversion: ret = PyNumber_Multiply((PyObject *)a, f); if (ret==NULL) {Py_DECREF(f); return NULL;} if (PyArray_IsScalar(ret, Generic)) { /* array scalars cannot be modified inplace */ PyObject *tmp; tmp = PyObject_CallFunction(n_ops.rint, "O", ret); Py_DECREF(ret); ret = PyObject_CallFunction(n_ops.divide, "OO", tmp, f); Py_DECREF(tmp); } else { PyObject_CallFunction(n_ops.rint, "OO", ret, ret); PyObject_CallFunction(n_ops.divide, "OOO", ret, f, ret); } From oliphant.travis at ieee.org Thu Feb 23 21:08:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 21:08:01 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: <200602231803.27417.faltet@carabos.com> Message-ID: <43FE9478.9040309@ieee.org> Sasha wrote: >On 2/23/06, Francesc Altet wrote: > > >>It's a bit late, but I want to support your proposal (most of it). >> >> > >You are not late -- you are the first to reply! When you say "most of >it," is there anything in particular that you don't like? > > Usually nobody has a strong opinion on these issues until they encounter something they don't like. I think many are still trying to understand what a rank-0 array is. >>>>type(array(2)*2) >>>> >>>> > > >I believe it should result in a rank-0 array instead. > > Can you be more precise about when rank-0 array should be returned and when scalars should be? >I've recently wrote ndarray round function and that code illustrates >the problem of implicite scalar conversion: > > I think we will have issues no matter what because rank-0 arrays and scalars have always been with us. We just need to nail down some rules for when they will show up and live by them. Right now the rule is basically: rank-0 arrays become array-scalars all the time. The exceptions are rank0.copy() rank0.view() array(5) scalar.__array__() one_el.shape=() If you can come up with a clear set of rules for when rank-0 arrays should show up and when scalars should show up, then we will understand better what you want to do. -Travis From oliphant.travis at ieee.org Thu Feb 23 21:34:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu Feb 23 21:34:01 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: Message-ID: <43FE9A87.3050104@ieee.org> Sasha wrote: >The main criticism of supporting both scalars and rank-0 arrays is >that it is "unpythonic" in the sense that it provides two almost >equivalent ways to achieve the same result. However, I am now >convinced that this is the case where practicality beats purity. > > I think most of us agree that both will be with us for the indefinite future. >The situation with ndarrays is somewhat similar. A rank-N array is >very similar to a function with N arguments, where each argument has a >finite domain (i-th domain of a is range(a.shape[i])). A rank-0 array >is just a function with no arguments and as such it is quite different >from a scalar. > I can buy this view. Nicely done. >Just as a function with no arguments cannot be >replaced by a constant in the case when a value returned may change >during the run of the program, rank-0 array cannot be replaced by an >array scalar because it is mutable. (See >http://projects.scipy.org/scipy/numpy/wiki/ZeroRankArray for use >cases). > >Rather than trying to hide rank-0 arrays from the end-user and treat >it as an implementation artifact, I believe numpy should emphasize the >difference between rank-0 arrays and scalars and have clear rules on >when to use what. > > I agree. The problem is what should the rules be. Right now, there are no clear rules other than rank-0 arrays --- DONT. You make a case that we should not be so hard on rank-0 arrays. >PROPOSALS >========== > >Here are three suggestions: > >1. Probably the most controversial question is what getitem should >return. I believe that most of the confusion comes from the fact that >the same syntax implements two different operations: indexing and >projection (for the lack of better name). Using the analogy between >ndarrays and functions, indexing is just the application of the >function to its arguments and projection is the function projection >((f, x) -> lambda (*args): f(x, *args)). > >The problem is that the same syntax results in different operations >depending on the rank of the array. > >Let > > >>>>x = ones((2,2)) >>>>y = ones(2) >>>> >>>> > >then x[1] is projection and type(x[1]) is ndarray, but y[1] is >indexing and type(y[1]) is int32. Similarly, y[1,...] is indexing, >while x[1,...] is projection. > >I propose to change numpy rules so that if ellipsis is present inside >[], the operation is always projection and both y[1,...] and >x[1,1,...] return zero-rank arrays. Note that I have previously >rejected Francesc's idea that x[...] and x[()] should have different >meaning for zero-rank arrays. I was wrong. > > I think this is a good and clear rule. And it seems like we may be "almost" there. Anybody want to implement it? >2. Another source of ambiguity is the various "reduce" operations such >as sum or max. Using the previous example, type(x.sum(axis=0)) is >ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: > > a. Make x.sum(axis) return ndarray unless axis is None, making >type(y.sum(axis=0)) is ndarray true in the example. > > > Hmm... I'm not sure. y.sum(axis=0) is the default spelling of sum(y). Thus, this would cause all old code to return a rank-0 array. Most people who write sum(y) want a scalar, not a "function with 0 arguments" > b. Allow axis to be a sequence of ints and make >x.sum(axis=range(rank(x))) return rank-0 array according to the rule >2.a above. > > So, this would sum over multiple axes? I guess I'm not opposed to something like that, but I'm not really excited about it either. Would that make sense for all methods that take the axis= argument? > c. Make x.sum() raise an error for rank-0 arrays and scalars, but >allow x.sum(axis=()) to return x. This will make numpy sum consistent >with the built-in sum that does not work on scalars. > > > I don't think I like this at all. This proposal has more far-reaching implications (and would require more code changes --- though the axis= arguments do have a converter function and so would not be as painful as one might imagine). In short, I don't feel as enthused about portion 2 of your proposal. >3. This is a really small change currently > > >>>>empty(()) >>>> >>>> >array(0) > >but > > > > >I propose to make shape=() valid in ndarray constructor. > > +1 I think we need more thinking about rank-0 arrays before doing something like proposal 2. However, 1 and 3 seem simple enough to move forward with... -Travis From cookedm at physics.mcmaster.ca Thu Feb 23 22:03:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Thu Feb 23 22:03:02 2006 Subject: [Numpy-discussion] optimizing power() for complex and real cases In-Reply-To: (David M. Cooke's message of "Thu, 23 Feb 2006 19:14:11 -0500") References: <43F20BFE.5030100@cox.net> <43F36CB6.5050004@cox.net> <43F3B8A9.3000507@bigpond.net.au> <43F3CA12.4000907@cox.net> <43F62069.80209@cox.net> <43F7C73B.2000806@cox.net> <20060220003714.GB15783@arbutus.physics.mcmaster.ca> <43F938FA.80200@cox.net> <43FD31FA.6030802@cox.net> Message-ID: cookedm at physics.mcmaster.ca (David M. Cooke) writes: > Hmm, scalar division and multiplication could both be speed up: > > In [36]: a = arange(10000, dtype=float) > In [37]: %time for i in xrange(100000): a * 1.0 > CPU times: user 3.30 s, sys: 0.00 s, total: 3.30 s > Wall time: 3.39 > In [38]: b = array([1.]) > In [39]: %time for i in xrange(100000): a * b > CPU times: user 2.63 s, sys: 0.00 s, total: 2.63 s > Wall time: 2.71 > > The differences in times are probably due to creating an array to hold > 1.0. > > When I have time, I'll look at the ufunc machinery. Since ufuncs are > just passed pointers to data and strides, there's no reason (besides > increasing complexity ;-) to build an ndarray object for scalars. I've had a look: basically, if you pass 1.0, say, to a ufunc, it ends up going through PyArray_FromAny. This did checks for the array interface first (looking for attributes __array_shape__, __array_typestr__, __array_struct__, __array__, etc.). These would always fail for Python scalar types. I special-cased Python scalar types (bool, int, long, float, complex) in PyArray_FromAny so they are checked for first. This *does* have the side effect that if you have a subclass of one of these that does define the array interface, that interface is not used. If anybody's worried about that, well...tough :-) Give me a reasonable test case for subclassing a Python scalar and adding the array interface. This gives me the times In [1]: a = arange(10000, dtype=float) In [2]: %time for i in xrange(100000): a * 1.0 CPU times: user 2.76 s, sys: 0.00 s, total: 2.76 s Wall time: 2.85 In [3]: b = array([1.]) In [4]: %time for i in xrange(100000): a * b CPU times: user 2.69 s, sys: 0.00 s, total: 2.69 s Wall time: 2.76 The overhead of a * 1.0 is 3% compared to a * b here, as opposed to 25% in my last set of numbers. [for those jumping in, this is all still in the power_optimization branch] -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From stefan at sun.ac.za Fri Feb 24 00:13:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 24 00:13:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com> Message-ID: <20060224081132.GB13312@alpha> On Thu, Feb 23, 2006 at 11:21:39PM +0200, Albert Strasheim wrote: > > Thus, > > > > kron(ones((2,3)), arr) > > > > >>> sl.kron(ones((2,3)),arr) > > array([[1, 2, 1, 2, 1, 2], > > [3, 4, 3, 4, 3, 4], > > [1, 2, 1, 2, 1, 2], > > [3, 4, 3, 4, 3, 4]]) > > > > gives you the equivalent of > > > > repmat(arr, 2,3) > > Thanks! Merging this into numpy would be much appreciated. Stefan van > der Walt did some benchmarks and this approach seems faster than > anything we managed for 2D arrays. My benchmark was wrong -- this function is not as fast as the version Albert previously proposed. Below follows the benchmark of seven possible repmat functions: --------------------------------------------------------------------------- 0 : 1.09316706657 (Albert) 1 : 6.15612506866 (Stefan) 2 : 5.21671295166 (Stefan) 3 : 2.78160500526 (Stefan) 4 : 1.20426011086 (Albert Optimised) 5 : 11.0923781395 (Travis) 6 : 3.47499799728 (Alex) --------------------------------------------------------------------------- 0 : 1.17543005943 1 : 6.03165698051 2 : 5.7597899437 3 : 2.40381717682 4 : 1.09497308731 5 : 11.6657807827 6 : 7.11567497253 --------------------------------------------------------------------------- 0 : 2.03999996185 1 : 9.87535595894 2 : 8.86893296242 3 : 4.56993699074 4 : 2.02298903465 5 : 22.8858327866 6 : 10.7882151604 --------------------------------------------------------------------------- I attach the code. St?fan -------------- next part -------------- A non-text attachment was scrubbed... Name: repmat.py Type: text/x-python Size: 1437 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: repmat_bench.py Type: text/x-python Size: 682 bytes Desc: not available URL: From faltet at carabos.com Fri Feb 24 00:53:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 24 00:53:02 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE9478.9040309@ieee.org> References: <43FE9478.9040309@ieee.org> Message-ID: <200602240952.42091.faltet@carabos.com> A Divendres 24 Febrer 2006 06:07, Travis Oliphant va escriure: > Sasha wrote: > >>>>type(array(2)*2) > > > > > > > >I believe it should result in a rank-0 array instead. Yes, this sounds reasonable, IMHO. > Right now the rule is basically: > > rank-0 arrays become array-scalars all the time. > > The exceptions are > > rank0.copy() > rank0.view() > array(5) > scalar.__array__() > one_el.shape=() > > If you can come up with a clear set of rules for when rank-0 arrays > should show up and when scalars should show up, then we will understand > better what you want to do. Yeah. I think Travis is right. A set of rules clearly stating the situations where rank-0 arrays have to become scalars maybe worth the effort. Travis already has the above list and now is just a matter of discovering other exceptions that should go there. So, please, if anybody comes with more exceptions, go ahead and propose them. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From faltet at carabos.com Fri Feb 24 01:01:02 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 24 01:01:02 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: References: Message-ID: <200602241000.25090.faltet@carabos.com> A Dissabte 18 Febrer 2006 20:49, Sasha va escriure: > PROPOSALS > ========== > > Here are three suggestions: > > 1. Probably the most controversial question is what getitem should > return. I believe that most of the confusion comes from the fact that > the same syntax implements two different operations: indexing and > projection (for the lack of better name). Using the analogy between > ndarrays and functions, indexing is just the application of the > function to its arguments and projection is the function projection > ((f, x) -> lambda (*args): f(x, *args)). > > The problem is that the same syntax results in different operations > depending on the rank of the array. > > Let > > >>> x = ones((2,2)) > >>> y = ones(2) > > then x[1] is projection and type(x[1]) is ndarray, but y[1] is > indexing and type(y[1]) is int32. Similarly, y[1,...] is indexing, > while x[1,...] is projection. > > I propose to change numpy rules so that if ellipsis is present inside > [], the operation is always projection and both y[1,...] and > x[1,1,...] return zero-rank arrays. Note that I have previously > rejected Francesc's idea that x[...] and x[()] should have different > meaning for zero-rank arrays. I was wrong. +1 (if I want to be consequent ;-) I guess that this would imply that: In [19]: z=numpy.array(1) In [20]: type(z[()]) Out[20]: In [21]: type(z[...]) Out[21]: isn't it? > 2. Another source of ambiguity is the various "reduce" operations such > as sum or max. Using the previous example, type(x.sum(axis=0)) is > ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: > > a. Make x.sum(axis) return ndarray unless axis is None, making > type(y.sum(axis=0)) is ndarray true in the example. > > b. Allow axis to be a sequence of ints and make > x.sum(axis=range(rank(x))) return rank-0 array according to the rule > 2.a above. > > c. Make x.sum() raise an error for rank-0 arrays and scalars, but > allow x.sum(axis=()) to return x. This will make numpy sum consistent > with the built-in sum that does not work on scalars. Well, to say the truth, I've not a strong opinion on this one (this is why I "mostly" supported your proposal ;-), but I think that if Travis has reasons to oppose to it, we should listen to him. > 3. This is a really small change currently > I propose to make shape=() valid in ndarray constructor. +1 Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From wbaxter at gmail.com Fri Feb 24 01:04:03 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Fri Feb 24 01:04:03 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky Message-ID: Multiplying a matrix times a scalar seems to return junk for some reason: >>> A = numpy.asmatrix(numpy.rand(1,2)) >>> A matrix([[ 0.30604211, 0.98475225]]) >>> A * 0.2 matrix([[ 6.12084210e-002, 7.18482614e-290]]) >>> 0.2 * A matrix([[ 6.12084210e-002, 7.18482614e-290]]) >>> numpy.__version__ '0.9.5' --billyb -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Fri Feb 24 01:22:07 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri Feb 24 01:22:07 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE6595.7050309@sympatico.ca> References: <200602231803.27417.faltet@carabos.com> <43FE6595.7050309@sympatico.ca> Message-ID: <200602241021.27150.faltet@carabos.com> A Divendres 24 Febrer 2006 02:47, v?reu escriure: > Could these be considered as dimensionless, to avoid having to explain > to people that the word rank doesn't have the same meaning as the matrix > rank? Colin Williams was proposing calling arrays coming from array(5) as 'dimensionless'. So, for the moment, we have three ways to name such a beasts: - 'rank-0' - '0-dimensional' or '0-dim' for short - 'dimensionless' Perhaps the time has come to choose a number for them (if we have to live with such array for a long time, as it seems to be the case). However, as more I think on this the more I'm convinced that, similarly to their higher dimension counterparts, we will not arrive to any definite agreement and people will continue to call them in any way he would be more comfortable with. IMO, this should be not a problem at all because all three words express a 'lack of dimensionality'. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From oliphant.travis at ieee.org Fri Feb 24 01:59:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 01:59:02 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: Message-ID: <43FED896.3060301@ieee.org> Bill Baxter wrote: > Multiplying a matrix times a scalar seems to return junk for some reason: > > >>> A = numpy.asmatrix(numpy.rand(1,2)) > >>> A > matrix([[ 0.30604211, 0.98475225]]) > >>> A * 0.2 > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> 0.2 * A > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> numpy.__version__ Unfortunately, there are still some bugs in the scalar multiplication section of _dotblas.c stemming from a re-write that allows discontiguous matrices. We are still tracking down the problems. Hopefully this should be fixed soon. -Travis From svetosch at gmx.net Fri Feb 24 02:25:04 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Fri Feb 24 02:25:04 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <43FE1BFA.9070709@ieee.org> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> Message-ID: <43FEDEC7.5080901@gmx.net> Travis Oliphant schrieb: > There is a function in scipy.linalg called kron that could be brought > over which can do a repmat. > > def kron(a,b): > """kronecker product of a and b That would be great, I have been missing it in numpy already! (Because scipy is rather big, I'd like to avoid depending on it for such things.) So please do bring it over. Thanks, Sven From oliphant.travis at ieee.org Fri Feb 24 02:53:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 02:53:01 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: Message-ID: <43FEE539.9010606@ieee.org> Bill Baxter wrote: > Multiplying a matrix times a scalar seems to return junk for some reason: > > >>> A = numpy.asmatrix(numpy.rand(1,2)) > >>> A > matrix([[ 0.30604211, 0.98475225]]) > >>> A * 0.2 > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> 0.2 * A > matrix([[ 6.12084210e-002, 7.18482614e-290]]) > >>> numpy.__version__ > '0.9.5' > This should be fixed in SVN. -Travis From cjw at sympatico.ca Fri Feb 24 04:18:09 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 24 04:18:09 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <200602241021.27150.faltet@carabos.com> References: <200602231803.27417.faltet@carabos.com> <43FE6595.7050309@sympatico.ca> <200602241021.27150.faltet@carabos.com> Message-ID: <43FEF953.1080906@sympatico.ca> Francesc Altet wrote: >A Divendres 24 Febrer 2006 02:47, v?reu escriure: > > >>Could these be considered as dimensionless, to avoid having to explain >>to people that the word rank doesn't have the same meaning as the matrix >>rank? >> >> > >Colin Williams was proposing calling arrays coming from array(5) as >'dimensionless'. So, for the moment, we have three ways to name such a >beasts: > >- 'rank-0' >- '0-dimensional' or '0-dim' for short >- 'dimensionless' > > > My suggestion was based on the usage of rank, with a different meaning, in matrices. n-dim, with n= 0 .. ? seems a neat way around. Colin W. >Perhaps the time has come to choose a number for them (if we have to >live with such array for a long time, as it seems to be the case). > >However, as more I think on this the more I'm convinced that, >similarly to their higher dimension counterparts, we will not arrive >to any definite agreement and people will continue to call them in any >way he would be more comfortable with. IMO, this should be not a >problem at all because all three words express a 'lack of >dimensionality'. > >Cheers, > > > From magnus at hetland.org Fri Feb 24 04:27:05 2006 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri Feb 24 04:27:05 2006 Subject: [Numpy-discussion] Simple NumPy-compatible vector w/C++ & SWIG? Message-ID: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> Hi! I'm working on a data structure library where one of the element types most likely will be a vector type (i.e., points in a multidimensional space, with the dimensionality set by the user). In the data structure (which is disk-based) I have work with raw bytes that I'd like to copy around as little as possible. The library itself is (being) written in C++, but I'm wrapping it with SWIG so I can drive and test it with Python. It seems to me that something NumPy-compatible might be the best choice for the vector type, but I'm not sure how I should do that. I've been thinking about simply implementing a minimal compatibility layer for the NumPy Array Interface; is it then possible to construct a NumPy array using this custom array, and get full support for the various array operations without actually copying the data? And: Any ideas on what to do on the C++ side? Is there any code/ library out there for a vector-thing that works well in C++ *and* that has wrapping code for NumPy? (I know the STL vector is wrapped to a Python list by default -- I'm just thinking that including those things in the equation would lead to lots of copied data...) -- Magnus Lie Hetland http://hetland.org From oliphant.travis at ieee.org Fri Feb 24 04:39:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 04:39:04 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: <43FEE539.9010606@ieee.org> Message-ID: <43FEFE3C.9080309@ieee.org> Bill Baxter wrote: > Excellent. > Is working on numpy your full time job? You certainly seem to be > putting a full-time effort into it at any rate. It is appreciated. No :-) But, success of NumPy is critical for me. This rate of effort will have to significantly decrease very soon. I'm very anxious to get NumPy stable, though, and so I try to respond quickly to serious errors when I can --- I can be somewhat of a perfectionist so that problems eat at me until they are fixed --- perhaps there is medicine I can take ;-) Fortunately, several people are becoming familar with the internals of the code which is essential so that it can carry forward when my time to spend on it wanes. Thanks for the appreciation. -Travis From fullung at gmail.com Fri Feb 24 05:43:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 05:43:01 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes Message-ID: <004701c63948$17d484f0$6363630a@dsp.sun.ac.za> Hello all Recent changes to multiarraymodule.c has broken the build with Visual Studio for revision 2164 of numpy. I'd fix the problem, but I'm still a bit new to the NumPy sources. I've attached the build log in case someone can come up with a quick fix. Meanwhile, the SciPy build is also broken. Is Visual Studio considered to be a supported compiler, or is Mingw's GCC the only supported compiler for Windows builds? Regards Albert -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: numpy-msvc-rev2164.txt URL: From aisaac at american.edu Fri Feb 24 06:07:01 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 24 06:07:01 2006 Subject: [Numpy-discussion] repmat equivalent? In-Reply-To: <20060224081132.GB13312@alpha> References: <5eec5f300602230331sefc7640r8298011705c424f2@mail.gmail.com> <43FE1BFA.9070709@ieee.org> <5eec5f300602231321x12da7721m8d7fa2aa7fc6edf0@mail.gmail.com><20060224081132.GB13312@alpha> Message-ID: On Fri, 24 Feb 2006, Stefan van der Walt apparently wrote: > Below follows the benchmark of seven possible repmat > functions: Probably I a misunderstanding something here. But I thought the idea of repmat was that it used a single copy of the data to represent multiple copies in a matrix, and all these functions seem ultimately to use multiple copies of the data. If that is right, then a repmat should be a subclass of matrix. I think ... Cheers, Alan Isaac From dalcinl at gmail.com Fri Feb 24 06:14:08 2006 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri Feb 24 06:14:08 2006 Subject: [Numpy-discussion] Simple NumPy-compatible vector w/C++ & SWIG? In-Reply-To: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> References: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> Message-ID: Magnus, I send you attached a SWIG file I use to interface PETSc libraries and NumPy. It is a series of macros (perhaps a bit nested), but I think this can help you to quickly define IN/OUT/INOUT typemaps for arguments like this: (int size, double* data). This approach was always enough for me. If your use case is more elaborated, please feel free to ask for other alternatives. On 2/24/06, Magnus Lie Hetland wrote: > Hi! > > I'm working on a data structure library where one of the element > types most likely will be a vector type (i.e., points in a > multidimensional space, with the dimensionality set by the user). In > the data structure (which is disk-based) I have work with raw bytes > that I'd like to copy around as little as possible. > > The library itself is (being) written in C++, but I'm wrapping it > with SWIG so I can drive and test it with Python. It seems to me that > something NumPy-compatible might be the best choice for the vector > type, but I'm not sure how I should do that. > > I've been thinking about simply implementing a minimal compatibility > layer for the NumPy Array Interface; is it then possible to construct > a NumPy array using this custom array, and get full support for the > various array operations without actually copying the data? > > And: Any ideas on what to do on the C++ side? Is there any code/ > library out there for a vector-thing that works well in C++ *and* > that has wrapping code for NumPy? (I know the STL vector is wrapped > to a Python list by default -- I'm just thinking that including those > things in the equation would lead to lots of copied data...) > > -- > Magnus Lie Hetland > http://hetland.org > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy.i Type: application/octet-stream Size: 21182 bytes Desc: not available URL: From aisaac at american.edu Fri Feb 24 06:15:12 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 24 06:15:12 2006 Subject: [Numpy-discussion] Matrix times scalar is wacky In-Reply-To: References: Message-ID: On Fri, 24 Feb 2006, Bill Baxter apparently wrote: > Multiplying a matrix times a scalar seems to return junk for some reason: Confirmed. Alan Isaac Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as N >>> N.__version__ '0.9.5' >>> a=N.array([[1,2],[3,4]],'f') >>> b=N.mat(a) >>> a*0.2 array([[ 0.2 , 0.40000001], [ 0.60000002, 0.80000001]], dtype=float32) #this works ok >>> b*0.2 matrix([[ 0.2, 0.4], [ 0.6, 0.8]]) >>> a = N.rand(1,2) >>> b = N.mat(a) >>> a*0.2 array([[ 0.01992175, 0.09690914]]) # this does not work ok >>> b*0.2 matrix([[ 1.99217540e-002, 2.22617305e-309]]) >>> From ndarray at mac.com Fri Feb 24 06:25:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 06:25:03 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE9478.9040309@ieee.org> References: <200602231803.27417.faltet@carabos.com> <43FE9478.9040309@ieee.org> Message-ID: On 2/24/06, Travis Oliphant wrote: > Sasha wrote: > ... > >>>>type(array(2)*2) > >>>> > >>>> > > > > > >I believe it should result in a rank-0 array instead. > > > > > Can you be more precise about when rank-0 array should be returned and > when scalars should be? A simple rule could be that unary functions that don't change the rank when operating on higer dimensional arrays should not change the type (scalar vs. array) of the dimensionless objects. For binary operations such as in the example above, the situation is less clear, but in this case an analogy with functions helps. Multiplication between a function f(...) and a scalar 2 is naturally defined as a function (2*f)(...) = 2*f(...), where ... stands for any number of arguments including zero. The scalars should be returned when an operation involves extracting an element, or evaluation of a function. This includes indexing with a complete set of indices (and no ellipsis) and reduce operations over all elements (more on that later.) From fullung at gmail.com Fri Feb 24 06:30:02 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 06:30:02 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes In-Reply-To: <004701c63948$17d484f0$6363630a@dsp.sun.ac.za> Message-ID: <006a01c6394e$a98e85c0$6363630a@dsp.sun.ac.za> Hello all Seems the build is also broken with MinGW GCC 3.4.2. I've attached both build logs to this ticket: http://projects.scipy.org/scipy/numpy/ticket/13 Regards Albert From cjw at sympatico.ca Fri Feb 24 07:56:12 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Fri Feb 24 07:56:12 2006 Subject: [Numpy-discussion] subclassing ndaray Message-ID: <43FF2C92.3060304@sympatico.ca> I have a subclass Bar, a 1-dim array which has some methods and some attributes. One of the attributes is a view of the Bar to permit different shaping. Suppose that 'a' is an instance of 'Bar', which has a method 'show' and a view attribute 'v'. a ^ 15 returns a Bar instance, with its methods but without the attributes. I am attempt to change this, Bar has a method __xor__, see below: def __xor__(self, other): ''' Exclusive or: __xor__(x, y) => x ^ y . ''' z= 1 << this loops to the recursion limit result= ArrayType.__xor__(self, other) n= self.n result.n= n result.rowSize= self.rowSize result.show= self.show result.v= _n.reshape(result.view(), (n*n, n*n)) return result Could anyone suggest a workaround please? Colin W. From fullung at gmail.com Fri Feb 24 08:19:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 08:19:01 2006 Subject: [Numpy-discussion] Shapes and sizes Message-ID: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Hello all I'm trying to write a function that takes a scalar, 0d-, 1d- or 2d array and returns a scalar or an array obtained by performing some computations on this input argument. When the output of such a function is the same size as input, one can do the following to preallocate the output array: def f(a): arra = asarray(a) if a.ndim > 2: raise ValueError('invalid dimensions') b = empty_like(arra) # do some operations on arra and put the values in b return b However, when the output array only depends on the size of a, but isn't exactly the same, things seem to get more complicated. Consider a function that operates on the rows of a (or the "row" if a is 1d or 0d or a scalar). For every row of length n, the function might return a row of length (n/2 + 1) if n is even or a row of length (n + 1)/2 if n is odd. Thus, depending on the shape of a, empty must be called with one of three different shapes. def outsize(n): if n % 2 == 0: return n/2+1 return (n+1)/2 if arra.ndim == 0: b = empty(()) elif arra.ndim == 1: b = empty((outsize(arra.shape[0]),)) else: b = empty((arra.shape[0],outsize(arra.shape[1])) To me this seems like a lot of code that can could be simpler if there was a function to get the size of an array that returns a useful value even if a particular dimension doesn't exist, much like MATLAB's size, where one can write: b = zeros(size(a,1),outsize(size(a,2))); function [m]=outsize(n) if mod(n, 2) == 0; m = n/2+1; else m =(n+1)/2; end and still have it work with scalars, 1d arrays and 2d arrays. Even if there were such a function for NumPy, this still leaves the problem that the output is going to have the wrong shape for 0d and 1d arrays, specifically (1,1) and (outsize(n),1) instead of () and (outsize(n),). This problem is solved in MATLAB where there is no distinction between 1 and [1]. How do you guys deal with this problem in your functions? Regars Albert From ndarray at mac.com Fri Feb 24 09:16:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 09:16:04 2006 Subject: [Numpy-discussion] A case for rank-0 arrays In-Reply-To: <43FE9A87.3050104@ieee.org> References: <43FE9A87.3050104@ieee.org> Message-ID: On 2/24/06, Travis Oliphant wrote: > Sasha wrote: >... > >I propose to change numpy rules so that if ellipsis is present inside > >[], the operation is always projection and both y[1,...] and > >x[1,1,...] return zero-rank arrays. Note that I have previously > >rejected Francesc's idea that x[...] and x[()] should have different > >meaning for zero-rank arrays. I was wrong. > > > > > I think this is a good and clear rule. And it seems like we may be > "almost" there. > Anybody want to implement it? > I'll implement it. I think I am well prepared to handle this after I implemented [] for rank-0 case. > >2. Another source of ambiguity is the various "reduce" operations such > >as sum or max. Using the previous example, type(x.sum(axis=0)) is > >ndarray, but type(y.sum(axis=0)) is int32. I propose two changes: > > > > a. Make x.sum(axis) return ndarray unless axis is None, making > >type(y.sum(axis=0)) is ndarray true in the example. > > > Hmm... I'm not sure. y.sum(axis=0) is the default spelling of sum(y). > Thus, this would cause all old code to return a rank-0 array. > > Most people who write sum(y) want a scalar, not a "function with 0 > arguments" > That's a valid concern. Maybe we can first agree that it will be helful to have some way of implementing the sum operation that always returns ndarray even in the dimensionless case. Once we agree on this goal we can choose a spelling for such operation. One possiblility is if we implement (b) to keep old behavior for y.sum(axis=0), but make y.sum(axis=(0,)) return an ndarray in all cases. The ugliness of that spelling may be an advantage because it conveys "you know what you are doing" message. > > b. Allow axis to be a sequence of ints and make > >x.sum(axis=range(rank(x))) return rank-0 array according to the rule > >2.a above. > > > So, this would sum over multiple axes? I guess I'm not opposed to > something like that, but I'm not really excited about it either. It looks like this is the kind of proposal that has a better chance of being adopted once someone implements it. I will definitely implement it if it becomes a requirement for (a) because I do need some way to spell sum that does not change the type in the dimensionless case. > Would that make sense for all methods that take the axis= argument? > I think so, but I did not review all the cases. > > > c. Make x.sum() raise an error for rank-0 arrays and scalars, but > >allow x.sum(axis=()) to return x. This will make numpy sum consistent > >with the built-in sum that does not work on scalars. > > > I don't think I like this at all. > Can you be more specific about what you don't like? Why numpy sum should be different from built-in sum? Numpy made dimentionless arrays non-iterable, isn't it logical to make them non-summable as well? Note that in dimensionful case providing non-existing exis is an error: >>> array([1]).sum(1) Traceback (most recent call last): File "", line 1, in ? ValueError: axis(=1) out of bounds Why should not this be an error in the dimensionless case? Current behavior is rather odd: >>> array(1).sum(axis=0) 1 >>> array(1).sum(axis=1) 1 > >I propose to make shape=() valid in ndarray constructor. > > > > > +1 Will do. > I think we need more thinking about rank-0 arrays before doing something > like proposal 2. However, 1 and 3 seem simple enough to move forward > with... Sounds like a plan! From ndarray at mac.com Fri Feb 24 10:45:01 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 10:45:01 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Message-ID: On 2/24/06, Albert Strasheim wrote: > def outsize(n): > if n % 2 == 0: return n/2+1 > return (n+1)/2 > if arra.ndim == 0: > b = empty(()) > elif arra.ndim == 1: > b = empty((outsize(arra.shape[0]),)) > else: > b = empty((arra.shape[0],outsize(arra.shape[1])) Try >>> empty(map(outsize, arra.shape)) > This problem is solved in MATLAB where there is no distinction between 1 and [1]. > > How do you guys deal with this problem in your functions? You might want to take a look at the parallel thread under "A case for rank-0 arrays." From ndarray at mac.com Fri Feb 24 10:54:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 10:54:03 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Message-ID: On 2/24/06, Sasha wrote: > Try > >>> empty(map(outsize, arra.shape)) Oops. I did not realize that you want to apply outsize to the last dimension only. For ndim>1, you can do >>> empty(arra.shape[:-1]+(outsize(arra.shape[-1]),)) That will not work work for scalars though, but you might want to rethink whether your function makes sense for scalars. Remember, 1 is not the same as [1] in python, you maybe trying to copy MATLAP design too literally. From fullung at gmail.com Fri Feb 24 11:12:13 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 11:12:13 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes In-Reply-To: <006a01c6394e$a98e85c0$6363630a@dsp.sun.ac.za> Message-ID: <036201c63976$29ae22c0$6363630a@dsp.sun.ac.za> Hello all The build breaking changes (when building with Visual Studio .NET 2003) were introduced in revision 2150 with log message "Make an enumerated type out of the scalar defines.". The following files were changed: numpy\core\include\numpy\arrayobject.h numpy\core\src\multiarraymodule.c numpy\core\src\scalartypes.inc.src numpy\core\src\ufuncobject.c The build with MinGW GCC 3.4.2 seem to have been broken at least since revision 2146. I would be willing to set up a machine to perform Windows builds using Visual Studio .NET 2003, Visual C++ 2005 Express Edition and and MinGW GCC so that we can avoid these problems in future. Anybody interested in this? Regards Albert From stefan at sun.ac.za Fri Feb 24 11:16:03 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Fri Feb 24 11:16:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FF2C92.3060304@sympatico.ca> References: <43FF2C92.3060304@sympatico.ca> Message-ID: <20060224191426.GD21117@alpha> I see the same strange result. Here is a minimal code example to demonstrate: import numpy as N class Bar(N.ndarray): v = 0. def __new__(cls, *args, **kwargs): print "running new" return super(Bar, cls).__new__(cls, *args) def __init__(self, *args, **kwargs): print "running init" self[:] = 0 self.v = 3 In [88]: b = Bar(3) running new running init In [89]: b Out[89]: Bar([0, 0, 0]) In [90]: b.v Out[90]: 3 In [91]: c = b+1 In [92]: c.v Out[92]: 0.0 However, if I do b[:] = 1, everything works fine. St?fan On Fri, Feb 24, 2006 at 10:56:02AM -0500, Colin J. Williams wrote: > I have a subclass Bar, a 1-dim array which has some methods and some > attributes. One of the attributes is a view of the Bar to permit > different shaping. > > Suppose that 'a' is an instance of 'Bar', which has a method 'show' and > a view attribute 'v'. > > a ^ 15 returns a Bar instance, with its methods but without the attributes. > > I am attempt to change this, Bar has a method __xor__, see below: > > def __xor__(self, other): > ''' Exclusive or: __xor__(x, y) => x ^ y . ''' > z= > 1 > << this loops to the recursion limit > result= ArrayType.__xor__(self, other) > n= self.n > result.n= n > result.rowSize= self.rowSize > result.show= self.show > result.v= _n.reshape(result.view(), (n*n, n*n)) > return result > > Could anyone suggest a workaround please? > > Colin W. From strawman at astraw.com Fri Feb 24 11:21:09 2006 From: strawman at astraw.com (Andrew Straw) Date: Fri Feb 24 11:21:09 2006 Subject: [Numpy-discussion] algorithm, optimization, or other problem? In-Reply-To: <43FDF1B8.3070608@bryant.edu> References: <07C6A61102C94148B8104D42DE95F7E8C8EF39@exchange2k.envision.co.il> <43FDFB80.9090106@VisionSense.com> <43FDF1B8.3070608@bryant.edu> Message-ID: <43FF5C3D.3070006@astraw.com> > > cdef int it,i > > increases the speed from 8 seconds per block to 0.2 seconds per block, > which is comparable to the mex. > > I learned that I have to be a bit more careful! :) Yes, it's always good to double-check the autogenerated C code that Pyrex makes. (This becomes especially important if you release the GIL from your Pyrex code -- I once spent days tracking a weird race condition in threaded code due to this simple oversight.) I'm glad Pyrex is working to get comparable speeds to pure C now. Cheers! Andrew From fullung at gmail.com Fri Feb 24 11:44:05 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 11:44:05 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> Message-ID: <20060224194323.GA12443@dogbert.sdsl.sun.ac.za> Hello On Fri, 24 Feb 2006, Sasha wrote: > On 2/24/06, Sasha wrote: > > Try > > >>> empty(map(outsize, arra.shape)) > Oops. I did not realize that you want to apply outsize to the last > dimension only. For ndim>1, you can do > > >>> empty(arra.shape[:-1]+(outsize(arra.shape[-1]),)) Thanks, this works nicely. My code: outsize = lambda n: (n/2+1, (n+1)/2)[(n%2)%2] b = empty(a.shape[:-1]+(outsize(a.shape[-1]),)) (One line less than MATLAB ;-)) > That will not work work for scalars though, but you might want to > rethink whether your function makes sense for scalars. Remember, 1 is > not the same as [1] in python, you maybe trying to copy MATLAP design > too literally. Personally, I think asarray(scalar) should return something that can actually be used as an array (i.e. has a proper shape and ndim), but if all NumPy functions operate only on arrays to begin with, I could live with that too. Regards Albert From cookedm at physics.mcmaster.ca Fri Feb 24 11:51:04 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Feb 24 11:51:04 2006 Subject: [Numpy-discussion] Simple NumPy-compatible vector w/C++ & SWIG? In-Reply-To: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> References: <9C3EBDD9-2C89-4587-9795-29906E150120@hetland.org> Message-ID: <20060224194944.GA17207@arbutus.physics.mcmaster.ca> On Fri, Feb 24, 2006 at 01:26:16PM +0100, Magnus Lie Hetland wrote: > Hi! > > I'm working on a data structure library where one of the element > types most likely will be a vector type (i.e., points in a > multidimensional space, with the dimensionality set by the user). In > the data structure (which is disk-based) I have work with raw bytes > that I'd like to copy around as little as possible. > > The library itself is (being) written in C++, but I'm wrapping it > with SWIG so I can drive and test it with Python. It seems to me that > something NumPy-compatible might be the best choice for the vector > type, but I'm not sure how I should do that. > > I've been thinking about simply implementing a minimal compatibility > layer for the NumPy Array Interface; is it then possible to construct > a NumPy array using this custom array, and get full support for the > various array operations without actually copying the data? I assume you've looked at the array interface at http://numeric.scipy.org/array_interface.html ? If you implement that (if you're working with C or C++, adding just __array_struct__ is probably the easiest), then numpy can use your vectors without copying data. Call numpy.asarray(v), and you have a numpy array with all the numpy methods. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From fullung at gmail.com Fri Feb 24 12:21:05 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 12:21:05 2006 Subject: [Numpy-discussion] Visual Studio build broken due to array types changes In-Reply-To: <43FF6114.9060008@ieee.org> References: <036201c63976$29ae22c0$6363630a@dsp.sun.ac.za> <43FF6114.9060008@ieee.org> Message-ID: <5eec5f300602241220m292c2a10wd704d6bd699f782f@mail.gmail.com> Hello Thanks! Looks like this solves the problems with Visual Studio. I did a python setup.py clean between builds. Shouldn't this clean up the source tree sufficiently? Regards Albert On 2/24/06, Travis Oliphant wrote: > Albert Strasheim wrote: > > >Hello all > > > > > Please remove the build directory and try with a fresh build. > > It seems many of your errors come from a new version of the C-API. > > Thanks, > > -Travis From ndarray at mac.com Fri Feb 24 12:57:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 12:57:04 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 Message-ID: Usin python 2.4.2 and latest SVN version I get the following: > python Python 2.4.2 (#3, Jan 13 2006, 13:52:39) [GCC 3.4.4 20050721 (Red Hat 3.4.4-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Found 3 tests for numpy.lib.getlimits Found 30 tests for numpy.core.numerictypes Found 11 tests for numpy.core.umath Found 8 tests for numpy.lib.arraysetops Found 42 tests for numpy.lib.type_check Found 83 tests for numpy.core.multiarray Found 3 tests for numpy.dft.helper Found 27 tests for numpy.core.ma Found 1 tests for numpy.core.oldnumeric Traceback (most recent call last): File "", line 1, in ? File "/.../lib/python2.4/site-packages/numpy/__init__.py", line 46, in test return NumpyTest('numpy').test(level, verbosity) File "/.../lib/python2.4/site-packages/numpy/testing/numpytest.py", line 422, in test suites.extend(self._get_module_tests(module, abs(level), verbosity)) File "/.../lib/python2.4/site-packages/numpy/testing/numpytest.py", line 355, in _get_module_tests self.warn('FAILURE importing tests for %s' % (mstr(module))) File "/.../lib/python2.4/site-packages/numpy/testing/numpytest.py", line 469, in warn from numpy.distutils.misc_util import yellow_text File "/.../lib/python2.4/site-packages/numpy/distutils/__init__.py", line 5, in ? import ccompiler File "/.../lib/python2.4/site-packages/numpy/distutils/ccompiler.py", line 12, in ? from exec_command import exec_command File "/.../lib/python2.4/site-packages/numpy/distutils/exec_command.py", line 54, in ? import tempfile File "/.../lib/python2.4/tempfile.py", line 33, in ? from random import Random as _Random ImportError: cannot import name Random Does anyone see the same? It looks like it fails while loading test_misc_util, but running that alone works: > python lib/python2.4/site-packages/numpy/distutils/tests/test_misc_util.py Found 4 tests for __main__ .... ---------------------------------------------------------------------- Ran 4 tests in 0.001s From ndarray at mac.com Fri Feb 24 13:06:05 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 13:06:05 2006 Subject: [Numpy-discussion] Who is maintaining the SVN repository? Message-ID: Where should I report problems with the SVN repository? I've tried to edit the log entry of my commit and got this: > svn propedit -r 2166 --revprop svn:log svn: DAV request failed; it's possible that the repository's pre-revprop-change hook either failed or is non-existent svn: At least one property change failed; repository is unchanged Is that right? From ndarray at mac.com Fri Feb 24 13:30:03 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 13:30:03 2006 Subject: [Numpy-discussion] Why multiple ellipses? Message-ID: Numpy allows multiple ellipses in indexing expressions, but I am not sure if that is useful. AFAIK, ellipsis stands for "as many :'s as needed", but if there is more than one, how do I know how many :'s each of them represents: >>> x = arange(8) >>> x.shape=(2,2,2) >>> x[0,...,0,...] array([0, 1]) >>> x[0,0,:] array([0, 1]) >>> x[0,:,0] array([0, 2]) In the example above, the first ellipsis represents no :'s and the last one represents one. Is that the current rule that the last ellipsis represents all the needed :'s? What is the possible use for that? From ndarray at mac.com Fri Feb 24 14:43:08 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 14:43:08 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 In-Reply-To: References: <038b01c6398a$795b3b50$6363630a@dsp.sun.ac.za> Message-ID: I've located the culprit. Once I revert numpy/core/__init__.py to 2164, tests pass. The change appears to postpone the import of NumpyTest: > svn diff -r 2164:2165 numpy/core/__init__.py Index: numpy/core/__init__.py =================================================================== --- numpy/core/__init__.py (revision 2164) +++ numpy/core/__init__.py (revision 2165) @@ -25,5 +25,6 @@ __all__ += rec.__all__ __all__ += char.__all__ -from numpy.testing import ScipyTest -test = ScipyTest().test +def test(level=1, verbosity=1): + from numpy.testing import NumpyTest + return NumpyTest().test(level, verbosity) I don't understand why this causes problems on my system, but it does. Pearu? On 2/24/06, Sasha wrote: > I've reverted to r2164 and everything is back to normal. This > suggests that r2165 changes are at fault and not my setup. That > change deals with imports and the failure that I see happens during > import. > > I've tried NUMPY_IMPORT_DEBUG=1 and got this: > > Executing 'import testing' ... ok > Executing 'from testing import ScipyTest' ... ok > Executing 'from testing import NumpyTest' ... ok > Executing 'import core' ... ok > Executing 'from core import *' ... ok > Executing 'import random' ... ok > Executing 'from random import rand' ... ok > Executing 'from random import randn' ... ok > Executing 'import lib' ... ok > Executing 'from lib import *' ... ok > Executing 'import linalg' ... ok > Executing 'import dft' ... ok > Executing 'from dft import fft' ... ok > Executing 'from dft import ifft' ... ok > Found 4 tests for numpy.lib.getlimits > Found 30 tests for numpy.core.numerictypes > Found 11 tests for numpy.core.umath > Found 8 tests for numpy.lib.arraysetops > Found 42 tests for numpy.lib.type_check > Found 83 tests for numpy.core.multiarray > Found 3 tests for numpy.dft.helper > Found 27 tests for numpy.core.ma > Traceback (most recent call last): > ... > > Any suggestions for further diagnostic? > > > On 2/24/06, Albert Strasheim wrote: > > Works for me. Just built revision 2165 on Windows with Visual Studio. > > > > Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] > > > > In [1]: import numpy > > > > In [2]: numpy.test() > > Found 83 tests for numpy.core.multiarray > > Found 3 tests for numpy.lib.getlimits > > Found 11 tests for numpy.core.umath > > Found 8 tests for numpy.lib.arraysetops > > Found 42 tests for numpy.lib.type_check > > Found 4 tests for numpy.lib.index_tricks > > Found 30 tests for numpy.core.numerictypes > > Found 27 tests for numpy.core.ma > > Found 1 tests for numpy.core.oldnumeric > > Found 9 tests for numpy.lib.twodim_base > > Found 8 tests for numpy.core.defmatrix > > Found 1 tests for numpy.lib.ufunclike > > Found 33 tests for numpy.lib.function_base > > Found 3 tests for numpy.dft.helper > > Found 1 tests for numpy.lib.polynomial > > Found 6 tests for numpy.core.records > > Found 14 tests for numpy.core.numeric > > Found 44 tests for numpy.lib.shape_base > > Found 0 tests for __main__ > > ............................................................................ > > ........................ > > ............................................................................ > > ........................ > > ............................................................................ > > ........................ > > ............................ > > ---------------------------------------------------------------------- > > Ran 328 tests in 0.781s > > > > OK > > Out[2]: > > > > In [3]: numpy.__version__ > > Out[3]: '0.9.6.2165' > > > > Cheers > > > > Albert > > > > > From Chris.Barker at noaa.gov Fri Feb 24 15:19:05 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri Feb 24 15:19:05 2006 Subject: [Numpy-discussion] Shapes and sizes In-Reply-To: <20060224194323.GA12443@dogbert.sdsl.sun.ac.za> References: <031a01c6395d$e69b3170$6363630a@dsp.sun.ac.za> <20060224194323.GA12443@dogbert.sdsl.sun.ac.za> Message-ID: <43FF943D.5080100@noaa.gov> Albert Strasheim wrote: > Personally, I think asarray(scalar) should return something that can > actually be used as an array (i.e. has a proper shape and ndim), What is the proper shape and numdim? Only your app knows. What I generally do with a function i want to take an array or "something that can be turned into an array" is to use asarray, then set the shape to what I am expecting: >>> import numarray as N >>> >>> def rank1(input): ... A = N.asarray(input) ... A.shape = (-1) ... print repr(A) ... >>> rank1((5,6,7,8)) array([5, 6, 7, 8]) >>> rank1(5) array([5]) >>> >>> def rank2(input): ... A = N.asarray(input) ... A.shape = (-1, 2) ... print repr(A) ... >>> rank2((2,3)) array([[2, 3]]) >>> rank2(((2,3), (4,5), (6,7))) array([[2, 3], [4, 5], [6, 7]]) -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From borreguero at gmail.com Fri Feb 24 15:23:02 2006 From: borreguero at gmail.com (Jose Borreguero) Date: Fri Feb 24 15:23:02 2006 Subject: [Numpy-discussion] unsuccessful install on 64bit machine Message-ID: <7cced4ed0602241522x6bc5d501g6d21f0aea7a141d6@mail.gmail.com> While installing numpy under a non-conventional directory: *python setup.py install --prefix==/gpfs1/active/jose/code/python *I get two classes of warnings: (1) blas_mkl_info: /tmp/numpy-0.9.5/numpy/distutils/system_info.py:531: UserWarning: Library error: libs=['mkl', 'vml', 'guide'] found_libs=[] warnings.warn("Library error: libs=%s found_libs=%s" % \ NOT AVAILABLE (and others warnings similar to this) (2)numpy.core - nothing done with h_files= ['build/src/numpy/core/src/scalartypes.inc', 'build/src/numpy/core/src/arraytypes.inc', 'build/src/numpy/core/config.h', 'build/src/numpy/core/__multiarray_api.h'] (and others warnings similar to this) I created a very simple site.cfg file under numpy/distutils, which goes like this [blas] library_dirs = /usr/lib64 [lapack] library_dirs = /usr/lib64 I have no Atlas library installed in the system. My compiler: gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) (I also have pg compilers) After setup.py has run, I have no */gpfs1/active/jose/code/python/lib64/python2.3/site-packages/numpy *created. So numpy is not installed. Any ideas, please? -- Jose M. Borreguero jmborr at gatech.edu, www.borreguero.com phone: 404 407 8980 Fax: 404 385 7478 Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14St NW, Atlanta GA 30318 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu at scipy.org Fri Feb 24 15:23:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 24 15:23:03 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 In-Reply-To: References: <038b01c6398a$795b3b50$6363630a@dsp.sun.ac.za> Message-ID: On Fri, 24 Feb 2006, Sasha wrote: > I've located the culprit. Once I revert numpy/core/__init__.py to > 2164, tests pass. > > The change appears to postpone the import of NumpyTest: >> svn diff -r 2164:2165 numpy/core/__init__.py > Index: numpy/core/__init__.py > =================================================================== > --- numpy/core/__init__.py (revision 2164) > +++ numpy/core/__init__.py (revision 2165) > @@ -25,5 +25,6 @@ > __all__ += rec.__all__ > __all__ += char.__all__ > > -from numpy.testing import ScipyTest > -test = ScipyTest().test > +def test(level=1, verbosity=1): > + from numpy.testing import NumpyTest > + return NumpyTest().test(level, verbosity) > > > > I don't understand why this causes problems on my system, but it does. Pearu? Could you try svn update and reinstall of numpy now? This error could be due to the fact that numpy.distutils.__init__ imported numpy.testing while importing numpy.__init__ was not finished yet. Pearu From fullung at gmail.com Fri Feb 24 15:36:07 2006 From: fullung at gmail.com (Albert Strasheim) Date: Fri Feb 24 15:36:07 2006 Subject: [Numpy-discussion] Build on Windows: system_info and site.cfg Message-ID: <001501c6399b$0db5cc10$6363630a@dsp.sun.ac.za> Hello all I'm trying to build NumPy on Windows using optimized ATLAS and CLAPACK libraries. The system_info functions are currently doing things like: p = self.combine_paths(lib_dir, 'lib'+l+ext) and if self.search_static_first: exts = ['.a',so_ext] else: exts = [so_ext,'.a'] if sys.platform=='cygwin': exts.append('.dll.a') which generally isn't going work on Windows, where library names don't always start with 'lib' and always end in '.lib', for static libraries and DLL import libraries. It might be worth it to have users explicitly specify their build settings (compiler, flags, BLAS libraries, LAPACK libraries, etc.) in a site.cfg instead of trying to detect all the possible combinations of C and/or FORTRAN BLAS, LAPACK, FFTW and whatnot. A few default configurations could be provided for common configurations. Any thoughts? Is anyone interested in fixing the build on Windows? Thanks! Albert From pearu at scipy.org Fri Feb 24 15:40:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Fri Feb 24 15:40:02 2006 Subject: [Numpy-discussion] unsuccessful install on 64bit machine In-Reply-To: <7cced4ed0602241522x6bc5d501g6d21f0aea7a141d6@mail.gmail.com> References: <7cced4ed0602241522x6bc5d501g6d21f0aea7a141d6@mail.gmail.com> Message-ID: On Fri, 24 Feb 2006, Jose Borreguero wrote: > While installing numpy under a non-conventional directory: > *python setup.py install --prefix==/gpfs1/active/jose/code/python > *I get two classes of warnings: > > (1) blas_mkl_info: > /tmp/numpy-0.9.5/numpy/distutils/system_info.py:531: UserWarning: Library > error: libs=['mkl', 'vml', 'guide'] found_libs=[] > warnings.warn("Library error: libs=%s found_libs=%s" % \ > NOT AVAILABLE > (and others warnings similar to this) > > (2)numpy.core - nothing done with h_files= > ['build/src/numpy/core/src/scalartypes.inc', > 'build/src/numpy/core/src/arraytypes.inc', 'build/src/numpy/core/config.h', > 'build/src/numpy/core/__multiarray_api.h'] > (and others warnings similar to this) This warnings can be ignored. > I created a very simple site.cfg file under numpy/distutils, which goes like > this > [blas] > library_dirs = /usr/lib64 > [lapack] > library_dirs = /usr/lib64 To avoid such hooks and all the troubles on 64-bit platforms, numpy/distutils/system_info.py needs a 64-bit support in setting up default_* directory lists. I can write the patch if someone could provide information how to determine if one is running 64-bit or 32-bit applications. numpy.distutils.cpuinfo can give is_64bit()->True but that does not guarantee that used software is 64-bit. What are the values of sys.platform os.name in your system? > I have no Atlas library installed in the system. > My compiler: gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) (I also have pg > compilers) > > After setup.py has run, I have no > */gpfs1/active/jose/code/python/lib64/python2.3/site-packages/numpy > *created. So numpy is not installed. What was the output when you run setup.py? Pearu From oliphant at ee.byu.edu Fri Feb 24 16:09:01 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Fri Feb 24 16:09:01 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: References: Message-ID: <43FF9FD8.8090704@ee.byu.edu> Sasha wrote: >Numpy allows multiple ellipses in indexing expressions, but I am not >sure if that is useful. AFAIK, ellipsis stands for "as many :'s as >needed", but if there is more than one, how do I know how many :'s >each of them represents: > > It should be that the first ellipsis is interpreted as an ellipsis. Any others are silently converted to ':' characters. > > >>>>x = arange(8) >>>>x.shape=(2,2,2) >>>>x[0,...,0,...] >>>> >>>> >array([0, 1]) > > This is equivalent to x[0,...,0,:] which is equivalent to x[0,0,:] (because the ellipsis is interpreted as nothing). >>>>x[0,0,:] >>>> >>>> >array([0, 1]) > > >>>>x[0,:,0] >>>> >>>> >array([0, 2]) > >In the example above, the first ellipsis represents no :'s and the >last one represents one. Is that the current rule that the last >ellipsis represents all the needed :'s? What is the possible use for >that? > > > The rule is that only the first ellipsis (from left to right) is used and any others are just another spelling of ':'. This is the rule that Numeric implemented and so it's what we've kept. I have no idea what the use might be, but I saw changing the rule as gratuitous breakage. Thus, only one ellipsis is actually treated like an ellipse. Everything else is treated as ':' -Travis From oliphant.travis at ieee.org Fri Feb 24 17:41:14 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 17:41:14 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <20060224191426.GD21117@alpha> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> Message-ID: <43FFB580.80307@ieee.org> Stefan van der Walt wrote: >I see the same strange result. Here is a minimal code example to >demonstrate: > >import numpy as N > >class Bar(N.ndarray): > v = 0. > > def __new__(cls, *args, **kwargs): > print "running new" > return super(Bar, cls).__new__(cls, *args) > > def __init__(self, *args, **kwargs): > print "running init" > self[:] = 0 > self.v = 3 > > It's only strange if you have assumptions your not revealing. Here's the deal. Neither the __init__ method nor the __new__ method are called for c = b+1. So, your wondering how the Bar object got created then right? Well, it got created as a subclass of ndarray in PyArray_NewFromDescr. The __init__ and __new__ methods are not called because they may have arbitrary signatures. Instead, the __array_finalize__ method is always called. So, you should use that instead of __init__. The __array_finalize__ method always receives the argument of the "parent" object. Thus in your case. def __array_finalize__(self, parent): self.v = 3 would do what you want. -Travis >In [88]: b = Bar(3) >running new >running init > >In [89]: b >Out[89]: Bar([0, 0, 0]) > >In [90]: b.v >Out[90]: 3 > >In [91]: c = b+1 > >In [92]: c.v >Out[92]: 0.0 > >However, if I do b[:] = 1, everything works fine. > >St?fan > >On Fri, Feb 24, 2006 at 10:56:02AM -0500, Colin J. Williams wrote: > > >>I have a subclass Bar, a 1-dim array which has some methods and some >>attributes. One of the attributes is a view of the Bar to permit >>different shaping. >> >>Suppose that 'a' is an instance of 'Bar', which has a method 'show' and >>a view attribute 'v'. >> >>a ^ 15 returns a Bar instance, with its methods but without the attributes. >> >>I am attempt to change this, Bar has a method __xor__, see below: >> >> def __xor__(self, other): >> ''' Exclusive or: __xor__(x, y) => x ^ y . ''' >> z= >> 1 >> << this loops to the recursion limit >> result= ArrayType.__xor__(self, other) >> n= self.n >> result.n= n >> result.rowSize= self.rowSize >> result.show= self.show >> result.v= _n.reshape(result.view(), (n*n, n*n)) >> return result >> >>Could anyone suggest a workaround please? >> >>Colin W. >> >> > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From oliphant.travis at ieee.org Fri Feb 24 17:59:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri Feb 24 17:59:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FF2C92.3060304@sympatico.ca> References: <43FF2C92.3060304@sympatico.ca> Message-ID: <43FFB9C2.7080905@ieee.org> Colin J. Williams wrote: > I have a subclass Bar, a 1-dim array which has some methods and some > attributes. One of the attributes is a view of the Bar to permit > different shaping. The ndarray handles sub-classing a little-bit differently. All array constructors go through the same C-code call which can create sub-classes as well. (it's what gets called by ndarray.__new__). If there is a parent object, then additionally, the __array_finalize__(self, parent) method is called right-after creation of the sub-class. This is where attributes should be finalized. But, care must be taken in this code so that a recursion is not setup If this mechanism is not sufficient for you, then you need to use a container class (for this reason UserArray has been resurrected to serve as a default container class model---it needs more testing, however). The problem __array_finalize__ helps fix is how to get subclasses to work well without having to re-define every single special method like UserArray does. For the most part it seems to work, but I suppose it creates a few surprises if you are not aware of what is going on. The most important thing to remember is that attributes are not automatically carried over to new instances because new instances can be created without every calling __new__ or __init__. I'm sure this mechanism can be improved upon and I welcome suggestions. > Suppose that 'a' is an instance of 'Bar', which has a method 'show' > and a view attribute 'v'. > a ^ 15 returns a Bar instance, with its methods but without the > attributes. > > I am attempt to change this, Bar has a method __xor__, see below: > > def __xor__(self, other): > ''' Exclusive or: __xor__(x, y) => x ^ y . ''' > z= > > 1 > << this loops to the recursion limit > result= ArrayType.__xor__(self, other) > n= self.n > result.n= n > result.rowSize= self.rowSize > result.show= self.show > result.v= _n.reshape(result.view(), (n*n, n*n)) > return result Look at the __array_finalize__ method in defmatrix.py for ideas about how it can be used. -Travis From aisaac at american.edu Fri Feb 24 21:44:02 2006 From: aisaac at american.edu (Alan G Isaac) Date: Fri Feb 24 21:44:02 2006 Subject: [Numpy-discussion] Re: Method to shift elements in an array? Message-ID: Tim wrote: > import numpy > def roll(A, n): > "Roll the array A in place. Positive n -> roll right, negative n -> > roll left" > if n > 0: > n = abs(n) > temp = A[-n:] > A[n:] = A[:-n] > A[:n] = temp > elif n < 0: > n = abs(n) > temp = A[:n] > A[:-n] = A[n:] > A[-n:] = temp > else: > pass This probably counts as a gotcha: >>> a=N.arange(10) >>> temp=a[-6:] >>> a[6:]=a[:-6] >>> a[:6]=temp >>> a array([4, 5, 0, 1, 2, 3, 0, 1, 2, 3]) Cheers, Alan Isaac PS Here's something close to the rotater functionality. #rotater: rotate row elements # Format: y = rotater(x,r,copydata) # Input: x RxC array # rotateby size R integer array, or integer (rotation amounts) # inplace boolean (default is False -> copies data) # Output: y RxC array: # rows rotated by rotateby # or None (if inplace=True) # Remarks: Intended for use with 2D arrays. # rotateby values are positive for rightward rotation, # negative for leftward rotation # :author: Alan G Isaac (aisaac AT american DOT edu) # :date: 24 Feb 2006 def rotater(x,rotateby,inplace=False) : assert(len(x.shape)==2), "For 2-d arrays only." xrotate = numpy.array(x,copy=(not inplace)) xrows = xrotate.shape[0] #make an iterater of row shifts if isinstance(rotateby,int): from itertools import repeat rowshifts = repeat(rotateby,xrows) else: rowshifts = numpy.asarray(rotateby) assert(rowshifts.size==xrows) rowshifts = rowshifts.flat #perform rotation on each row for row in xrange(xrows): rs=rowshifts.next() #do nothing if rs==0 if rs>0: xrotate[row] = numpy.concatenate([xrotate[row][-rs:],xrotate[row][:-rs]]) elif rs<0: xrotate[row] = numpy.concatenate([xrotate[row][:-rs],xrotate[row][-rs:]]) if inplace: return None else: return xrotate From ndarray at mac.com Fri Feb 24 22:32:04 2006 From: ndarray at mac.com (Sasha) Date: Fri Feb 24 22:32:04 2006 Subject: [Numpy-discussion] Tests fail in SVN r2165 In-Reply-To: References: <038b01c6398a$795b3b50$6363630a@dsp.sun.ac.za> Message-ID: On 2/24/06, Pearu Peterson wrote: > Could you try svn update and reinstall of numpy now? This error could be > due to the fact that numpy.distutils.__init__ imported numpy.testing while > importing numpy.__init__ was not finished yet. r2168 works fine. Thanks a lot for a quick fix. From nwagner at mecha.uni-stuttgart.de Fri Feb 24 23:49:01 2006 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Fri Feb 24 23:49:01 2006 Subject: Fwd: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? Message-ID: --- the forwarded message follows --- -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From robert.kern at gmail.com Fri Feb 24 18:33:41 2006 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 24 Feb 2006 17:33:41 -0600 Subject: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: <43FECB4B.8080900@mecha.uni-stuttgart.de> References: <43FECB4B.8080900@mecha.uni-stuttgart.de> Message-ID: <43FF97D5.7050300@gmail.com> Nils Wagner wrote: > Robert Kern wrote: > >>Vidar Gundersen wrote: >> >> >> >>>and i also wonder why concatenate can't be used to stack 1-d >>>arrays on top of each other, returning a 2-d array? >>> >> >>Use vstack() for that. Also note its companions, hstack() and column_stack(). >> >> > > > Hi Robert, > > Is it possible to use aliases for these operators ? > > In textbooks (Matrix Algebra by Abadir and Magnus, Cambridge University > Press (2005)) you will find > > The vec-operator transforms a matrix into a vector by stacking its > columns one underneath the other. Ask on the list, not private email. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter --_===36437881====uni-stuttgart.de===_-- From stefan at sun.ac.za Sat Feb 25 00:41:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Sat Feb 25 00:41:01 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FFB580.80307@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> Message-ID: <20060225083924.GF21117@alpha> On Fri, Feb 24, 2006 at 06:40:16PM -0700, Travis Oliphant wrote: > Stefan van der Walt wrote: > > >I see the same strange result. Here is a minimal code example to > >demonstrate: > > > >import numpy as N > > > >class Bar(N.ndarray): > > v = 0. > > > > def __new__(cls, *args, **kwargs): > > print "running new" > > return super(Bar, cls).__new__(cls, *args) > > > > def __init__(self, *args, **kwargs): > > print "running init" > > self[:] = 0 > > self.v = 3 > > > > > It's only strange if you have assumptions your not revealing. Here's > the deal. > > Neither the __init__ method nor the __new__ method are called for c = b+1. > > So, your wondering how the Bar object got created then right? Well, it > got created as a subclass of ndarray in PyArray_NewFromDescr. > > The __init__ and __new__ methods are not called because they may have > arbitrary signatures. Instead, the __array_finalize__ method is always > called. So, you should use that instead of __init__. > > The __array_finalize__ method always receives the argument of the > "parent" object. > > Thus in your case. > > def __array_finalize__(self, parent): > self.v = 3 > > would do what you want. That doesn't seem to work. __array_finalize__ isn't called when the object is initially constructed: In [14]: b = Bar(2) running new In [15]: b.v Out[15]: 0.0 In [16]: b=b+1 In [17]: b.v Out[17]: 3 Should a person then call __array_finalize__ from __init__? St?fan From cjw at sympatico.ca Sat Feb 25 05:18:07 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sat Feb 25 05:18:07 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <43FFB9C2.7080905@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <43FFB9C2.7080905@ieee.org> Message-ID: <440058E0.1040004@sympatico.ca> Travis Oliphant wrote: > Colin J. Williams wrote: > >> I have a subclass Bar, a 1-dim array which has some methods and some >> attributes. One of the attributes is a view of the Bar to permit >> different shaping. > > > The ndarray handles sub-classing a little-bit differently. All array > constructors go through the same C-code call which can create > sub-classes as well. (it's what gets called by ndarray.__new__). > > If there is a parent object, then additionally, the > > __array_finalize__(self, parent) > > method is called right-after creation of the sub-class. This is > where attributes should be finalized. But, care must be taken in this > code so that a recursion is not setup > > If this mechanism is not sufficient for you, then you need to use a > container class (for this reason UserArray has been resurrected to > serve as a default container class model---it needs more testing, > however). > The problem __array_finalize__ helps fix is how to get subclasses to > work well without having to re-define every single special method like > UserArray does. > > For the most part it seems to work, but I suppose it creates a few > surprises if you are not aware of what is going on. > > The most important thing to remember is that attributes are not > automatically carried over to new instances because new instances can > be created without every calling __new__ or __init__. > > I'm sure this mechanism can be improved upon and I welcome suggestions. Thanks for this. Does this mean that whenever we subclass ndarray, we should use __array_finalize__ (with its additional 'parent' parameter) instead of Python's usual __init__? It would help if you could clarify the role of 'parent'. [Dbg]>>> h(self.__array_finalize__) Help on built-in function __array_finalize__: __array_finalize__(...) Is parent the next type up in the type hierarchy? If so, can this not be determined from self.__class__? I've tried a similar operation with the Python library's sets.Set. There, __init__ is called, ensuring that the expression is of the appropriate sub-type. > >> Suppose that 'a' is an instance of 'Bar', which has a method 'show' >> and a view attribute 'v'. >> a ^ 15 returns a Bar instance, with its methods but without the >> attributes. >> >> I am attempt to change this, Bar has a method __xor__, see below: >> >> def __xor__(self, other): >> ''' Exclusive or: __xor__(x, y) => x ^ y . ''' >> z= >> >> 1 >> << this loops to the recursion limit >> result= ArrayType.__xor__(self, other) >> n= self.n >> result.n= n >> result.rowSize= self.rowSize >> result.show= self.show >> result.v= _n.reshape(result.view(), (n*n, n*n)) >> return result > > > > Look at the __array_finalize__ method in defmatrix.py for ideas about > how it can be used. def __array_finalize__(self, obj): ndim = self.ndim if ndim == 0: self.shape = (1,1) elif ndim == 1: self.shape = (1,self.shape[0]) return These are functions for which one would use __init__ in numarray. This doesn't describe or illustrate the role or purpose of the parent object. Colin W. From cjw at sympatico.ca Sat Feb 25 05:31:08 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sat Feb 25 05:31:08 2006 Subject: [Numpy-discussion] ndarray - reshaping a sub-class Message-ID: <44005BF3.7080906@sympatico.ca> The function reshape, when applied to an instance of a sub-class, returns an array instance. The method reshape returns and instance of the sub-class. It seems desirable that both be treated in the same way. Colin W. From fullung at gmail.com Sat Feb 25 07:03:01 2006 From: fullung at gmail.com (Albert Strasheim) Date: Sat Feb 25 07:03:01 2006 Subject: [Numpy-discussion] Windows Build with optimized libraries Message-ID: <002301c63a1c$80b0e610$6363630a@dsp.sun.ac.za> Hello all I've got the latest NumPy from SVN building with optimized ATLAS 3.7.11 and CLAPACK on Windows. I've also replaced the CLAPACK functions that are provided by ATLAS with the ATLAS ones. The current page on building with Windows: http://www.scipy.org/Installing_SciPy/Windows doesn't have instructions to do this, so I'd like to add info on building with MinGW, Visual Studio .NET 2003, Visual C++ Toolkit 2003 and Visual C++ 2005 Express Edition (if I can figure out how to make distutils detect Visual C++ 2005). I had to change the build scripts in a few places to get things to work. I've attached the patch if someone is interested in committing it to SVN. Briefly, I did the following: 1. Built ATLAS 3.7.11 with Cygwin. I copied libatlas.a as atlas.lib and libcblas.a as cblas.lib to some directory, say c:\tmp\numpylibs. 2. Built CLAPACK 3.0 for Windows with Visual Studio .NET 2003. I added cblaswr.c to clapack.lib and disabled building of the other projects, except for libI77 and libF77. I also changed the project properties of these three projects to use SSE2 instructions (under C/C++ | Code Generation | Enable Enhanced Instruction Set). I don't know if this makes much difference though (anybody have some benchmarks?). 3. I then took release builds of clapack.lib, libF77.lib and libI77.lib and rolled them together with ATLAS's liblapack.a: cp clapack.lib lapack.lib ar x liblapack.a mkdir Release ar x libI77.lib ar x libF77.lib ar r lapack.lib Release/*.obj *.o This adds the symbols from libI77 and libF77 to the library and replaces any existing symbols with the symbols from the ATLAS LAPACK library. I copied this lapack.lib to c:\tmp\numpylibs. 4. I created the file numpy\numpy\distutils\site.cfg with contents: [atlas] library_dirs = c:\tmp\numpylibs atlas_libs = cblas,atlas [lapack] library_dirs = c:\tmp\numpylibs lapack_libs = lapack 5.1. Visual Studio: python setup.py bdist_wininst 5.2. MinGW: python setup.py config --compiler=mingw32 build --compiler=mingw32 bdist_wininst That's it. The build generated a shiny numpy-0.9.6.2168.win32-py2.4.exe. A quick question: it seems that NumPy can also use FFTW 2.1.5 to speed up its FFT functions. Is this the case? If so, I'll take a look at building FFTW 2.1.5 on Windows too. fftw.org's links to solution files for 2.1.3 are broken, so I'll probably have to make new ones. Hope this helps. Regards Albert -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy-windowsbuild.diff Type: application/octet-stream Size: 2117 bytes Desc: not available URL: From ndarray at mac.com Sat Feb 25 08:17:01 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 25 08:17:01 2006 Subject: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: > >>Vidar Gundersen wrote: > > The vec-operator transforms a matrix into a vector by stacking its > > columns one underneath the other. >>> x matrix([[1, 2], [3, 4]]) >>> matrix([[1,2],[3,4]]).T.ravel() matrix([[1, 3, 2, 4]]) From robert.kern at gmail.com Sat Feb 25 09:52:04 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sat Feb 25 09:52:04 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: Nils Wagner wrote: >> Hi Robert, >> >>Is it possible to use aliases for these operators ? >> >>In textbooks (Matrix Algebra by Abadir and Magnus, Cambridge University >>Press (2005)) you will find >> >>The vec-operator transforms a matrix into a vector by stacking its >>columns one underneath the other. It's possible to add some aliases, sure. No, we're not going to do it. There are already too many different names for the same thing in numpy because of backwards compatibility. We should not exacerbate the problem. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From pebarrett at gmail.com Sat Feb 25 09:57:02 2006 From: pebarrett at gmail.com (Paul Barrett) Date: Sat Feb 25 09:57:02 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: <43FF9FD8.8090704@ee.byu.edu> References: <43FF9FD8.8090704@ee.byu.edu> Message-ID: <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> On 2/24/06, Travis Oliphant wrote: > > Sasha wrote: > > >Numpy allows multiple ellipses in indexing expressions, but I am not > >sure if that is useful. AFAIK, ellipsis stands for "as many :'s as > >needed", but if there is more than one, how do I know how many :'s > >each of them represents: > > > > > It should be that the first ellipsis is interpreted as an ellipsis. Any > others are silently converted to ':' characters. > > > > > > >>>>x = arange(8) > >>>>x.shape=(2,2,2) > >>>>x[0,...,0,...] > >>>> > >>>> > >array([0, 1]) > > > > > This is equivalent to > > x[0,...,0,:] > > which is equivalent to > > x[0,0,:] (because the ellipsis is interpreted as nothing). > > >>>>x[0,0,:] > >>>> > >>>> > >array([0, 1]) > > > > > >>>>x[0,:,0] > >>>> > >>>> > >array([0, 2]) > > > >In the example above, the first ellipsis represents no :'s and the > >last one represents one. Is that the current rule that the last > >ellipsis represents all the needed :'s? What is the possible use for > >that? > > > > > > > The rule is that only the first ellipsis (from left to right) is used > and any others are just another spelling of ':'. > > This is the rule that Numeric implemented and so it's what we've kept. > I have no idea what the use might be, but I saw changing the rule as > gratuitous breakage. This might be a good time to change this behavior, since I've yet to find a good reason for keeping it. Maybe we can depricate it until version 1.0. -- Paul Thus, only one ellipsis is actually treated like an ellipse. Everything > else is treated as ':' > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Sat Feb 25 10:14:05 2006 From: ndarray at mac.com (Sasha) Date: Sat Feb 25 10:14:05 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> References: <43FF9FD8.8090704@ee.byu.edu> <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> Message-ID: On 2/25/06, Paul Barrett wrote: > ... > On 2/24/06, Travis Oliphant wrote: > > ... > > The rule is that only the first ellipsis (from left to right) is used > > and any others are just another spelling of ':'. > > ... > > This might be a good time to change this behavior, since I've yet to find a > good reason for keeping it. Maybe we can depricate it until version 1.0. > I am very much supporting deprecation. The distinction between '...' and ':' is hard enough to explain without '...' treated as ':' in some cases. I would suggest to allow it in 1.0, but issue python deprecation warning with a text message "repeated ellipses replaced by :'s". From oliphant.travis at ieee.org Sat Feb 25 11:55:10 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat Feb 25 11:55:10 2006 Subject: [Numpy-discussion] Why multiple ellipses? In-Reply-To: References: <43FF9FD8.8090704@ee.byu.edu> <40e64fa20602250956r74e69ac4q33548b8a6ef763c0@mail.gmail.com> Message-ID: <4400B5EB.10808@ieee.org> Sasha wrote: >I am very much supporting deprecation. The distinction between '...' >and ':' is hard enough to explain without '...' treated as ':' in some >cases. I would suggest to allow it in 1.0, but issue python >deprecation warning with a text message "repeated ellipses replaced by >:'s". > > I'm fine with that. -Travis From cjw at sympatico.ca Sat Feb 25 12:00:02 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sat Feb 25 12:00:02 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: <4400B722.6080109@sympatico.ca> Robert Kern wrote: >Nils Wagner wrote: > > > >>>Hi Robert, >>> >>>Is it possible to use aliases for these operators ? >>> >>>In textbooks (Matrix Algebra by Abadir and Magnus, Cambridge University >>>Press (2005)) you will find >>> >>>The vec-operator transforms a matrix into a vector by stacking its >>>columns one underneath the other. >>> >>> > >It's possible to add some aliases, sure. No, we're not going to do it. There are >already too many different names for the same thing in numpy because of >backwards compatibility. We should not exacerbate the problem. > > > Could these be put into a separate modules and only included when Numeric compatibility is desired? This would help to reduce the clutter. Colin W. Colin W From aisaac at american.edu Sat Feb 25 12:07:29 2006 From: aisaac at american.edu (Alan G Isaac) Date: Sat Feb 25 12:07:29 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: <4400B722.6080109@sympatico.ca> References: <4400B722.6080109@sympatico.ca> Message-ID: > Robert Kern wrote: >> It's possible to add some aliases, sure. No, we're not going to do it. There are >> already too many different names for the same thing in numpy because of >> backwards compatibility. We should not exacerbate the problem. On Sat, 25 Feb 2006, "Colin J. Williams" apparently wrote: > Could these be put into a separate modules and only > included when Numeric compatibility is desired? > This would help to reduce the clutter. Beyond this? Cheers, Alan Isaac >>> help(numpy.core.oldnumeric) Help on module numpy.core.oldnumeric in numpy.core: NAME numpy.core.oldnumeric - # Compatibility module containing deprecated names FILE c:\python24\lib\site-packages\numpy\core\oldnumeric.py From robert.kern at gmail.com Sat Feb 25 12:08:05 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sat Feb 25 12:08:05 2006 Subject: [Numpy-discussion] Re: Fwd: Re: inconsistent use of axis= keyword argument? In-Reply-To: <4400B722.6080109@sympatico.ca> References: <4400B722.6080109@sympatico.ca> Message-ID: Colin J. Williams wrote: > Robert Kern wrote: >> It's possible to add some aliases, sure. No, we're not going to do it. >> There are >> already too many different names for the same thing in numpy because of >> backwards compatibility. We should not exacerbate the problem. >> > Could these be put into a separate modules and only included when > Numeric compatibility is desired? Most, if not all, of the core aliases are already isolated in numpy.core.oldnumeric. Some of the other packages also have some aliases _in situ_, too. I would personally like it if the core aliases weren't imported by default, but I think that's a decision that should have been made (one way or the other) some months ago when the first wave of code conversion was going on. I don't want to trigger a second wave of code conversion just for asthetics. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From aisaac at american.edu Sat Feb 25 13:25:06 2006 From: aisaac at american.edu (Alan G Isaac) Date: Sat Feb 25 13:25:06 2006 Subject: Fwd: [Numpy-discussion] Re: inconsistent use of axis= keyword argument? In-Reply-To: References: Message-ID: On Sat, 25 Feb 2006, Nils Wagner apparently wrote: > Is it possible to use aliases for these operators ? > In textbooks (Matrix Algebra by Abadir and Magnus, > Cambridge University Press (2005)) you will find > The vec-operator transforms a matrix into a vector by > stacking its columns one underneath the other. At http://www.american.edu/econ/pytrix/pyGAUSS.py you will find these vec operations (as GAUSS "look alikes"). hth, Alan Isaac #vec: vectorize columns of 2-D array # Format: y = vec(x) # Input: x RxK 2-D array (or matrix) # Output: y (RK)x1 2-D array: # stacked columns of x # Remarks: ravel OK for non-contiguous arrays # Author: Alan G Isaac (aisaac AT american DOT edu) # Date: 20050420 def vec(x): assert(len(x.shape)==2), "For 2-d arrays only." return x.transpose().ravel().reshape((x.size,-1)) #vecr: vectorize rows of 2-D array # Format: y = vecr(x) # Input: x RxK 2-D array (or matrix) # Output: y (RK)x1 2-D array: # stacked rows of x # Remarks: ravel OK for non-contiguous arrays # Author: Alan G Isaac (aisaac AT american DOT edu) # Date: 20050420 def vecr(x): assert(len(x.shape)==2), "For 2-d arrays only." return x.ravel().reshape((x.size,-1)) From fullung at gmail.com Sun Feb 26 04:52:18 2006 From: fullung at gmail.com (Albert Strasheim) Date: Sun Feb 26 04:52:18 2006 Subject: [Numpy-discussion] Triangular window function Message-ID: <003d01c63ad3$5d4b6350$6363630a@dsp.sun.ac.za> Hello all NumPy already has Hamming, Hanning and some other window functions. It would be useful if the triangular window in SciPy could also be brought over. Any thoughts? Regards Albert From ndarray at mac.com Sun Feb 26 11:31:57 2006 From: ndarray at mac.com (Sasha) Date: Sun Feb 26 11:31:57 2006 Subject: [Numpy-discussion] [SciPy-user] Messing with missing values Message-ID: I am replying on "numpy-discussion" because this is really a numpy rather than scipy topic. > Unfortunately, most of the numpy/scipy functions don't handle missing values > nicely. Can you specify which *numpy* functions are giving you trouble? That should be fixed. > How could I mask the values corresponding to > MA.masked in the final list, without having to check every single element? Latest ma allows you to pass masked arrays directly to ufuncs. In order for this to work a ufunc should be registered in the "domains" and "fills" dictionaries. Not much documentation on this feature exists yet, so you will have to read the code in ma.py to figure this out. > Date: Sat, 25 Feb 2006 18:36:19 -0500 > From: pgmdevlist at mailcan.com > Subject: [SciPy-user] Messing with missing values > To: scipy-user at scipy.net > Message-ID: <200602251836.20406.pgmdevlist at mailcan.com> > Content-Type: text/plain; charset="us-ascii" > > Folks, > Most of the data I work with have missing values, and I rely on 'ma' a lot. > Unfortunately, most of the numpy/scipy functions don't handle missing values > nicely. Not a problem I thought, I just have to adapt the functions I need. > I have 2 options: wrapping the initial functions in tests, or recode the > initial functions. > * According to your experience, which is the most efficient way to go ? > * I have a function that outputs either a float or MA.masked. I call this > function recursively, appending the result to a list, and then trying to > process the list as an array. How could I mask the values corresponding to > MA.masked in the final list, without having to check every single element? > > Thanks for your ideas From robert.kern at gmail.com Sun Feb 26 12:28:08 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sun Feb 26 12:28:08 2006 Subject: [Numpy-discussion] Re: Triangular window function In-Reply-To: <003d01c63ad3$5d4b6350$6363630a@dsp.sun.ac.za> References: <003d01c63ad3$5d4b6350$6363630a@dsp.sun.ac.za> Message-ID: Albert Strasheim wrote: > Hello all > > NumPy already has Hamming, Hanning and some other window functions. It would > be useful if the triangular window in SciPy could also be brought over. > > Any thoughts? Please, no more. Of course it would be useful if a function in scipy were brought over to numpy. *Any* function in scipy. Without a more compelling reason, like special functions that are provided by C99, I'm -1 on moving anything else from scipy to numpy. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From pgmdevlist at mailcan.com Sun Feb 26 21:20:03 2006 From: pgmdevlist at mailcan.com (pgmdevlist at mailcan.com) Date: Sun Feb 26 21:20:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: References: Message-ID: <200602270019.24151.pgmdevlist@mailcan.com> On Sunday 26 February 2006 14:19, Sasha wrote: > I am replying on "numpy-discussion" because this is really a numpy > rather than scipy topic. My bad, sorry for that. > > Unfortunately, most of the numpy/scipy functions don't handle missing > > values nicely. > > Can you specify which *numpy* functions are giving you trouble? > That should be fixed. Typical examples: median, stdev, diff... `stdev` is obvious, `median` straightforward for 1d arrays (and I'm still looking for an optimal method for higher dimension). The couple of `shape_base` functions I tried (`hstack`, `column_stack`..) required to fill the array beforehand, and superimposing the corresponding mask. Or even some methods such as `ndim` (more for convenience than anything, a `len(x.shape)` does the trick for both masked & unmasked versions), or r_[]. I remmbr a message a couple of weeks ago wondering whether ma should be kpet uptodate with the rest of numpy (and of course, I can't find the reference right now). What's the status on ma ? > > How could I mask the values corresponding to > > MA.masked in the final list, without having to check every single > > element? > > Latest ma allows you to pass masked arrays directly to ufuncs. In > order for this to work a ufunc should be registered in the "domains" > and "fills" dictionaries. Not much documentation on this feature > exists yet, so you will have to read the code in ma.py to figure this > out. Let's take the `median` example for 2D arrays. I end up with something like: --- med = [] for x_i in x: med.append(median1d(x_i.compressed()) --- with `median1d` a slightly modified version of the basic numpy `median`, outputing `MA.masked` if `x_i.compressed()` is `None`. I need the `med` list to be a masked_array. Paul Dubois suggests: --- return ma.array(med, mask=[x is ma.masked for x in med]) --- I guess that's more efficient than the --- return MA.masked_values(med.filled(nodata),nodata) --- I had come up with. AAMOF, it seems even faster to hardcode the `median1d` part in the loop. But yes, I gonna check the sources for the ufunc. Thanks again. -- Pierre GM From zpincus at stanford.edu Sun Feb 26 21:35:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Sun Feb 26 21:35:01 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? Message-ID: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> Hi folks, I'm trying to write an ndarray subclass with a constructor like the matrix constructor -- one which can take matrix objects, array objects, or things that can be turned into array objects. I've copied the __new__ method from matrix (and tried to eliminate the matrix-specific stuff), but there's a lot of code there. So I'm trying to figure out what the absolute minimum I need is for correct behavior. (This would be a useful wiki entry somewhere. In fact, a whole page about subclassing ndarray would be good.) What follows is what I have so far. Have I missed anything, or can anything else be removed? Zach class contour(numpy.ndarray): def __new__(subtype, data, dtype=None, copy=True): ##### Do I need this first if block? ##### Wouldn't the second block would do fine on its own? if isinstance(data, contour): dtype2 = data.dtype if (dtype is None): dtype = dtype2 if (dtype2 == dtype) and (not copy): return data return data.astype(dtype) if isinstance(data, numpy.ndarray): if dtype is None: intype = data.dtype else: intype = numpy.dtype(dtype) new = data.view(contour) if intype != data.dtype: return new.astype(intype) if copy: return new.copy() else: return new # now convert data to an array arr = numpy.array(data, dtype=dtype, copy=copy) ##### Do I need this if block? if not (arr.flags.fortran or arr.flags.contiguous): arr = arr.copy() ##### Do I need the fortran flag? ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, buffer=arr, fortran=arr.flags.fortran) return ret From schofield at ftw.at Mon Feb 27 02:57:03 2006 From: schofield at ftw.at (Ed Schofield) Date: Mon Feb 27 02:57:03 2006 Subject: [Numpy-discussion] Sparse matrix hooks Message-ID: <4402DAA8.3030501@ftw.at> I'm trying to improve integration between SciPy's sparse matrices and NumPy's dense array/matrix objects. One problem I'm facing is that NumPy matrices redefine the * operator to call NumPy's dot() function. Since dot() has no awareness of SciPy's sparse matrix objects, this doesn't work for the operation 'dense * sparse'. (It does work for sparse * dense, which calls sparse.__mul__ first.) I'd like to propose the addition of a basic sparse matrix object to NumPy. This wouldn't need to provide any actual functionality, but could instead provide a skeletal interface or base class from which sparse matrices in other packages (SciPy and potentially PySparse) could derive. The spmatrix object in SciPy would be a good starting point. The benefit of this would be that hooks for proper handling of sparse matrices would be easy to provide for functions like dot(), where(), and var(). There may be other ways to make 'dense * sparse' work in SciPy, but I haven't been able to come up with any. This solution would at least be quite flexible and quite straightforward. -- Ed From pearu at scipy.org Mon Feb 27 03:09:04 2006 From: pearu at scipy.org (Pearu Peterson) Date: Mon Feb 27 03:09:04 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: <4402DAA8.3030501@ftw.at> References: <4402DAA8.3030501@ftw.at> Message-ID: On Mon, 27 Feb 2006, Ed Schofield wrote: > > I'm trying to improve integration between SciPy's sparse matrices and > NumPy's dense array/matrix objects. One problem I'm facing is that > NumPy matrices redefine the * operator to call NumPy's dot() function. > Since dot() has no awareness of SciPy's sparse matrix objects, this > doesn't work for the operation 'dense * sparse'. (It does work for > sparse * dense, which calls sparse.__mul__ first.) Have you tried defining sparse.__rmul__? dense.__mul__ should raise an exception when it does not know about the rhs operant and then Python calls .__rmul__. Pearu From schofield at ftw.at Mon Feb 27 04:38:01 2006 From: schofield at ftw.at (Ed Schofield) Date: Mon Feb 27 04:38:01 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: References: <4402DAA8.3030501@ftw.at> Message-ID: <4402F293.8050606@ftw.at> Pearu Peterson wrote: > On Mon, 27 Feb 2006, Ed Schofield wrote: > >> I'm trying to improve integration between SciPy's sparse matrices and >> NumPy's dense array/matrix objects. One problem I'm facing is that >> NumPy matrices redefine the * operator to call NumPy's dot() function. >> Since dot() has no awareness of SciPy's sparse matrix objects, this >> doesn't work for the operation 'dense * sparse'. (It does work for >> sparse * dense, which calls sparse.__mul__ first.) > > Have you tried defining sparse.__rmul__? dense.__mul__ should raise > an exception when it does not know about the rhs operant and then > Python calls .__rmul__. Yes, we've defined __rmul__, and this works fine for dense arrays, whose __mul__ raises an exception. The problem is that matrix.__mul__ calls dot(), which doesn't raise an exception, but rather creates an oddball object array: matrix([[ (1, 0) 0.0 (2, 1) 0.0 (3, 0) 0.0, (1, 0) 0.0 (2, 1) 0.0 (3, 0) 0.0, (1, 0) 0.0 (2, 1) 0.0 (3, 0) 0.0]], dtype=object) We could potentially modify the __mul__ function of numpy's matrix objects to make a guess about whether an array constructed out of the argument will somehow be sane or whether, like here, it should raise an exception. But this would be difficult to get right, since the sparse matrix formats are quite varied (some supporting the map/sequence protocols, some not, etc.). But being able to test isinstance(arg, spmatrix) would make this easy. -- Ed From cjw at sympatico.ca Mon Feb 27 05:07:51 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 27 05:07:51 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> Message-ID: <4402F958.7010902@sympatico.ca> Zachary Pincus wrote: > Hi folks, > > I'm trying to write an ndarray subclass with a constructor like the > matrix constructor -- one which can take matrix objects, array > objects, or things that can be turned into array objects. > > I've copied the __new__ method from matrix (and tried to eliminate > the matrix-specific stuff), but there's a lot of code there. So I'm > trying to figure out what the absolute minimum I need is for correct > behavior. (This would be a useful wiki entry somewhere. In fact, a > whole page about subclassing ndarray would be good.) > > What follows is what I have so far. Have I missed anything, or can > anything else be removed? > > Zach > > class contour(numpy.ndarray): > def __new__(subtype, data, dtype=None, copy=True): > > ##### Do I need this first if block? > ##### Wouldn't the second block would do fine on its own? > if isinstance(data, contour): > dtype2 = data.dtype > if (dtype is None): > dtype = dtype2 > if (dtype2 == dtype) and (not copy): > return data > return data.astype(dtype) > > if isinstance(data, numpy.ndarray): > if dtype is None: > intype = data.dtype > else: > intype = numpy.dtype(dtype) > new = data.view(contour) > if intype != data.dtype: > return new.astype(intype) > if copy: return new.copy() > else: return new > > # now convert data to an array > arr = numpy.array(data, dtype=dtype, copy=copy) > > ##### Do I need this if block? > if not (arr.flags.fortran or arr.flags.contiguous): > arr = arr.copy() > > ##### Do I need the fortran flag? > ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, > buffer=arr, fortran=arr.flags.fortran) > return ret > > Would there be any merit in breaking this into two parts, __new__ to allocate space and __init__ to initialize the data? Colin W. From pearu at scipy.org Mon Feb 27 05:10:03 2006 From: pearu at scipy.org (Pearu Peterson) Date: Mon Feb 27 05:10:03 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: <4402F293.8050606@ftw.at> References: <4402DAA8.3030501@ftw.at> <4402F293.8050606@ftw.at> Message-ID: On Mon, 27 Feb 2006, Ed Schofield wrote: > Pearu Peterson wrote: >> On Mon, 27 Feb 2006, Ed Schofield wrote: >> >>> I'm trying to improve integration between SciPy's sparse matrices and >>> NumPy's dense array/matrix objects. One problem I'm facing is that >>> NumPy matrices redefine the * operator to call NumPy's dot() function. >>> Since dot() has no awareness of SciPy's sparse matrix objects, this >>> doesn't work for the operation 'dense * sparse'. (It does work for >>> sparse * dense, which calls sparse.__mul__ first.) >> >> Have you tried defining sparse.__rmul__? dense.__mul__ should raise >> an exception when it does not know about the rhs operant and then >> Python calls .__rmul__. > > Yes, we've defined __rmul__, and this works fine for dense arrays, whose > __mul__ raises an exception. The problem is that matrix.__mul__ calls > dot(), which doesn't raise an exception, but rather creates an oddball > object array: > > matrix([[ (1, 0) 0.0 > (2, 1) 0.0 > (3, 0) 0.0, > (1, 0) 0.0 > (2, 1) 0.0 > (3, 0) 0.0, > (1, 0) 0.0 > (2, 1) 0.0 > (3, 0) 0.0]], dtype=object) > > > We could potentially modify the __mul__ function of numpy's matrix > objects to make a guess about whether an array constructed out of the > argument will somehow be sane or whether, like here, it should raise an > exception. But this would be difficult to get right, since the sparse > matrix formats are quite varied (some supporting the map/sequence > protocols, some not, etc.). But being able to test isinstance(arg, > spmatrix) would make this easy. Sure, isinstance(arg,spmatrix) would work but it is not a general solution for performing binary operations with matrices and such user defined objects that numpy is not aware of. But these objects may be aware of numpy matrices or arrays. Sparse matrix is one example. Other example is defining a symbolic matrix. Etc. So, IMHO matrix.__mul__ (or dot) should be fixed instead. Pearu From martin.wiechert at gmx.de Mon Feb 27 05:47:00 2006 From: martin.wiechert at gmx.de (Martin Wiechert) Date: Mon Feb 27 05:47:00 2006 Subject: [Numpy-discussion] object members Message-ID: <200602271436.12659.martin.wiechert@gmx.de> Hi developers, any plans on when object members will be back? a = ndarray (shape = (10,), dtype = {'names': ['x'], 'formats': ['|O4']}) TypeError: fields with object members not yet supported. in numpy 0.9.5. Of course, this is better than the segfault in 0.9.4, but it would be quite inconvenient for my project to not have object members. My C code still produces dtypes with object members. Can I safely use them, as long as I make sure new arrays are properly initialised? Thanks, Martin. From schofield at ftw.at Mon Feb 27 06:41:08 2006 From: schofield at ftw.at (Ed Schofield) Date: Mon Feb 27 06:41:08 2006 Subject: [Numpy-discussion] Sparse matrix hooks In-Reply-To: References: <4402DAA8.3030501@ftw.at> <4402F293.8050606@ftw.at> Message-ID: <44030F74.1060508@ftw.at> Pearu Peterson wrote: > On Mon, 27 Feb 2006, Ed Schofield wrote: >> Yes, we've defined __rmul__, and this works fine for dense arrays, whose >> __mul__ raises an exception. The problem is that matrix.__mul__ calls >> dot(), which doesn't raise an exception, but rather creates an oddball >> object array: >> >> matrix([[ (1, 0) 0.0 >> (2, 1) 0.0 >> (3, 0) 0.0, >> (1, 0) 0.0 >> (2, 1) 0.0 >> (3, 0) 0.0, >> (1, 0) 0.0 >> (2, 1) 0.0 >> (3, 0) 0.0]], dtype=object) >> >> >> We could potentially modify the __mul__ function of numpy's matrix >> objects to make a guess about whether an array constructed out of the >> argument will somehow be sane or whether, like here, it should raise an >> exception. But this would be difficult to get right, since the sparse >> matrix formats are quite varied (some supporting the map/sequence >> protocols, some not, etc.). But being able to test isinstance(arg, >> spmatrix) would make this easy. > > Sure, isinstance(arg,spmatrix) would work but it is not a general > solution for performing binary operations with matrices and such user > defined objects that numpy is not aware of. But these objects may be > aware of numpy matrices or arrays. Sparse matrix is one example. Other > example is defining a symbolic matrix. Etc. > So, IMHO matrix.__mul__ (or dot) should be fixed instead. Ah, yes, this could be the simplest solution (at least to the __mul__ problem). We could redefine matrix.__mul__ as def __mul__(self, other): if isinstance(other, N.ndarray) or not hasattr(other, '__rmul__') \ or N.isscalar(other): return N.dot(self, other) else: return NotImplemented This seems to fix multiplication. I may make a case later for sparse matrix hooks for other functions, but I don't see a pressing need right now. ;) -- Ed From ndarray at mac.com Mon Feb 27 08:14:02 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 08:14:02 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: <200602270019.24151.pgmdevlist@mailcan.com> References: <200602270019.24151.pgmdevlist@mailcan.com> Message-ID: On 2/27/06, pgmdevlist at mailcan.com wrote: > ... > I remmbr a message a couple of weeks ago wondering whether ma should be kpet > uptodate with the rest of numpy (and of course, I can't find the reference > right now). What's the status on ma ? > Ma will be supported in numpy. See FAQ "Does NumPy support nan ("not a number")?." Ma development page is at http://projects.scipy.org/scipy/numpy/wiki/MaskedArray . Feel free to add contents there. I would welcome a section listing numpy functions that are still not available in ma. > Let's take the `median` example for 2D arrays. Median is one of those examples where Paul's recommendation does not work because missing values should be ignored rather than filled. For example, in R median has two modes: to ignore missing values and to return missing value if any value is missing: > median(c(1,NA)) [1] NA > help(median) > median(c(1,NA),na.rm=TRUE) [1] 1 > median(c(1,0)) [1] 0.5 From zpincus at stanford.edu Mon Feb 27 08:50:10 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Mon Feb 27 08:50:10 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <4402F958.7010902@sympatico.ca> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> Message-ID: <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> On Feb 27, 2006, at 5:06 AM, Colin J. Williams wrote: > [snip] > Would there be any merit in breaking this into two parts, __new__ > to allocate space and __init__ to initialize the data? What I presented below is exactly the __new__ grabbed from the matrix class definition in defmatrix.py (less matrix-specific stuff). I assume it's overkill for what I need, but it seemed like a good starting place. Figuring out what bits are really necessary and what bits aren't would be step 1, before I can even think about whether some of the bits should really live in __init__ (though I can't actually see any of the below lining in init, because most of the code is given over to deciding whether or not to allocate space). Zach > Zachary Pincus wrote: >> >> What follows is what I have so far. Have I missed anything, or >> can anything else be removed? >> >> Zach >> >> class contour(numpy.ndarray): >> def __new__(subtype, data, dtype=None, copy=True): >> >> ##### Do I need this first if block? >> ##### Wouldn't the second block would do fine on its own? >> if isinstance(data, contour): >> dtype2 = data.dtype >> if (dtype is None): >> dtype = dtype2 >> if (dtype2 == dtype) and (not copy): >> return data >> return data.astype(dtype) >> >> if isinstance(data, numpy.ndarray): >> if dtype is None: >> intype = data.dtype >> else: >> intype = numpy.dtype(dtype) >> new = data.view(contour) >> if intype != data.dtype: >> return new.astype(intype) >> if copy: return new.copy() >> else: return new >> >> # now convert data to an array >> arr = numpy.array(data, dtype=dtype, copy=copy) >> >> ##### Do I need this if block? >> if not (arr.flags.fortran or arr.flags.contiguous): >> arr = arr.copy() >> >> ##### Do I need the fortran flag? >> ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, >> buffer=arr, fortran=arr.flags.fortran) >> return ret >> >> > Would there be any merit in breaking this into two parts, __new__ > to allocate space and __init__ to initialize the data? > > Colin W. > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the > live webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From oliphant.travis at ieee.org Mon Feb 27 10:10:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 10:10:02 2006 Subject: [Numpy-discussion] object members In-Reply-To: <200602271436.12659.martin.wiechert@gmx.de> References: <200602271436.12659.martin.wiechert@gmx.de> Message-ID: <4403403B.1050405@ieee.org> Martin Wiechert wrote: >Hi developers, > >any plans on when object members will be back? > > Probably not for a while unless somebody wants to tackle the issues that are present. >My C code still produces dtypes with object members. Can I safely use them, as >long as I make sure new arrays are properly initialised? > > You can only safely use dtypes that are a single OBJECT array. If a record data-type has object members, it is not accounted for in all of the code that special-checks for object arrays. Basically, all of that code would need to be adjusted to deal with has-object (i.e. void arrays that have a part of their memory layout that is an object). The issue is that reference counts need to be handled correctly for that portion of the data-record. This was not considered as the code was written, so it would involve a bit of work to get right. It absolutely could be done, but it's not on my priority list and my time for working on NumPy has dwindled. -Travis From ndarray at mac.com Mon Feb 27 10:17:16 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 10:17:16 2006 Subject: [Numpy-discussion] Unexpected property of ndarray.fill Message-ID: The function ndarray.fill is documented as taking a scalar value, but in the current implementation it accepts arrays as well and ignores all but the first element Example 1: >>> x = empty(10) >>> x.fill([1,2]) >>> x array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) Example 2: >>> y = empty((2,3)) >>> y.fill([1,2,3]) >>> y array([[1, 1, 1], [1, 1, 1]]) I believe this somewhat unexpected. I would expect one of the following to apply: 1. Exception in both examples 2. Exception in Example 1 and array([[1,2,3], [1,2,3]]) in Example 2 3. array([1, 2, 1, 2, 1, 2, 1, 2, 1, 2]) in Example 1 and array([[1,2,3], [1,2,3]]) in Example 2 From oliphant at ee.byu.edu Mon Feb 27 11:26:04 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 11:26:04 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <20060225083924.GF21117@alpha> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> Message-ID: <44035227.9010609@ee.byu.edu> Stefan van der Walt wrote: >>The __init__ and __new__ methods are not called because they may have >>arbitrary signatures. Instead, the __array_finalize__ method is always >>called. So, you should use that instead of __init__. >> >> This is now true in SVN. Previously, __array_finalize__ was not called if the "parent" was NULL. However, now, it is still called with None as the value of the first argument. Thus __array_finalize__ will be called whenever ndarray.__new__(,...) is called. -Travis From oliphant at ee.byu.edu Mon Feb 27 11:38:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 11:38:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <20060227190844.GA28750@alpha> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44033E75.3010200@ieee.org> <20060227190844.GA28750@alpha> Message-ID: <440354DB.5020006@ee.byu.edu> Stefan van der Walt wrote: >If I understand correctly, the __array_finalize__ method should copy >all meta data from the parent object to the child. In other words, I >might have something like: > >def __init__(self, ...): > self.v = 3 > >def __array_finalize__(self, parent): > self.v = parent.v > >I do not want to do something like > >def __array_finalize__(self, parent): > self.v = 3 > >Because then, every time I do an operation on my array, self.v will be >reset. Shouldn't array_finalize look for such methods/properties and copy >them automatically? > You need to set up __array_finalize__ to do that. I did not want to do that for every sub-class because different things are required for different sub classes. So, this seems like a workable compromise. It should work a bit better now that I've changed it so that __array_finalize__ is called on every sub-class creation. If there is no "parent" then parent will be None. -Travis From aisaac at american.edu Mon Feb 27 11:50:08 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 27 11:50:08 2006 Subject: [Numpy-discussion] Unexpected property of ndarray.fill In-Reply-To: References: Message-ID: On Mon, 27 Feb 2006, Sasha apparently wrote: > 3. array([1, 2, 1, 2, 1, 2, 1, 2, 1, 2]) in Example 1 and > array([[1,2,3], [1,2,3]]) in Example 2 A user preference for 3. And if the fill array is too long, the rest should be discarded. Cheers, Alan Isaac From chanley at stsci.edu Mon Feb 27 11:57:02 2006 From: chanley at stsci.edu (Christopher Hanley) Date: Mon Feb 27 11:57:02 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <440354DB.5020006@ee.byu.edu> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44033E75.3010200@ieee.org> <20060227190844.GA28750@alpha> <440354DB.5020006@ee.byu.edu> Message-ID: <4403593F.4090201@stsci.edu> Travis Oliphant wrote: > So, this seems like a workable compromise. It should work a bit better > now that I've changed it so that __array_finalize__ is called on every > sub-class creation. If there is no "parent" then parent will be None. Hi Travis, The change you made just broke the following bit of code: class FITS_rec(rec.recarray): def __new__(subtype, input): """Construct a FITS record array from a recarray.""" # input should be a record array self = rec.recarray.__new__(subtype, input.shape, input.dtype, buf=input.data, strides=input.strides) self._nfields = len(self.dtype.fields[-1]) self._convert = [None]*len(self.dtype.fields[-1]) self._coldefs = None return self def __array_finalize__(self,obj): self._convert = obj._convert self._coldefs = obj._coldefs self._nfields = obj._nfields In [1]: from pyfits import FITS_rec In [2]: from numpy import rec In [3]: data = FITS_rec(rec.array(None,formats="i4,i4,i4",names="c1,c2,c3",shape=3)) --------------------------------------------------------------------------- exceptions.AttributeError Traceback (most recent call last) /data/sparty1/dev/pyfits-numpy/test/ /data/sparty1/dev/site-packages/lib/python/pyfits.py in __new__(subtype, input) 3136 # self.__setstate__(input.__getstate__()) 3137 self = rec.recarray.__new__(subtype, input.shape, input.dtype, -> 3138 buf=input.data, strides=input.strides) 3139 3140 # _parent is the original (storage) array, /data/sparty1/dev/site-packages/lib/python/numpy/core/records.py in __new__(subtype, shape, formats, names, titles, buf, offset, strides, byteorder, aligned) 153 self = sb.ndarray.__new__(subtype, shape, (record, descr), 154 buffer=buf, offset=offset, --> 155 strides=strides) 156 return self 157 /data/sparty1/dev/site-packages/lib/python/pyfits.py in __array_finalize__(self, obj) 3150 def __array_finalize__(self,obj): 3151 # self._parent = obj._parent -> 3152 self._convert = obj._convert 3153 self._coldefs = obj._coldefs 3154 self._nfields = obj._nfields AttributeError: 'NoneType' object has no attribute '_convert' Given what you have just said I would not have expected this to be broken. Chris From oliphant at ee.byu.edu Mon Feb 27 12:01:06 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 12:01:06 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403593F.4090201@stsci.edu> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44033E75.3010200@ieee.org> <20060227190844.GA28750@alpha> <440354DB.5020006@ee.byu.edu> <4403593F.4090201@stsci.edu> Message-ID: <44035A5E.2030100@ee.byu.edu> Christopher Hanley wrote: > Travis Oliphant wrote: > >> So, this seems like a workable compromise. It should work a bit >> better now that I've changed it so that __array_finalize__ is called >> on every sub-class creation. If there is no "parent" then parent >> will be None. > > > Hi Travis, > > The change you made just broke the following bit of code: Right. It's because obj can sometimes be None... > > class FITS_rec(rec.recarray): > def __new__(subtype, input): > """Construct a FITS record array from a recarray.""" > # input should be a record array > self = rec.recarray.__new__(subtype, input.shape, input.dtype, > buf=input.data, strides=input.strides) > > self._nfields = len(self.dtype.fields[-1]) > self._convert = [None]*len(self.dtype.fields[-1]) > self._coldefs = None > return self > > def __array_finalize__(self,obj): > self._convert = obj._convert > self._coldefs = obj._coldefs > self._nfields = obj._nfields Add as the first line. if obj is None: return -Travis > > Given what you have just said I would not have expected this to be > broken. The change is that now __array_finalize__ is always called (even if there is no "parent" in which case obj is None). This seems easier to explain and a bit more consistent. Feedback encouraged... -Travis From strawman at astraw.com Mon Feb 27 12:16:03 2006 From: strawman at astraw.com (Andrew Straw) Date: Mon Feb 27 12:16:03 2006 Subject: [Numpy-discussion] bug report: nonzero on masked arrays Message-ID: <44035DEB.6030703@astraw.com> I know there's been some discussion along these lines lately, but I haven't followed it closely and thought I could still give a simple bug report. I can post this on Trac if no one can tackle it immediately -- let me know. --Andrew In [1]: import numpy In [2]: numpy.__file__ Out[2]: '/home/astraw/py2.3-linux-x86_64/lib/python2.3/site-packages/numpy-0.9.6.2172-py2.3-linux-x86_64.egg/numpy/__init__.pyc' In [3]: jj= numpy.ma.masked_array( [0,1,2,3,0,4,5,6], mask=[0,0,0,1,1,1,0,0]) In [4]: numpy.ma.nonzero(jj) --------------------------------------------------------------------------- exceptions.NameError Traceback (most recent call last) /home/astraw/src/kookaburra/flydra/analysis/ /home/astraw/py2.3-linux-x86_64/lib/python2.3/site-packages/numpy-0.9.6.2172-py2.3-linux-x86_64.egg/numpy/core/ma.py in __call__(self, a, *args, **kwargs) 317 else: 318 if m.shape != shape: --> 319 m = mask_or(getmaskarray(a), getmaskarray(b)) 320 return masked_array(result, m) 321 NameError: global name 'b' is not defined In [5]: From paul at pfdubois.com Mon Feb 27 12:55:04 2006 From: paul at pfdubois.com (Paul F. Dubois) Date: Mon Feb 27 12:55:04 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: References: <200602270019.24151.pgmdevlist@mailcan.com> Message-ID: <440366F2.8090705@pfdubois.com> Some things don't make sense for missing value arrays. For example, FFT's. Median does not make sense if you fill so you have to use compress. I think for each operation you must decide what you mean exactly. If it fits the "easy" wrapper model, you can do that, but otherwise, you have to code it. Sasha wrote: > On 2/27/06, pgmdevlist at mailcan.com wrote: >> ... >> I remmbr a message a couple of weeks ago wondering whether ma should be kpet >> uptodate with the rest of numpy (and of course, I can't find the reference >> right now). What's the status on ma ? >> > > Ma will be supported in numpy. See FAQ > "Does NumPy support nan ("not a > number")?." > > Ma development page is at > http://projects.scipy.org/scipy/numpy/wiki/MaskedArray . > Feel free to add contents there. I would welcome a section listing > numpy functions that are still not available in ma. > > >> Let's take the `median` example for 2D arrays. > > Median is one of those examples where Paul's recommendation does not > work because missing values should be ignored rather than filled. For > example, in R median has two modes: to ignore missing values and to > return missing value if any value is missing: > >> median(c(1,NA)) > [1] NA >> help(median) >> median(c(1,NA),na.rm=TRUE) > [1] 1 >> median(c(1,0)) > [1] 0.5 > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From aisaac at american.edu Mon Feb 27 14:03:15 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 27 14:03:15 2006 Subject: [Numpy-discussion] missing array type Message-ID: The recent discussion of Matlab's repmat plus some recent use of grids leads me to ask: should numpy contain a true repeated array object, where one single copy of the data supports a full set of array operations? (So that, e.g., repmat(x,(3,2)) would simply point to the x data or, if desired, make a single copy of it.) And actually this seems a special case of a truly space-saving Kronecker product. Cheers, Alan Isaac From ndarray at mac.com Mon Feb 27 14:27:13 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 14:27:13 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: It looks like what you want is a zero-stride array that I proposed some time ago. See "zeros in strides" thread . I've posted a patch to the list, but it was met with a mild opposition from Travis, so I've never committed it to SVN. The final word was: """ I would also like to get more opinions about Sasha's proposal for zero-stride arrays. -Travis """ If you agree that zero-stride array would provide the functionality that you need, it may tip the ballance towards accepting that patch. On 2/27/06, Alan G Isaac wrote: > The recent discussion of Matlab's repmat > plus some recent use of grids leads me to ask: > should numpy contain a true repeated array object, > where one single copy of the data supports > a full set of array operations? (So that, > e.g., repmat(x,(3,2)) would simply point > to the x data or, if desired, make a single > copy of it.) > > And actually this seems a special case of > a truly space-saving Kronecker product. > > Cheers, > Alan Isaac > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From ndarray at mac.com Mon Feb 27 14:48:03 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 14:48:03 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: <200602271658.44855.pgmdevlist@mailcan.com> References: <200602270019.24151.pgmdevlist@mailcan.com> <200602271658.44855.pgmdevlist@mailcan.com> Message-ID: Please reply to the list rather than in private e-mail. Private e-mail is likely to end up in a spam folder. See more below. On 2/27/06, pierregm wrote: > Sasha, > Thanks for your answer. > > Ma development page is at > > http://projects.scipy.org/scipy/numpy/wiki/MaskedArray . > > Feel free to add contents there. I would welcome a section listing > > numpy functions that are still not available in ma. > > OK, I'll work on that > Great! > > Median is one of those examples where Paul's recommendation does not > > work because missing values should be ignored rather than filled. For > > example, in R median has two modes: to ignore missing values and to > > return missing value if any value is missing: > > OK, good idea. I already implemented a mamedian for 1d and 2d array. Could you > tell me where I could upload it to be double-checked/tested ? I see three logical locations for patches: 1. Attach to the wiki page: . 2. Post to the list: . 3. Upload to sourceforge: . I don't have any preference, but if you choose 1 or 3, please announce the URL on the list. Also it is best to post patches as an output tof "svn diff". From aisaac at american.edu Mon Feb 27 15:43:02 2006 From: aisaac at american.edu (Alan G Isaac) Date: Mon Feb 27 15:43:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: On Mon, 27 Feb 2006, Sasha apparently wrote: > If you agree that zero-stride array would provide the > functionality that you need, it may tip the ballance > towards accepting that patch. I am out of my technical depth here. Based on your examples, zero-stride arrays seem both logical and desirable. They do seem just right for a simpler (and substantially generalized) nd_grid. But I think they only partially address problems two other that have come up. 1. repmat http://www.mathworks.com/access/helpdesk/help/techdoc/ref/repmat.html#998661 Perhaps you will say that the best representation of a repmat will use 4 dimensions, two with zero strides? I wonder if a more general cycling is needed for a natural repeated matrix. 2. Kronecker product: http://www.mathworks.com/access/helpdesk/help/techdoc/ref/kron.html#998881 This seems a different issue altogether. I suspect the right way to produce kron(x,y) is usually as a class whose data is x and y, with the Kronecker product never actually stored in memory. I do not see zero-stride arrays as helping here. Cheers, Alan Isaac From ndarray at mac.com Mon Feb 27 16:13:07 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 16:13:07 2006 Subject: [Numpy-discussion] Faster fill Message-ID: I''ve just posted a patch at http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas that results in a 6-14x speed-up of ndarray.fill for simple datatypes. That's a bigger change than what I am confortable submiting to svn without a review. Also since I am not familiar with record arrays, I am not sure whether or not this change would break anything in that area. Finally, the patch only adresses a single-segment case, "strided" arrays would still use old code. From oliphant at ee.byu.edu Mon Feb 27 16:22:20 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 16:22:20 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: <440397A1.9020607@ee.byu.edu> Sasha wrote: >It looks like what you want is a zero-stride array that I proposed >some time ago. See "zeros in strides" thread >. > >I've posted a patch to the list, but it was met with a mild opposition >from Travis, so I've never committed it to SVN. The final word was: > >""" >I would also like to get more opinions about Sasha's proposal for >zero-stride arrays. > >-Travis >""" > >If you agree that zero-stride array would provide the functionality >that you need, it may tip the ballance towards accepting that patch. > > Actually, I think it does. I think 0-stride arrays are acceptable (I think you can make them now but you have to provide your own memory, right?) From one perspective, all we are proposing to do is allow numpy to create the memory *and* allow differently strided arrays, right? Now, if it creates the memory, the strides must be C-contiguous or fortran-contiguous. We are going to allow user-specified strides, now, even on memory creation. Sasha, your initial patch was pretty good but I was concerned about the speed of array creation being changed for other cases. If you can speed up PyArray_NewFromDescr (probably by only changing the branch that currently raises an error), then I think your proposed changes should be O.K. The check on the provided strides argument needs to be thought through so that we don't accept strides that will allow walking outside the memory that *is* or *will be* allocated. I have not reviewed your code for this, but I'm assuming you've thought that through? -Travis From oliphant at ee.byu.edu Mon Feb 27 16:44:02 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Feb 27 16:44:02 2006 Subject: [Numpy-discussion] Faster fill In-Reply-To: References: Message-ID: <44039CA8.4080102@ee.byu.edu> Sasha wrote: >I''ve just posted a patch at >http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas >that results in a 6-14x speed-up of ndarray.fill for simple >datatypes. That's a bigger change than what I am confortable >submiting to svn without a review. Also since I am not familiar with >record arrays, I am not sure whether or not this change would break >anything in that area. Finally, the patch only adresses a >single-segment case, "strided" arrays would still use old code. > > This looks like a good proceedure. For those less familiar with the code. Sasha added an additional "data-type-function" to the structure of data-type-specific functions. Each builtin data-type has a pointer to a list of functions. A while ago I moved these functions out from the data-type object so that they could grow. This is probably a good fundamental operation for speeding up. I would probably also not define the misaligned case, but just use the default code for that case as well. This is consistent with the idea that misaligned data will have slower operations. To make your code work for record arrays you need to handle the VOID case. I would just not define it for the void case at this point. Several other data-type functions need to be improved to handle record arrays (look at setitem and getitem for guidance) better anyway. You also need to add a check so that if the function pointer is NULL, the optimized function is not called. But, generally, this is the right use of the data-type-functions. Good job. -Travis From jh at oobleck.astro.cornell.edu Mon Feb 27 18:24:05 2006 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Mon Feb 27 18:24:05 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <20060221145310.A69A912C2A@sc8-sf-spam2.sourceforge.net> (numpy-discussion-request@lists.sourceforge.net) References: <20060221145310.A69A912C2A@sc8-sf-spam2.sourceforge.net> Message-ID: <200602280223.k1S2NZL3020466@oobleck.astro.cornell.edu> I fixed a small error. I found myself a bit lost. Are cookbook pages supposed to be introductory, or are they aimed at users who already know a fair bit? From the page, I can vaguely get that record arrays allow you access to your data using text strings as partial indexers, but I found myself focusing so much on testing whether this was true and figuring out how they work that I was completely distracted from the example. A paragraph or two at the top explaining what a record array is, why it's useful, and what the basic properties are would be good. Then give the example. Also, it's less than intuitive why you are storing RGB values (usually thought of as integers in the range 0-255 or 0x00 - 0xff) in 32-bit floating-point numbers. --jh-- From ndarray at mac.com Mon Feb 27 18:36:02 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 18:36:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <440397A1.9020607@ee.byu.edu> References: <440397A1.9020607@ee.byu.edu> Message-ID: On 2/27/06, Travis Oliphant wrote: > .... I think 0-stride arrays are acceptable (I > think you can make them now but you have to provide your own memory, > right?) > Not really. Ndarray constructor has never allowed zeros in strides. It was possible to set strides to a tuple containing zeros after construction in some cases. I've changed that in r2054 . Currently zero strides are not allowed. > From one perspective, all we are proposing to do is allow numpy to > create the memory *and* allow differently strided arrays, right? Another way to view this is that we are proposing to change the computation of memory requirements to consider strides instead of item size and number of items. Zero strides require only one item to be allocated for any number of items in the array. > Now, if it creates the memory, the strides must be C-contiguous or > fortran-contiguous. We are going to allow user-specified strides, now, > even on memory creation. > Yes. > Sasha, your initial patch was pretty good but I was concerned about the > speed of array creation being changed for other cases. If you can speed > up PyArray_NewFromDescr (probably by only changing the branch that > currently raises an error), then I think your proposed changes should be > O.K. > I will probably not be able to do it right away. Meanwhile I've created a wiki page for this mini-project . > The check on the provided strides argument needs to be thought through > so that we don't accept strides that will allow walking outside the > memory that *is* or *will be* allocated. > > I have not reviewed your code for this, but I'm assuming you've thought > that through? That was the central issue in the patch: how to compute the size of the buffer in the presence of zero strides, so I hope I got it right. In order to make zero stride arrays really useful, they should survive transformation by ufunc. With my patch if x is a zero-stride array of length N, then exp(x) is a regular array and exp is called N times to compute the result. That would be a much bigger project. As a first step, I would just disallow using zero-stride arrays as output to avoid problems with inplace operations. In any case, everyone interested in this feature is invited to edit the wiki page at . From cjw at sympatico.ca Mon Feb 27 19:23:04 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 27 19:23:04 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <44035227.9010609@ee.byu.edu> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> Message-ID: <4403C20A.5060603@sympatico.ca> Travis Oliphant wrote: > Stefan van der Walt wrote: > >>> The __init__ and __new__ methods are not called because they may >>> have arbitrary signatures. Instead, the __array_finalize__ method >>> is always called. So, you should use that instead of __init__. >>> >> > This is now true in SVN. Previously, __array_finalize__ was not > called if the "parent" was NULL. However, now, it is still called > with None as the value of the first argument. > > Thus __array_finalize__ will be called whenever ndarray.__new__( subclass>,...) is called. Why this change in style from the the common Python idom of __new__, __init__, with the same signature to __new__, __array_finalize__ with possibly different signatures? Incidentally, what are the signatures? The doc string is empty: [Dbg]>>> _n.ndarray.__array_finalize__.__doc__ [Dbg]>>> Colin W. From cjw at sympatico.ca Mon Feb 27 19:30:00 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Mon Feb 27 19:30:00 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: Message-ID: <4403C3AB.1040600@sympatico.ca> Sasha wrote: >It looks like what you want is a zero-stride array that I proposed >some time ago. See "zeros in strides" thread >. > >I've posted a patch to the list, but it was met with a mild opposition >from Travis, so I've never committed it to SVN. The final word was: > >""" >I would also like to get more opinions about Sasha's proposal for >zero-stride arrays. > >-Travis >""" > >If you agree that zero-stride array would provide the functionality >that you need, it may tip the ballance towards accepting that patch. > > (-1) The proposal add complexity. I don't see the compensating benefit. Colin W. > >On 2/27/06, Alan G Isaac wrote: > > >>The recent discussion of Matlab's repmat >>plus some recent use of grids leads me to ask: >>should numpy contain a true repeated array object, >>where one single copy of the data supports >>a full set of array operations? (So that, >>e.g., repmat(x,(3,2)) would simply point >>to the x data or, if desired, make a single >>copy of it.) >> >>And actually this seems a special case of >>a truly space-saving Kronecker product. >> >>Cheers, >>Alan Isaac >> >> >> >> >>------------------------------------------------------- >>This SF.Net email is sponsored by xPML, a groundbreaking scripting language >>that extends applications into web and mobile media. Attend the live webcast >>and join the prime developer group breaking into this new coding territory! >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 >>_______________________________________________ >>Numpy-discussion mailing list >>Numpy-discussion at lists.sourceforge.net >>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> >> > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From ndarray at mac.com Mon Feb 27 19:43:05 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 19:43:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403C3AB.1040600@sympatico.ca> References: <4403C3AB.1040600@sympatico.ca> Message-ID: On 2/27/06, Colin J. Williams wrote: > ... > (-1) The proposal add complexity. I don't see the compensating benefit. Did you read the original thread? I thought I clearly explained what the benefit was. Added complexity is minimal because at the C level zero-stride arrays are already possible. All I was proposing was to safely expose existing C level functionality to Python. From ndarray at mac.com Mon Feb 27 21:11:06 2006 From: ndarray at mac.com (Sasha) Date: Mon Feb 27 21:11:06 2006 Subject: [Numpy-discussion] Faster fill In-Reply-To: <44039CA8.4080102@ee.byu.edu> References: <44039CA8.4080102@ee.byu.edu> Message-ID: On 2/27/06, Travis Oliphant wrote: > Sasha wrote: > > >I''ve just posted a patch at > >http://projects.scipy.org/scipy/numpy/wiki/PossibleOptimizationAreas > ... > This looks like a good proceedure. On the second thought, it may be better to commit experimental changes to a branch in svn and merge to the trunk after review. What do you think? > I would probably also not define the misaligned case, but just use the > default code for that case as well. This is consistent with the idea > that misaligned data will have slower operations. > It turns out my approach does not speed up misaligned case. I've posted a new patch that incorporates your suggestions at . (Note the change in location.) > > You also need to add a check so that if the function pointer is NULL, > the optimized function is not called. > Done in the new patch. The patch passes numpy.test(10), but I don't think it tests ndarray.fill in any meaningful way. I will probably need to add some tests. From oliphant.travis at ieee.org Mon Feb 27 21:43:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 21:43:02 2006 Subject: [Numpy-discussion] Faster fill In-Reply-To: References: <44039CA8.4080102@ee.byu.edu> Message-ID: <4403E2BE.1090801@ieee.org> Sasha wrote: >On the second thought, it may be better to commit experimental changes >to a branch in svn and merge to the trunk after review. What do you >think? > > This is always possible. It really depends on how significant the changes are. These changes are somewhat isolated to a single bit of functionality (adding a new item to the function structure pointed to by each PyArray_Descr shouldn't change anything else). As long as svn compiles and passes all tests, I think it can be merged directly. I see branches as being needed when a feature requires more testing and it is less clear how invasive the changes will be. In this case, I would say go ahead and apply the feature directly.. >The patch passes numpy.test(10), but I don't think it tests >ndarray.fill in any meaningful way. I will probably need to add some >tests. > > Tests are always good. In fact, it's an easy way for someone to contribue, since there are a lot of features I have only tested using examples in my book (the book examples serve as an additional set of tests that I regularly run). -Travis From zpincus at stanford.edu Mon Feb 27 21:53:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Mon Feb 27 21:53:02 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> Message-ID: <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> Hi again. I would like to put together a wiki page about writing ndarray subclasses because this is obviously a difficult topic, and the available documentation (e.g. looking at defmatrix) doesn't cover all -- or even the most common -- uses. As such, I am trying to put together a "skeleton" ndarray subclass which has all the basic features (a __new__ that allows for direct construction of such objects from other data objects, and porpagation of simple attributes in __array_finalize__). Right now I am trying to figure out what the minimum complement of things that need to go into such a __new__ method is. Below is my first effort, derived from defmatrix. Any help identifying parts of this code that are unnecessary, or parts that need to be added, would directly result in a better wiki page once I figure everything out. Zach > What follows is what I have so far. Have I missed anything, or can > anything else be removed? > > Zach > > class contour(numpy.ndarray): > def __new__(subtype, data, dtype=None, copy=True): > > ##### Do I need this first if block? > ##### Wouldn't the second block would do fine on its own? > if isinstance(data, contour): > dtype2 = data.dtype > if (dtype is None): > dtype = dtype2 > if (dtype2 == dtype) and (not copy): > return data > return data.astype(dtype) > > if isinstance(data, numpy.ndarray): > if dtype is None: > intype = data.dtype > else: > intype = numpy.dtype(dtype) > new = data.view(contour) > if intype != data.dtype: > return new.astype(intype) > if copy: return new.copy() > else: return new > > # now convert data to an array > arr = numpy.array(data, dtype=dtype, copy=copy) > > ##### Do I need this if block? > if not (arr.flags.fortran or arr.flags.contiguous): > arr = arr.copy() > > ##### Do I need the fortran flag? > ret = numpy.ndarray.__new__(subtype, arr.shape, arr.dtype, > buffer=arr, fortran=arr.flags.fortran) > return ret > From oliphant.travis at ieee.org Mon Feb 27 22:02:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 22:02:04 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: <440397A1.9020607@ee.byu.edu> Message-ID: <4403E72E.4040007@ieee.org> Sasha wrote: >On 2/27/06, Travis Oliphant wrote: > > >>.... I think 0-stride arrays are acceptable (I >>think you can make them now but you have to provide your own memory, >>right?) >> >> >> >Not really. Ndarray constructor has never allowed zeros in strides. It >was possible to set strides to a tuple containing zeros after >construction in some cases. I've changed that in r2054 >. Currently >zero strides are not allowed. > > Ah, right. It was only possible to do it in C-code. But, it is possible to do it in C-code. Since Colin has expressed some reservations, it's probably a good idea to continue the discussion before doing anything. One issue I have with zero-stride arrays is that essentially is what broadcasting is all about. Recently there has been a discussion about bringing repmat functionality over. The repmat function is used in some array languages largely because there is no such thing as broadcasting and arrays are not ND. Perhaps what is desired instead is rather than play games with indexing on a two dimensional array you simply define the appropriate 4-dimensional array. Currently you can define the size of the new dimensions to be 1 and they will act like 0-strided arrays when you operate with other arrays of any desired shape. Zero-strided arrays are actually quite fundamental to the notion of broadcasting. [Soap Box] I've been annoyed for several years that the idea of linear operators is constrained in most libraries to 2 dimensions. There are many times I want to find an inverse of an operator that is most naturally expressed with 6 dimensions. I have to myself play games with indexing to give the computer a matrix it can understand. Why is that? I think the computer should be doing the work of raveling and unraveling those indices for me. I think we have the opportunity in NumPy/SciPy to be much more general. A tensor class that handles the "index-raveling" that so many people have become conditioned to think is necessary could and should be handled by the class. If you've ever written finite-element code you should know exactly what I mean. [End Soap Box] On the one hand, we could just tell people to try and use broadcasting so that zero-strided arrays show up in Python in definitive ways. On the other we can just expose the power of zero-strided arrays to Python and let people come up with their own rules. I lean toward giving people the capability and letting them show me what it can do. The only thing controversial, I think is the behavior of outputs on ufuncs for strided arrays. Currently ufunc outputs always have full strides unless an output array is given. Changing this default behavior would require some justification (not to mention some code tweaking). I'm not immediately inclined to change it even if zero-strided arrays are allowed to be created from Python. >In order to make zero stride arrays really useful, they should survive >transformation by ufunc. With my patch if x is a zero-stride array of >length N, then exp(x) is a regular array and exp is called N times to >compute the result. That would be a much bigger project. As a first >step, I would just disallow using zero-stride arrays as output to >avoid problems with inplace operations. > > Hmm.. Could you show us again what you mean by these problems and the better behavior that could happen if ufuncs were changed? -Travis From oliphant.travis at ieee.org Mon Feb 27 22:13:03 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 22:13:03 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403C20A.5060603@sympatico.ca> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> <4403C20A.5060603@sympatico.ca> Message-ID: <4403E9CA.8050506@ieee.org> > Travis Oliphant wrote: > >> Stefan van der Walt wrote: >> >>>> The __init__ and __new__ methods are not called because they may >>>> have arbitrary signatures. Instead, the __array_finalize__ method >>>> is always called. So, you should use that instead of __init__. >>>> >>> >>> >> This is now true in SVN. Previously, __array_finalize__ was not >> called if the "parent" was NULL. However, now, it is still called >> with None as the value of the first argument. >> >> Thus __array_finalize__ will be called whenever ndarray.__new__(> subclass>,...) is called. > > > Why this change in style from the the common Python idom of __new__, > __init__, with the same signature to __new__, __array_finalize__ with > possibly different signatures? > I don't see it as a change in style but adding a capability to the ndarray subclass. The problem is that arrays can be created in many ways (slicing, ufuncs, etc). Not all of these ways should go through the __new__/__init__ -- style creation mechanism. Try inheriting from a float builtin and add attributes. Then add your float to an instance of your new class and see what happens. You will get a float-type on the output. This is the essence of Paul's insight that sub-classing is rarely useful because you end up having to re-define all the operators anyway to return the value that you want. He knows whereof he speaks as well, because he wrote MA and UserArray and had experience with Python sub-classing. I wanted a mechanism to make it easier to sub-class arrays and have the operators return your object if possible (including all of it's attributes). Thus, __array_priority__ (a floating point attribute) __array_finalize__ (a method called on internal construction of the array wrapper). were invented (along with __array_wrap__ which any class can define to have their objects survive ufuncs). It was easy enough to see where to call __array_finalize__ in the C-code if somewhat difficult to explain (and get exception handling to work because of my initial over-thinking). The signature is just __array_finalize__(self, parent): return i.e. any return value is ignored (but exceptions are caught). I've used the feature succesfully on at least 3-subclasses (chararray, memmap, and matrix) and so I'm actually pretty happy with it. __new__ and __init__ are still relevant for constructing your brand-new object. The __array_finalize__ function is just what the internal contructor that acutally allocates memory will always call to let you set final attributes *every* time your sub-class gets created. -Travis From zpincus at stanford.edu Mon Feb 27 22:31:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Mon Feb 27 22:31:01 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403E9CA.8050506@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> <4403C20A.5060603@sympatico.ca> <4403E9CA.8050506@ieee.org> Message-ID: <8626BCED-0630-47E0-8DB8-C225579EA9C6@stanford.edu> > __array_priority__ (a floating point attribute) > __array_wrap__ (which any class can define to have their objects > survive ufuncs) What do these do again? That is, how are they used internally? What happens if they're not used? If I (or anyone else) is to put together a wiki page about this (see the other thread I just emailed to, please), getting good concise descriptions of what ndarray subclasses need to do/can do would be very helpful. Zach From oliphant.travis at ieee.org Mon Feb 27 22:33:00 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Mon Feb 27 22:33:00 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> Message-ID: <4403EE5C.709@ieee.org> Zachary Pincus wrote: > Hi again. > > I would like to put together a wiki page about writing ndarray > subclasses because this is obviously a difficult topic, and the > available documentation (e.g. looking at defmatrix) doesn't cover all > -- or even the most common -- uses. Great. Let me see if I can help out with some concepts. First of all, you should probably mention that UserArray might be what people want. UserArray is a standard "container" class meaning that the array is just one of it's attributes and UserArray doesn't inherit from the array. I finally realized that evey though you can "directly" inherit from the ndarray, sometimes that is not what you want to do, but instead want a capable container class. I think multiple inheritence with other builtins is a big reason to want this feature as was pointed out on the list a few days ago. The biggest reason to inherit from the ndarray, is if you want your object to satisify the isinstance(obj, ndarray) test... Inheriting from the ndarray is as simple as class myarray(numpy.ndarray): pass Now, your array will be constructed. To get your arrays you can either use the standard array constructor array([1,2,3,4]).view(myarray) Or use the new myarray constructor --- which now has the same signature as ndarray.__new__. For several reasons (probably not any good ones, though :-) ), the default ndarray signature is built for "wrapping" around memory (exposed by another object through the buffer protocol in Python) or for creating uninitalized new memory. So, if you over-write the __new__ constructor for your class and want to call the ndarray.__new__ constructor you have to realize that you need to think of what you are doing in terms of "wrapping" some other created piece of memory or "initializing your memory". If you want your array to be-able to "convert" arbitrary objects to arrays, instead, then your constructor could in-fact be as simple as class myarray(numpy.ndarray): def __new__(cls, obj): return numpy.array(obj).view(cls) Then, if you want, you can define an __init__ method to handle setting of attributes --- however, if you set some attributes, then you need to think about what you want to happen when your new array gets "sliced" or added to. Because the internal code will create your new array (without calling new) and then call __array_finalize__(self, parent) where parent could be None (if there is no parent --- i.e. this is a new array). Any attributes you define should also be defined here so they get passed on to all arrays that are created.. I hope this helps some. -Travis From zpincus at stanford.edu Tue Feb 28 01:28:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 01:28:04 2006 Subject: [Numpy-discussion] Simplest ndarray subclass __new__ possible? In-Reply-To: <4403EE5C.709@ieee.org> References: <39FC8FA0-48DE-408D-8AD8-DDF72474F1C1@stanford.edu> <4402F958.7010902@sympatico.ca> <44724794-B6E1-4927-8D63-F3BC74F2CD10@stanford.edu> <80583703-75EB-4503-90A5-F4EEBE09CCEF@stanford.edu> <4403EE5C.709@ieee.org> Message-ID: <135C7D6D-F504-425D-BD5D-13574184AEBA@stanford.edu> Thanks Travis, I think I'm getting a hold on most of what's going on. The __array_priority__ bit remains a bit opaque (can anyone offer guidance?), and I still have some questions about why the __new__ of the matrix subclass ha so much complexity. > So, if you over-write the __new__ constructor for your class and > want to call the ndarray.__new__ constructor you have to realize > that you need to think of what you are doing in terms of "wrapping" > some other created piece of memory or "initializing your memory". This makes good sense. > If you want your array to be-able to "convert" arbitrary objects to > arrays, instead, then your constructor could in-fact be as simple as > > class myarray(numpy.ndarray): > def __new__(cls, obj): > return numpy.array(obj).view(cls) Ok, gotcha. However, the matrix class's __new__ has a lot more complexity. Is *all* of the complexity there in the service of ensuring that matrices are only 2d? It seems like there's more going on there than just that... > Then, if you want, you can define an __init__ method to handle > setting of attributes --- however, if you set some attributes, then > you need to think about what you want to happen when your new array > gets "sliced" or added to. Because the internal code will create > your new array (without calling new) and then call > > __array_finalize__(self, parent) > > where parent could be None (if there is no parent --- i.e. this is > a new array). > > Any attributes you define should also be defined here so they get > passed on to all arrays that are created.. All of this also makes sense. > I hope this helps some. > > > -Travis > From stefan at sun.ac.za Tue Feb 28 01:57:07 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 28 01:57:07 2006 Subject: [Numpy-discussion] wiki page for record arrays In-Reply-To: <200602280223.k1S2NZL3020466@oobleck.astro.cornell.edu> References: <20060221145310.A69A912C2A@sc8-sf-spam2.sourceforge.net> <200602280223.k1S2NZL3020466@oobleck.astro.cornell.edu> Message-ID: <20060228095607.GB7085@sun.ac.za> Hi Joe, On Mon, Feb 27, 2006 at 09:23:35PM -0500, Joe Harrington wrote: > I fixed a small error. I found myself a bit lost. Are cookbook pages > supposed to be introductory, or are they aimed at users who already > know a fair bit? From the page, I can vaguely get that record arrays > allow you access to your data using text strings as partial indexers, > but I found myself focusing so much on testing whether this was true > and figuring out how they work that I was completely distracted from > the example. A paragraph or two at the top explaining what a record > array is, why it's useful, and what the basic properties are would be > good. Then give the example. Also, it's less than intuitive why you > are storing RGB values (usually thought of as integers in the range > 0-255 or 0x00 - 0xff) in 32-bit floating-point numbers. Thanks for the feedback. Of course, any cookbook should aim to be as simple as possible. I wrote this as I figured out record arrays, and proposed it as a starting point -- so feel free to improve it as you see fit. While <= 8-bit images can be stored as integers in [0-255], it is common to use floating point numbers in [0-1] for any depth image. Regards St?fan From ndarray at mac.com Tue Feb 28 04:32:05 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 04:32:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403E72E.4040007@ieee.org> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: On 2/28/06, Travis Oliphant wrote: > Hmm.. Could you show us again what you mean by these problems and the > better behavior that could happen if ufuncs were changed? >From my original post : """ 3. Fix augmented assignment operators. Currently: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([5, 5, 5, 5, 5]) >>> x += arange(5) >>> x array([15, 15, 15, 15, 15]) Desired: >>> x = zeros(5) >>> x.strides=0 >>> x += 1 >>> x array([1, 1, 1, 1, 1]) >>> x += arange(5) >>> x array([1, 2, 3, 4, 5]) """ From zpincus at stanford.edu Tue Feb 28 04:36:01 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 04:36:01 2006 Subject: [Numpy-discussion] can't resize ndarray subclass Message-ID: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Thus far, it seems like the way to get instances of ndarray subclasses is to use the 'view(subclass)' method on a proper ndarray, either in the subclass __new__, or after constructing the array. However, subclass instances so created *do not own their own data*, as far as I can tell. They are just "views" on an other array's data. This means that these objects can't be resized, or have other operations performed on them which requires 'owning' the data buffer. E.g.: class f(numpy.ndarray): def __new__(cls, *p, **kw): return numpy.array(*p, **kw).view(cls) f([1,2,3]).resize((10,)) ValueError: cannot resize this array: it does not own its data numpy.array([1,2,3]).view(f).resize((10,)) ValueError: cannot resize this array: it does not own its data numpy.array([1,2,3]).resize((10,)) (no problem) Is there another way to create ndarray subclasses which do own their own data? Note that numpy.resize(f([1,2,3]), (10,)) works fine. But this isn't the same as having an object that owns its data. Specifically, there's just no way to have an ndarray subclass resize itself as a result of calling a method if that object doesn't own its data. (Imagine a variable-resolution polygon type that could interpolate or decimate vertices as needed: such a class would need to resize itself.) Zach From Chris.Barker at noaa.gov Tue Feb 28 09:28:07 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue Feb 28 09:28:07 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403E72E.4040007@ieee.org> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: <44048816.5020707@noaa.gov> Travis Oliphant wrote: > [Soap Box] > I've been annoyed for several years that the idea of linear operators is > constrained in most libraries to 2 dimensions. There are many times I > want to find an inverse of an operator that is most naturally expressed > with 6 dimensions. Yes, yes, yes! "numpy is not matlab" One of the things I love most about numpy is that it is an n-d array package, NOT a matrix package. I also love broadcasting. Similar to Travis, I was recently helping out a friend using Matlab for a graduate structural mechanics course. The machinations required to shoehorn the natural tensor math into 2-d matrices was pretty ugly indeed. I'd much rather see numpy encourage the use of higher dimension arrays and broadcasting over traditional 2-d matrix solutions. However.... > I have to myself play games with indexing to give > the computer a matrix it can understand. Why is that? One of the reasons is that we want to use other people already optimized code (i.e. LAPACK). They only work with the 2-d data structures. I suppose we could do the translation to the LAPACK data structures under the hood, but that would take some work. However, this makes me wonder.... I'm unclear on the details, but from what I understand of the post that started this thread, one use repmat is used in order to turn some operations into standard linear algebra operations, and that's done for performance purposes. The repmat matrix would therefore need to be in a form usable by LAPACK and friends, and thus would need to be dense anyway ... a zero-stride array would not work, so maybe the potential advantages of the compact storage wouldn't really be realized (until we write out own LAPACK) This also brings me to... Sasha wrote: > Desired: >>>> x = zeros(5) >>>> x.strides=0 >>>> x += 1 >>>> x > array([1, 1, 1, 1, 1]) >>>> x += arange(5) >>>> x > array([1, 2, 3, 4, 5]) So what the heck is a zero-strided array? My understanding was that the whole point was the what looked like multiple values, were really a single, shared, value. In this case, is shouldn't be possible to in-place add more than one value. I wouldn't say that what Sasha presented as "desired" is desired.. an in=place operation shouldn't fundamentally change the nature of the array. That array should ALWAYS remain single-valued. So what should the result of x += arange(5) be? I say it should raise an exception. Maybe zero-stride arrays are only really useful read-only? This is a complicated can of worms..... -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ndarray at mac.com Tue Feb 28 09:38:05 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 09:38:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4403E72E.4040007@ieee.org> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: Travis, I've noticed that you changed the code to allow x.strides = 0 , but it does not look like your changes alows creation of memory-saving zero stride arrays: >>> b = array([1]) >>> ndarray((5,), strides=(0,), buffer=b) Traceback (most recent call last): File "", line 1, in ? TypeError: buffer is too small for requested array I would think memory-saving is the only justification for allowing zero strides. What use does your change enable? From tim.hochberg at cox.net Tue Feb 28 09:53:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 09:53:03 2006 Subject: [Numpy-discussion] Re: Method to shift elements in an array? In-Reply-To: References: Message-ID: <44048DCE.80408@cox.net> Alan G Isaac wrote: >Tim wrote: > > >>import numpy >>def roll(A, n): >> "Roll the array A in place. Positive n -> roll right, negative n -> >>roll left" >> if n > 0: >> n = abs(n) >> temp = A[-n:] >> A[n:] = A[:-n] >> A[:n] = temp >> elif n < 0: >> n = abs(n) >> temp = A[:n] >> A[:-n] = A[n:] >> A[-n:] = temp >> else: >> pass >> >> > >This probably counts as a gotcha: > > >>>>a=N.arange(10) >>>>temp=a[-6:] >>>>a[6:]=a[:-6] >>>>a[:6]=temp >>>>a >>>> >>>> >array([4, 5, 0, 1, 2, 3, 0, 1, 2, 3]) > > Ack! Right, those temp variables needed to be copies. That's why I added the caveat about only rolling a few elements, since otherwise it gets expensive. Then I forgot to make the copies in the code, doh! -tim >Cheers, >Alan Isaac > >PS Here's something close to the rotater functionality. > >#rotater: rotate row elements ># Format: y = rotater(x,r,copydata) ># Input: x RxC array ># rotateby size R integer array, or integer (rotation amounts) ># inplace boolean (default is False -> copies data) ># Output: y RxC array: ># rows rotated by rotateby ># or None (if inplace=True) ># Remarks: Intended for use with 2D arrays. ># rotateby values are positive for rightward rotation, ># negative for leftward rotation ># :author: Alan G Isaac (aisaac AT american DOT edu) ># :date: 24 Feb 2006 >def rotater(x,rotateby,inplace=False) : > assert(len(x.shape)==2), "For 2-d arrays only." > xrotate = numpy.array(x,copy=(not inplace)) > xrows = xrotate.shape[0] > #make an iterater of row shifts > if isinstance(rotateby,int): > from itertools import repeat > rowshifts = repeat(rotateby,xrows) > else: > rowshifts = numpy.asarray(rotateby) > assert(rowshifts.size==xrows) > rowshifts = rowshifts.flat > #perform rotation on each row > for row in xrange(xrows): > rs=rowshifts.next() > #do nothing if rs==0 > if rs>0: > xrotate[row] = numpy.concatenate([xrotate[row][-rs:],xrotate[row][:-rs]]) > elif rs<0: > xrotate[row] = numpy.concatenate([xrotate[row][:-rs],xrotate[row][-rs:]]) > if inplace: > return None > else: > return xrotate > > > > > >------------------------------------------------------- >This SF.Net email is sponsored by xPML, a groundbreaking scripting language >that extends applications into web and mobile media. Attend the live webcast >and join the prime developer group breaking into this new coding territory! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From ndarray at mac.com Tue Feb 28 10:38:03 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 10:38:03 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <44048852.3080701@sympatico.ca> References: <4403C3AB.1040600@sympatico.ca> <440449B2.8060007@sympatico.ca> <4404680F.9020006@sympatico.ca> <44048852.3080701@sympatico.ca> Message-ID: On 2/28/06, Colin J. Williams wrote: > ... > >>Would it not be better to have def zeroes(..., zeroStrides= False): > >>and a= zeros(..., zeroStrides= True)? > >> > >> > > > >This is equivalent to what I proposed: xzeros(shape) and xones(shape) > >functions as a shorthand to ndarray(shape, strides=(0,)*len(shape)) > > > > > Wot, more names to remember? [sorry, I can't give you the graphic to go > aling with this. :-) ] Oh, please - you don't have to be so emphatic. Your solution above (adding zeroStrides parameter) would require a much more arbitrary name to remember. Even worse, someone will write zeros(shape, int, False, True) to mean zeros(shape, dtype=int, fortran=False, zeroStrides=True) and anyone reading that code will have to look up the manual to understand what each boolean means. Boolean parameters are generally considered bad design. For example, it would be much better to have array(..., memory_layout='Fortran') instead of current array(...,fortran=True). (Well, fortran=True is not that bad, but fortran=False is really puzzling - if it is not fortran - what is it?) Arguably even better solution would be array(..., strides = fortran_strides(shape)), but that's a different story. The names xzeros and xones are a natural choice because the functionality they provide is very similar to what xrange provides compared to range: memory saving way to achieve the same result. From ndarray at mac.com Tue Feb 28 10:48:09 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 10:48:09 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <44048816.5020707@noaa.gov> References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> <44048816.5020707@noaa.gov> Message-ID: On 2/28/06, Christopher Barker wrote: > ... > So what should the result of x += arange(5) be? I say it should raise an > exception. Agree. That's what I was proposing as a feasible (as opposed to ideal) solution. Ideally, x += [1,1,1,1,1] would be fine, but not x += [1,2,1,2,1]. I quoted too much from an early post. > > Maybe zero-stride arrays are only really useful read-only? > Maybe. It is hard to justify x[1] = 2 changing the result of x[0], but x[:] = 2 may still be ok. > This is a complicated can of worms..... Completely agree. That's why I made x.strides = 0 illegal some time ago. I don't think it is a good idea to bring it back without understanding all the consequences. If we allow it now, it will be harder to change the behavior of the result later. Someone's code will rely on x += ones(5) incrementing x five times for zero-stride x. From zpincus at stanford.edu Tue Feb 28 11:30:04 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 11:30:04 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: In answer to my previous question, to get an ndarray subclass that owns its own data, copy() must be called on the new "view" of the subclass. This makes sense and is reasonable. However a new problem has me nearly tearing my hair out. Calling the resize method on an instance of such a subclass works fine. However, calling a method that calls 'self.resize' breaks! And worse, it breaks in such a way that then subsequent calls to resize also break. Check it out: class f(numpy.ndarray): def __new__(cls, obj): return numpy.array(obj).view(cls).copy() def expand(self): self.resize([self.shape[0] + 1, self.shape[1]]) g = f([[1,2],[3,4]]) g.resize([3,2]) # this works, thanks to the '.copy()' above g = f([[1,2],[3,4]]) g.expand() # just internally calls self.resize([3,2]) ValueError: cannot resize an array that has been referenced or is referencing another array in this way. Use the resize function g.resize([3,2]) # this NOW DOES NOT WORK! ValueError: cannot resize an array that has been referenced or is referencing another array in this way. Use the resize function Can anyone help? Please? Zach On Feb 28, 2006, at 4:35 AM, Zachary Pincus wrote: > Thus far, it seems like the way to get instances of ndarray > subclasses is to use the 'view(subclass)' method on a proper > ndarray, either in the subclass __new__, or after constructing the > array. > > However, subclass instances so created *do not own their own data*, > as far as I can tell. They are just "views" on an other array's > data. This means that these objects can't be resized, or have other > operations performed on them which requires 'owning' the data buffer. > > E.g.: > class f(numpy.ndarray): > def __new__(cls, *p, **kw): > return numpy.array(*p, **kw).view(cls) > > f([1,2,3]).resize((10,)) > ValueError: cannot resize this array: it does not own its data > numpy.array([1,2,3]).view(f).resize((10,)) > ValueError: cannot resize this array: it does not own its data > numpy.array([1,2,3]).resize((10,)) > (no problem) > > Is there another way to create ndarray subclasses which do own > their own data? > > Note that numpy.resize(f([1,2,3]), (10,)) works fine. But this > isn't the same as having an object that owns its data. > Specifically, there's just no way to have an ndarray subclass > resize itself as a result of calling a method if that object > doesn't own its data. (Imagine a variable-resolution polygon type > that could interpolate or decimate vertices as needed: such a class > would need to resize itself.) > > Zach > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the > live webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at cox.net Tue Feb 28 11:41:15 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 11:41:15 2006 Subject: [Numpy-discussion] Numpy and PEP 343 Message-ID: <4404A71B.10600@cox.net> An idea that has popped up from time to time is delaying evalution of a complicated expressions so that the result can be computed more efficiently. For instance, the matrix expression: a = b*c + d*e results in the creation of two, potentially large, temporary matrices and also does a couple of extra loops at the C level than the equivalent expression implemented in C would. The general idea has been to construct some sort of psuedo-object, when the numerical operations are indicated, then do the actual numerical operations at some later time. This would be very problematic if implemented for all arrays since it would quickly become impossible to figure out what was going on, particularly with view semantics. However, it could result in large performance improvements without becoming incomprehensible if implemented in small enough chunks. A "straightforward" approach would look something like: numpy.begin_defer() # Now all numpy operations (in this thread) are deferred a = b*c + d*e # 'a' is a special object that holds pointers to # 'b', 'c', 'd' and 'e' and knows what ops to perform. numpy.end_defer() # 'a' performs the operations and now looks like an array Since 'a' knows the whole series of operations in advance it can perform them more efficiently than would be possible using the basic numpy machinery. Ideally, the space for 'a' could be allocated up front, and all of the operations could be done in a single loop. In practice the optimization might be somewhat less ambitious, depending on how much energy people put into this. However, this approach has some problems. One is the syntax, which clunky and a bit unsafe (a missing end_defer in a function could cause stuff to break very far away). The other is that I suspect that this sort of deferred evaluation makes multiple views of an array even more likely to bite the unwary. The syntax issue can be cleanly addressed now that PEP 343 (the 'with' statement) is going into Python 2.5. Thus the above would look like: with numpy.deferral(): a = b*c + d*e Just removing the extra allocation of temporary variables can result in 30% speedup for this case[1], so the payoff would likely be large. On the down side, it could be quite a can of worms, and would likely require a lot of work to implement. Food for thought anyway. -tim [1] from timeit import Timer print Timer('a = b*c + d*e', 'from numpy import arange;b=c=d=e=arange(100000.)').timeit(10000) print Timer('a = b*c; multiply(d,e,temp); a+=temp', 'from numpy import arange, zeros, multiply;' 'b=c=d=e=arange(100000.);temp=zeros([100000], dtype=float)').timeit(10000) => 94.8665989672 62.6143562939 From oliphant.travis at ieee.org Tue Feb 28 12:03:09 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 12:03:09 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: <4404AC42.9030405@ieee.org> Zachary Pincus wrote: > Thus far, it seems like the way to get instances of ndarray > subclasses is to use the 'view(subclass)' method on a proper ndarray, > either in the subclass __new__, or after constructing the array. > > However, subclass instances so created *do not own their own data*, > as far as I can tell. They are just "views" on an other array's data. > This means that these objects can't be resized, or have other > operations performed on them which requires 'owning' the data buffer. Yes, that is true. To own your own data your subclass would have to create it's own memory using ndarray.__new__(mysubclass, shape, dtype) Or make a copy as you suggested later. -Travis From oliphant.travis at ieee.org Tue Feb 28 12:11:11 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 12:11:11 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: <4404AE2E.9030405@ieee.org> Zachary Pincus wrote: > In answer to my previous question, to get an ndarray subclass that > owns its own data, copy() must be called on the new "view" of the > subclass. This makes sense and is reasonable. > > However a new problem has me nearly tearing my hair out. Calling the > resize method on an instance of such a subclass works fine. However, > calling a method that calls 'self.resize' breaks! And worse, it > breaks in such a way that then subsequent calls to resize also break. Yeah, this is one difficult aspect of the resize method. Because memory is being re-allocated, the method has to make sure the memory isn't being shared by another object. Right now, it's checking the reference count. Unfortunately, that isn't a fool-proof mechanism as you've discovered, because a reference can be held onto in somewhat unpredictable ways that would not be bothered by a resize on the memory and this messes up the resize method. What is really needed is someway to determine if any other object is actually pointing to the memory of the ndarray (and not just holding on to the object). But, nobody has figured out a way to do that. It would be possible to let the user "force" the issue leaving it up to them to make sure they don't share the memory and then reallocate it. In other words, an extra argument to the resize method could be used to bypass the memory check. I'd be willing to do that because I know the check being performed is not foolproof -Travis From oliphant.travis at ieee.org Tue Feb 28 12:19:02 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 12:19:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: <440397A1.9020607@ee.byu.edu> <4403E72E.4040007@ieee.org> Message-ID: <4404B002.3010407@ieee.org> Sasha wrote: >>>>b = array([1]) >>>>ndarray((5,), strides=(0,), buffer=b) >>>> >>>> >Traceback (most recent call last): > File "", line 1, in ? >TypeError: buffer is too small for requested array > >I would think memory-saving is the only justification for allowing >zero strides. > >What use does your change enable? > > I was just simplifying the PyArray_CheckStrides code. I didn't try to actually enable creating 0-stride arrays in PyArray_NewFromDescr. I don't mind if it is enabled though. I just haven't done it yet. I agree that 0-stride arrays are a "can-of-worms", and I also do not see changing ufunc behavior. The current behavior is understandable and exactly what one would expect with zero-stride and multiple dimensioned arrays. i.e. b = arange(5); b.strides = 0 add(b,1,b) where b has shape (5,) and stride (0,) would add 1 to the first element of b 5 times. Since all elements of the array b are obtained from the first element (that's what stride=0 means), you end up with an array of all 5's. This may not be useful, I agree, but it is understandable and changing it would be too much of an exception. If somebody is creating 0-stride arrays on their own, then they must know what they are doing. -Travis From oliphant.travis at ieee.org Tue Feb 28 13:25:14 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 13:25:14 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> Message-ID: <4404BF92.5040200@ieee.org> Zachary Pincus wrote: > However a new problem has me nearly tearing my hair out. Calling the > resize method on an instance of such a subclass works fine. However, > calling a method that calls 'self.resize' breaks! And worse, it > breaks in such a way that then subsequent calls to resize also break. In SVN version of numpy, there is a new keyword argument to resize (refcheck). If this keyword argument is 0 (it defaults to 1), the reference-count check is not performed. Thus, if you are sure that your array has not exposed it's memory to another object, then you can set refcheck=0 and the resize will proceed. If you really did expose your memory to another object, this could lead to segfaults in exactly the same way that exposing the memory to a Python array (array module) and then later resizing (which Python currently allows) would cause problems. Be careful... -Travis From cookedm at physics.mcmaster.ca Tue Feb 28 13:48:02 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 28 13:48:02 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404A71B.10600@cox.net> (Tim Hochberg's message of "Tue, 28 Feb 2006 12:40:11 -0700") References: <4404A71B.10600@cox.net> Message-ID: Tim Hochberg writes: > > > An idea that has popped up from time to time is delaying evalution of > a complicated expressions so that the result can be computed more > efficiently. For instance, the matrix expression: > > a = b*c + d*e > > results in the creation of two, potentially large, temporary matrices > and also does a couple of extra loops at the C level than the > equivalent expression implemented in C would. > > The general idea has been to construct some sort of psuedo-object, > when the numerical operations are indicated, then do the actual > numerical operations at some later time. This would be very > problematic if implemented for all arrays since it would quickly > become impossible to figure out what was going on, particularly with > view semantics. However, it could result in large performance > improvements without becoming incomprehensible if implemented in small > enough chunks. > > A "straightforward" approach would look something like: > > numpy.begin_defer() # Now all numpy operations (in this thread) > are deferred > a = b*c + d*e # 'a' is a special object that holds pointers to > # 'b', 'c', 'd' and 'e' and knows what ops to > perform. > numpy.end_defer() # 'a' performs the operations and now looks like > an array > > Since 'a' knows the whole series of operations in advance it can > perform them more efficiently than would be possible using the basic > numpy machinery. Ideally, the space for 'a' could be allocated up > front, and all of the operations could be done in a single loop. In > practice the optimization might be somewhat less ambitious, depending > on how much energy people put into this. However, this approach has > some problems. One is the syntax, which clunky and a bit unsafe (a > missing end_defer in a function could cause stuff to break very far > away). The other is that I suspect that this sort of deferred > evaluation makes multiple views of an array even more likely to bite > the unwary. This is a good idea; probably a bit difficult. I don't like the global defer context though. That could get messy, especially if you start calling functions. > The syntax issue can be cleanly addressed now that PEP 343 (the 'with' > statement) is going into Python 2.5. Thus the above would look like: > > with numpy.deferral(): > a = b*c + d*e > > Just removing the extra allocation of temporary variables can result > in 30% speedup for this case[1], so the payoff would likely be large. > On the down side, it could be quite a can of worms, and would likely > require a lot of work to implement. Alternatively, make some sort of expression type: ex = VirtualExpression() ex.a = ex.b * ex.c + ex.d * ex.e then, compute = ex.compile(a=(shape_of_a, dtype_of_a), etc.....) This could return a function that would look like def compute(b, c, d, e): a = empty(shape_of_a, dtype=dtype_of_a) multiply(b, c, a) # ok, I'm making this one up :-) fused_multiply_add(d, e, a) return a a = compute(b, c, d, e) Or, it could use some sort of numeric-specific bytecode that can be interpreted quickly in C. With some sort of optimizing compiler for that bytecode it could be really fun (it could use BLAS when appropriate, for instance!). or ... use weave :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From tim.hochberg at cox.net Tue Feb 28 13:57:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 13:57:02 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: References: <4403C3AB.1040600@sympatico.ca> <440449B2.8060007@sympatico.ca> <4404680F.9020006@sympatico.ca> <44048852.3080701@sympatico.ca> Message-ID: <4404C6F2.1060402@cox.net> Sasha wrote: >On 2/28/06, Colin J. Williams wrote: > > >>... >> >> >>>>Would it not be better to have def zeroes(..., zeroStrides= False): >>>>and a= zeros(..., zeroStrides= True)? >>>> >>>> >>>> >>>> >>>This is equivalent to what I proposed: xzeros(shape) and xones(shape) >>>functions as a shorthand to ndarray(shape, strides=(0,)*len(shape)) >>> >>> >>> >>> >>Wot, more names to remember? [sorry, I can't give you the graphic to go >>aling with this. :-) ] >> >> > >Oh, please - you don't have to be so emphatic. Your solution above >(adding zeroStrides parameter) would require a much more arbitrary >name to remember. Even worse, someone will write zeros(shape, int, >False, True) to mean zeros(shape, dtype=int, fortran=False, >zeroStrides=True) and anyone reading that code will have to look up >the manual to understand what each boolean means. > >Boolean parameters are generally considered bad design. For example, >it would be much better to have array(..., memory_layout='Fortran') >instead of current array(...,fortran=True). (Well, fortran=True is >not that bad, but fortran=False is really puzzling - if it is not >fortran - what is it?) Arguably even better solution would be >array(..., strides = fortran_strides(shape)), but that's a different >story. > > I agree with this. >The names xzeros and xones are a natural choice because the >functionality they provide is very similar to what xrange provides >compared to range: memory saving way to achieve the same result. > > But not this. I don't think using xrange as a template for naming anything is a good idea. If xrange were being added to Python now, it would almost certainly be called irange and live in itertools. I have a strong suspicion that the *name* xrange will go the way of the dodo eventually, although the functionality will survive in some other form. Also, while I can see how there might well be some good uses for zero-stride arrays, I'm having a hard time getting excited by by xzeros and xones. The only applications I can come up with can already be done in more efficient ways without using xones and xzeros. [Let me apologize in advance if I missed a compelling example earlier in this thread -- I just got back from vacation and I may have missed something in my email reading frenzy] -tim From zpincus at stanford.edu Tue Feb 28 13:58:02 2006 From: zpincus at stanford.edu (Zachary Pincus) Date: Tue Feb 28 13:58:02 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <4404BF92.5040200@ieee.org> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> Message-ID: <11799E00-E0A1-459D-957E-EEC62910ACDE@stanford.edu> Thanks Travis for the replies and the new functionality in the SVN! I think I have enough (well, maybe once I figure out __array_priority__) to get a decent wiki entry for subclassing ndarray, and maybe some template subclasses that others can use. I presume __array_priority__ determines the resulting type when two different type arrays are ufunc'd together? The exact mechanism of which object gets selected as the parent, etc., is still unclear though. Zach On Feb 28, 2006, at 1:24 PM, Travis Oliphant wrote: > Zachary Pincus wrote: > >> However a new problem has me nearly tearing my hair out. Calling >> the resize method on an instance of such a subclass works fine. >> However, calling a method that calls 'self.resize' breaks! And >> worse, it breaks in such a way that then subsequent calls to >> resize also break. > > > In SVN version of numpy, there is a new keyword argument to resize > (refcheck). If this keyword argument is 0 (it defaults to 1), the > reference-count check is not performed. Thus, if you are sure that > your array has not exposed it's memory to another object, then you > can set refcheck=0 and the resize will proceed. > > If you really did expose your memory to another object, this could > lead to segfaults in exactly the same way that exposing the memory > to a Python array (array module) and then later resizing (which > Python currently allows) would cause problems. > > Be careful... > > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the > live webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at cox.net Tue Feb 28 14:06:09 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 14:06:09 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <4404BF92.5040200@ieee.org> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> Message-ID: <4404C900.4080001@cox.net> Travis Oliphant wrote: > Zachary Pincus wrote: > >> However a new problem has me nearly tearing my hair out. Calling the >> resize method on an instance of such a subclass works fine. However, >> calling a method that calls 'self.resize' breaks! And worse, it >> breaks in such a way that then subsequent calls to resize also break. > > > > In SVN version of numpy, there is a new keyword argument to resize > (refcheck). If this keyword argument is 0 (it defaults to 1), the > reference-count check is not performed. Thus, if you are sure that > your array has not exposed it's memory to another object, then you can > set refcheck=0 and the resize will proceed. I'd suggest that this get exposed as a separate function, for instance A._unchecked_resize(size). It seems much less likely that this will accidentally get called than that somone will mistakenly throw a second boolean argument into resize. -tim > If you really did expose your memory to another object, this could > lead to segfaults in exactly the same way that exposing the memory to > a Python array (array module) and then later resizing (which Python > currently allows) would cause problems. > > Be careful... > > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From oliphant.travis at ieee.org Tue Feb 28 14:15:01 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 14:15:01 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <4404C900.4080001@cox.net> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> <4404C900.4080001@cox.net> Message-ID: <4404CB22.9070203@ieee.org> Tim Hochberg wrote: > > I'd suggest that this get exposed as a separate function, for instance > A._unchecked_resize(size). It seems much less likely that this will > accidentally get called than that somone will mistakenly throw a > second boolean argument into resize. > The way the resize method is written, you can't mistakenly throw in another argument. You would have to provide a "refcheck" keyword argument. I can't see that being a mistake. The resize method can be called with either a sequence or with several shapes, i.e. a.resize((3,2)) a.resize(3,2) Both of these are equivalent. To get the refcheck functionality you would have to explicitly provide a keyword argument. a.resize((3,2),refcheck=0) a.resize(3,2,refcheck=0) -Travis From oliphant.travis at ieee.org Tue Feb 28 14:20:00 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 14:20:00 2006 Subject: [Numpy-discussion] can't resize ndarray subclass In-Reply-To: <11799E00-E0A1-459D-957E-EEC62910ACDE@stanford.edu> References: <1DED40B2-E761-4B8E-86DF-6F9C36242D0C@stanford.edu> <4404BF92.5040200@ieee.org> <11799E00-E0A1-459D-957E-EEC62910ACDE@stanford.edu> Message-ID: <4404CC58.8010101@ieee.org> Zachary Pincus wrote: > Thanks Travis for the replies and the new functionality in the SVN! > > I think I have enough (well, maybe once I figure out > __array_priority__) to get a decent wiki entry for subclassing > ndarray, and maybe some template subclasses that others can use. > > I presume __array_priority__ determines the resulting type when two > different type arrays are ufunc'd together? The exact mechanism of > which object gets selected as the parent, etc., is still unclear though. When two different subclasses appear in a ufunc (or in other places in ndarray), the subclass chosen for creation is the one with highest __array_priority__. The "parent" concept is entirely separate is the object passed to __array_finalize__ and should be of the same type as self (or None). -Travis From stefan at sun.ac.za Tue Feb 28 14:21:01 2006 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue Feb 28 14:21:01 2006 Subject: [Numpy-discussion] setting package path Message-ID: <20060228221937.GE10590@alpha> In numpytest.py, set_package_path is provided for handling path changes while doing unit tests. It reads def set_package_path(level=1): """ Prepend package directory to sys.path. set_package_path should be called from a test_file.py that satisfies the following tree structure: //test_file.py Then the first existing path name from the following list /build/lib.- /.. is prepended to sys.path. ... However, the line that supposedly calculates "somepath/.." is d = os.path.dirname(os.path.dirname(os.path.abspath(testfile))) which calculates "somepath". Which is wrong: the docstring, the code or my interpretation? St?fan From tim.hochberg at cox.net Tue Feb 28 15:15:02 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 15:15:02 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: References: <4404A71B.10600@cox.net> Message-ID: <4404D92D.7000604@cox.net> David M. Cooke wrote: >Tim Hochberg writes: > > > >> >> >>An idea that has popped up from time to time is delaying evalution of >>a complicated expressions so that the result can be computed more >>efficiently. For instance, the matrix expression: >> >>a = b*c + d*e >> >>results in the creation of two, potentially large, temporary matrices >>and also does a couple of extra loops at the C level than the >>equivalent expression implemented in C would. >> >>The general idea has been to construct some sort of psuedo-object, >>when the numerical operations are indicated, then do the actual >>numerical operations at some later time. This would be very >>problematic if implemented for all arrays since it would quickly >>become impossible to figure out what was going on, particularly with >>view semantics. However, it could result in large performance >>improvements without becoming incomprehensible if implemented in small >>enough chunks. >> >>A "straightforward" approach would look something like: >> >> numpy.begin_defer() # Now all numpy operations (in this thread) >> are deferred >> a = b*c + d*e # 'a' is a special object that holds pointers to >> # 'b', 'c', 'd' and 'e' and knows what ops to >> perform. >> numpy.end_defer() # 'a' performs the operations and now looks like >> an array >> >>Since 'a' knows the whole series of operations in advance it can >>perform them more efficiently than would be possible using the basic >>numpy machinery. Ideally, the space for 'a' could be allocated up >>front, and all of the operations could be done in a single loop. In >>practice the optimization might be somewhat less ambitious, depending >>on how much energy people put into this. However, this approach has >>some problems. One is the syntax, which clunky and a bit unsafe (a >>missing end_defer in a function could cause stuff to break very far >>away). The other is that I suspect that this sort of deferred >>evaluation makes multiple views of an array even more likely to bite >>the unwary. >> >> > >This is a good idea; probably a bit difficult. > It's not original. I think this idea comes around periodically, but dies from a combination of it being nontrivial and the resulting syntax being too heavyweight. >I don't like the global >defer context though. That could get messy, especially if you start >calling functions. > > I'm not crazy about it either. You could localize it with an appropriate (ab)use of sys._getframe, but that's another potential can of worms. Something like: class deferral: frames = set() def __enter__(self): self.frame = sys._getframe(1) self.frames.add(self.frame) def __exit__(self, *args): self.frames.discard(self.frame) self.frame = None def should_defer(): return (sys._getframe(1) in deferral.frames) Then: with deferral(): #stuff Should be localized to just 'stuff', even if it calls other functions[1]. The details might be sticky though.... >>The syntax issue can be cleanly addressed now that PEP 343 (the 'with' >>statement) is going into Python 2.5. Thus the above would look like: >> >>with numpy.deferral(): >> a = b*c + d*e >> >>Just removing the extra allocation of temporary variables can result >>in 30% speedup for this case[1], so the payoff would likely be large. >>On the down side, it could be quite a can of worms, and would likely >>require a lot of work to implement. >> >> > >Alternatively, make some sort of expression type: > >ex = VirtualExpression() > >ex.a = ex.b * ex.c + ex.d * ex.e > >then, > >compute = ex.compile(a=(shape_of_a, dtype_of_a), etc.....) > >This could return a function that would look like > >def compute(b, c, d, e): > a = empty(shape_of_a, dtype=dtype_of_a) > multiply(b, c, a) > # ok, I'm making this one up :-) > fused_multiply_add(d, e, a) > return a > >a = compute(b, c, d, e) > > The syntax seems too heavy too me. It would be signifigantly lighter if the explicit compile step is optional, allowing: ex = VirtualExpression() ex.a = ex.b * ex.c + ex.d * ex.e a = ex(b=b, c=c, d=d, e=e) 'ex' could then figure out all of the sizes and types itself, create the function, compute the result. The created function would be cached and whenever the input parameters matched it would just be reused, so there shouldn't be too much more overhead than with compiled version you suggest. The syntax is still heavy relative to the 'with' version though. >Or, it could use some sort of numeric-specific bytecode that can be >interpreted quickly in C. With some sort of optimizing compiler for >that bytecode it could be really fun (it could use BLAS when >appropriate, for instance!). > >or ... use weave :-) > > I'll have to look at weave again. Last time I looked at it (quite a while ago) it didn't work for me. I can't recall if it was a licensing issue or it didn't work with my compiler or what, but I decided I couldn't use it. -tim [1] Here's an example that fakes with and tests deferral: import sys class deferral: frames = set() def __enter__(self): self.frame = sys._getframe(1) self.frames.add(self.frame) def __exit__(self, *args): self.frames.discard(self.frame) self.frame = None def should_defer(): return (sys._getframe(1) in deferral.frames) def f(n): if not n: return if n % 4: print "should_defer() =", should_defer(), "for n =", n f(n-1) else: # This is a rough translation of: # with deferral(): # print "should_defer() =", should_defer(), "in f" # g(n-1) d = deferral() d.__enter__() try: print "should_defer() =", should_defer(), "for n =", n f(n-1) finally: d.__exit__(None, None, None) f(10) From tim.hochberg at cox.net Tue Feb 28 15:20:03 2006 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Feb 28 15:20:03 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404D92D.7000604@cox.net> References: <4404A71B.10600@cox.net> <4404D92D.7000604@cox.net> Message-ID: <4404DA6D.70703@cox.net> Tim Hochberg wrote: [SNIP] Ugh. This last part got mangled somehow. > > > [1] Here's an example that fakes with and tests deferral: > > > > import sys > > class deferral: > frames = set() > def __enter__(self): > self.frame = sys._getframe(1) > self.frames.add(self.frame) > def __exit__(self, *args): > self.frames.discard(self.frame) > self.frame = None > def should_defer(): > return (sys._getframe(1) in deferral.frames) > [This got all jammed together, sorry] > def f(n): > if not n: > return > if n % 4: > print "should_defer() =", should_defer(), "for n =", n > f(n-1) > else: > # This is a rough translation of: > # with deferral(): > # print "should_defer() =", should_defer(), "in f" > # g(n-1) > d = deferral() > d.__enter__() > try: > print "should_defer() =", should_defer(), "for n =", n > f(n-1) > finally: > d.__exit__(None, None, None) > > > f(10) > > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From cookedm at physics.mcmaster.ca Tue Feb 28 15:30:05 2006 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Feb 28 15:30:05 2006 Subject: [Numpy-discussion] missing array type In-Reply-To: <4404C6F2.1060402@cox.net> (Tim Hochberg's message of "Tue, 28 Feb 2006 14:56:02 -0700") References: <4403C3AB.1040600@sympatico.ca> <440449B2.8060007@sympatico.ca> <4404680F.9020006@sympatico.ca> <44048852.3080701@sympatico.ca> <4404C6F2.1060402@cox.net> Message-ID: Tim Hochberg writes: > > I don't think using xrange as a template for naming anything is a good > idea. If xrange were being added to Python now, it would almost > certainly be called irange and live in itertools. I have a strong > suspicion that the *name* xrange will go the way of the dodo > eventually, although the functionality will survive in some other > form. In Python 3.0 it'll be named "range" :-) -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From pgmdevlist at mailcan.com Tue Feb 28 15:56:05 2006 From: pgmdevlist at mailcan.com (pierregm) Date: Tue Feb 28 15:56:05 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Messing with missing values In-Reply-To: References: <200602271658.44855.pgmdevlist@mailcan.com> Message-ID: <200602281855.20239.pgmdevlist@mailcan.com> Folks, Following Sasha's recommendation, I added a short list of features yet missing in MA to the wiki page: http://projects.scipy.org/scipy/numpy/wiki/MaskedArray The list corresponds to the features *I* miss (I doubt I'm the only one) and for which I have/had to find a workaround. I tried to organize the features by potential problems/fixes. I'll add more missing features as I run into them. I also attached an example of implementation of `std` and `median` for masked arrays. It's a bit crude, it's not fully tested, it's def'ny not in a svn diff format (sorry about that, I need to figure this one out), but it does the job I wanted it to. http://projects.scipy.org/scipy/numpy/attachment/wiki/MaskedArray/ma_examples.py for Of course, your feedback is more than welcome. Thx again -- Pierre GM From cjw at sympatico.ca Tue Feb 28 17:01:13 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Tue Feb 28 17:01:13 2006 Subject: [Numpy-discussion] subclassing ndaray In-Reply-To: <4403E9CA.8050506@ieee.org> References: <43FF2C92.3060304@sympatico.ca> <20060224191426.GD21117@alpha> <43FFB580.80307@ieee.org> <20060225083924.GF21117@alpha> <44035227.9010609@ee.byu.edu> <4403C20A.5060603@sympatico.ca> <4403E9CA.8050506@ieee.org> Message-ID: <4404F243.5030707@sympatico.ca> Travis Oliphant wrote: > >> Travis Oliphant wrote: >> >>> Stefan van der Walt wrote: >>> >>>>> The __init__ and __new__ methods are not called because they may >>>>> have arbitrary signatures. Instead, the __array_finalize__ >>>>> method is always called. So, you should use that instead of >>>>> __init__. >>>>> >>>> >>>> >>>> >>> This is now true in SVN. Previously, __array_finalize__ was not >>> called if the "parent" was NULL. However, now, it is still called >>> with None as the value of the first argument. >>> >>> Thus __array_finalize__ will be called whenever >>> ndarray.__new__(,...) is called. >> >> >> >> Why this change in style from the the common Python idom of __new__, >> __init__, with the same signature to __new__, __array_finalize__ with >> possibly different signatures? >> > > I don't see it as a change in style but adding a capability to the > ndarray subclass. The problem is that arrays can be created in many > ways (slicing, ufuncs, etc). Not all of these ways should go through > the __new__/__init__ -- style creation mechanism. Try inheriting > from a float builtin and add attributes. Then add your float to an > instance of your new class and see what happens. Yes, I've tried this with ndarray - it didn't work. Later, I realized that it wasn't a good thing to try. Colin W. > > You will get a float-type on the output. This is the essence of > Paul's insight that sub-classing is rarely useful because you end up > having to re-define all the operators anyway to return the value that > you want. He knows whereof he speaks as well, because he wrote MA and > UserArray and had experience with Python sub-classing. > I wanted a mechanism to make it easier to sub-class arrays and have > the operators return your object if possible (including all of it's > attributes). > > Thus, > > __array_priority__ (a floating point attribute) > __array_finalize__ (a method called on internal construction of the > array wrapper). > > were invented (along with __array_wrap__ which any class can define > to have their objects survive ufuncs). > It was easy enough to see where to call __array_finalize__ in the > C-code if somewhat difficult to explain (and get exception handling to > work because of my initial over-thinking). > The signature is just > > __array_finalize__(self, parent): > return > > i.e. any return value is ignored (but exceptions are caught). > > > I've used the feature succesfully on at least 3-subclasses (chararray, > memmap, and matrix) and so I'm actually pretty happy with it. > __new__ and __init__ are still relevant for constructing your > brand-new object. The __array_finalize__ function is just what the > internal contructor that acutally allocates memory will always call to > let you set final attributes *every* time your sub-class gets created. > > > > -Travis > From paul at pfdubois.com Tue Feb 28 17:20:16 2006 From: paul at pfdubois.com (Paul F. Dubois) Date: Tue Feb 28 17:20:16 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: References: <4404A71B.10600@cox.net> Message-ID: <4404F6B3.9080809@pfdubois.com> You're reinventing C++ expression templates, although since Python is dynamically typed you don't need templates. The crucial feature in C++ that lets it all work is that you can override the action for assignment. a = b*c + d*e If we could realize we were at the "equals" sign we could evaluate the RHS, and assign it to a. This is not possible in Python; to make is possible would require slowing down regular assignment, which is perhaps a definition of bad. a[...] = RHS could be overridden but it is ugly and 'naive' users will forget. a := RHS could be added to the language with the semantics that it tries to do a.__assignment__(RHS) but Guido told me no long ago. (:->. Also, you might forget the : in :=. a.assign(RHS) would also work but then the original statement would produce a strange object with surprising results. David M. Cooke wrote: > Tim Hochberg writes: > >> >> >> An idea that has popped up from time to time is delaying evalution of >> a complicated expressions so that the result can be computed more >> efficiently. For instance, the matrix expression: >> >> a = b*c + d*e >> >> results in the creation of two, potentially large, temporary matrices >> and also does a couple of extra loops at the C level than the >> equivalent expression implemented in C would. >> >> The general idea has been to construct some sort of psuedo-object, >> when the numerical operations are indicated, then do the actual From ndarray at mac.com Tue Feb 28 17:51:20 2006 From: ndarray at mac.com (Sasha) Date: Tue Feb 28 17:51:20 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404F6B3.9080809@pfdubois.com> References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> Message-ID: Lazy evaluation has been part of many array languages since early days of APL (which makes this idea almost 50 years old). I was entertaining an idea of bringing lazy evaluation to python myself and concluded that there are two places where it might fit 1. At the level of python optimizer a * x + y, for example, can be translated into a call to a call to axpy if a, x and y are known to be arrays. This aproach quickly brings you to the optional static typing idea. 2. Overload arithmetic operators for ufunc objects. This will allow some form of tacit programming and you would be able to write f = multiply + multiply f(x, y, z, t) and have it evaluated without temporaries. Both of these ideas are from the pie-in-the-sky variety. On 2/28/06, Paul F. Dubois wrote: > You're reinventing C++ expression templates, although since Python is > dynamically typed you don't need templates. The crucial feature in C++ > that lets it all work is that you can override the action for assignment. > > a = b*c + d*e > > If we could realize we were at the "equals" sign we could evaluate the > RHS, and assign it to a. > > This is not possible in Python; to make is possible would require > slowing down regular assignment, which is perhaps a definition of bad. > > a[...] = RHS > > could be overridden but it is ugly and 'naive' users will forget. > > a := RHS > > could be added to the language with the semantics that it tries to do > a.__assignment__(RHS) but Guido told me no long ago. (:->. Also, you > might forget the : in :=. > > a.assign(RHS) > > would also work but then the original statement would produce a strange > object with surprising results. > > David M. Cooke wrote: > > Tim Hochberg writes: > > > >> > >> > >> An idea that has popped up from time to time is delaying evalution of > >> a complicated expressions so that the result can be computed more > >> efficiently. For instance, the matrix expression: > >> > >> a = b*c + d*e > >> > >> results in the creation of two, potentially large, temporary matrices > >> and also does a couple of extra loops at the C level than the > >> equivalent expression implemented in C would. > >> > >> The general idea has been to construct some sort of psuedo-object, > >> when the numerical operations are indicated, then do the actual > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Feb 28 19:42:01 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue Feb 28 19:42:01 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <4404F6B3.9080809@pfdubois.com> References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> Message-ID: On 2/28/06, Paul F. Dubois wrote: > You're reinventing C++ expression templates, although since Python is Yes indeedy, and although they might work well enough they produce the most godawful looking assembly code I have ever looked at. The boost ublas template library takes this approach and I regard it more as a meta-compiler research project written in a template language than as an array library. I think that there are two main users of arrays: those who want quick and convenient (optimize programmer time) and those who want super-fast execution (optimize cpu time). Because a human can generally do a better job and knows more about the intent than the average compiler, I think that the best bet is to provide the tools needed to write efficient code if the programmer so desires, but otherwise concentrate on convenience. When absolute speed is essential it is worth budgeting programmer time to achieve it, but generally I don't think that is the case. Chuck From oliphant.travis at ieee.org Tue Feb 28 19:55:04 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 19:55:04 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> Message-ID: <44051AC1.9000908@ieee.org> Charles R Harris wrote: >Yes indeedy, and although they might work well enough they produce the >most godawful looking assembly code I have ever looked at. The boost >ublas template library takes this approach and I regard it more as a >meta-compiler research project written in a template language than as >an array library. I think that there are two main users of arrays: >those who want quick and convenient (optimize programmer time) and >those who want super-fast execution (optimize cpu time). Because a >human can generally do a better job and knows more about the intent >than the average compiler, I think that the best bet is to provide the >tools needed to write efficient code if the programmer so desires, but >otherwise concentrate on convenience. When absolute speed is essential >it is worth budgeting programmer time to achieve it, but generally I >don't think that is the case. > > I think this is ultimately why nothing has been done except to make it easier and easier to write compiled code that gets called from Python. I'm sure most have heard that ctypes will be added to Python 2.5. This will make it very easy to write a C function to do what you want and just call it from Python. Weave can still help with the "auto-compilation" of the specific library for your type. Ultimately such code will be faster than NumPy can every be. -Travis From eric at enthought.com Tue Feb 28 21:28:01 2006 From: eric at enthought.com (eric jones) Date: Tue Feb 28 21:28:01 2006 Subject: [Numpy-discussion] Numpy and PEP 343 In-Reply-To: <44051AC1.9000908@ieee.org> References: <4404A71B.10600@cox.net> <4404F6B3.9080809@pfdubois.com> <44051AC1.9000908@ieee.org> Message-ID: <440530AC.2010000@enthought.com> Travis Oliphant wrote: > > Weave can still help with the "auto-compilation" of the specific > library for your type. Ultimately such code will be faster than NumPy > can every be. Yes. weave.blitz() can be used to do the equivalent of this lazy evaluation for you in many cases without much effort. For example: import weave from scipy import arange a = arange(1e7) b = arange(1e7) c=2.0*a+3.0*b # or with weave weave.blitz("c=2.0*a+3.0*b") As Paul D. mentioned.what Tim outlined is essentially template expressions in C++. blitz++ (http://www.oonumerics.org/blitz/) is a C++ template expressions library for array operations, and weave.blitz translates a Numeric expression into C++ blitz code. For the example above on large arrays, you get about a factor of 4 speed up on large arrays. (Notice, the first time you run the example it will be much slower because of compile time. Use timings from subsequent runs.) C:\temp>weave_time.py Expression: c=2.0*a+3.0*b Numeric: 0.678311899322 Weave: 0.162177084984 Speed-up: 4.18253848494 This isn't as good as you can do with hand coded C, but it isn't so bad for the effort involved... I have wished for time to write a weave.f90("c=2.0*a+3.0*b") function because it is very feasible. My guess from simple experimentation is that it would be about as fast as hand coded C for this sort of expression. This might give us another factor of two or three in execution speed and the compile times would come down from tens of seconds to tenths of seconds. Incidentally, the weave calling overhead is large enough to limit its benefit on small arrays. Pat Miller pointed out some ways to get rid of that overhead, and I even wrote some experimental fixes to weave that helped out a lot. Alas, they never were completed fully. Revisiting these would make weave.blitz useful for small arrays as well. Fixing these is probably more work than wrting blitz.f90() All this to say, I think weave basically accomplishes what Tim wants with a different mechanism (letting C++ compilers do the optimization instead of writing this optimization at the python level). It does require a compiler on client machines in its current form (even that can be fixed...), but I think it might prove faster than re-implementing a numeric expression compiler at the python level (though that sounds fun as well). see ya, eric ############################### #weave_time.py ############################### import timeit array_size = 1e7 iterations = 10 setup = """\ import weave from scipy import arange a=arange(%f) b=arange(%f) c=arange(%f) # needed by weave test """ % (array_size, array_size, array_size) expr = "c=2.0*a+3.0*b" print "Expression:", expr numeric_timer = timeit.Timer(expr, setup) numeric_time = numeric_timer.timeit(number=iterations) print "Numeric:", numeric_time/iterations weave_timer = timeit.Timer('weave.blitz("%s")' % expr, setup) weave_timer = timeit.Timer('weave.blitz("%s")' % expr, setup) weave_time = weave_timer.timeit(number=iterations) print "Weave:", weave_time/iterations print "Speed-up:", numeric_time/weave_time > -Travis > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From pearu at scipy.org Tue Feb 28 22:35:02 2006 From: pearu at scipy.org (Pearu Peterson) Date: Tue Feb 28 22:35:02 2006 Subject: [Numpy-discussion] setting package path In-Reply-To: <20060228221937.GE10590@alpha> References: <20060228221937.GE10590@alpha> Message-ID: On Wed, 1 Mar 2006, Stefan van der Walt wrote: > In numpytest.py, set_package_path is provided for handling path > changes while doing unit tests. It reads > > def set_package_path(level=1): > """ Prepend package directory to sys.path. > > set_package_path should be called from a test_file.py that > satisfies the following tree structure: > > //test_file.py > > Then the first existing path name from the following list > > /build/lib.- > /.. > > is prepended to sys.path. > ... > > However, the line that supposedly calculates "somepath/.." is > > d = os.path.dirname(os.path.dirname(os.path.abspath(testfile))) > > > which calculates "somepath". Which is wrong: the docstring, the code > or my interpretation? You have to read also following code: d1 = os.path.join(d,'build','lib.%s-%s'%(get_platform(),sys.version[:3])) if not os.path.isdir(d1): d1 = os.path.dirname(d) # <- here we get "somepath/.." sys.path.insert(0,d1) Pearu From oliphant.travis at ieee.org Tue Feb 28 23:16:07 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Feb 28 23:16:07 2006 Subject: [Numpy-discussion] Re: [SciPy-user] Table like array In-Reply-To: <16761e100602282240y5bcf869fme9dd2f42771066c4@mail.gmail.com> References: <16761e100602282240y5bcf869fme9dd2f42771066c4@mail.gmail.com> Message-ID: <44054A19.6040202@ieee.org> Michael Sorich wrote: > Hi, > > I am looking for a table like array. Something like a 'data frame' > object to those familiar with the statistical languages R and Splus. > This is mainly to hold and manipulate 2D spreadsheet like data, which > tends to be of relatively small size (compared to what many people > seem to use numpy for), heterogenous, have column and row names, and > often contains missing data. You could subclass the ndarray to produce one of these fairly easily, I think. The missing data item could be handled by a mask stored along with the array (or even in the array itself). Or you could use a masked array as your core object (though I'm not sure how it handles the arbitrary (i.e. record-like) data-types yet). Alternatively, and probably the easiest way to get started, you could just create your own table-like class and use simple 1-d arrays or 1-d masked arrays for each of the columns --- This has always been a way to store record-like tables. It really depends what you want the data-frames to be able to do and what you want them to "look-like." > A RecArray seems potentially useful, as it allows different fields to > have different data types and holds the name of the field. However it > doesn't seem easy to manipulate the data. Or perhaps I am simply > having difficulty finding documentation on there features. Adding a new column/field means basically creating a new array with a new data-type and copying data over into the already-defined fields. Data-types always have a fixed number of bytes per item. What those bytes represent can be quite arbitrary but it's always fixed. So, it is always "more work" to insert a new column. You could make that seamless in your table class so the user doesn't see it though. You'll want to thoroughly understand the dtype object including it's attributes and methods. Particularly the fields attribute of the dtype object. > eg > adding a new column/field (and to a lesser extent a new row/record) to > the recarray Adding a new row or record is actually similar because once an array is created it is usually resized by creating another array and copying the old array into it in the right places. > Changing the field/column names > make a new table by selecting a subset of fields/columns. (you can > select a single field/column, but not multiple). Right. So far you can't select multiple columns. It would be possible to add this feature with a little-bit of effort if there were a strong demand for it, but it would be much easier to do it in your subclass and/or container class. How many people would like to see x['f1','f2','f5'] return a new array with a new data-type descriptor constructed from the provided fields? > It would also be nice for the table to be able to deal easily with > masked data (I have not tried this with recarray yet) and perhaps also > to be able to give the rows/records unique ids that could be used to > select the rows/records (in addition to the row/record index), in the > same way that the fieldnames can select the fields. Adding fieldnames to the "rows" is definitely something that a subclass would be needed for. I'm not sure how you would even propose to select using row names. Would you also use getitem semantics? > Can anyone comment on this issue? Particularly whether code exists for > this purpose, and if not ideas about how best to go about developing > such a Table like array (this would need to be limited to python > programing as my ability to program in c is very limited). I don't know of code that already exists for this, but I don't think it would be too hard to construct your own data-frame object. I would probably start with an implementation that just used standard arrays of a particular type to represent the internal columns and then handle the indexing using your own over-riding of the __getitem__ and __setitem__ special methods. This would be the easiest to get working, I think. -Travis