From falted at openlc.org Fri Jan 2 07:53:03 2004 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 2 07:53:03 2004 Subject: [Numpy-discussion] UInt64 support in FreeBSD? Message-ID: <200401021653.10042.falted@openlc.org> Hi, Some people wanting to use pytables on FreeBSD would like to have UInt64 support, but numarray lacks support for it on this platform. As FreeBSD uses gcc compiler, I think it's just a matter to add an "freebsd4-i386" entry in generate.py. Todd, may you please add such a support? Regarding to the other parameters, LP64 (long pointer), HAS_FLOAT128 (128 floating point), I'm not sure, but perhaps they maybe similar to the "linux2" platform. Anyone using FreeBSD can give more hints? Happy new year!, -- Francesc Alted From jmiller at stsci.edu Fri Jan 2 08:17:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 2 08:17:01 2004 Subject: [Numpy-discussion] UInt64 support in FreeBSD? In-Reply-To: <200401021653.10042.falted@openlc.org> References: <200401021653.10042.falted@openlc.org> Message-ID: <1073060150.3451.33.camel@localhost.localdomain> On Fri, 2004-01-02 at 10:53, Francesc Alted wrote: > Hi, > > Some people wanting to use pytables on FreeBSD would like to have UInt64 > support, but numarray lacks support for it on this platform. As FreeBSD uses > gcc compiler, I think it's just a matter to add an "freebsd4-i386" entry in > generate.py. > > Todd, may you please add such a support? Sure, I'll add it, but I have no means to test it... I also changed the default platform to include UInt64, since with the exception of MSVC, it's supported everywhere I've looked. Todd > Regarding to the other parameters, > LP64 (long pointer), HAS_FLOAT128 (128 floating point), I'm not sure, but > perhaps they maybe similar to the "linux2" platform. Anyone using FreeBSD > can give more hints? > > Happy new year!, -- Todd Miller From edcjones at erols.com Fri Jan 2 18:16:01 2004 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 2 18:16:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <3FF624C5.7010400@erols.com> IM I have uploaded a new version of my small image processing system IM to "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually convert it all to "nd_image". Some features are: Wrappers for some useful functions in the numarray API. NA_GetType NA_TypeName NA_GetTypeFromTypeno NA_TypenoFromType SafeCastCheck Standardized parameters Module(arrin), TypeCode(arrin), Width(arrin), Height(arrin), Bands(arrin), Mode(arrin), NatypeOrMode(arrin), and BytesPerItem(arrin) Open and Save ArrayToArrayCast Converts between array types and formats. Out of range values are clipped. Some additions to numarray BlockReduce, MultiReduce, BlockMean, CountNonZero, CountZeros, Stretch (grey level range), Zoom, Shrink, and Saturate. Convert an array to a list of (array[i,j], i, j) or a dictionary with entries d[(i,j)] = array[i,j]. Sliding window operators including MeanX and HaarX which have masking. Only the unmasked pixels are averaged when finding a mean. For MeanX and HaarX, a border is added to the image. The pixels in the border become the masked pixels. THOUGHTS There are many open source image processing systems but most of them get only to the Canny edge operator and then stop. A sample of the better ones are: ImageMagick http://www.imagemagick.org/ OpenCV http://www.intel.com/research/mrl/research/opencv/ Xite http://www.ifi.uio.no/forskning/grupper/dsb/Software/Xite/ VXL http://vxl.sourceforge.net/ Gandalf http://sourceforge.net/projects/gandalf-library/ imgSeek http://imgseek.sourceforge.net/ And then there is the huge and hard to use "Image Understanding Environment" (IUE) at "http://www.aai.com/AAI/IUE/IUE.html". Has anyone used this? A good starting point is "The Computer Vision Homepage" at "http://www-2.cs.cmu.edu/~cil/vision.html". At this site there is a list of published software. A well-known example is the Kanade-Lucas-Tomasi Feature Tracker coded by Stan Birchfield at "http://vision.stanford.edu/~birch/klt/". Thanks. Note how short the software list is compared with the size of the computer vision lterature. Why does so little software exists for the more advanced parts of computer vision? I feel this is mostly because academic researchers seldom publish their software. In some cases (for example, face recognition software) there are financial motives. In most cases. I suspect that there is no pressure on the researchers from journals or department chairmen to publish the software. So they avoid the work of making their software presentable by not releasing it. The result are many unreproduced experiments and slow transitions of new algorithms out of academia. A good computer vision system Has an easy to use and widely used scripting language. Python Has powerful array processing capabilities. numarray, nd_image Wraps a variety of other computer vision systems. The wrapping process should be straightforward. SWIG, Pyrex, Psyco, ..., and the Python API. Provides a uniform interface to its components. Is used by many people. From verveer at embl-heidelberg.de Mon Jan 5 04:20:11 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Jan 5 04:20:11 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <3FF624C5.7010400@erols.com> References: <3FF624C5.7010400@erols.com> Message-ID: <200401051243.50605.verveer@embl-heidelberg.de> On Saturday 03 January 2004 03:11, Edward C. Jones wrote: > IM > > I have uploaded a new version of my small image processing system IM to > "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code > in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually > convert it all to "nd_image". I had a look and I guess that indeed you could use the nd_image package for some low level stuff (I am the author of nd_image). nd_image is however also still being developed and I am looking for directions to further work on. I wondered if there is anything you would like to see in there? > THOUGHTS > > There are many open source image processing systems but most of them get > only to the Canny edge operator and then stop. A sample of the better ones > are: > > ImageMagick http://www.imagemagick.org/ > OpenCV http://www.intel.com/research/mrl/research/opencv/ > Xite > http://www.ifi.uio.no/forskning/grupper/dsb/Software/Xite/ VXL > http://vxl.sourceforge.net/ > Gandalf http://sourceforge.net/projects/gandalf-library/ > imgSeek http://imgseek.sourceforge.net/ I think not all of these are general image processing systems and often a bit limited. One problem that I have with most of these packages is that they stop at processing 8bit or 16bit two-dimensional images. That is a limit for quite a lot of image analysis research, for instance medical imaging. That is why numarray is so great, it supports multi-dimensional arrays of arbritrary type. nd_image is designed to support multiple dimensions and any data type. That is not always easy and may prevent some optimizations, but I think it is an important feature. That idea is of course not new, matlab is starting to support multi-dimensional image routines and I am aware of at least one C library that does this, although it is not free software: http://www.ph.tn.tudelft.nl/DIPlib/ > And then there is the huge and hard to use "Image Understanding > Environment" (IUE) at "http://www.aai.com/AAI/IUE/IUE.html". Has anyone > used this? The website appears to updated last in 1999, which is not encouraging. Looks hideously complex too. > A good starting point is "The Computer Vision Homepage" at > "http://www-2.cs.cmu.edu/~cil/vision.html". At this site there is a list of > published software. A well-known example is the Kanade-Lucas-Tomasi Feature > Tracker coded by Stan Birchfield at > "http://vision.stanford.edu/~birch/klt/". Thanks. Note how short the > software list is compared with the size of the computer vision lterature. > > Why does so little software exists for the more advanced parts of computer > vision? I feel this is mostly because academic researchers seldom publish > their software. In some cases (for example, face recognition software) > there are financial motives. In most cases. I suspect that there is no > pressure on the researchers from journals or department chairmen to publish > the software. So they avoid the work of making their software presentable > by not releasing it. The result are many unreproduced experiments and slow > transitions of new algorithms out of academia. This is certainly true. I know from experience that often you simply cannot afford to design and maintain a software package after you came up with something new and published it. So a lot of things never leave the laboratory simply because it is hard to do properly. I hope that having a system around like numarray with packages will help. > A good computer vision system > Has an easy to use and widely used scripting language. > Python > Has powerful array processing capabilities. > numarray, nd_image > Wraps a variety of other computer vision systems. The wrapping process > should be straightforward. > SWIG, Pyrex, Psyco, ..., and the Python API. > Provides a uniform interface to its components. > Is used by many people. I intend to develop nd_image further as a basic component for multidimensional image analysis. It would be great if it would get picked up to be part of a system like to propose. Maybe in the future SciPy could play that role. What I would like to hear from people that use this type of software is what kind of basic operations you would like to see become part of nd_image. That will help me to further develop the package. Contributed code is obviously also welcome. Peter -- Dr. Peter J. Verveer Cell Biology and Cell Biophysics Programme European Molecular Biology Laboratory Meyerhofstrasse 1 D-69117 Heidelberg Germany Tel. : +49 6221 387245 Fax : +49 6221 387306 From edcjones at erols.com Mon Jan 5 16:25:01 2004 From: edcjones at erols.com (Edward C. Jones) Date: Mon Jan 5 16:25:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <200401051243.50605.verveer@embl-heidelberg.de> References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> Message-ID: <3FF9FF44.8030606@erols.com> Peter Verveer wrote: >On Saturday 03 January 2004 03:11, Edward C. Jones wrote: > > >> IM >> >>I have uploaded a new version of my small image processing system IM to >>"http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code >>in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually >>convert it all to "nd_image". >> >> > >I had a look and I guess that indeed you could use the nd_image package for >some low level stuff (I am the author of nd_image). nd_image is however also >still being developed and I am looking for directions to further work on. I >wondered if there is anything you would like to see in there? > > Thanks for your response. I have put a slightly revised version of IM on my web page "http://members.tripod.com/~edcjones/". The new version includes functions, written in C, for slicing arrays. A cople of things I would like to see: The ability to read and write a variety of image formats. ImageMagick has a good set. All of ImageMagick should be wrapped. The Canny edge operator along with code for generating polygonal approximations to edges. See OpenCV. Do you have some examples of algorithms for multi-dimensional images that you think should be put in nd_image? Thanks, Ed Jones From verveer at embl-heidelberg.de Tue Jan 6 04:31:00 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Tue Jan 6 04:31:00 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <3FF9FF44.8030606@erols.com> References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> <3FF9FF44.8030606@erols.com> Message-ID: <200401061330.18932.verveer@embl-heidelberg.de> Hi Ed, > A cople of things I would like to see: > > The ability to read and write a variety of image formats. That is of course important. But in my view really a separate issue from developing a library of analysis routines. The latter just have to operate on numarray arrays and need not to worry about how the data gets there. Of course you need to get your data in numarray. PIL seems to do a good job with images, except for 16bit tiffs which causes me quiet some problems. Anybody know a good solution for getting 16bit tiffs into numarray? >ImageMagick > has a good set. All of ImageMagick should be wrapped. Isn't there already a python interface to ImageMagick? > The Canny edge operator along with code for generating polygonal > approximations to edges. See OpenCV. Canny I will likely implement at some point. Polygonal approximations to edges can be done in many ways I guess. I would need to find some reasonable method in the literature to do that. Suggestions are welcome. > Do you have some examples of algorithms for multi-dimensional images > that you think should be put in nd_image? At the moment I have only been looking at general basic image processing operations which normally generalize well to multiple dimensions. I will continue to do that. There are also somewhat higher level operations that I currently have not included. For instance, I implemented a sub-pixel shift estimator which I need for my work. That would be an example of a routine that is completely written in python using numarray and nd_image routines and does not need any C. This could be useful for others, but I am not sure if it belongs in a low-level library. Maybe we need some repository for that sort of python applications. Cheers, Peter From oliphant at ee.byu.edu Tue Jan 6 07:32:02 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Tue Jan 6 07:32:02 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <3FF9FF44.8030606@erols.com> References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> <3FF9FF44.8030606@erols.com> Message-ID: <3FFAD50E.6050500@ee.byu.edu> Edward C. Jones wrote: > Peter Verveer wrote: > >> On Saturday 03 January 2004 03:11, Edward C. Jones wrote: >> >> >>> IM >>> >>> I have uploaded a new version of my small image processing system IM to >>> "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the >>> code >>> in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually >>> convert it all to "nd_image". >>> >> >> >> I had a look and I guess that indeed you could use the nd_image >> package for some low level stuff (I am the author of nd_image). >> nd_image is however also still being developed and I am looking for >> directions to further work on. I wondered if there is anything you >> would like to see in there? >> > Thanks for your response. I have put a slightly revised version of IM on > my web page "http://members.tripod.com/~edcjones/". The new version > includes functions, written in C, for slicing arrays. > > A cople of things I would like to see: > > The ability to read and write a variety of image formats. ImageMagick > has a good set. All of ImageMagick should be wrapped. > I have wrappers for ImageMagick done (for Numeric). See pylab.sourceforge.net -Travis From perry at stsci.edu Tue Jan 6 07:49:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 6 07:49:00 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <200401061330.18932.verveer@embl-heidelberg.de> Message-ID: > Hi Ed, > > A cople of things I would like to see: > > > > The ability to read and write a variety of image formats. > > That is of course important. But in my view really a separate issue from > developing a library of analysis routines. The latter just have > to operate on > numarray arrays and need not to worry about how the data gets there. Of > course you need to get your data in numarray. PIL seems to do a > good job with > images, except for 16bit tiffs which causes me quiet some > problems. Anybody > know a good solution for getting 16bit tiffs into numarray? > I'd agree that support for image formats should be decoupled from processing functions > >ImageMagick > > has a good set. All of ImageMagick should be wrapped. > > Isn't there already a python interface to ImageMagick? > > Perhaps we should look at how much work it would be to adopt Travis's wrapped version for numarray. It may be fairly simple to do if his version uses the the more common api calls. Perry From edcjones at erols.com Tue Jan 6 08:54:01 2004 From: edcjones at erols.com (Edward C. Jones) Date: Tue Jan 6 08:54:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: References: Message-ID: <3FFAE713.2020506@erols.com> Perry Greenfield wrote: >>Hi Ed, >> >> >>>A cople of things I would like to see: >>> >>>The ability to read and write a variety of image formats. >>> >>> >>That is of course important. But in my view really a separate issue from >>developing a library of analysis routines. The latter just have >>to operate on >>numarray arrays and need not to worry about how the data gets there. Of >>course you need to get your data in numarray. PIL seems to do a >>good job with >>images, except for 16bit tiffs which causes me quiet some >>problems. Anybody >>know a good solution for getting 16bit tiffs into numarray? >> >> >> >I'd agree that support for image formats should be decoupled >from processing functions > > > >>>ImageMagick >>>has a good set. All of ImageMagick should be wrapped. >>> >>> >>Isn't there already a python interface to ImageMagick? >> >> >> >> >Perhaps we should look at how much work it would be to adopt >Travis's wrapped version for numarray. It may be fairly simple >to do if his version uses the the more common api calls. > >Perry > > I have checked this out a bit. All the Numeric function calls are among the ones that numarray emulates. All but one of them seem to be properly DECREFed. The exception is in "imageobject.c", line 973, where "bitobj" is created. Also: ImageMagick was forked, producing GraphicsMagick. The two are very similar. Which is better to use? Ed Jones From rays at san.rr.com Tue Jan 6 22:52:01 2004 From: rays at san.rr.com (RJS) Date: Tue Jan 6 22:52:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <5.2.1.1.2.20040106220631.00aee968@pop-server.san.rr.com> > I have uploaded a new version of my small image processing system IM to > "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code > in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually > convert it all to "nd_image". ... > nd_image is however also still being developed and I am looking for directions to > further work on. I wondered if there is anything you would like to see in there? I have been working with Pythonmagic and numarray for a particular astronomy project/technique, and IM has a few things I might use; nd_image also has some interesting functions as well. I want to align and specially stack 8-bit grayscale images from a FITS cube, or BMP set, currently. So, my suggestions (hint, hint) are: 1. A method to shift an array to efficiently give the best alignment with another. My brute force shifting and subtracting from the main image is slow... Most programs I have seen align a selected sub-image, then shift the whole image/array (without rotation, although that would be desirable) My _main_ objective is to stack progressively-longer-exposure 8-bit images into 16-bits, with the clipped pixels of longer exposures ignored in the summing process. The value of each pixel must be weighted inversely proportionately to it's exposure length (so shorter exposures "fill in" the clipped areas of the long exposures). So: 2. A fast method(ology) to do weighted sums of 2D arrays with a mask available for each array. I really do commend Peter and Edward for their contribution! By the way, if you do wxPython and haven't tried Boa Constructor, you might like it. I have been using the CVS version (now .2.8) for a few months, and it's working nicely. Ray Schumacher http://rjs.org/astro/1004x/ From verveer at embl-heidelberg.de Wed Jan 7 04:45:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Wed Jan 7 04:45:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <5.2.1.1.2.20040106220631.00aee968@pop-server.san.rr.com> References: <5.2.1.1.2.20040106220631.00aee968@pop-server.san.rr.com> Message-ID: <200401071344.24149.verveer@embl-heidelberg.de> On Wednesday 07 January 2004 07:50, RJS wrote: > > I have uploaded a new version of my small image processing system IM to > > "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the > > code in IM (pronounced "I'm") is inferior to "nd_image" so I will > > eventually convert it all to "nd_image". > > ... > > > nd_image is however also still being developed and I am looking for > > directions to > > > further work on. I wondered if there is anything you would like to see > > in there? > > I have been working with Pythonmagic and numarray for a particular > astronomy project/technique, and IM has a few things I might use; nd_image > also has some interesting functions as well. > > I want to align and specially stack 8-bit grayscale images from a FITS > cube, or BMP set, currently. So, my suggestions (hint, hint) are: > 1. A method to shift an array to efficiently give the best alignment with > another. My brute force shifting and subtracting from the main image is > slow... Most programs I have seen align a selected sub-image, then shift > the whole image/array (without rotation, although that would be desirable) If I understand you well, you essentially want to estimate a shift between two images. I have some code that can do that. I do not intend to include that in nd_image for now, but I can send you the code. > My _main_ objective is to stack progressively-longer-exposure 8-bit images > into 16-bits, with the clipped pixels of longer exposures ignored in the > summing process. The value of each pixel must be weighted inversely > proportionately to it's exposure length (so shorter exposures "fill in" the > clipped areas of the long exposures). > > So: > 2. A fast method(ology) to do weighted sums of 2D arrays with a mask > available for each array. I think this can be achieved relatively easily with standard numarray operations. Cheers, Peter From haase at msg.ucsf.edu Wed Jan 7 11:44:01 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed Jan 7 11:44:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> <3FF9FF44.8030606@erols.com> <200401061330.18932.verveer@embl-heidelberg.de> Message-ID: <024a01c3d556$8075bf90$421ee6a9@rodan> > Hi Ed, > > A cople of things I would like to see: > > > > The ability to read and write a variety of image formats. > > That is of course important. But in my view really a separate issue from > developing a library of analysis routines. The latter just have to operate on > numarray arrays and need not to worry about how the data gets there. Of > course you need to get your data in numarray. PIL seems to do a good job with > images, except for 16bit tiffs which causes me quiet some problems. Anybody > know a good solution for getting 16bit tiffs into numarray? > Hi Peter, When did you try that ? My info is the PIL released within the last few month version 1.1.4 which does the job. this is from http://effbot.org/zone/pil-changes-114.htm: (1.1.4a2 released) + Improved support for 16-bit unsigned integer images (mode "I;16"). This includes TIFF reader support, and support for "getextrema" and "point" (from Klamer Shutte). (Ooops: PIL 1.1.4 final was released on May 10, 2003. (time flies ...) Regards, Sebastian From rays at san.rr.com Wed Jan 7 22:35:04 2004 From: rays at san.rr.com (RJS) Date: Wed Jan 7 22:35:04 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <5.2.1.1.2.20040107221315.023e4e90@pop-server.san.rr.com> Hello Peter, > On Wednesday 07 January 2004 07:50, RJS wrote: > > Most programs I have seen align a selected sub-image, then shift > > the whole image/array (without rotation, although that would be desirable) > If I understand you well, you essentially want to estimate a shift between two > images. I have some code that can do that. I do not intend to include that in > nd_image for now, but I can send you the code. Yes, please. I sure that it's better/faster than my PIL or PythonMagick efforts. I don't know about machine vision etc, but shift is indispensable for video astronomy. > > 2. A fast method(ology) to do weighted sums of 2D arrays with a mask > > available for each array. > I think this can be achieved relatively easily with standard numarray > operations. Yes, it is straight-forward, in a way, but I'm always scouring the net for C and Python algorithms. Very (most?) often they're better than my own. This app is really a proof-of-concept; hopefully others will incorporate the "clipped image stacking" into their already fine astro apps. The problem is that the standard methods - median, mean, or summing - all suffer when long images in unequal exposure stacks have large clipped regions. Thanks, Ray http://rjs.org From verveer at embl-heidelberg.de Thu Jan 8 01:42:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Thu Jan 8 01:42:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <024a01c3d556$8075bf90$421ee6a9@rodan> References: <3FF624C5.7010400@erols.com> <200401061330.18932.verveer@embl-heidelberg.de> <024a01c3d556$8075bf90$421ee6a9@rodan> Message-ID: <200401081041.13964.verveer@embl-heidelberg.de> Hi Sebastian, I use the 1.1.4 final version. I do however, have images that are not read by PIL ('cannot identify image file'). I think these files are okay, since I can read them in a scientific imaging program. So maybe the 16bit support in PIL is not complete. Peter On Wednesday 07 January 2004 20:43, Sebastian Haase wrote: > > know a good solution for getting 16bit tiffs into numarray? > > Hi Peter, > When did you try that ? My info is the PIL released within the last few > month version 1.1.4 which does the job. > this is from http://effbot.org/zone/pil-changes-114.htm: > > (1.1.4a2 released) > > + Improved support for 16-bit unsigned integer images (mode "I;16"). > This includes TIFF reader support, and support for "getextrema" > and "point" (from Klamer Shutte). > > (Ooops: PIL 1.1.4 final was released on May 10, 2003. (time flies ...) From nadavh at visionsense.com Thu Jan 8 04:05:00 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu Jan 8 04:05:00 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <07C6A61102C94148B8104D42DE95F7E8066942@exchange2k.envision.co.il> I am producing and reading 16 bit tiff files using PIL. These files however can not be displayed by most image processing programs (gimp does fine). Nadav -----Original Message----- From: Peter Verveer [mailto:verveer at embl-heidelberg.de] Sent: Thu 08-Jan-04 11:41 To: Sebastian Haase; numpy-discussion at lists.sourceforge.net Cc: Subject: Re: [Numpy-discussion] Update for IM. a small image processing system Hi Sebastian, I use the 1.1.4 final version. I do however, have images that are not read by PIL ('cannot identify image file'). I think these files are okay, since I can read them in a scientific imaging program. So maybe the 16bit support in PIL is not complete. Peter On Wednesday 07 January 2004 20:43, Sebastian Haase wrote: > > know a good solution for getting 16bit tiffs into numarray? > > Hi Peter, > When did you try that ? My info is the PIL released within the last few > month version 1.1.4 which does the job. > this is from http://effbot.org/zone/pil-changes-114.htm: > > (1.1.4a2 released) > > + Improved support for 16-bit unsigned integer images (mode "I;16"). > This includes TIFF reader support, and support for "getextrema" > and "point" (from Klamer Shutte). > > (Ooops: PIL 1.1.4 final was released on May 10, 2003. (time flies ...) ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at ieee.org Thu Jan 8 15:38:02 2004 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Thu Jan 8 15:38:02 2004 Subject: [Numpy-discussion] License.txt inclussion breaks McMillan's Installer. Message-ID: <3FFDE9A2.4040806@ieee.org> The way LICENSE.txt is included in the __init__ file for numarray breaks McMillan's installer (and probably py2exe as well, although I haven't checked that). The offending line is: __LICENSE__ = open(_os.path.join(__path__[0],"LICENSE.txt")).read() The first problem is that the installer doesn't pick up the dependancy on LICENSE.txt. That's not a huge deal as it's relatively simple to add that to the list of dependancy's by hand. More serious is that the __path__ variable is bogus in an installer archive so that the reading of the license file fails, even if it's present. One solution is just include the license text directly instead of reading it from a separate file. This is simple and the license is short enough that this shouldn't clutter things too much. It's not like there's all that much in the __init__ file anyway <0.5 wink>. A second solution is to wrap the above incantation in try, except; however, this doesn't guarantee that the license file is included. A third solution is to come up with a different incantation that works for installer. I've looked at this briefly and it looks a little messy. Nevertheless, I'll come up with something that works if this is deemed the preferred solution. Someone else will have to figure out what works with py2exe. [ If the above makes no sense to those of you unfamilar with McMillan's installer, I apologize -- ask away and I'll try to clarify] Regards -tim From jmiller at stsci.edu Fri Jan 9 05:52:03 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 9 05:52:03 2004 Subject: [Numpy-discussion] License.txt inclussion breaks McMillan's Installer. In-Reply-To: <3FFDE9A2.4040806@ieee.org> References: <3FFDE9A2.4040806@ieee.org> Message-ID: <1073656204.10007.23.camel@halloween.stsci.edu> On Thu, 2004-01-08 at 18:37, Tim Hochberg wrote: > > The way LICENSE.txt is included in the __init__ file for numarray breaks > McMillan's installer (and probably py2exe as well, although I haven't > checked that). The offending line is: > > __LICENSE__ = open(_os.path.join(__path__[0],"LICENSE.txt")).read() > > > The first problem is that the installer doesn't pick up the dependancy > on LICENSE.txt. That's not a huge deal as it's relatively simple to add > that to the list of dependancy's by hand. > > More serious is that the __path__ variable is bogus in an installer > archive so that the reading of the license file fails, even if it's present. > > One solution is just include the license text directly instead of > reading it from a separate file. This is simple and the license is short > enough that this shouldn't clutter things too much. It's not like > there's all that much in the __init__ file anyway <0.5 wink>. I like this solution the best from the perspective of simplicity and fool-proof-ness. I had considered it before but rejected it as leading to duplication of the license. Now I realize I can just "put a symbolic link" in LICENSE.txt and move the actual text of the license to __init__.py as you suggest. This is fixed in CVS now. Todd > A second solution is to wrap the above incantation in try, except; > however, this doesn't guarantee that the license file is included. > > A third solution is to come up with a different incantation that works > for installer. I've looked at this briefly and it looks a little messy. > Nevertheless, I'll come up with something that works if this is deemed > the preferred solution. Someone else will have to figure out what works > with py2exe. > > [ If the above makes no sense to those of you unfamilar with McMillan's > installer, I apologize -- ask away and I'll try to clarify] > > Regards > > -tim > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Perforce Software. > Perforce is the Fast Software Configuration Management System offering > advanced branching capabilities and atomic changes on 50+ platforms. > Free Eval! http://www.perforce.com/perforce/loadprog.html > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From cookedm at physics.mcmaster.ca Fri Jan 9 06:51:00 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Jan 9 06:51:00 2004 Subject: [Numpy-discussion] License.txt inclussion breaks McMillan's Installer. In-Reply-To: <1073656204.10007.23.camel@halloween.stsci.edu> References: <3FFDE9A2.4040806@ieee.org> <1073656204.10007.23.camel@halloween.stsci.edu> Message-ID: <20040109144917.GA3957@arbutus.physics.mcmaster.ca> On Fri, Jan 09, 2004 at 08:50:04AM -0500, Todd Miller wrote: > On Thu, 2004-01-08 at 18:37, Tim Hochberg wrote: > > > > The way LICENSE.txt is included in the __init__ file for numarray breaks > > McMillan's installer (and probably py2exe as well, although I haven't > > checked that). The offending line is: > > > > __LICENSE__ = open(_os.path.join(__path__[0],"LICENSE.txt")).read() > > > > > > The first problem is that the installer doesn't pick up the dependancy > > on LICENSE.txt. That's not a huge deal as it's relatively simple to add > > that to the list of dependancy's by hand. > > > > More serious is that the __path__ variable is bogus in an installer > > archive so that the reading of the license file fails, even if it's present. > > > > One solution is just include the license text directly instead of > > reading it from a separate file. This is simple and the license is short > > enough that this shouldn't clutter things too much. It's not like > > there's all that much in the __init__ file anyway <0.5 wink>. > > I like this solution the best from the perspective of simplicity and > fool-proof-ness. I had considered it before but rejected it as leading > to duplication of the license. Now I realize I can just "put a symbolic > link" in LICENSE.txt and move the actual text of the license to > __init__.py as you suggest. > > This is fixed in CVS now. > > Todd I have to admit that I read the problem above and thought, WHAT? numarray already takes longer to import than Numeric; you mean some of that time it's reading in a license file I'll never look at? Compare: $ time python -c 'import numarray' real 0m0.230s user 0m0.230s sys 0m0.000s $ time python -c 'import Numeric' real 0m0.076s user 0m0.050s sys 0m0.020s [final results after running each a couple times to get it in cache] numarray takes 3 times longer to import than Numeric. I know, it's only 0.154 s difference, but that's noticeable for small scripts. [Ok, so I just tested the change to reading the license, and I don't see any change in import times :-)] If I had any time, I'd look at making it import faster. Some playing around with the hotshot profiler shows that most of the time is spent in numarray.ufunc._makeCUFuncDict. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From childliteracyraffle204 at yahoo.com Fri Jan 9 14:47:00 2004 From: childliteracyraffle204 at yahoo.com (childliteracyraffle204 at yahoo.com) Date: Fri Jan 9 14:47:00 2004 Subject: [Numpy-discussion] Car Raffle Donate to Charity Cadillac Raffle Message-ID: <200401091554210788.001199B2@127.0.0.1> Car Raffle Donate to Charity Cadillac Raffle http://www.ChildLiteracy.org/ Current Raffles 2003 BLACK CADILLAC DEVILLE DTS $100 per ticket Mitsubishi's 2002 Montero Sport LS $35 per ticket Pioneer PDP-4330HD 43" Plasma TV $20 per ticket http://www.ChildLiteracy.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jochen at fhi-berlin.mpg.de Tue Jan 13 23:59:00 2004 From: jochen at fhi-berlin.mpg.de (=?iso-8859-1?q?Jochen_K=FCpper?=) Date: Tue Jan 13 23:59:00 2004 Subject: [Numpy-discussion] numarray setup.py Message-ID: Sometime after v0.8 setup.py was changed to include some 'classifiers'. Doesn't work for me (python 2.2.2 on FreeBSD): ,----[python setup.py install --home=~/install/freebsd-x86] | Using EXTRA_COMPILE_ARGS = [] | error in setup script: invalid distribution option 'classifiers' `---- Greetings, Jochen -- Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper.de Libert?, ?galit?, Fraternit? GnuPG key: CC1B0B4D (Part 3 you find in my messages before fall 2003.) From jmiller at stsci.edu Wed Jan 14 02:52:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 14 02:52:01 2004 Subject: [Numpy-discussion] numarray setup.py In-Reply-To: References: Message-ID: <1074077466.3718.26.camel@localhost.localdomain> On Wed, 2004-01-14 at 02:58, Jochen K?pper wrote: > Sometime after v0.8 setup.py was changed to include some > 'classifiers'. Doesn't work for me (python 2.2.2 on FreeBSD): I removed the classifiers for Pythons < 2.3. This is (theoretically) fixed in CVS and tested against 2.2.3. Let me know if there's still a problem with 2.2.2. Todd > > ,----[python setup.py install --home=~/install/freebsd-x86] > | Using EXTRA_COMPILE_ARGS = [] > | error in setup script: invalid distribution option 'classifiers' > `---- > > Greetings, > Jochen -- Todd Miller From jochen at fhi-berlin.mpg.de Wed Jan 14 05:15:02 2004 From: jochen at fhi-berlin.mpg.de (=?iso-8859-1?q?Jochen_K=FCpper?=) Date: Wed Jan 14 05:15:02 2004 Subject: [Numpy-discussion] numarray setup.py In-Reply-To: <1074077466.3718.26.camel@localhost.localdomain> (Todd Miller's message of "Wed, 14 Jan 2004 05:51:06 -0500") References: <1074077466.3718.26.camel@localhost.localdomain> Message-ID: On Wed, 14 Jan 2004 05:51:06 -0500 Todd Miller wrote: Todd> On Wed, 2004-01-14 at 02:58, Jochen K?pper wrote: >> Sometime after v0.8 setup.py was changed to include some >> 'classifiers'. Doesn't work for me (python 2.2.2 on FreeBSD): Todd> I removed the classifiers for Pythons < 2.3. This is Todd> (theoretically) fixed in CVS and tested against 2.2.3. Let me Todd> know if there's still a problem with 2.2.2. Seems to work now. Greetings, Jochen -- Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper.de Libert?, ?galit?, Fraternit? GnuPG key: CC1B0B4D (Part 3 you find in my messages before fall 2003.) From cjw at sympatico.ca Thu Jan 15 06:31:04 2004 From: cjw at sympatico.ca (Colin J. Williams) Date: Thu Jan 15 06:31:04 2004 Subject: [Numpy-discussion] _clone, copy, view Message-ID: <4006A420.60500@sympatico.ca> It would help if someone could describe the intended functional differences between _clone, copy and view in numarray. Colin W. From jmiller at stsci.edu Thu Jan 15 07:07:06 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 15 07:07:06 2004 Subject: [Numpy-discussion] _clone, copy, view In-Reply-To: <4006A420.60500@sympatico.ca> References: <4006A420.60500@sympatico.ca> Message-ID: <1074179078.2009.32.camel@halloween.stsci.edu> On Thu, 2004-01-15 at 09:30, Colin J. Williams wrote: > It would help if someone could describe the intended functional > differences between _clone, copy and view in numarray. a.copy() returns a new array object with a copy of a's data. a's dictionary attributes are currently aliased, not deep copied. That may not be the way it should be. The copy is assumed to be a C_ARRAY, meaning it is aligned, not byteswapped, and contiguous. Thus, copy() is sometimes used as a cleanup operation. a.view() returns a shallow copy of a. Most importantly, the shallow copy aliases the same data as a. Views are used to look at the same data buffer in some new way, perhaps with a different shape, or perhaps as a subset of the original data. A view has the same special properties as the original, e.g. if the original is byteswapped, so is the view. a._clone() is an implementation detail of the generic take() method. The generic take() method is overridden for numerical arrays, but utilized for object arrays. clone()'s purpose is to return a new array of the same type as 'a' but with a different shape and total number of elements. Thus, clone() is useful for creating result arrays for take based on the input array being taken from. > > Colin W. > -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From edcjones at erols.com Fri Jan 16 07:39:03 2004 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 16 07:39:03 2004 Subject: [Numpy-discussion] Problem with Sourceforge mailing list archives? Message-ID: <40080487.4010309@erols.com> The "Numpy-discussion Archives" at "https://lists.sourceforge.net/lists/listinfo/numpy-discussion" are down or missing. Is there a problem? From edcjones at erols.com Fri Jan 16 07:52:00 2004 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 16 07:52:00 2004 Subject: [Numpy-discussion] Tabs in numarray code Message-ID: <40080780.3040904@erols.com> What are the policies about tab characters in numarray Python and C code? What are the policies about indentation in numarray Python and C code? The following small program found a bunch of tabs in numarray code: -------- #! /usr/local/bin/python import os topdir = '/usr/local/src/numarray-0.8/' for dirpath, dirnames, filenames in os.walk(topdir): for name in filenames: if name.endswith('.py'): fullname = os.path.join(dirpath, name) lines = file(fullname, 'r').read().splitlines() for i, line in enumerate(lines): if '\t' in line: print fullname[len(topdir):], i+1, line From jmiller at stsci.edu Fri Jan 16 08:12:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 16 08:12:02 2004 Subject: [Numpy-discussion] Tabs in numarray code In-Reply-To: <40080780.3040904@erols.com> References: <40080780.3040904@erols.com> Message-ID: <1074269385.3715.19.camel@halloween.stsci.edu> On Fri, 2004-01-16 at 10:47, Edward C. Jones wrote: > What are the policies about tab characters in numarray Python and C > code? What are the policies about indentation in numarray Python and C code? The policy is "no tabs". Indentation in Python and C is 5 spaces per level. Enforcement of the policies is obviously currently lacking. Todd > > The following small program found a bunch of tabs in numarray code: > -------- > #! /usr/local/bin/python > > import os > > topdir = '/usr/local/src/numarray-0.8/' > for dirpath, dirnames, filenames in os.walk(topdir): > for name in filenames: > if name.endswith('.py'): > fullname = os.path.join(dirpath, name) > lines = file(fullname, 'r').read().splitlines() > for i, line in enumerate(lines): > if '\t' in line: > print fullname[len(topdir):], i+1, line > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From jmiller at stsci.edu Fri Jan 16 08:15:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 16 08:15:02 2004 Subject: [Numpy-discussion] Tabs in numarray code In-Reply-To: <1074269385.3715.19.camel@halloween.stsci.edu> References: <40080780.3040904@erols.com> <1074269385.3715.19.camel@halloween.stsci.edu> Message-ID: <1074269561.4020.21.camel@halloween.stsci.edu> On Fri, 2004-01-16 at 11:09, Todd Miller wrote: > On Fri, 2004-01-16 at 10:47, Edward C. Jones wrote: > > What are the policies about tab characters in numarray Python and C > > code? What are the policies about indentation in numarray Python and C code? > > The policy is "no tabs". > Indentation in Python and C is 5 spaces per level. Actually, I meant *4* spaces, and enforcement is somewhat worse than I thought. > Enforcement of the policies is obviously currently lacking. > > Todd > > > > > The following small program found a bunch of tabs in numarray code: > > -------- > > #! /usr/local/bin/python > > > > import os > > > > topdir = '/usr/local/src/numarray-0.8/' > > for dirpath, dirnames, filenames in os.walk(topdir): > > for name in filenames: > > if name.endswith('.py'): > > fullname = os.path.join(dirpath, name) > > lines = file(fullname, 'r').read().splitlines() > > for i, line in enumerate(lines): > > if '\t' in line: > > print fullname[len(topdir):], i+1, line > > > > > > > > > > ------------------------------------------------------- > > The SF.Net email is sponsored by EclipseCon 2004 > > Premiere Conference on Open Tools Development and Integration > > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > > http://www.eclipsecon.org/osdn > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- > Todd Miller > Space Telescope Science Institute > 3700 San Martin Drive > Baltimore MD, 21030 > (410) 338 - 4576 > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From haase at msg.ucsf.edu Fri Jan 16 16:02:01 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Jan 16 16:02:01 2004 Subject: [Numpy-discussion] numarray.records - get/set item References: <030401c3bac6$96382a20$421ee6a9@rodan> Message-ID: <01b901c3dc8d$051b71d0$421ee6a9@rodan> Hi everybody, I would like to check if there has been made a decision on this ? I'm planning to use record arrays to access image data header-information and having an attribute 'f' like suggested is still my favorite way. Is anyone besides me using record arrays on memory-mapped buffers ? Thanks, Sebastian Haase ----- Original Message ----- From: "Sebastian Haase" To: Sent: Thursday, December 04, 2003 4:27 PM Subject: Fw: [Numpy-discussion] numarray.records - get/set item > My situation where I got onto this, is having one field named 'mmm' > ("MinMaxMean") being an 3 element array. > Now, to assign the values first I tried: > self.hdrArray = makeHdrArray(self.h) #this makes the record array > self.hdr = self.hdrArray[0].field #this is my shortcut to the > bound member function > # it essentially is a solution (hack) for the getitem part > # but regarding setitem I had to learn that "assigning to a function" is > illigal in Python - as opposed to C++ > #so to do assignment I need to do: > self.hdr('mmm')[0], self.hdr('mmm')[1], self.hdr('mmm')[2] = (mi,ma,av) > > now that I'm looking at it, > self.hdrArray[0].setfield('mmm', (mi,ma,av)) > would probably be better... > > How about adding an attribute 'f' which could serve as a "proxy" to allow: > myRec.f.mmm = (mi,ma,av) > and maybe even additionally: > myRec.f['mmm'] = (mi,ma,av) > > Regards, > Sebastian > > > > ----- Original Message ----- > From: "Perry Greenfield" > To: "Sebastian Haase" ; > > Sent: Thursday, December 04, 2003 3:08 PM > Subject: RE: [Numpy-discussion] numarray.records - get/set item > > > > > Hi, > > > Is it maybe a good idea to add this to the definition of 'class Record' > > > > > > class Record: > > > """Class for one single row.""" > > > > > > def __getitem__(self, fieldName): > > > return self.array.field(fieldName)[self.row] > > > def __setitem__(self, fieldName, value): > > > self.array.field(fieldName)[self.row] = value > > > > > > I don't know about the implications if __delitem __ and so on are not > > > defined. > > > I just think it would look quite nice to say > > > myRecArr[0]['mmm'] = 'hallo' > > > as opposed to > > > myRecArr[0].setfield('mmm', 'hallo') > > > > > > Actually I would even like > > > myRecArr[0].mmm = 'hallo' > > > > > > This should be possible by defining __setattr__. > > > It would obviously only work for fieldnames that do not contain '.' or ' > ' > > > or ... > > > > > > Any comments ? > > > > > > > > We've had many internal discussions about doing this. The latter was > > considered a problem because of possible name collisions of field > > names with other attributes or methods. The former is not bothered > > by this problem, but we decided to be conservative on this and see > > how strong the need was. We are interested in other opinions. > > > > Perry > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From oliphant at ee.byu.edu Mon Jan 19 13:34:07 2004 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Jan 19 13:34:07 2004 Subject: [Numpy-discussion] Status of Numeric Message-ID: <400C3EF3.8090005@ee.byu.edu> Numarray is making great progress and is quite usable for many purposes. An idea that was championed by some is that the Numeric code base would stay static and be replaced entirely by Numarray. However, Numeric is currently used in a large installed base. In particular SciPy uses Numeric as its core array. While no doubt numarray arrays will be supported in the future, the speed of the less bulky Numeric arrays and the typical case that we encounter in SciPy of many, small arrays will make it difficult for people to abandon Numeric entirely with it's comparatively light-weight arrays. In the development of SciPy we have encountered issues in Numeric that we feel need to be fixed. As this has become an important path to success of several projects (both commercial and open) it is absolutely necessary that this issues be addressed. The purpose of this email is to assess the attitude of the community regarding how these changes to Numeric should be accomplished. These are the two options we can see: * freeze old Numeric 23.x and make all changes to Numeric 24.x still keeping Numeric separate from SciPy * freeze old Numeric 23.x and subsume Numeric into SciPy essentially creating a new SciPy arrayobject that is fast and lightweight. Anybody wanting this new array object would get it by installing scipy_base. Numeric would never change in the future but the array in scipy_base would. It is not an option to wait for numarray to get fast enough as these issues need to be addressed now. Ultimately I think it will be a wise thing to have two implementations of arrays: one that is fast and lightweight optimized for many relatively small arrays, and another that is optimized for large-scale arrays. Eventually, the use of these two underlying implementations should be automatic and invisible to the user. A few of the particular changes we need to make to the Numeric arrayobject are: 1) change the coercion model to reflect Numarray's choice and eliminate the savespace crutch. 2) Add indexing capability to Numeric arrays (similar to Numarray's) 3) Improve the interaction between Numeric arrays and scalars. 4) Optimization: Again, these changes are going to be made to some form of the Numeric arrays. What I am really interested in knowing is the attitude of the community towards keeping Numeric around. If most of the community wants to see Numeric go away then we will be forced to bring the Numeric array under the SciPy code-base and own it there. Your feedback is welcome and appreciated. Sincerely, Travis Oliphant and other SciPy developers From perry at stsci.edu Mon Jan 19 14:14:05 2004 From: perry at stsci.edu (Perry Greenfield) Date: Mon Jan 19 14:14:05 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> Message-ID: Travis Oliphant writes: > > Numarray is making great progress and is quite usable for many > purposes. An idea that was championed by some is that the Numeric code > base would stay static and be replaced entirely by Numarray. > > However, Numeric is currently used in a large installed base. In > particular SciPy uses Numeric as its core array. While no doubt > numarray arrays will be supported in the future, the speed of the less > bulky Numeric arrays and the typical case that we encounter in SciPy of > many, small arrays will make it difficult for people to abandon Numeric > entirely with it's comparatively light-weight arrays. > I'd like to ask if the numarray option couldn't at least be considered. In particular with regard to speed, we'd like to know what the necessary threshold is. For many ufuncs, numarray is within a factor of 3 or so of Numeric for small arrays. Is this good enough or not? What would be good enough? It would probably be difficult to make it as fast in all cases, but how close does it have to be? A factor of 2? 1.5? We haven't gotten very much feedback on specific numbers in this regard. Are there other aspects of numarray performance that are a problem? What specifically? We don't have the resources to optimize everything in case it might affect someone. We need to know that it is particular problem with users to give it some priority (and know what the necessary threshold is for acceptable performance). Perhaps the two (Numeric and numarray) may need to coexist for a while, but we would like to isolate the issues that make that necessary. That hasn't really happened yet. Travis, do you have any specific nummarray speed issues that have arisen from your benchmarking or use that we can look at? Perry Greenfield From hinsen at cnrs-orleans.fr Tue Jan 20 03:16:02 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Tue Jan 20 03:16:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> References: <400C3EF3.8090005@ee.byu.edu> Message-ID: On 19.01.2004, at 21:32, Travis Oliphant wrote: > These are the two options we can see: > * freeze old Numeric 23.x and make all changes to Numeric 24.x still > keeping Numeric separate from SciPy > * freeze old Numeric 23.x and subsume Numeric into SciPy essentially > creating a new SciPy arrayobject that is fast and lightweight. > Anybody wanting this new array object would get it by installing > scipy_base. Numeric would never change in the future but the array in > scipy_base would. That depends on the exact nature of the changes. My view is that any package that is upwards-compatible with Numeric (except for bug fixes of course) should be called Numeric and distributed as such. Any package that is intentionally incompatible with Numeric in some important aspect should not be called Numeric. There is a lot of code out there that builds on Numeric, and some of it is hardly maintained any more, although there are still users around. Those users expect to be able to upgrade Numeric without breaking their code. Konrad. From p.magwene at snet.net Tue Jan 20 05:59:01 2004 From: p.magwene at snet.net (Paul Magwene) Date: Tue Jan 20 05:59:01 2004 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #808 - 2 msgs In-Reply-To: <200401200407.i0K47Cfg004905@mta1.snet.net> References: <200401200407.i0K47Cfg004905@mta1.snet.net> Message-ID: <400D341F.1060504@snet.net> > > --__--__-- > > Message: 1 > Date: Mon, 19 Jan 2004 14:32:51 -0600 > From: Travis Oliphant > To: numpy-discussion at lists.sourceforge.net, python-list at python.org > Subject: [Numpy-discussion] Status of Numeric > > > The purpose of this email is to assess the attitude of the community > regarding how these changes to Numeric should be accomplished. > > These are the two options we can see: > * freeze old Numeric 23.x and make all changes to Numeric 24.x still > keeping Numeric separate from SciPy > * freeze old Numeric 23.x and subsume Numeric into SciPy essentially > creating a new SciPy arrayobject that is fast and lightweight. Anybody > wanting this new array object would get it by installing scipy_base. > Numeric would never change in the future but the array in scipy_base would. My preference would be for option #1 -- continue further development of Numeric as a separate package with new improvements going into the 24.x series. It's my experience that when projects get subsumed, additional requirements tend to creep in, even if they're not actually "required." --Paul Magwene From falted at openlc.org Tue Jan 20 09:45:04 2004 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 20 09:45:04 2004 Subject: [Numpy-discussion] numarray 0.8 and MacOSX Message-ID: <200401201844.31406.falted@openlc.org> Hi, I'm trying to compile numarray 0.8 on a MacOSX (Darwin 6.8). The compilation process seemed to go well, but an error happens when trying to import numarray: [falted at ppc-osx2:numarray-0.8]$ python Python 2.2 (#1, 10/24/02, 16:10:52) [GCC Apple cpp-precomp 6.14] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numarray Traceback (most recent call last): File "", line 1, in ? File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/__init__.py", line 11, in ? from numarrayall import * File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarrayall.py", line 2, in ? from generic import * File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/generic.py", line 1030, in ? import numarraycore as _nc File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarraycore.py", line 29, in ? PyINT_TYPES = { NameError: name 'bool' is not defined I know that there are available ports of numarray 0.8 to Darwin, but, for a series of reasons, I prefer to compile it for myself. Anyone can provide a hint so as to compile it cleanly? Thanks, -- Francesc Alted From jmiller at stsci.edu Tue Jan 20 10:07:03 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Jan 20 10:07:03 2004 Subject: [Numpy-discussion] numarray 0.8 and MacOSX In-Reply-To: <200401201844.31406.falted@openlc.org> References: <200401201844.31406.falted@openlc.org> Message-ID: <1074621875.20653.22.camel@halloween.stsci.edu> On Tue, 2004-01-20 at 12:44, Francesc Alted wrote: > Hi, > > I'm trying to compile numarray 0.8 on a MacOSX (Darwin 6.8). The compilation > process seemed to go well, but an error happens when trying to import > numarray: > > [falted at ppc-osx2:numarray-0.8]$ python > Python 2.2 (#1, 10/24/02, 16:10:52) > [GCC Apple cpp-precomp 6.14] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import numarray > Traceback (most recent call last): > File "", line 1, in ? > File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/__init__.py", > line 11, in ? > from numarrayall import * > File > "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarrayall.py", > line 2, in ? > from generic import * > File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/generic.py", > line 1030, in ? > import numarraycore as _nc > File > "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarraycore.py", > line 29, in ? > PyINT_TYPES = { > NameError: name 'bool' is not defined > > I know that there are available ports of numarray 0.8 to Darwin, but, for a > series of reasons, I prefer to compile it for myself. Anyone can provide a > hint so as to compile it cleanly? I tested numarray-0.8 on Darwin, but I tested it against user installed versions of Python 2.2.3 and 2.3.2. Both of these versions define bool. On the version of Mac OS-X I've got (10.2?), /usr/bin/python is 2.2.0, and it does not define bool. So, I don't think there is a clean compile, at least not for Mac users who don't also install updated versions of Python. Todd > > Thanks, > > -- > Francesc Alted > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From Chris.Barker at noaa.gov Tue Jan 20 11:13:05 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 20 11:13:05 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: References: <400C3EF3.8090005@ee.byu.edu> Message-ID: <400D7D66.5000504@noaa.gov> Konrad Hinsen wrote: > My view is that any > package that is upwards-compatible with Numeric (except for bug fixes > of course) should be called Numeric and distributed as such. Any > package that is intentionally incompatible with Numeric in some > important aspect should not be called Numeric. I absolutely agree with this. Travis Oliphant wrote: > 1) change the coercion model to reflect Numarray's choice and eliminate > the savespace crutch. > 2) Add indexing capability to Numeric arrays (similar to Numarray's) > 3) Improve the interaction between Numeric arrays and scalars. These all look like backward in-compatable changes, so in that case, I vote for Sci-py-array, or whatever. However, it also looks like these are all moving toward the Numarray API. Is this the case? That would be great, as then Numarray would just be dropped in if/when it is deemed up to the task. It also leaves the door open for some sort of automagic selection of which array to use for a given instance. > 4) Optimization: Nothing wrong with that...as long as it's not premature! > Numarray is making great progress and is quite usable for many > purposes. An idea that was championed by some is that the Numeric code > base would stay static and be replaced entirely by Numarray. > However, Numeric is currently used in a large installed base. In > particular SciPy uses Numeric as its core array. While no doubt > numarray arrays will be supported in the future, the speed of the less > bulky Numeric arrays and the typical case that we encounter in SciPy of > many, small arrays will make it difficult for people to abandon Numeric > entirely with it's comparatively light-weight arrays. It was said that making Numarray more efficient with small arrays was a goal of the project...is it still? I'm still unclear on why Numarrays are so much more "heavy"..is it just that no one has taken the time to optimize them, or is there really something inherent (and important) in the design? > As this has become an important path to > success of several projects (both commercial and open) it is absolutely > necessary that this issues be addressed. From the sammll list above, it looks like what you need is an array that is like a Numarray, but faster for samll arrays...Has anyone done an analysis of whether it would be harder to optimize Numarray than to make the above changes to Numeric, and continue to maintain two packages? You probably have, but I though I'd ask anyway... > Ultimately I think it will be a wise > thing to have two implementations of arrays: one that is fast and > lightweight optimized for many relatively small arrays, and another that > is optimized for large-scale arrays. Are these really incompatable goals? > If most of the community > wants to see Numeric go away then we will be forced to bring the > Numeric array under the SciPy code-base and own it there. I think it's quite the opposite... if most of the community wants to see Numeric continue on, it must be maintained (and improved) with little change to the API. If we're all going to switch to Numarray, then the SciPy project can do whatever it wants with Numeric... In Summary: - Anything called "Numeric" should have a compatable API to the current version - I'd much rather have just one N-d array type, preferable one that is part of the Python Standard Library...is likely to ever happen? - I also want fast small arrays. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From edcjones at erols.com Tue Jan 20 11:47:04 2004 From: edcjones at erols.com (Edward C. Jones) Date: Tue Jan 20 11:47:04 2004 Subject: [Numpy-discussion] How fast are small arrays currently? Message-ID: <400D848B.8050004@erols.com> Has anyone recently benchmarked the speed of numarray vs. Numeric? Why are numarrays so slow to create? From falted at openlc.org Tue Jan 20 12:30:02 2004 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 20 12:30:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400D7D66.5000504@noaa.gov> References: <400C3EF3.8090005@ee.byu.edu> <400D7D66.5000504@noaa.gov> Message-ID: <200401202129.20608.falted@openlc.org> A Dimarts 20 Gener 2004 20:11, Chris Barker va escriure: > > As this has become an important path to > > success of several projects (both commercial and open) it is absolutely > > necessary that this issues be addressed. > > From the sammll list above, it looks like what you need is an array > that is like a Numarray, but faster for samll arrays...Has anyone done > an analysis of whether it would be harder to optimize Numarray than to > make the above changes to Numeric, and continue to maintain two > packages? You probably have, but I though I'd ask anyway... I agree. An analysis should be done in order to see if it is better to concentrate in getting numarray better for small arrays or in having several array implementations. The problem is if numarray cannot be enhanced enough because of design problems, although I would bet that something can be done in order to get it close to Numeric performance. And I guess quite a bit people on this list would be happy to collaborate in some way or another so as to achieve this goal. However, as Perry says, in order to do this analysis, an amount of the needed speed-up should be estimated first. I personaly feel that it would worth the effort to go and try to optimize the small arrays case in numarray instead of having to fight against a jungle of Numeric/numarray/python array implementations. I strongly believe that numarray has enough advantages over Numeric that would compensate the effort to further enhance its present limitations rather than maintain several packages. Just my 2 cents, -- Francesc Alted From cookedm at physics.mcmaster.ca Tue Jan 20 12:33:01 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Jan 20 12:33:01 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <400D848B.8050004@erols.com> References: <400D848B.8050004@erols.com> Message-ID: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> On Tue, Jan 20, 2004 at 02:42:03PM -0500, Edward C. Jones wrote: > Has anyone recently benchmarked the speed of numarray vs. Numeric? Just what I was doing :-) Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing the two. Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an array of size N (of Float), the time to do a+a is Numeric: 3.7940e-6 + 2.2556e-8 * N seconds numarray: 3.7062e-5 + 5.8497e-9 * N For sin(a), Numeric: 1.7824e-6 + 1.1341e-7 * N numarray: 2.8994e-5 + 9.8985e-8 * N So the slowness of numarray vs. Numeric for small arrays is because of an overhead of 3.7e-5 s for numarray, as opposed to 3.8e-6 s for Numeric. Otherwise, numarray is 4 times faster for large arrays for addition (and multiplication, which I've also checked). The crossover is at arrays of about 2000 elements. If this overhead could be reduced by a factor of 3 or 4, I'd be much happier with using numarray for small arrays. But for now, it's not good enough. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From perry at stsci.edu Tue Jan 20 12:33:03 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 12:33:03 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <400D848B.8050004@erols.com> Message-ID: > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Edward > C. Jones > Sent: Tuesday, January 20, 2004 2:42 PM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] How fast are small arrays currently? > > > Has anyone recently benchmarked the speed of numarray vs. Numeric? > We presented some benchmarks at scipy 2003. It depends on many factors and what functions or operations are being performed so it is hard to generalize (one reason I ask for specific cases that need improvement). But to take ufuncs as examples: the speed for 1 element arrays (about as small as they get) has: v0.4 v0.5 Int32 + Int32 65 3.7 Int32 + Int32 discontiguous 104 7.3 Int32 + Float64 95 4.9 add.reduce(Int32) NxN swapaxes 111 3.6 add.reduce(Int32, -1) NxN 98 3.2 What is shown is the (time for numarray operation)/(time for Numeric), for v0.4 and v0.5. Note that with v0.5, these are typically 3 to 4 times slower for small arrays with a couple cases some what worse (a factor of 4.9 and 7.3). Speeds for v0.4 are substantially slower (orders of magnitude). Note that the speedup is obtained through caching certain information. The first time you perform a certain operation (say an Int32/Int16 add), it will be slow. When repeated it will be closer to that shown benchmark. If you are only going to do one operation on a small array, speed presumably doesn't matter much. It is only when you plan to iterate over many small arrays would it usually be an issue. Other functions may be much worse (or better). If people let us know which things are too slow we can put that on our to do list. Is a factor of 3 or 4 times slower a killer? What about a factor of 2? > Why are numarrays so slow to create? > I'll leave it to Todd to give the details of that. From verveer at embl.de Tue Jan 20 12:38:02 2004 From: verveer at embl.de (verveer at embl.de) Date: Tue Jan 20 12:38:02 2004 Subject: [Numpy-discussion] Status of Numeric Message-ID: <1074631021.400d916d09e59@webmail.embl.de> Just my 2 cents on the issue of replacing Numeric by Numarray: I was under the impression that Numarray was intended to be a replacement for Numeric, also as a building block for larger packages such as SciPy. Was Numarray not intended to be an "improved Numeric" in the first place? I chose to develop for Numarray rather than Numeric because of its improvements, under the assumption that eventually my code would also become available to the users of such packages as SciPy. (I wrote the nd_image extension that is now distributed with Numarray. I also contributed some improvements to RandomArray extension that are not in the Numeric version.) I believe that it would be a bad situation if the numerical python community would be split among two different array packages. (I think Paul Dubois expressed a similar sentiment on comp.lang.python). Supporting code for two incompatible packages would be a pain (I am personally not willing to do that). Not being able to use modules designed for one package in the other would be disappointing for many people, I think... If I understood well, the only issue with Numarray seems to be that the speed for handling small arrays is too low. So would it not be more efficient to focus on that problem rather than throwing away all the excellent work that has been done already on Numarray? Best regards, Peter -- Dr. Peter J. Verveer Cell Biology and Cell Biophysics Programme European Molecular Biology Laboratory Meyerhofstrasse 1 D-69117 Heidelberg Germany From perry at stsci.edu Tue Jan 20 12:39:03 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 12:39:03 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> Message-ID: David M. Cooke writes: > Just what I was doing :-) > > Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing > the two. > > Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an > array of size N (of Float), the time to do a+a is > > Numeric: 3.7940e-6 + 2.2556e-8 * N seconds > numarray: 3.7062e-5 + 5.8497e-9 * N > > For sin(a), > Numeric: 1.7824e-6 + 1.1341e-7 * N > numarray: 2.8994e-5 + 9.8985e-8 * N > > So the slowness of numarray vs. Numeric for small arrays is because of > an overhead of 3.7e-5 s for numarray, as opposed to 3.8e-6 s for > Numeric. Otherwise, numarray is 4 times faster for large arrays > for addition (and multiplication, which I've also checked). > > The crossover is at arrays of about 2000 elements. > > If this overhead could be reduced by a factor of 3 or 4, I'd be much > happier with using numarray for small arrays. But for now, it's not > good enough. > How many times do you do the operation for each size? Because of caching, the first result may be much slower than the rest. If you didn't could you try computing it by discarding the first numarray time (or start timing after doing the first iteration)? Thanks, Perry From perry at stsci.edu Tue Jan 20 12:42:03 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 12:42:03 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <1074631021.400d916d09e59@webmail.embl.de> Message-ID: Peter J. Verveer writes: > I was under the impression that Numarray was intended to be a > replacement for > Numeric, also as a building block for larger packages such as SciPy. Was > Numarray not intended to be an "improved Numeric" in the first > place? I chose > to develop for Numarray rather than Numeric because of its > improvements, under > the assumption that eventually my code would also become available to the > users of such packages as SciPy. (I wrote the nd_image extension > that is now > distributed with Numarray. I also contributed some improvements > to RandomArray > extension that are not in the Numeric version.) > It has been our intention to port scipy to use numarray soon. This work has been delayed somewhat since our current focus is on plotting. We do still intend to see that scipy works with numarray. Perry From cookedm at physics.mcmaster.ca Tue Jan 20 13:05:01 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Jan 20 13:05:01 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: References: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> Message-ID: <20040120210414.GA9095@arbutus.physics.mcmaster.ca> On Tue, Jan 20, 2004 at 03:38:34PM -0500, Perry Greenfield wrote: > David M. Cooke writes: > > > Just what I was doing :-) > > > > Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing > > the two. > > > > Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an > > array of size N (of Float), the time to do a+a is > > > > Numeric: 3.7940e-6 + 2.2556e-8 * N seconds > > numarray: 3.7062e-5 + 5.8497e-9 * N > > > > For sin(a), > > Numeric: 1.7824e-6 + 1.1341e-7 * N > > numarray: 2.8994e-5 + 9.8985e-8 * N ... > How many times do you do the operation for each size? Because of > caching, the first result may be much slower than the rest. > If you didn't could you try computing it by discarding the first > numarray time (or start timing after doing the first iteration)? 10000 times per size. I'm re-running it like you suggested, but the difference is small (the new version is up on the above page). For numarray for addition, it's now 3.8771e-5 + 4.9832e-9 * N -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cjw at sympatico.ca Tue Jan 20 14:20:01 2004 From: cjw at sympatico.ca (Colin J. Williams) Date: Tue Jan 20 14:20:01 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> References: <400C3EF3.8090005@ee.byu.edu> Message-ID: <400DA93C.7030709@sympatico.ca> Travis Oliphant wrote: > > Numarray is making great progress and is quite usable for many > purposes. An idea that was championed by some is that the Numeric > code base would stay static and be replaced entirely by Numarray. It was my impression that this idea had been generally accepted. It was not just one of the proposals under discussion. I wonder how many others out there had assumed that, in spite of current speed problems, numarray was the way for the future, and had based their development endeavours on numarray. I did. To this relative outsider, there seem to have been three groups involved in efforts to provide Python with numerical array capabilities, those connected with Numeric, SciPy and numarray. SciPy would appear to be the most recent addition to the list. Is there any way that some agrement between these groups can be achieved to restore the hope for a common development path? This message from Travis Oliphant seems to envisage two paths. Is this the better way to go? > > However, Numeric is currently used in a large installed base. In > particular SciPy uses Numeric as its core array. While no doubt > numarray arrays will be supported in the future, the speed of the less > bulky Numeric arrays and the typical case that we encounter in SciPy > of many, small arrays will make it difficult for people to abandon > Numeric entirely with it's comparatively light-weight arrays. > > In the development of SciPy we have encountered issues in Numeric that > we feel need to be fixed. As this has become an important path to > success of several projects (both commercial and open) it is > absolutely necessary that this issues be addressed. > > > The purpose of this email is to assess the attitude of the community > regarding how these changes to Numeric should be accomplished. > These are the two options we can see: > * freeze old Numeric 23.x and make all changes to Numeric 24.x still > keeping Numeric separate from SciPy > * freeze old Numeric 23.x and subsume Numeric into SciPy essentially > creating a new SciPy arrayobject that is fast and lightweight. > Anybody wanting this new array object would get it by installing > scipy_base. Numeric would never change in the future but the array in > scipy_base would. > > It is not an option to wait for numarray to get fast enough as these > issues need to be addressed now. Ultimately I think it will be a wise > thing to have two implementations of arrays: one that is fast and > lightweight optimized for many relatively small arrays, and another > that is optimized for large-scale arrays. Eventually, the use of > these two underlying implementations should be automatic and invisible > to the user. Is this "automatic and invisible" practicable, excepts for trivial examples? > > A few of the particular changes we need to make to the Numeric > arrayobject are: > > 1) change the coercion model to reflect Numarray's choice and > eliminate the savespace crutch. > 2) Add indexing capability to Numeric arrays (similar to Numarray's) > 3) Improve the interaction between Numeric arrays and scalars. > 4) Optimization: > > Again, these changes are going to be made to some form of the Numeric > arrays. What I am really interested in knowing is the attitude of the > community towards keeping Numeric around. If most of the community > wants to see Numeric go away then we will be forced to bring the > Numeric array under the SciPy code-base and own it there. > > Your feedback is welcome and appreciated. > Sincerely, > > Travis Oliphant and other SciPy developers > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion I hope that some cooperative approach can be devised. Colin W. From m.oliver at iu-bremen.de Tue Jan 20 14:52:02 2004 From: m.oliver at iu-bremen.de (Marcel Oliver) Date: Tue Jan 20 14:52:02 2004 Subject: [Numpy-discussion] Status of Numeric Message-ID: <16397.45434.933125.912105@localhost.localdomain> Perry Greenfield writes: > Peter J. Verveer writes: > > > I was under the impression that Numarray was intended to be a > > replacement for Numeric, also as a building block for larger > > packages such as SciPy. Was Numarray not intended to be an > > "improved Numeric" in the first place? I chose to develop for > > Numarray rather than Numeric because of its improvements, under > > the assumption that eventually my code would also become > > available to the users of such packages as SciPy. (I wrote the > > nd_image extension that is now distributed with Numarray. I also > > contributed some improvements to RandomArray extension that are > > not in the Numeric version.) > > > It has been our intention to port scipy to use numarray soon. This > work has been delayed somewhat since our current focus is on > plotting. We do still intend to see that scipy works with numarray. That this discussion is happening NOW really surprises me. I have been following this list for a couple of years now, with the intention of eventually using numerical Python as the main teaching toolbox for numerical analysis, and possibly for the migration small research codes as well. The possibility of doing numerics in Phython has always intrigued me. Right now I am primarily using Matlab. It's very powerful, but not free and the language is horrible; Octave is trying to play catch up but has mostly lost steam. So a good scientific Phython environment (of any sort) would be a really cool thing to have. However, two things have always held me back (apart from coding small examples on a few occasions): 1. Numerical Phython has been in a limbo for too long (I had even assumed a few times that both Numeric and Numarray were dead for all practical purposes). If there are two incompatible version for years and no clear indication where the whole thing is going, I am very hesitant to invest any time into writing substantial code, or recommend it for class room use. 2. Plotting is a major issue. There are a couple of semi-functional packages, but neither a comprehensive solution nor a clear direction for the plotting architecture. Short, I see a lot of potential, unused mainly because the numerical Python community seems to lack clear direction and leadership. This is a real showstopper for someone who is primarily interested in building on top. I am still hopeful that something will come of all this - any progress will be very much appreciated. Best regards, Marcel --------------------------------------------------------------------- Marcel Oliver Phone: +49-421-200-3212 School of Engineering and Science Fax: +49-421-200-3103 International University Bremen m.oliver at iu-bremen.de Campus Ring 1 oliver at member.ams.org 28759 Bremen, Germany http://math.iu-bremen.de/oliver --------------------------------------------------------------------- From rays at blue-cove.com Tue Jan 20 14:56:04 2004 From: rays at blue-cove.com (Ray Schumacher) Date: Tue Jan 20 14:56:04 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <20040120210414.GA9095@arbutus.physics.mcmaster.ca> References: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> Message-ID: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> With a cross-over at ~2000 elements, can we safely say that working with video, FITS cubes or other similar imagery would be fastest with numarray for summing or dividing 2D arrays? (~920K elements) Ray http://rjs.org/astro From cookedm at physics.mcmaster.ca Tue Jan 20 15:05:01 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Jan 20 15:05:01 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> References: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> Message-ID: <200401201803.53983.cookedm@physics.mcmaster.ca> On Tuesday 20 January 2004 17:54, Ray Schumacher wrote: > With a cross-over at ~2000 elements, can we safely say that working with > video, FITS cubes or other similar imagery would be fastest with numarray > for summing or dividing 2D arrays? (~920K elements) My benchmark was for 1-D arrays, but checking 2-D shows the crossover is in the same region. I'd say for these types of applications you really want to use numarray. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From Chris.Barker at noaa.gov Tue Jan 20 15:10:04 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 20 15:10:04 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <16397.45434.933125.912105@localhost.localdomain> References: <16397.45434.933125.912105@localhost.localdomain> Message-ID: <400DB4DE.3030604@noaa.gov> > Perry Greenfield writes: > > It has been our intention to port scipy to use numarray soon. This > > work has been delayed somewhat since our current focus is on > > plotting. That is good news. What plotting package are you working on? Last I heard Chaco had turned into Enthought's (and STSci) in-house Windows only package. (Not because they want it that way, but because they don't have funding to make it work on other platforms, and support the broader community). I don't see anything new on the SciPy page after August '03. Frankly, weak plotting is a bigger deal to me than array performance. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rays at blue-cove.com Tue Jan 20 16:03:04 2004 From: rays at blue-cove.com (Ray Schumacher) Date: Tue Jan 20 16:03:04 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <16397.46274.310876.908257@localhost.localdomain> References: <5.2.0.4.2.20040120145558.13e5abf8@blue-cove.com> <16397.45434.933125.912105@localhost.localdomain> <5.2.0.4.2.20040120145558.13e5abf8@blue-cove.com> Message-ID: <5.2.0.4.2.20040120155807.13e7eca8@blue-cove.com> Hi Marcel, At 12:07 AM 1/21/2004 +0100, you wrote: > >Are you saying you have found that you have reinvented the wheel? >That's exactly what I suspect happening a lot... I'm sure a lot of people have written little plot utilities because of the size of Chaco and similar packages, or difficulty integrating with their favorite GUI or module. Ray From perry at stsci.edu Tue Jan 20 17:08:41 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:08:41 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <20040120210414.GA9095@arbutus.physics.mcmaster.ca> Message-ID: <92F58991-4BAD-11D8-9B39-000393989D66@stsci.edu> > David M. Cooke writes: > > 10000 times per size. I'm re-running it like you suggested, but the > difference is small (the new version is up on the above page). For > numarray for addition, it's now > 3.8771e-5 + 4.9832e-9 * N > Well, OK we'll have to look into that. That's different by a factor of 3 or so than what I expected. I'll see if I can find what that is due to. Perry From perry at stsci.edu Tue Jan 20 17:32:06 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:32:06 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400DA93C.7030709@sympatico.ca> Message-ID: <57090E6E-4BB1-11D8-9B39-000393989D66@stsci.edu> On Tuesday, January 20, 2004, at 05:18 PM, Colin J. Williams wrote: > Travis Oliphant wrote: > >> >> Numarray is making great progress and is quite usable for many >> purposes. An idea that was championed by some is that the Numeric >> code base would stay static and be replaced entirely by Numarray. > > It was my impression that this idea had been generally accepted. It > was not just one of the proposals under discussion. > I don't think there was ever any formal vote. I think Paul Dubois had accepted the idea, others had a more "wait and see" attitude. Realistically, I think one can safely say that as one might expect, those that already were using Numeric probably were happy with its capabilities and that given normal motivations, there would be significant inertia on the part of well established users (those with a lot of code already) to switch over. But since it wasn't quite as usable for our needs, we decided that we needed a new version. We had to develop it to support our needs and would have done it regardless. We hoped that it would be suitable for all uses, and we've tried to involve all in the process as much as possible. As you might expect, we've devoted most of our attention to meeting our needs, but we have also expended significant energy trying to meet the needs of the more general community (and we will continue to try to do so within our resources). I don't know if it is reasonable to expect that a certain outcome has been blessed by all, nor did most of the existing Numeric users ask us to do this. But many did recognize (as Paul Dubois alluded to) that there was a need to recode the array stuff. Maybe someone could have done a better job of it, but no one else has yet (it is a fair amount of work after all). We do intend to support all the important packages that Numeric does, it make take some time to get there. I suppose our goal is to eventually attract all new users. We can't, nor should we expect that existing Numeric users will switch at our desire or whim. > I wonder how many others out there had assumed that, in spite of > current speed problems, numarray was the way for the future, and had > based their development endeavours on numarray. I did. > > To this relative outsider, there seem to have been three groups > involved in efforts to provide Python with numerical array > capabilities, those connected with Numeric, SciPy and numarray. SciPy > would appear to be the most recent addition to the list. > Actually, I think it would be more accurate to say that SciPy is an attempt to collect a large base of numeric code and integrate it into an array package (currently Numeric) rather than to develop a new array package. It was started before we started numarray and thus was centered around Numeric. They have found occasions to to modify and extend Numeric behavior. In that sense, it long has been somewhat incompatible with Numeric. (Travis can correct me if I got that wrong.) > Is there any way that some agrement between these groups can be > achieved to restore the hope for a common development path? > I would certainly like to, and in any case, we want to adapt scipy to be compatible with numarray. Perry Greenfield From perry at stsci.edu Tue Jan 20 17:43:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:43:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <16397.45434.933125.912105@localhost.localdomain> Message-ID: On Tuesday, January 20, 2004, at 05:53 PM, Marcel Oliver wrote: > That this discussion is happening NOW really surprises me. I have > been following this list for a couple of years now, with the intention > of eventually using numerical Python as the main teaching toolbox for > numerical analysis, and possibly for the migration small research > codes as well. > > The possibility of doing numerics in Phython has always intrigued me. > Right now I am primarily using Matlab. It's very powerful, but not > free and the language is horrible; Octave is trying to play catch up > but has mostly lost steam. So a good scientific Phython environment > (of any sort) would be a really cool thing to have. > > However, two things have always held me back (apart from coding small > examples on a few occasions): > > 1. Numerical Phython has been in a limbo for too long (I had even > assumed a few times that both Numeric and Numarray were dead for > all practical purposes). If there are two incompatible version for > I don't know why you assumed that. Both have regularly been updated more than once in the past two years. > years and no clear indication where the whole thing is going, I am > very hesitant to invest any time into writing substantial code, or > recommend it for class room use. > That's your right of course. You have to remember that neither we (STScI) nor Enthought (who has funded virtually all the scipy work) are getting paid to do the work we are doing for the general community. In our case, we do much of it for our own purposes, and it would certainly be to our advantage if numarray were adopted by the general community so we invest resources in it. If you don't feel it is ready for your purposes, don't use numarray (or Numeric). We have only so many resources and while we wish we could do everything immediately, we can't. We are committed to making Python a good scientific environment, but we don't promise that it has everything that everyone would need now (and it certainly doesn't). > 2. Plotting is a major issue. There are a couple of semi-functional > packages, but neither a comprehensive solution nor a clear > direction for the plotting architecture. > I agree completely. A later (tonight) message will discuss the current situation at more length. Perry Greenfield From perry at stsci.edu Tue Jan 20 17:46:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:46:00 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> Message-ID: <43E10567-4BB3-11D8-9B39-000393989D66@stsci.edu> On Tuesday, January 20, 2004, at 05:54 PM, Ray Schumacher wrote: > With a cross-over at ~2000 elements, can we safely say that working > with video, FITS cubes or other similar imagery would be fastest with > numarray for summing or dividing 2D arrays? (~920K elements) > As long as you treat the array as a whole, I'd say that usually numarray would be better suited. That doesn't mean you won't find some instances where it is slower for certain operations. (When you do, let us know). Perry Greenfield From bsder at allcaps.org Tue Jan 20 18:53:00 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Tue Jan 20 18:53:00 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> References: <400C3EF3.8090005@ee.byu.edu> Message-ID: <20040120181047.G98683@mail.allcaps.org> On Mon, 19 Jan 2004, Travis Oliphant wrote: > ... Ultimately I think it will be a wise thing to have two > implementations of arrays: one that is fast and lightweight optimized > for many relatively small arrays, and another that is optimized for > large-scale arrays. I am *extremely* interested in the use case of the small arrays in SciPy. Which algorithms and modules are dominated by the small array speed? -a From perry at stsci.edu Tue Jan 20 19:04:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 19:04:02 2004 Subject: [Numpy-discussion] Status of Numeric (and plotting in particular) In-Reply-To: <400DB4DE.3030604@noaa.gov> Message-ID: <27E3B046-4BBE-11D8-9B39-000393989D66@stsci.edu> On Tuesday, January 20, 2004, at 06:08 PM, Chris Barker wrote: >> Perry Greenfield writes: > >> > It has been our intention to port scipy to use numarray soon. This >> > work has been delayed somewhat since our current focus is on >> > plotting. > > That is good news. What plotting package are you working on? Last I > heard Chaco had turned into Enthought's (and STSci) in-house Windows > only package. (Not because they want it that way, but because they > don't have funding to make it work on other platforms, and support the > broader community). > > I don't see anything new on the SciPy page after August '03. > > Frankly, weak plotting is a bigger deal to me than array performance. > Yes, I agree completely (and why we are giving plotting higher priority than scipy integration). I really was hoping to raise this issue later, but I might as well address it since the Numeric/numarray issue has raised it indirectly. Chaco had been the focus of our plotting efforts for more than a year. The effort started with our funding Enthought to start the effort. We had a number of requirements for a plotting package that weren't met by any existing package, and it didn't appear that any would be easily modified to our needs. The requirements we had (off the top of my head) included: 1) easy portability to graphics devices *and* different windowing systems. 2) it had to run on all major platforms including Solaris, Linux, Macs, and Windows. 3) the graphics had to be embedable within gui widgets. 4) it had to allow cursor interactions, at least to the point of being able to read cursor positions from python. 5) it had to be open source and preferably not gpl (though the latter was probably not a show stopper for us) 6) It also had to be customizable to the point of being able to produce very high quality hardcopy plots suitable for publication. 7) object oriented plotting framework capable of sensible composition. 8) command line interface akin to that available in matlab or IDL to make producing quick interactive plots very, very easy. Developing something that satisfies these is not at all trivial. In the process Enthought has expended much energy developing chaco, kiva and traits (and lately they are working on yet more extensions); easily much more of the effort has come from sources other than STScI. Kiva is the back end that presents a uniform api for different graphics devices. Traits handles many of the user interface issues for plot parameters, and handling the relationships of these parameters between plot components. Chaco is the higher level plotting software that provides the traditional plotting capabilities for 2-d data. Much has been invested in chaco. It is with some regret that we (STScI) have concluded that chaco is not suitable for our needs and that we need to take a different approach (or at least give it a try). I'll take some space to explain why. The short answer is that in the end we think it was too ambitious. We still aim to achieve the goals I listed above. The problem we think is that chaco was also tasked to try to achieve extra goals with regard interactive capabilities that were in the end, not really important to STScI and it's community, but were important to Enthought (and presumably its clients, and the scipy community). More specifically, a lot of thought and work went into making many aspects of the plots could be interactively modified. That is, by clicking on various aspects of plots, one could bring up editors for the attributes of that plot element, such as color, line style, font, size, etc. Many other interactive aspects have been enhanced as well. Much recent work by Enthought is going into extending the capabilities even further by adding gui kinds of features (e.g., widgets of all sorts). Unfortunately these capabilities have come at a price, namely complexity. We have found it difficult to track the ongoing changes to chaco to become proficient enough to contribute significantly by adding capabilities we have needed. Perhaps that argues that we aren't competent to do so. To a certain degree, that is probably is true. There is no doubt that Enthought has some very talented software engineers working on chaco and related products. On the other hand, our goal is to have this software be accessible by scientists in general, and particularly astronomers. Chaco is complex enough that we think that is a serious problem. Customizing it's behavior requires a very large investment of time understanding how it works, far beyond what most astronomers are willing to tackle (at least that's my impression). Much of this complexity (and many of its ongoing changes) is to support the interactive capabilities, and to make it responsive enough that plots can update themselves quickly enough not to lead to annoying lags. But frankly, we just want something to render plots on the screen and on hardcopy. Outside of being able to obtain cursor coordinates, we find many of the interactive capabilities as secondary in importance. When most astronomers want to tune a plot (either for publication quality, or for batch processing), they usually want to be able to reproduce the adjustments for new data, for which the interactive attribute editing capability is of little use. Generally they would like to script the the more customized plots so that they can be easily modified and reused. So it seems that it is too difficult to accomplish all these aims within one package. We would like to develop a different plotting package (using many of ideas from chaco, and some code) based on kiva and the traits package. We have started on this over the past month, and hope to have some simple functionality available within a month (though when we make it public may take a bit longer). It will be open source and we hope significantly simpler than chaco. It will not focus on speed (well, we want fairly fast display times for plots of a reasonable number of points, but we don't need video refresh rates). If your interest in plotting matches ours, then this may be for you. We will welcome contributions and comments once we get it off the ground. (We are calling it pyxis by the way). Enthought is continuing to work on chaco and at some point that will be mature, and will be capable of some sophisticated things. That may be more appropriate for some than what we are working on. Perry Greenfield From SKuzminski at fairisaac.com Wed Jan 21 05:06:02 2004 From: SKuzminski at fairisaac.com (Kuzminski, Stefan R) Date: Wed Jan 21 05:06:02 2004 Subject: [Numpy-discussion] Status of Numeric (and plotting in particular) Message-ID: <7646464ACC9B5347A4A5C57729D74A5503540DB4@srfmsg100.corp.fairisaac.com> I'm working on a commercial product that produces publication quality plots from data contained in Numeric arrays. I also concluded that Chaco was a bit more involved than I needed. My question is what requirements are not met by the other available plotting packages such as.. http://matplotlib.sourceforge.net/ These don't have every bell and whistle ( esp. when it comes to the interactive 'properties' dialog ) but as you point out there is a dark side to those features. There are a number of quite capable plotting packages for Python, diversity is good up to a point, but this space ( Plotting packages ) seems ripe for a shakeout. Stefan -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Perry Greenfield Sent: Tuesday, January 20, 2004 7:02 PM To: Chris Barker Cc: numpy-discussion at lists.sourceforge.net Subject: Re: [Numpy-discussion] Status of Numeric (and plotting in particular) On Tuesday, January 20, 2004, at 06:08 PM, Chris Barker wrote: >> Perry Greenfield writes: > >> > It has been our intention to port scipy to use numarray soon. This >> > work has been delayed somewhat since our current focus is on >> > plotting. > > That is good news. What plotting package are you working on? Last I > heard Chaco had turned into Enthought's (and STSci) in-house Windows > only package. (Not because they want it that way, but because they > don't have funding to make it work on other platforms, and support the > broader community). > > I don't see anything new on the SciPy page after August '03. > > Frankly, weak plotting is a bigger deal to me than array performance. > Yes, I agree completely (and why we are giving plotting higher priority than scipy integration). I really was hoping to raise this issue later, but I might as well address it since the Numeric/numarray issue has raised it indirectly. Chaco had been the focus of our plotting efforts for more than a year. The effort started with our funding Enthought to start the effort. We had a number of requirements for a plotting package that weren't met by any existing package, and it didn't appear that any would be easily modified to our needs. The requirements we had (off the top of my head) included: 1) easy portability to graphics devices *and* different windowing systems. 2) it had to run on all major platforms including Solaris, Linux, Macs, and Windows. 3) the graphics had to be embedable within gui widgets. 4) it had to allow cursor interactions, at least to the point of being able to read cursor positions from python. 5) it had to be open source and preferably not gpl (though the latter was probably not a show stopper for us) 6) It also had to be customizable to the point of being able to produce very high quality hardcopy plots suitable for publication. 7) object oriented plotting framework capable of sensible composition. 8) command line interface akin to that available in matlab or IDL to make producing quick interactive plots very, very easy. Developing something that satisfies these is not at all trivial. In the process Enthought has expended much energy developing chaco, kiva and traits (and lately they are working on yet more extensions); easily much more of the effort has come from sources other than STScI. Kiva is the back end that presents a uniform api for different graphics devices. Traits handles many of the user interface issues for plot parameters, and handling the relationships of these parameters between plot components. Chaco is the higher level plotting software that provides the traditional plotting capabilities for 2-d data. Much has been invested in chaco. It is with some regret that we (STScI) have concluded that chaco is not suitable for our needs and that we need to take a different approach (or at least give it a try). I'll take some space to explain why. The short answer is that in the end we think it was too ambitious. We still aim to achieve the goals I listed above. The problem we think is that chaco was also tasked to try to achieve extra goals with regard interactive capabilities that were in the end, not really important to STScI and it's community, but were important to Enthought (and presumably its clients, and the scipy community). More specifically, a lot of thought and work went into making many aspects of the plots could be interactively modified. That is, by clicking on various aspects of plots, one could bring up editors for the attributes of that plot element, such as color, line style, font, size, etc. Many other interactive aspects have been enhanced as well. Much recent work by Enthought is going into extending the capabilities even further by adding gui kinds of features (e.g., widgets of all sorts). Unfortunately these capabilities have come at a price, namely complexity. We have found it difficult to track the ongoing changes to chaco to become proficient enough to contribute significantly by adding capabilities we have needed. Perhaps that argues that we aren't competent to do so. To a certain degree, that is probably is true. There is no doubt that Enthought has some very talented software engineers working on chaco and related products. On the other hand, our goal is to have this software be accessible by scientists in general, and particularly astronomers. Chaco is complex enough that we think that is a serious problem. Customizing it's behavior requires a very large investment of time understanding how it works, far beyond what most astronomers are willing to tackle (at least that's my impression). Much of this complexity (and many of its ongoing changes) is to support the interactive capabilities, and to make it responsive enough that plots can update themselves quickly enough not to lead to annoying lags. But frankly, we just want something to render plots on the screen and on hardcopy. Outside of being able to obtain cursor coordinates, we find many of the interactive capabilities as secondary in importance. When most astronomers want to tune a plot (either for publication quality, or for batch processing), they usually want to be able to reproduce the adjustments for new data, for which the interactive attribute editing capability is of little use. Generally they would like to script the the more customized plots so that they can be easily modified and reused. So it seems that it is too difficult to accomplish all these aims within one package. We would like to develop a different plotting package (using many of ideas from chaco, and some code) based on kiva and the traits package. We have started on this over the past month, and hope to have some simple functionality available within a month (though when we make it public may take a bit longer). It will be open source and we hope significantly simpler than chaco. It will not focus on speed (well, we want fairly fast display times for plots of a reasonable number of points, but we don't need video refresh rates). If your interest in plotting matches ours, then this may be for you. We will welcome contributions and comments once we get it off the ground. (We are calling it pyxis by the way). Enthought is continuing to work on chaco and at some point that will be mature, and will be capable of some sophisticated things. That may be more appropriate for some than what we are working on. Perry Greenfield ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From jh at oobleck.astro.cornell.edu Wed Jan 21 10:45:02 2004 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Wed Jan 21 10:45:02 2004 Subject: [Numpy-discussion] the direction and pace of development Message-ID: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> This is a necessarily long post about the path to an open-source replacement for IDL and Matlab. While I have tried to be fair to those who have contributed much more than I have, I have also tried to be direct about what I see as some fairly fundamental problems in the way we're going about this. I've given it some section titles so you can navigate, but I hope that you will read the whole thing before posting a reply. I fear that this will offend some people, but please know that I value all your efforts, and offense is not my intent. THE PAST VS. NOW While there is significant and dedicated effort going into numeric/numarray/scipy, it's becoming clear that we are not progressing quickly toward a replacement for IDL and Matlab. I have great respect for all those contributing to the code base, but I think the present discussion indicates some deep problems. If we don't identify those problems (easy) and solve them (harder, but not impossible), we will continue not to have the solution so many people want. To be convinced that we are doing something wrong at a fundamental level, consider that Python was the clear choice for a replacement in 1996, when Paul Barrett and I ran a BoF at ADASS VI on interactive data analysis environments. That was over 7 years ago. When people asked at that conference, "what does Python need to replace IDL or Matlab", the answer was clearly "stable interfaces to basic numerics and plotting; then we can build it from there following the open-source model". Work on both these problems was already well underway then. Now, both the numerical and plotting development efforts have branched. There is still no stable base upon which to build. There aren't even packages for popular OSs that people can install and play with. The problem is not that we don't know how to do numerics or graphics; if anything, we know these things too well. In 1996, if anyone had told us that in 2004 there would be no ready-to-go replacement system because of a factor of 4 in small array creation overhead (on computers that ran 100x as fast as those then available) or the lack of interactive editing of plots at video speeds, the response would not have been pretty. How would you have felt? THE PROBLEM We are not following the open-source development model. Rather, we pay lip service to it. Open source's development mantra is "release early, release often". This means release to the public, for use, a package that has core capability and reasonably-defined interfaces. Release it in a way that as many people as possible will get it, install it, use it for real work, and contribute to it. Make the main focus of the core development team the evaluation and inclusion of contributions from others. Develop a common vision for the program, and use that vision to make decisions and keep efforts focused. Include contributing developers in decision making, but do make decisions and move on from them. Instead, there are no packages for general distribution. The basic interfaces are unstable, and not even being publicly debated to decide among them (save for the past 3 days). The core developers seem to spend most of their time developing, mostly out of view of the potential user base. I am asked probably twice a week by different fellow astronomers when an open-source replacement for IDL will be available. They are mostly unaware that this effort even exists. However, this indicates that there are at least hundreds of potential contributors of application code in astronomy alone, as I don't nearly know everyone. The current efforts look rather more like the GNU project than Linux. I'm sorry if that hurts, but it is true. I know that Perry's group at STScI and the fine folks at Enthought will say they have to work on what they are being paid to work on. Both groups should consider the long term cost, in dollars, of spending those development dollars 100% on coding, rather than 50% on coding and 50% on outreach and intake. Linus himself has written only a small fraction of the Linux kernel, and almost none of the applications, yet in much less than 7 years Linux became a viable operating system, something much bigger than what we are attempting here. He couldn't have done that himself, for any amount of money. We all know this. THE PATH Here is what I suggest: 1. We should identify the remaining open interface questions. Not, "why is numeric faster than numarray", but "what should the syntax of creating an array be, and of doing different basic operations". If numeric and numarray are in agreement on these issues, then we can move on, and debate performance and features later. 2. We should identify what we need out of the core plotting capability. Again, not "chaco vs. pyxis", but the list of requirements (as an astronomer, I very much like Perry's list). 3. We should collect or implement a very minimal version of the featureset, and document it well enough that others like us can do simple but real tasks to try it out, without reading source code. That documentation should include lists of things that still need to be done. 4. We should release a stand-alone version of the whole thing in the formats most likely to be installed by users on the four most popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. Tarballs and CVS checkouts are right out. We have seen that nobody in the real world installs them. To be most portable and robust, it would make sense to include the Python interpreter, named such that it does not stomp on versions of Python in the released operating systems. Static linking likewise solves a host of problems and greatly reduces the number of package variants we will have to maintain. 5. We should advertize and advocate the result at conferences and elsewhere, being sure to label it what it is: a first-cut effort designed to do a few things well and serve as a platform for building on. We should also solicit and encourage people either to work on the included TODO lists or to contribute applications. One item on the TODO list should be code converters from IDL and Matlab to Python, and compatibility libraries. 6. We should then all continue to participate in the discussions and development efforts that appeal to us. We should keep in mind that evaluating and incorporating code that comes in is in the long run much more efficient than writing the universe ourselves. 7. We should cut and package new releases frequently, at least once every six months. It is better to delay a wanted feature by one release than to hold a release for a wanted feature. The mountain is climbed in small steps. The open source model is successful because it follows closely something that has worked for a long time: the scientific method, with its community contributions, peer review, open discussion, and progress mainly in small steps. Once basic capability is out there, we can twiddle with how to improve things behind the scenes. IS SCIPY THE WAY? The recipe above sounds a lot like SciPy. SciPy began as a way to integrate the necessary add-ons to numeric for real work. It was supposed to test, document, and distribute everything together. I am aware that there are people who use it, but the numbers are small and they seem to be tightly connected to Enthought for support and application development. Enthought's focus seems to be on servicing its paying customers rather than on moving SciPy development along, and I fear they are building an installed customer base on interfaces that were not intended to be stable. So, I will raise the question: is SciPy the way? Rather than forking the plotting and numerical efforts from what SciPy is doing, should we not be creating a new effort to do what SciPy has so far not delivered? These are not rhetorical or leading questions. I don't know enough about the motivations, intentions, and resources of the folks at Enthought (and elsewhere) to know the answer. I do think that such a fork will occur unless SciPy's approach changes substantially. The way to decide is for us all to discuss the question openly on these lists, and for those willing to participate and contribute effort to declare so openly. I think all that is needed, either to help SciPy or replace it, is some leadership in the direction outlined above. I would be interested in hearing, perhaps from the folks at Enthought, alternative points of view. Why are there no packages for popular OSs for SciPy 0.2? Why are releases so infrequent? If the folks running the show at scipy.org disagree with many others on these lists, then perhaps those others would like to roll their own. Or, perhaps stable/testing/unstable releases of the whole package are in order. HOW TO CONTRIBUTE? Judging by the number of PhDs in sigs, there are a lot of researchers on this list. I'm one, and I know that our time for doing core development or providing the aforementioned leadership is very limited, if not zero. Later we will be in a much better position to contribute application software. However, there is a way we can contribute to the core effort even if we are not paid, and that is to put budget items in grant and project proposals to support the work of others. Those others could be either our own employees or subcontractors at places like Enthought or STScI. A handful of contributors would be all we'd need to support someone to produce OS packages and tutorial documentation (the stuff core developers find boring) for two releases a year. --jh-- From jwp at psychology.nottingham.ac.uk Wed Jan 21 11:10:04 2004 From: jwp at psychology.nottingham.ac.uk (Jon Peirce) Date: Wed Jan 21 11:10:04 2004 Subject: [Numpy-discussion] re: Status of Numeric (and plotting in particular) In-Reply-To: References: Message-ID: <400ECE6C.30106@psychology.nottingham.ac.uk> > > >We have started on this over the past month, and hope to have some >simple >functionality available within a month (though when we make it public >may >take a bit longer). It will be open source and we hope significantly >simpler >than chaco. It will not focus on speed (well, we want fairly fast >display times >for plots of a reasonable number of points, but we don't need video >refresh >rates). If your interest in plotting matches ours, then this may be for >you. >We will welcome contributions and comments once we get it off the >ground. >(We are calling it pyxis by the way). > I agree with the sentiment that chaco is a very heavy and confusing package for the average scientist (but maybe great for the full-time programmer) but I'm really concerned about the idea that we need *another* solution started from scratch. There are already so many including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, pgplot, pyx (new)... In particular MatPlotLib looks promising - check out its examples: http://matplotlib.sourceforge.net/screenshots.html *Many* plotting types already , simple syntax, a few different backends. And already has something of a following. So is it really not possible for STScI to push its resources into aiding the development of something that's already begun? Would be great if we could develop a single package really well rather than everyone making their own. -- Jon Peirce Nottingham University +44 (0)115 8467176 (tel) +44 (0)115 9515324 (fax) http://www.psychology.nottingham.ac.uk/staff/jwp/ From perry at stsci.edu Wed Jan 21 12:07:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Wed Jan 21 12:07:01 2004 Subject: [Numpy-discussion] re: Status of Numeric (and plotting in particular) In-Reply-To: <400ECE6C.30106@psychology.nottingham.ac.uk> Message-ID: Jon Peirce writes: > > I agree with the sentiment that chaco is a very heavy and confusing > package for the average scientist (but maybe great for the full-time > programmer) but I'm really concerned about the idea that we need > *another* solution started from scratch. There are already so many > including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, > pgplot, pyx (new)... > We had looked all of these and each had fallen short in some major way (though I thought piddle had much promise and perhaps could be built on; however it was intended as a back end only.) > In particular MatPlotLib looks promising - check out its examples: > http://matplotlib.sourceforge.net/screenshots.html > *Many* plotting types already , simple syntax, a few different backends. > And already has something of a following. > This we had not seen. A superficial look indicates that it is worth investigating further as a basis for a plotting package. I didn't see any major problem with it that contradicted our requirements, but obviously we will have to look at it in more depth to see if that is the case. It doesn't have to be perfect of course. And it is much more expensive tto start from scratch (though we weren't doing that entirely since a number of components from the chaco effort would have been reused). But this is worth seriously considering. Perry Greenfield > So is it really not possible for STScI to push its resources into aiding > the development of something that's already begun? Would be great if we > could develop a single package really well rather than everyone making > their own. > From magnus at hetland.org Wed Jan 21 12:53:06 2004 From: magnus at hetland.org (Magnus Lie Hetland) Date: Wed Jan 21 12:53:06 2004 Subject: [Numpy-discussion] re: Status of Numeric (and plotting in particular) In-Reply-To: References: <400ECE6C.30106@psychology.nottingham.ac.uk> Message-ID: <20040121205248.GA24551@idi.ntnu.no> Perry Greenfield : > > Jon Peirce writes: > > > > I agree with the sentiment that chaco is a very heavy and confusing > > package for the average scientist (but maybe great for the full-time > > programmer) but I'm really concerned about the idea that we need > > *another* solution started from scratch. There are already so many > > including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, > > pgplot, pyx (new)... > > > We had looked all of these and each had fallen short in some major > way (though I thought piddle had much promise and perhaps could be > built on; however it was intended as a back end only.) Wohoo! Piddle lives ;) I think I'd be interested in resuming some of my earlier work on Piddle if it is ever used for something useful -- such as a proper plotting tool. (I was actually just thinking about wrapping PyX in the Piddle interface to make TeX typesetting available in Piddle.) [snip about mathplotlib] Hm. Maybe a Piddle back-end could be written for it (which would instantly give it lots of extra back-ends)...? Two birds with one stone and all that... - M -- Magnus Lie Hetland "The mind is not a vessel to be filled, http://hetland.org but a fire to be lighted." [Plutarch] From jmiller at stsci.edu Wed Jan 21 13:21:05 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 21 13:21:05 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: References: Message-ID: <1074719906.1424.251.camel@halloween.stsci.edu> > > Why are numarrays so slow to create? > > There are several portable ways to create numarrays (array(), arange(), zeros(), ones()) and I'm not really sure which one to address, so I poked around some. I discovered that numarray-0.8 has a problem with array() which causes very poor performance (~30x slower than Numeric) for arrays created from a sequence. The problem is with a private Python function, _all_arrays(), that scans the sequence to see if it consists only of arrays; _all_arrays() works badly for the ordinary case of a sequence of numbers. This is fixed now in CVS. Beyond this flaw in array(), it's a mixed bag, with numarray tending to do well with large arrays and certain use cases, and Numeric doing well with small arrays and other use cases. Todd > I'll leave it to Todd to give the details of that. > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From hinsen at cnrs-orleans.fr Wed Jan 21 13:27:00 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Jan 21 13:27:00 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> Message-ID: <6DBBCCFD-4C58-11D8-8519-000A95AB5F10@cnrs-orleans.fr> On 21.01.2004, at 19:44, Joe Harrington wrote: > This is a necessarily long post about the path to an open-source > replacement for IDL and Matlab. While I have tried to be fair to You raise many good points here. Some comments: > those who have contributed much more than I have, I have also tried to > be direct about what I see as some fairly fundamental problems in the > way we're going about this. I've given it some section titles so you I'd say the fundamental problem is that "we" don't exist as a coherent group. There are a few developer groups (e.g. at STSC and Enthought) who write code primarily for their own need and then make it available. The rest of us are what one could call "power users": very interested in the code, knowledgeable about its use, but not contributing to its development other than through testing and feedback. > THE PROBLEM > > We are not following the open-source development model. Rather, we True. But is it perhaps because that model is not so well adapted to our situation? If you look at Linux (the OpenSource reference), it started out very differently. It was a fun project, done by hobby programmers who shared an idea of fun (kernel hacking). Linux was not goal-oriented in the beginnings. No deadlines, no usability criteria, but lots of technical challenges. Our situation is very different. We are scientists and engineers who want code to get our projects done. We have clear goals, and very limited means, plus we are mostly somone's employees and thus not free to do as we would like. On the other hand, our project doesn't provide the challenges that attract the kind of people who made Linux big. You don't get into the news by working on NumPy, you don't work against Microsoft, etc. Computational science and engineering just isn't the same as kernel hacking. I develop two scientific Python libraries myself, more specialized and thus with a smaller market share, but the situation is otherwise similar. And I work much like the Numarray people do: I write the code that I need, and I invest minimal effort in distribution and marketing. To get the same code developped in the Linux fashion, there would have to be many more developers. But they just don't exist. I know of three people worldwide whose competence in both Python/C and in the application domain is good enough that they could work on the code base. This is not enough to build a networked development community. The potential NumPy community is certainly much bigger, but I am not sure it is big enough. Working on NumPy/Numarray requires the combination of not-so-frequent competences, plus availability. I am not saying it can't be done, but it sure isn't obvious that it can be. > Release it in a way that as many people as possible will get it, > install it, use it for real work, and contribute to it. Make the main > focus of the core development team the evaluation and inclusion of > contributions from others. Develop a common vision for the program, This requires yet different competences, and thus different people. It takes people who are good at reading others' code and communicating with them about it. Some people are good programmers, some are good scientists, some are good communicators. How many are all of that - *and* available? > I know that Perry's group at STScI and the fine folks at Enthought > will say they have to work on what they are being paid to work on. > Both groups should consider the long term cost, in dollars, of > spending those development dollars 100% on coding, rather than 50% on > coding and 50% on outreach and intake. Linus himself has written only You are probably right. But does your employer think long-term? Mine doesn't. > applications, yet in much less than 7 years Linux became a viable > operating system, something much bigger than what we are attempting Exactly. We could be too small to follow the Linux way. > 1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". Yes, a very good point. Focus on the goal, not on the legacy code. However, a technical detail that should not be forgotten here: NumPy and Numarray have a C API as well, which is critical for many add-ons and applications. A C API is more closely tied to the implementation than a Python API. It might thus be difficult to settle on an API and then work on efficient implementations. > 2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). 100% agreement. For plotting, defining the interface should be easier (no C stuff). Konrad. From perry at stsci.edu Wed Jan 21 13:29:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Wed Jan 21 13:29:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> Message-ID: Joe Harrington writes: > > This is a necessarily long post about the path to an open-source > replacement for IDL and Matlab. While I have tried to be fair to > those who have contributed much more than I have, I have also tried to > be direct about what I see as some fairly fundamental problems in the > way we're going about this. I've given it some section titles so you > can navigate, but I hope that you will read the whole thing before > posting a reply. I fear that this will offend some people, but please > know that I value all your efforts, and offense is not my intent. > No offense taken. [...] > THE PROBLEM > > We are not following the open-source development model. Rather, we > pay lip service to it. Open source's development mantra is "release > early, release often". This means release to the public, for use, a > package that has core capability and reasonably-defined interfaces. > Release it in a way that as many people as possible will get it, > install it, use it for real work, and contribute to it. Make the main > focus of the core development team the evaluation and inclusion of > contributions from others. Develop a common vision for the program, > and use that vision to make decisions and keep efforts focused. > Include contributing developers in decision making, but do make > decisions and move on from them. > > Instead, there are no packages for general distribution. The basic > interfaces are unstable, and not even being publicly debated to decide > among them (save for the past 3 days). The core developers seem to > spend most of their time developing, mostly out of view of the > potential user base. I am asked probably twice a week by different > fellow astronomers when an open-source replacement for IDL will be > available. They are mostly unaware that this effort even exists. > However, this indicates that there are at least hundreds of potential > contributors of application code in astronomy alone, as I don't nearly > know everyone. The current efforts look rather more like the GNU > project than Linux. I'm sorry if that hurts, but it is true. > I'd both agree with this and disagree. Agree in the sense that many agree these are desireable traits of an open source project. Disagree in the sense that many don't meet all of these traits, and yet may be useful to some degree. Even Python is not released often, nor is it generally packaged by the core group. You will find packaging by special interest group that may or may not be up to date for various platforms. There is a whole spectrum of other, useful open source projects that don't satisfy these requirments. I don't mean that in a defensive way; it's certainly fair to ask what is going wrong in the Python numeric world, but doing the above alone doesn't necessarily guarentee that you will be sucessful in attracting feedback and contributions; there are other factors as well that influence how a project develops. We have had experience with the packaging issue for PyRAF, and it isn't quite so simple, the package binary approach didn't always make life simpler for the user (arguably, we have found the source distribution approach more trouble-free than our original release). Having ones own version of python packaged as a binary raises issues with LD_LIBRARY_PATH that there are just no good solutions to. > I know that Perry's group at STScI and the fine folks at Enthought > will say they have to work on what they are being paid to work on. > Both groups should consider the long term cost, in dollars, of > spending those development dollars 100% on coding, rather than 50% on > coding and 50% on outreach and intake. Linus himself has written only > a small fraction of the Linux kernel, and almost none of the > applications, yet in much less than 7 years Linux became a viable > operating system, something much bigger than what we are attempting > here. He couldn't have done that himself, for any amount of money. > We all know this. > I'd say we have tried our best to solicit input (and accept contributed code as well). You have to remember that how easily contributions come depends on what the critical mass is for usefulness. For something like numarray or Numeric, that critical mass is quite large. Few are interested in contributing when it can do very little and and older package exists that can do more. By the time it has comparable functionality, it is already quite large. A lot of projects like that start with a small group before more join in. There are others where the critical mass is low and many join in when functionality is still relatively low. > THE PATH > > Here is what I suggest: > > 1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". > If numeric and numarray are in agreement on these issues, then we > can move on, and debate performance and features later. > Well, there are, and continue to be those that can't come to an agreement on even the interface. These issues have been raised many times in the past. Often consensus was hard to achieve. We tended to lean towards backward compatibilty unless the change seemed really necessary. For type coercion and error handling, we thought it was. But I don't think we have tried shield the decision making process from the community. I do think the difficulty in achieving a sense of consensus is a problem. Perhaps we are going about the process in the wrong way; I'd welcome suggestions as to how to improve that. > 2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). > > 3. We should collect or implement a very minimal version of the > featureset, and document it well enough that others like us can do > simple but real tasks to try it out, without reading source code. > That documentation should include lists of things that still need > to be done. > > 4. We should release a stand-alone version of the whole thing in the > formats most likely to be installed by users on the four most > popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this > means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. > Tarballs and CVS checkouts are right out. We have seen that nobody > in the real world installs them. To be most portable and robust, > it would make sense to include the Python interpreter, named such > that it does not stomp on versions of Python in the released > operating systems. Static linking likewise solves a host of > problems and greatly reduces the number of package variants we will > have to maintain. > Static linking also introduces other problems. And we have gone this route in the past so we have some knowledge of what it entails. > 5. We should advertize and advocate the result at conferences and > elsewhere, being sure to label it what it is: a first-cut effort > designed to do a few things well and serve as a platform for > building on. We should also solicit and encourage people either to > work on the included TODO lists or to contribute applications. One > item on the TODO list should be code converters from IDL and Matlab > to Python, and compatibility libraries. > > 6. We should then all continue to participate in the discussions and > development efforts that appeal to us. We should keep in mind that > evaluating and incorporating code that comes in is in the long run > much more efficient than writing the universe ourselves. > > 7. We should cut and package new releases frequently, at least once > every six months. It is better to delay a wanted feature by one > release than to hold a release for a wanted feature. The mountain > is climbed in small steps. > > The open source model is successful because it follows closely > something that has worked for a long time: the scientific method, with > its community contributions, peer review, open discussion, and > progress mainly in small steps. Once basic capability is out there, > we can twiddle with how to improve things behind the scenes. > In general, I can't disagree much with most of these. I'm happy for others to smack us when we are going away from this sort of process. Please do, it would be the only way (and others) would learn how to really do it. But we have released fairly frequently, if not with rpms. We do provide pretty good support as well. We have incorporated most of the code sent to us, and considered and implemented many feature requests or performance issues. But the numarray core is not something one would casually change without spending some time understanding how it works; I suspect that is the biggest inhibitor to changes to the core. We are happy to work with others on it if they have the time to do so. If anyone feels we have discouraged people contributing, please let me know (privately if you wish). > IS SCIPY THE WAY? > > The recipe above sounds a lot like SciPy. SciPy began as a way to > integrate the necessary add-ons to numeric for real work. It was > supposed to test, document, and distribute everything together. I am > aware that there are people who use it, but the numbers are small and > they seem to be tightly connected to Enthought for support and > application development. Enthought's focus seems to be on servicing > its paying customers rather than on moving SciPy development along, > and I fear they are building an installed customer base on interfaces > that were not intended to be stable. > I don't feel this is fair to Enthought. It is not my impression that they have made any money off of the scipy distribution directly (Chaco is a different issue). As far as I can tell, the only benefit they've generally gotten from it is from the visibility of sponsoring it, and perhaps from their own use few of the tools they have included as part of it. I doubt that their own clients have driven its development in any significant way. I'd guess they have sunk far more money into scipy than gotten out of it. I don't want others to get the impression that it is the other way around. In fact, on a number of occasions I have heard users complain about the documentation and the standard response is "please help us improve it" with very little in response. They have gone the extra mile in soliciting contributions and help maintaining it. Perhaps it is part of my open source blind spot, but I have trouble seeing what else they could be doing to encourage others to contribute to scipy (besides paying them; which they have done as well!). The only thing I can think of is that because they are doing it, others feel that they don't. Perhaps there is a similar issue with numarray. I don't know. > So, I will raise the question: is SciPy the way? Rather than forking > the plotting and numerical efforts from what SciPy is doing, should we > not be creating a new effort to do what SciPy has so far not > delivered? These are not rhetorical or leading questions. I don't > know enough about the motivations, intentions, and resources of the > folks at Enthought (and elsewhere) to know the answer. I do think > that such a fork will occur unless SciPy's approach changes > substantially. The way to decide is for us all to discuss the > question openly on these lists, and for those willing to participate > and contribute effort to declare so openly. I think all that is > needed, either to help SciPy or replace it, is some leadership in the > direction outlined above. I would be interested in hearing, perhaps > from the folks at Enthought, alternative points of view. Why are > there no packages for popular OSs for SciPy 0.2? Why are releases so > infrequent? If the folks running the show at scipy.org disagree with > many others on these lists, then perhaps those others would like to > roll their own. Or, perhaps stable/testing/unstable releases of the > whole package are in order. > I think the answer is simple. Supporting distributions of the software they have pulled into scipy is a hell of a lot of work; work that nobody is paying them for. It gives me the shivers to think of our taking on all they have for scipy. > HOW TO CONTRIBUTE? > > Judging by the number of PhDs in sigs, there are a lot of researchers > on this list. I'm one, and I know that our time for doing core > development or providing the aforementioned leadership is very > limited, if not zero. Later we will be in a much better position to > contribute application software. However, there is a way we can > contribute to the core effort even if we are not paid, and that is to > put budget items in grant and project proposals to support the work of > others. Those others could be either our own employees or > subcontractors at places like Enthought or STScI. A handful of > contributors would be all we'd need to support someone to produce OS > packages and tutorial documentation (the stuff core developers find > boring) for two releases a year. > By all means, if there is a groundswell of support for development, please let us know. Perry Greenfield From tim.hochberg at ieee.org Wed Jan 21 15:24:00 2004 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Wed Jan 21 15:24:00 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: References: Message-ID: <400F09C3.9020708@ieee.org> Arthur wrote: [SNIP] > Which, to me, seems like a worthy goal. > > On the other hand, it would seem that the goal of something to move > into the core would be performance optimized at the range of array > size most commonly encountered. Rather than for the extraodrinary, > which seems to be the goal of numarray, responding to specific needs > of the numarray development team's applications. I'm not sure where you came up with this, but it's wrong on at least two counts. The first is that last I heard the crossover point where Numarray becomes faster than Numeric is about 2000 elements. It would be nice if that becomes smaller, but I certainly wouldn't call it extreme. In fact I'd venture that the majority of cases where numeric operations are a bottleneck would already be faster under Numarray. In my experience, while it's not uncommon to use short arrays, it is rare for them to be a bottleneck. The second point is the relative speediness of Numeric at low array sizes is the result that nearly all of it is implemented in C, whereas much of Numarray is implemented in Python. This results in a larger overhead for Numarray, which is why it's slower for small arrays. As I understand it, the decision to base most of Numarray in Python was driven by maintainability; it wasn't an attempt to optimize large arrays at the expense of small ones. > Has the core Python development team given out clues about their > feelings/requirements for a move of either Numeric or numarray into > the core? I believe that one major requirement was that the numeric community come to a consensus on an array package and be willing to support it in the core. There may be other stuff. > It concerns me that this thread isn't trafficked. I suspect that most of the exchange has taken place on numpy-discussion at lists.sourceforge.net. [SNIP] -tim From rkern at ucsd.edu Wed Jan 21 15:42:01 2004 From: rkern at ucsd.edu (Robert Kern) Date: Wed Jan 21 15:42:01 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400F09C3.9020708@ieee.org> References: <400F09C3.9020708@ieee.org> Message-ID: <20040121234108.GA4602@taliesen.ucsd.edu> On Wed, Jan 21, 2004 at 04:22:43PM -0700, Tim Hochberg wrote: [snip] > The second point is the relative speediness of Numeric at low array > sizes is the result that nearly all of it is implemented in C, whereas > much of Numarray is implemented in Python. This results in a larger > overhead for Numarray, which is why it's slower for small arrays. As I > understand it, the decision to base most of Numarray in Python was > driven by maintainability; it wasn't an attempt to optimize large arrays > at the expense of small ones. Has the numarray team (or anyone else for that matter) looked at using Pyrex[1] to implement any part of numarray? If not, then that's my next free-time experiment (i.e. avoiding homework while still looking productive at the office). [1] http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From oliphant at ee.byu.edu Wed Jan 21 16:00:01 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Wed Jan 21 16:00:01 2004 Subject: [Numpy-discussion] Comments on the Numarray/Numeric disscussion In-Reply-To: <6DBBCCFD-4C58-11D8-8519-000A95AB5F10@cnrs-orleans.fr> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <6DBBCCFD-4C58-11D8-8519-000A95AB5F10@cnrs-orleans.fr> Message-ID: <400F1289.2040403@ee.byu.edu> I would like to thank the contributors to the discussion as I think one of the problems we have had lately is that people haven't been talking much. Partly because we have some fundamental differences of opinion caused by different goals and partly because we are all busy working on a variety of other pressing projects. The impression has been that Numarray will replace Numeric. I agree with Perry that this has always been less of a consensus and more of a hope. I am more than happy for Numarray to replace Numeric as long as it doesn't mean all my code slows down. I would say the threshold is that my code can't slow down by more than a factor of 10%. If there is a code-base out there (Numeric) that can allow my code to run 10% faster it will get used. I also don't think it's ideal to have multiple N-D arrays running around there, but if they all have the same interface then it doesn't really matter. The two major problems I see with Numarray replacing Numeric are 1) How is UFunc support? Can you create ufuncs in C easily (with a single function call or something similar). 2) Speed for small arrays (array creation is the big one). It is actually quite a common thing to have a loop during which many small arrays get created and destroyed. Yes, you can usually make such code faster by "vectorizing" (if you can figure out how). But the average scientist just wants to (and should be able to) just write a loop. Regarding speed issues. Actually, there are situations where I am very unsatisfied with Numeric's speed performance and so the goal for Numarray should not be to achieve some percentage of Numeric's performance but to beat it. Frankly, I don't see how you can get speed that I'm talking about by carrying around a lot of extras like byte-swapping support, memory-mapping support, record-array support. *Question*: Is there some way to turn on a flag in Numarray so that all of the extra stuff is ignored (i.e. create a small-array that looks on a binary level just like a Numeric array) ? It would seem to me that this is the only way that the speed issue will go away. Given that 1) Numeric already works and given that all of my code depends on it 2) Numarray doesn't seem to have support for general purpose ufunctions (can the scipy.special package be ported to numarray?) 3) Numarray is slower for the common tasks I end up using SciPy for and 4) I actually understand the Numeric code base quite well I have a hard time justifying switching over to Numarray. Thanks again for the comments. -Travis O. Konrad Hinsen wrote: > On 21.01.2004, at 19:44, Joe Harrington wrote: > >> This is a necessarily long post about the path to an open-source >> replacement for IDL and Matlab. While I have tried to be fair to > > > You raise many good points here. Some comments: > >> those who have contributed much more than I have, I have also tried to >> be direct about what I see as some fairly fundamental problems in the >> way we're going about this. I've given it some section titles so you > > > I'd say the fundamental problem is that "we" don't exist as a coherent > group. There are a few developer groups (e.g. at STSC and Enthought) who > write code primarily for their own need and then make it available. The > rest of us are what one could call "power users": very interested in the > code, knowledgeable about its use, but not contributing to its > development other than through testing and feedback. > >> THE PROBLEM >> >> We are not following the open-source development model. Rather, we > > > True. But is it perhaps because that model is not so well adapted to our > situation? If you look at Linux (the OpenSource reference), it started > out very differently. It was a fun project, done by hobby programmers > who shared an idea of fun (kernel hacking). Linux was not goal-oriented > in the beginnings. No deadlines, no usability criteria, but lots of > technical challenges. > > Our situation is very different. We are scientists and engineers who > want code to get our projects done. We have clear goals, and very > limited means, plus we are mostly somone's employees and thus not free > to do as we would like. On the other hand, our project doesn't provide > the challenges that attract the kind of people who made Linux big. You > don't get into the news by working on NumPy, you don't work against > Microsoft, etc. Computational science and engineering just isn't the > same as kernel hacking. > > I develop two scientific Python libraries myself, more specialized and > thus with a smaller market share, but the situation is otherwise > similar. And I work much like the Numarray people do: I write the code > that I need, and I invest minimal effort in distribution and marketing. > To get the same code developped in the Linux fashion, there would have > to be many more developers. But they just don't exist. I know of three > people worldwide whose competence in both Python/C and in the > application domain is good enough that they could work on the code base. > This is not enough to build a networked development community. The > potential NumPy community is certainly much bigger, but I am not sure it > is big enough. Working on NumPy/Numarray requires the combination of > not-so-frequent competences, plus availability. I am not saying it can't > be done, but it sure isn't obvious that it can be. > >> Release it in a way that as many people as possible will get it, >> install it, use it for real work, and contribute to it. Make the main >> focus of the core development team the evaluation and inclusion of >> contributions from others. Develop a common vision for the program, > > > This requires yet different competences, and thus different people. It > takes people who are good at reading others' code and communicating with > them about it. > Some people are good programmers, some are good scientists, some are > good communicators. How many are all of that - *and* available? > >> I know that Perry's group at STScI and the fine folks at Enthought >> will say they have to work on what they are being paid to work on. >> Both groups should consider the long term cost, in dollars, of >> spending those development dollars 100% on coding, rather than 50% on >> coding and 50% on outreach and intake. Linus himself has written only > > > You are probably right. But does your employer think long-term? Mine > doesn't. > >> applications, yet in much less than 7 years Linux became a viable >> operating system, something much bigger than what we are attempting > > > Exactly. We could be too small to follow the Linux way. > >> 1. We should identify the remaining open interface questions. Not, >> "why is numeric faster than numarray", but "what should the syntax >> of creating an array be, and of doing different basic operations". > > > Yes, a very good point. Focus on the goal, not on the legacy code. > However, a technical detail that should not be forgotten here: NumPy and > Numarray have a C API as well, which is critical for many add-ons and > applications. A C API is more closely tied to the implementation than a > Python API. It might thus be difficult to settle on an API and then work > on efficient implementations. > >> 2. We should identify what we need out of the core plotting >> capability. Again, not "chaco vs. pyxis", but the list of >> requirements (as an astronomer, I very much like Perry's list). > > > 100% agreement. For plotting, defining the interface should be easier > (no C stuff). > > Konrad. > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From paul at prescod.net Wed Jan 21 20:22:03 2004 From: paul at prescod.net (Paul Prescod) Date: Wed Jan 21 20:22:03 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400F09C3.9020708@ieee.org> References: <400F09C3.9020708@ieee.org> Message-ID: <400F4E5E.9020704@prescod.net> Tim Hochberg wrote: >... > > The second point is the relative speediness of Numeric at low array > sizes is the result that nearly all of it is implemented in C, whereas > much of Numarray is implemented in Python. This results in a larger > overhead for Numarray, which is why it's slower for small arrays. As I > understand it, the decision to base most of Numarray in Python was > driven by maintainability; it wasn't an attempt to optimize large arrays > at the expense of small ones. What about Pyrex? If you code Pyrex as if it were exactly Python you won't get much optimization. But if you code it as if it were 90% as maintainable as Python you can often get 90% of the speed of C, which is pretty damn close to having all of the best of both worlds. If you point me to a few key functions in Numarray I could try to recode them in Pyrex and do some benchmarking for you (only if Pyrex is a serious option of course!). Paul Prescod From eric at enthought.com Thu Jan 22 00:05:01 2004 From: eric at enthought.com (eric jones) Date: Thu Jan 22 00:05:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> Message-ID: <400F8400.4030208@enthought.com> Good thing Duke is beating Maryland as I read, otherwise, mail like this can make you grumpy. :-) Joe Harrington wrote: >This is a necessarily long post about the path to an open-source >replacement for IDL and Matlab. While I have tried to be fair to >those who have contributed much more than I have, I have also tried to >be direct about what I see as some fairly fundamental problems in the >way we're going about this. I've given it some section titles so you >can navigate, but I hope that you will read the whole thing before >posting a reply. I fear that this will offend some people, but please >know that I value all your efforts, and offense is not my intent. > > > >THE PAST VS. NOW > >While there is significant and dedicated effort going into >numeric/numarray/scipy, it's becoming clear that we are not >progressing quickly toward a replacement for IDL and Matlab. I have >great respect for all those contributing to the code base, but I think >the present discussion indicates some deep problems. If we don't >identify those problems (easy) and solve them (harder, but not >impossible), we will continue not to have the solution so many people >want. To be convinced that we are doing something wrong at a >fundamental level, consider that Python was the clear choice for a >replacement in 1996, when Paul Barrett and I ran a BoF at ADASS VI on >interactive data analysis environments. That was over 7 years ago. > > > The effort has fallen short of the mark you set. I also wish the community was more efficient at pursuing this goal. There are fundamental issues. (1) The effort required is large. (2) Free time is in short supply. (3) Financial support is difficult to come by for library development. Other potential problems would be a lack of interest and a lack of competence. I do not think many of us suffer from the first. As for competence, the development team beyond the walls of Enthought self selects in open source projects, so we're stuck with what we've got. I know most of the people and happen to think they are a talented bunch, so I'll consider us no worse than the average group of PhDs (some consider that a pretty low bar ...). I believe the tasks that go undone (multi-platform support, bi-yearly releases, documentation, etc.) are more due to (2) and (3) above instead of some other deep (or shallow) issue. I guess another possibility is organization. This can be improved upon. Thanks to the gracious help of Cal Tech (CACR) and NCBR, the community has gathered at a low cost SciPy workshop at Cal Tech the last couple of years. I believe this is a positive step. Adding this to the newsgroups and mailing lists provides us with a solid framework within which to operate. I still have confidence that we will reach the IDL/Matlab replacement point. We don't have the resources that those products have behind them. We do have a superior language, but without a lot of sweat and toiling at hours of grunt work, we don't stand a chance. As for Enthought's efforts, our success in building applications (scientific and otherwise) has diverted our developers (myself included) away from SciPy as the primary focus. We do continue to develop it and provide significant (for us) financial support to maintain it. I am lucky enough to work with a fine set of software engineers, and I am itching to for us to get more time devoted to SciPy. I do believe that we will get the opportunity in the future -- it is just a matter of time. Call me an optimist. >replace IDL or Matlab", the answer was clearly "stable interfaces to >basic numerics and plotting; then we can build it from there following >the open-source model". Work on both these problems was already well >underway then. Now, both the numerical and plotting development >efforts have branched. There is still no stable base upon which to >build. There aren't even packages for popular OSs that people can >install and play with. The problem is not that we don't know how to >do numerics or graphics; if anything, we know these things too well. >In 1996, if anyone had told us that in 2004 there would be no >ready-to-go replacement system because of a factor of 4 in small array >creation overhead (on computers that ran 100x as fast as those then >available) or the lack of interactive editing of plots at video >speeds, the response would not have been pretty. How would you have >felt? > >THE PROBLEM > >We are not following the open-source development model. Rather, we >pay lip service to it. Open source's development mantra is "release >early, release often". This means release to the public, for use, a >package that has core capability and reasonably-defined interfaces. > > >Release it in a way that as many people as possible will get it, >install it, use it for real work, and contribute to it. Make the main >focus of the core development team the evaluation and inclusion of >contributions from others. Develop a common vision for the program, >and use that vision to make decisions and keep efforts focused. >Include contributing developers in decision making, but do make >decisions and move on from them. > >Instead, there are no packages for general distribution. The basic >interfaces are unstable, and not even being publicly debated to decide >among them (save for the past 3 days). The core developers seem to >spend most of their time developing, mostly out of view of the >potential user base. I am asked probably twice a week by different >fellow astronomers when an open-source replacement for IDL will be >available. They are mostly unaware that this effort even exists. >However, this indicates that there are at least hundreds of potential >contributors of application code in astronomy alone, as I don't nearly >know everyone. The current efforts look rather more like the GNU >project than Linux. I'm sorry if that hurts, but it is true. > > > Speaking from the standpoint of SciPy, all I can say is we've tried to do what you outline here. The effort of releasing the huge load of Fortran/C/C++/Python code across multiple platforms is difficult and takes many hours. I would venture that 90% of the effort on SciPy is with the build system. This means that the exact part of the process that you are discussing is the majority of the effort. We keep a version for Windows up to date because that is what our current clients use. In all the other categories, we do the best we can and ask others to fill the gaps. It is also worth saying that SciPy works quite well for most purposes once built -- we and others use it daily on commercial projects. >I know that Perry's group at STScI and the fine folks at Enthought >will say they have to work on what they are being paid to work on. >Both groups should consider the long term cost, in dollars, of >spending those development dollars 100% on coding, rather than 50% on >coding and 50% on outreach and intake. Linus himself has written only >a small fraction of the Linux kernel, and almost none of the >applications, yet in much less than 7 years Linux became a viable >operating system, something much bigger than what we are attempting >here. He couldn't have done that himself, for any amount of money. >We all know this. > > Elaborate on the outreach idea for me. Enthought (spend money to) provide funding to core developers outside of our company (Travis and Pearu), we (spend money to) give talks at many conferences a year, we (spend a little money to) co-sponsor a 70 person workshop on scientific computing every year, we have an open mailing list, we release most of the general software that we write, in the past I practically begged people to have CVS write access when they provide a patch to SciPy. We even spent a lot of time early on trying to set up the scipy.org site as a collaborative Zope based environment -- an effort that was largely a failure. Still we have a functioning largely static site, the mailing list, and CVS. As far as tools, that should be sufficient. It is impossible to argue with the results though. Linus pulled off the OS model, and Enthought and the SciPy community, thus far, has been less successful. If there are suggestions beyond "spend more *time* answering email," I am all ears. Time is the most precious commodity of all these days. Also, SciPy has only been around for 3+ years, so I guess we still have a some rope left. I continue to believe it'll happen -- this seems like the perfect project for open source contributions. >THE PATH > >Here is what I suggest: > >1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". > If numeric and numarray are in agreement on these issues, then we > can move on, and debate performance and features later. > > ?? I don't get this one. This interface (at least for numarray) is largely decided. We have argued the points, and Perry et. al. at STSci made the decisions. I didn't like some of them, and I'm sure everyone else had at least one thing they wished was changed, but that is the way this open stuff works. It is not the interface but the implementation that started this furor. Travis O.'s suggestion was to back port (much of) the numarray interface to the Numeric code base so that those stuck supporting large co debases (like SciPy) and needing fast small arrays could benefit from the interface enhancements. One or two of them had backward compatibility issues with Numeric, so he asked how it should be handled. Unless some magic porting fairy shows up, SciPy will be a Numeric only tool for the next year or so. This means that users of SciPy either have to forgo some of these features or back port. On speed: Numeric is already too slow -- we've had to recode a number of routines in C that I don't think we should have in a recent project. For us, the goal is not to approach Numeric's speed but to significantly beat it for all array sizes. That has to be a possibility for any replacement. Otherwise, our needs (with the exception of a few features) are already better met by Numeric. I have some worries about all of the endianness and memory mapped support that are built into Numarray imposing to much overhead for speed-ups on small arrays to be possible (this echo's Travis O's thoughts -- we will happily be proven wrong). None of our current work needs these features, and paying a price for them is hard to do with an alternative already there. It is fairly easy to improve its performance on mathematical by just changing the way the ufunc operations are coded. With some reasonably simple changes, Numeric should be comparable (or at least closer) to Numarray speed for large arrays. Numeric also has a large number of other optimizations that can be made (memory is zeroed twice in zeros(), asarray was recently improved significantly for the typical case, etc.). Making these changes would help our selling of Python and, since we have at least a years worth of applications that will be on the SciPy/Numeric platform, it will also help the quality of these applications. Oh yeah, I have also been surprised at how much of out code uses alltrue(), take(), isnan(), etc. The speed of these array manipulation methods is really important for us. >2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). > > Yep, we obviously missed on this one. Chaco (and the related libraries) is extremely advanced in some areas but lags in ease-of-use. It is primarily written by a talented and experienced computer scientist (Dave Morrill) who likely does not have the perspective of an astronomer. It is clear that areas of the library need to be re-examined, simplified, and improved. Unfortunately, there is not time for us to do that right now, and the internals have proven to complex for others to contribute to in a meaningful way. I do not know when this will be addressed. The sad thing here is that STSci won't be using it. That pains me to no end, and Perry and I have tried to figure out some way to make it work for them. But, it sounds like, at least in the short term, there will be two new additions to the plotting stable. We will work hard though to make the future Chaco solve STSci's problems (and everyone elses) better than it currently does. By the way, there is a lot of Chaco bashing going on. It is worth saying that we use Chaco every day in commercial applications that require complex graphics and heavy interactivity with great success. But, we also have mixed teams of scientists and computer scientists along with the "U Manual" (If I have a question, I ask you -- being Dave) to answer any questions. I continue to believe Chaco's Traits based approach is the only one currently out there that has the chance of improving on Matlab and other plotting packages available. And, while SciPy is moving slowly, Chaco is moving at a frantic development pace and gets new capabilities daily (which is part of the complaints about it). I feel certain in saying that it has more resources tied to its development that the other plotting option out there -- it is just currently being exercised in GUI environments instead of as a day-to-day plotting tool. My advice is dig in, learn traits, and learn Chaco. >3. We should collect or implement a very minimal version of the > featureset, and document it well enough that others like us can do > simple but real tasks to try it out, without reading source code. > That documentation should include lists of things that still need > to be done. > > >4. We should release a stand-alone version of the whole thing in the > formats most likely to be installed by users on the four most > popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this > means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. > Tarballs and CVS checkouts are right out. We have seen that nobody > in the real world installs them. To be most portable and robust, > it would make sense to include the Python interpreter, named such > that it does not stomp on versions of Python in the released > operating systems. Static linking likewise solves a host of > problems and greatly reduces the number of package variants we will > have to maintain. > >5. We should advertize and advocate the result at conferences and > elsewhere, being sure to label it what it is: a first-cut effort > designed to do a few things well and serve as a platform for > building on. We should also solicit and encourage people either to > work on the included TODO lists or to contribute applications. One > item on the TODO list should be code converters from IDL and Matlab > to Python, and compatibility libraries. > >6. We should then all continue to participate in the discussions and > development efforts that appeal to us. We should keep in mind that > evaluating and incorporating code that comes in is in the long run > much more efficient than writing the universe ourselves. > >7. We should cut and package new releases frequently, at least once > every six months. It is better to delay a wanted feature by one > release than to hold a release for a wanted feature. The mountain > is climbed in small steps. > >The open source model is successful because it follows closely >something that has worked for a long time: the scientific method, with >its community contributions, peer review, open discussion, and >progress mainly in small steps. Once basic capability is out there, >we can twiddle with how to improve things behind the scenes. > > > Everything here is great -- it is the implementation part that is hard. I am all for it happening though. >IS SCIPY THE WAY? > >The recipe above sounds a lot like SciPy. SciPy began as a way to >integrate the necessary add-ons to numeric for real work. It was >supposed to test, document, and distribute everything together. I am >aware that there are people who use it, but the numbers are small and >they seem to be tightly connected to Enthought for support and >application development. > Not so. The user base is not huge, but I would conservatively venture to say it is in the hundreds to thousands. We are a company of 12 without a single support contract for SciPy. >Enthought's focus seems to be on servicing >its paying customers rather than on moving SciPy development along, > > Continuing to move SciPy along at the pace we initially were would have ended Enthought -- something had to change. It is surprising how important paying customers are to a company. >and I fear they are building an installed customer base on interfaces >that were not intended to be stable. > > Not sure what you you mean here, but I'm all for stable interfaces. Huge portions of SciPy's interface haven't changed, and I doubt they will change. I do indeed feel, though, that SciPy is still a 0.2 release level, so some of the interfaces can change. It would be irresponsible to say otherwise. This is not "intentionally unstable" though... >So, I will raise the question: is SciPy the way? Rather than forking >the plotting and numerical efforts from what SciPy is doing, should we >not be creating a new effort to do what SciPy has so far not >delivered? These are not rhetorical or leading questions. I don't >know enough about the motivations, intentions, > Man this sounds like an interview (or interaction) question. We'll we're a company, so we do wish to make money -- otherwise, we'll have to do something else. We also care about deeply about science and are passionate about scientific computing. Let see, what else. We have made most of the things we do open source because we do believe in it in principle and as a good development philosophy. And, even though we all wish SciPy was moving faster, SciPy wouldn't be anywhere close to where it is without Travis Oliphant and Pearu Peterson -- neither of whom would have worked on it had it not been openly available. That alone validates the decision to make it open. I'm not sure what we have done to make someone question our "motivations and intentions" (sounds like a date interrogation), but it is hard to think of malicious ones when you are making the fruits of your labors and dollars freely available. >and resources of the > > Well, we have 12 people, and Pearu and Travis O work with us quite a bit also. The developers here are very good (if I do say so myself), but unfortunately primarily working on other projects at the moment. Besides scientists/computer scientists have a technical writer and a human-computer-interface specialist on staff. >folks at Enthought (and elsewhere) to know the answer. I do think >that such a fork will occur unless SciPy's approach changes >substantially. > Enthought has more commitments than we used to. SciPy remains important and core to what we do, it just has to share time with other things. Luckily Pearu and Travis have kept there ear to the ground to help out people on the mailing lists as well as working on the codebase. I'm not sure what our approach has been that would force a fork... It isn't like someone has come as asked to be release manager, offered to keep the web pages up to date, provided peer review of code, etc and we have turned them away. Almost from the beginning most effort is provided by a small team (fairly standard for OS stuff). We have repeatedly pointed out areas we need help at the conference and in mail -- code reviews, build help, release help, etc. In fact, I double dare ya to ask to manage the next release or the documentation effort. okay... triple dare ya. Some people have philosophical (like Konrad I believe) differences with how SciPy is packaged and believe it should be 12 smaller packages instead of one large one. This has its own set of problems obviously, but forking based on this kind of principle would make at least a modicum of sense. Forking because you don't like the pace of the project makes zero sense. Pitch in and solve the problem. The social barriers are very small. The code barriers (build, etc.) are what need to be solved. >The way to decide is for us all to discuss the >question openly on these lists, and for those willing to participate >and contribute effort to declare so openly. I think all that is >needed, either to help SciPy or replace it, is some leadership in the >direction outlined above. I would be interested in hearing, perhaps >from the folks at Enthought, alternative points of view. Why are >there no packages for popular OSs for SciPy 0.2? > Please build them, ask for web credentials, and up load them. Then answer the questions people have about them on the mailing list. It is as simple as that. There is no magic here -- just work. >Why are releases so >infrequent? > Ditto. >If the folks running the show at scipy.org disagree with >many others on these lists, then perhaps those others would like to >roll their own. Or, perhaps stable/testing/unstable releases of the >whole package are in order. > >HOW TO CONTRIBUTE? > >Judging by the number of PhDs in sigs, there are a lot of researchers >on this list. I'm one, and I know that our time for doing core >development or providing the aforementioned leadership is very >limited, if not zero. > Surprisingly, commercial developers have about the same amount of free time. > Later we will be in a much better position to >contribute application software. However, there is a way we can >contribute to the core effort even if we are not paid, and that is to >put budget items in grant and project proposals to support the work of >others. > For the academics, supporting a *dedicated* student to maintain SciPy would be much more cost effective use of your dollars. Unfortunately, it is hard to get a PhD for supporting SciPy... For companies, national laboratories, etc. Supporting development on SciPy (or numarray) directly is a great idea. Projects that we work on in other areas also indirectly support SciPy, Chaco, etc. so get us involved with the development efforts at your company/lab. Other options? Government (NASA, Military, NIH, etc) and national lab people can get SciPy/numarray/Python related SBIR (http://www.acq.osd.mil/sadbu/sbir/) topics that would impact there research/development put on the solicitation list this summer. Email me if you have any questions on this. ASCI people can propose PathForward projects. There are probably numerous other ways to do this. We will have a GSA schedule soon, so government contracting will also work. >subcontractors at places like Enthought or STScI. A handful of >contributors would be all we'd need to support someone to produce OS >packages and tutorial documentation (the stuff core developers find >boring) for two releases a year. > > Joe, as you say, things haven't gone as fast as any of us would wish, but it hasn't been for lack of trying. Many of us have put zillions of hours into this. The results are actually quite stable tools. Many people use Numeric/Numarray/SciPy in daily work without problems. But, like Linux in the early years, they still require "geeks" willing to do some amount of meddling to use them. Huge resources (developer and financial) have been pumped into Linux to get it to the point its at today. Anything we can do to increase the participation in building tools and financially supporting those who do build tools, I am all for... I'd love to see releases on 10 platforms and full documentation for the libraries as well as the next person. Whew, and Duke managed to hang on and win. my .01 worth, eric >--jh-- > > >------------------------------------------------------- >The SF.Net email is sponsored by EclipseCon 2004 >Premiere Conference on Open Tools Development and Integration >See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. >http://www.eclipsecon.org/osdn >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From cjw at sympatico.ca Thu Jan 22 09:56:04 2004 From: cjw at sympatico.ca (Colin J. Williams) Date: Thu Jan 22 09:56:04 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <400F8400.4030208@enthought.com> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> Message-ID: <40100EAA.8040204@sympatico.ca> As a relative newcomer to this discussion, I would like to respond on a couple of points. eric jones wrote: > Good thing Duke is beating Maryland as I read, otherwise, mail like > this can make you grumpy. :-) > > Joe Harrington wrote: > [snip] >> THE PATH >> >> Here is what I suggest: >> >> 1. We should identify the remaining open interface questions. Not, >> "why is numeric faster than numarray", but "what should the syntax >> of creating an array be, and of doing different basic operations". >> If numeric and numarray are in agreement on these issues, then we >> can move on, and debate performance and features later. >> >> > ?? I don't get this one. This interface (at least for numarray) is > largely decided. We have argued the points, and Perry et. al. at > STSci made the decisions. I didn't like some of them, and I'm sure > everyone else had at least one thing they wished was changed, but that > is the way this open stuff works. I have wondered whether the desire to be compatible with Numeric has been an inhibitory factor for numarray. It might be interesting to see the list of decisions which Eric Jones doesn't like. > > It is not the interface but the implementation that started this > furor. Travis O.'s suggestion was to back port (much of) the numarray > interface to the Numeric code base so that those stuck supporting > large co debases (like SciPy) and needing fast small arrays could > benefit from the interface enhancements. One or two of them had > backward compatibility issues with Numeric, so he asked how it should > be handled. Unless some magic porting fairy shows up, SciPy will be a > Numeric only tool for the next year or so. This means that users of > SciPy either have to forgo some of these features or back port. Back porting would appear, to this outsider, to be a regression. Is there no way of changing numarray so that it has the desired speed for small arrays? > > > On speed: > Numeric is already too slow -- we've had to recode a number of > routines in C that I don't think we should have in a recent project. > For us, the goal is not to approach Numeric's speed but to > significantly beat it for all array sizes. That has to be a > possibility for any replacement. Otherwise, our needs (with the > exception of a few features) are already better met by Numeric. I > have some worries about all of the endianness and memory mapped > support that are built into Numarray imposing to much overhead for > speed-ups on small arrays to be possible (this echo's Travis O's > thoughts -- we will happily be proven wrong). None of our current > work needs these features, and paying a price for them is hard to do > with an alternative already there. It is fairly easy to improve its > performance on mathematical by just changing the way the ufunc > operations are coded. With some reasonably simple changes, Numeric > should be comparable (or at least closer) to Numarray speed for large > arrays. Numeric also has a large number of other optimizations that > can be made (memory is zeroed twice in zeros(), asarray was recently > improved significantly for the typical case, etc.). Making these > changes would help our selling of Python and, since we have at least a > years worth of applications that will be on the SciPy/Numeric > platform, it will also help the quality of these applications. > > Oh yeah, I have also been surprised at how much of out code uses > alltrue(), take(), isnan(), etc. The speed of these array > manipulation methods is really important for us. I am surprised that alltrue() performance is a concern, but it should be easy to implement short circuit evaluation so that False responses are, on average, handled more quickly. If Boolean arrays are significant, in terms of the amount of computer time taken, should they be stored as bit arrays? Would there be a pay-off for the added complexity? > > [snip] > >> 3. We should collect or implement a very minimal version of the >> featureset, and document it well enough that others like us can do >> simple but real tasks to try it out, without reading source code. >> That documentation should include lists of things that still need >> to be done. >> > Does numarray not provide the basics? >> [snip >> The open source model is successful because it follows closely >> something that has worked for a long time: the scientific method, with >> its community contributions, peer review, open discussion, and >> progress mainly in small steps. Once basic capability is out there, >> we can twiddle with how to improve things behind the scenes. >> >> >> Colin W. From bsder at allcaps.org Thu Jan 22 15:03:01 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Thu Jan 22 15:03:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <400F8400.4030208@enthought.com> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> Message-ID: <20040122142856.V3184@mail.allcaps.org> On Thu, 22 Jan 2004, eric jones wrote: > The effort has fallen short of the mark you set. I also wish the > community was more efficient at pursuing this goal. There are > fundamental issues. (1) The effort required is large. (2) Free time is > in short supply. (3) Financial support is difficult to come by for > library development. (4) There is no itch to scratch Matlab is somewhere about $20,000 (base+a couple of toolboxes) per year for corporations, and something like $500 (or less) for registered students. All of the signal processing packages and stuff are all written for Matlab. The time cost of learning a new tool (Python + SciPy + Numeric/numarray) far exceeds the base prices for the average company or person. However, some companies have to deliver an end product with Matlab embedded. This is *extremely* undesirable; consequently, they are likely to create add-ons and extend the Python interface. However, the progress will likely be slow. > Speaking from the standpoint of SciPy, all I can say is we've tried to > do what you outline here. The effort of releasing the huge load of > Fortran/C/C++/Python code across multiple platforms is difficult and > takes many hours. And since SciPy is mostly Windows, the users expect that one click installs the universe. Good for customer experience. Bad for maintainability which would really like to have independently maintained packages with hard API's surrounding them.. > On speed: > Numeric is already too slow -- we've had to recode a number of routines > in C that I don't think we should have in a recent project. Then the idea of optimizing numarray is DOA. The best you are going to get is a constant factor speedup in return for vastly complicating maintainability. That's not a good tradeoff for a multi-year open-source project. > Oh yeah, I have also been surprised at how much of out code uses > alltrue(), take(), isnan(), etc. The speed of these array manipulation > methods is really important for us. That seems ... odd. Scanning an array rather than handling a NaN trap seems like an awful tradeoff (ie. an O(n) operation repeated every time rather than an O(1) operation activated only on NaN generation--a rare occurrence normally). > -- code reviews, build help, release help, etc. In fact, I double dare > ya to ask to manage the next release or the documentation effort. > okay... triple dare ya. Shades of, "Take my wife ... please!" ;) -a From oliphant at ee.byu.edu Thu Jan 22 15:52:09 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Thu Jan 22 15:52:09 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <20040122142856.V3184@mail.allcaps.org> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> <20040122142856.V3184@mail.allcaps.org> Message-ID: <4010622E.7040605@ee.byu.edu> Andrew P. Lentvorski, Jr. wrote: > On Thu, 22 Jan 2004, eric jones wrote: > > >>Speaking from the standpoint of SciPy, all I can say is we've tried to >>do what you outline here. The effort of releasing the huge load of >>Fortran/C/C++/Python code across multiple platforms is difficult and >>takes many hours. > > > And since SciPy is mostly Windows, the users expect that one click > installs the universe. Good for customer experience. Bad for > maintainability which would really like to have independently maintained > packages with hard API's surrounding them.. > What in the world does this mean? SciPy is "mostly Windows" Yes, there is a only a binary installer for windows available currently. But, how does that make this statement true. For me SciPy has always been used almost exclusively on Linux. In fact, the best plotting support for SciPy (in my mind) is xplt (pygist-based) and it works best on Linux. -Travis From perry at stsci.edu Thu Jan 22 18:50:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Thu Jan 22 18:50:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <20040121234108.GA4602@taliesen.ucsd.edu> Message-ID: Robert Kern writes: > [snip] > > Tim Hochberg writes: > > The second point is the relative speediness of Numeric at low array > > sizes is the result that nearly all of it is implemented in C, whereas > > much of Numarray is implemented in Python. This results in a larger > > overhead for Numarray, which is why it's slower for small arrays. As I > > understand it, the decision to base most of Numarray in Python was > > driven by maintainability; it wasn't an attempt to optimize > large arrays > > at the expense of small ones. > > Has the numarray team (or anyone else for that matter) looked at using > Pyrex[1] to implement any part of numarray? If not, then that's my next > free-time experiment (i.e. avoiding homework while still looking > productive at the office). > We had looked at it at least a couple of times. I don't remember now all the conclusions, but I think one of the problems was that it wasn't as useful when one had to deal with data types not used in python itself (e.g., unsigned int16). I might be wrong about that. Numarray generates a lot of c code directly for the actual array computations. That is neither the slow part, nor the hard part to write. It is the array computation setup that is complicated. Much of that is now in C (and we do worry that it has greatly added to the complexity). Perhaps that part could be better handled by pyrex. I think some of the remaining overhead has to do with intrinsic python calls, and the differences between the simpler type used for Numeric versus the new style classes used for numarray. Don't hold me to that however. Perry From perry at stsci.edu Thu Jan 22 19:08:59 2004 From: perry at stsci.edu (Perry Greenfield) Date: Thu Jan 22 19:08:59 2004 Subject: [Numpy-discussion] Comments on the Numarray/Numeric disscussion In-Reply-To: <400F1289.2040403@ee.byu.edu> Message-ID: Travis Oliphant writes: > The two major problems I see with Numarray replacing Numeric are > > 1) How is UFunc support? Can you create ufuncs in C easily (with a > single function call or something similar). > Different, but I don't think it is difficult to add ufuncs (and probably easier if many types must be supported, though I doubt that is much of an issue for most mathematical functions which generally are only needed for the float types and perhaps complex). > 2) Speed for small arrays (array creation is the big one). > This is the much harder issue. I do wonder if it is possible to make numarray any faster than Numeric on this point (or as other later mention, whether the complexity that it introduces is worth it. > It is actually quite a common thing to have a loop during which many > small arrays get created and destroyed. Yes, you can usually make such > code faster by "vectorizing" (if you can figure out how). But the > average scientist just wants to (and should be able to) just write a loop. > I'll pick a small bone here. Well, yes, and I could say that a scientist should be able to write loops that iterate over all array elements and expect that they run as fast. But they can't. After all, using an array language within an interpreted language implies that users must cast their problems into array manipulations for it to work efficiently. By using Numeric or numarray they *must* buy into vectorizing at some level. Having said that, it certainly is true that there are problems with small arrays that cannot be easily vectorized by combining into higher dimension arrays (I think the two most common cases are with variable-sized small arrays or where there are iterative algorithms on small arrays that must be iterated many times (though some of these problems can be cast into larger vectors, but often not really easily). > Regarding speed issues. Actually, there are situations where I am very > unsatisfied with Numeric's speed performance and so the goal for > Numarray should not be to achieve some percentage of Numeric's > performance but to beat it. > > Frankly, I don't see how you can get speed that I'm talking about by > carrying around a lot of extras like byte-swapping support, > memory-mapping support, record-array support. > You may be right. But then I would argue that if one want to speed up small array performance, one should really go for big improvements. To do that suggests taking a signifcantly different approach than either Numeric or numarray. But that's a different topic ;-) To me, factors of a few are not necessarily worth the trouble (and I wonder how much of the phase space of problems they really help move into feasibility). Yes, if you've written a bunch of programs that use small arrays that are marginally fast enough, then a factor of two slower is painful. But there are many other small array problems that were too slow already that couldn't be done anyway. The ones that weren't marginal will likely still be acceptable. Those that live in the grey zone now are the ones that are most sensitive to the issue. All the rest don't care. I don't have a good feel for how many live in the grey zone. I know some do. Perry Greenfield From perry at stsci.edu Thu Jan 22 19:25:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Thu Jan 22 19:25:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <40100EAA.8040204@sympatico.ca> Message-ID: Colin J. Williams writes: > > I have wondered whether the desire to be compatible with Numeric has > been an inhibitory factor for numarray. It might be interesting to see > the list of decisions which Eric Jones doesn't like. > There weren't that many. The ones that I remember (and if Eric has time he can fill in the rest) were: 1) default axis for operations. Some use the last and some use the first depending on context. Eric and Travis wanted to use a consistent rule (I believe last always). I believe that scipy wraps Numeric so that it does just that (remember, the behavior in scipy of Numeric is not quite the same as the distributed Numeric (correct me if I'm wrong). 2) allowing complex comparisons. Since Python no longer allows these (and it is reasonable to question whether this was right since complex numbers now can no longer be part of a generic python sort), Many felt that numarray should be consistent with Python. This isn't a big issue since I had argued that those that wanted to do generic comparisons simply needed to cast it as x.real where the .real attribute was available for all types of arrays, thus using that would always work regardless of the type. 3) having single-element indexing return a rank-0 array rather than a python scalar. Numeric is quite inconsistent in this regard now. We decided to have numarray always return python scalars (exceptions may be made if Float128 is supported). The argument for rank-0 arrays was that it would support generic programming so that one didn't need to test for the kind of value for many functions (i.e., scalar or array). But the issue of contention was that Eric argued that len(rank-0) == 1 and that (rank-0)[0] give the value, neither of which is correct according to the strict definition of rank-0. We argued that using rank-1 len-1 arrays were really what was needed for that kind of programming. It turned out that the most common need was for the result of reduction operations, so we provided a version of reduce (areduce) which always returned an array result even if the array was 1-d, (the result would be a length-1 rank-1 array). There are others, but I don't recall immediately. > > > > It is not the interface but the implementation that started this > > furor. Travis O.'s suggestion was to back port (much of) the numarray > > interface to the Numeric code base so that those stuck supporting > > large co debases (like SciPy) and needing fast small arrays could > > benefit from the interface enhancements. One or two of them had > > backward compatibility issues with Numeric, so he asked how it should > > be handled. Unless some magic porting fairy shows up, SciPy will be a > > Numeric only tool for the next year or so. This means that users of > > SciPy either have to forgo some of these features or back port. > > Back porting would appear, to this outsider, to be a regression. Is > there no way of changing numarray so that it has the desired speed for > small arrays? > If it must be faster than Numeric, I do wonder if that is easily done without greatly complicating the code. > > > > > > I am surprised that alltrue() performance is a concern, but it should be > easy to implement short circuit evaluation so that False responses are, > on average, handled more quickly. If Boolean arrays are significant, > in terms of the amount of computer time taken, should they be stored as > bit arrays? Would there be a pay-off for the added complexity? > Making alltrue fast in numarray would not be hard. Just some work writing a special purpose function to short circuit. I doubt very much bit arrays would be much faster. They would also greatly complicate the code base. It is possible to add them, but I've always felt the reason would be to save memory, not increase speed. They haven't been high priority for us. > > Perry Greenfield From oliphant at ee.byu.edu Thu Jan 22 19:33:01 2004 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu Jan 22 19:33:01 2004 Subject: [Numpy-discussion] More on transitioning to Numarray Message-ID: <4010958E.5050303@ee.byu.edu> Today, I realized that I needed to restate what my intention in raising the subject to begin with was. First of all, I would like to see everybody transition to Numarray someday. On the other hand, I'm not willing to ignore performance issues just to reach that desireable goal. I would like to recast my proposal into the framework of helping SciPy transition to Numarray. Basically, I don't think Numarray will be ready to fully support SciPy in less than a year (basically because it probably won't happen until some of us SciPy folks do a bit more work with Numarray). To help that along I am proposing making a few changes to the Numeric object that SciPy uses so that the array object SciPy expects starts looking more and more like the Numarray object. We have wanted to do this in SciPy and were simply wondering if it would make sense to change the Numeric object or to grab the Numeric code base into SciPy and make changes there. The feedback from the community has convinced me personally that we should leave Numeric alone and make any changes to something we create inside of SciPy. There is a lot of concern over having multiple implementations of nd arrays due to potential splitting of tools, etc. But, I should think that tools should be coded to an interface (API, methods, data structures) instead of a signle implementation, so that the actual underlying object should not matter greatly. I thought that was the point of modular development and object-orientedness .... Anyone doing coding with numeric arrays already has to distinguish between: Python Imaging Objects, Lists of lists, and other array-like objects. I think it is pretty clear that Numeric won't be changing. Thus, anything we do with the Numeric object will be done from the framework of SciPy. Best regards. Travis O. From paul at prescod.net Thu Jan 22 23:31:01 2004 From: paul at prescod.net (Paul Prescod) Date: Thu Jan 22 23:31:01 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: References: Message-ID: <4010CC24.3040500@prescod.net> Perry Greenfield wrote: > ... > We had looked at it at least a couple of times. I don't remember now > all the conclusions, but I think one of the problems was that > it wasn't as useful when one had to deal with data types not > used in python itself (e.g., unsigned int16). I might be wrong > about that. I would guess that the issue is more whether it is natively handled by Pyrex than whether it is handled by Python. Is there a finite list of these types that Numarray handles? If you have a list I could generate a patch to Pyrex that would support them. We could then ask Greg whether he could add them to Pyrex core or refactor it so that he doesn't have to. > Numarray generates a lot of c code directly for the actual > array computations. That is neither the slow part, nor the > hard part to write. It is the array computation setup that > is complicated. Much of that is now in C (and we do worry > that it has greatly added to the complexity). Perhaps that > part could be better handled by pyrex. It sounds like it. > I think some of the remaining overhead has to do with intrinsic > python calls, and the differences between the simpler type used > for Numeric versus the new style classes used for numarray. > Don't hold me to that however. Pyrex may be able to help with at least one of these. Calls between Pyrex-coded functions usually go at C speeds (although method calls may be slower). I don't know enough about the new-style, old-style issue to know about whether Pyrex can help with that but I would guess it might because a Pyrex "extension type" is more like a C extension type than a Python instance object. That implies some faster method lookup and calling. Numeric is the exact type of project Pyrex is designed for. And of course it works seamlessly with pre-existing Python and C code so you can selectively port things. Paul Prescod From bsder at allcaps.org Fri Jan 23 01:06:01 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Fri Jan 23 01:06:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <4010622E.7040605@ee.byu.edu> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> <20040122142856.V3184@mail.allcaps.org> <4010622E.7040605@ee.byu.edu> Message-ID: <20040122160155.T3350@mail.allcaps.org> On Thu, 22 Jan 2004, Travis E. Oliphant wrote: > What in the world does this mean? SciPy is "mostly Windows" Yes, there > is a only a binary installer for windows available currently. But, how > does that make this statement true. > > For me SciPy has always been used almost exclusively on Linux. In fact, > the best plotting support for SciPy (in my mind) is xplt (pygist-based) > and it works best on Linux. I was referring to the installers, but I apparently did a thinko and omitted the reference. My apologies. I did not mean to imply that SciPy runs only on Windows, especially since I run it on FreeBSD. My intent was to comment about Win32 having a "one big lump" installer philosophy vs. the Linux "discrete packages" philosophy and the impact on maintainability of each. ie. the fact that releases suck up so much energy because of the need to integrate large chunks of code outside of SciPy itself. -a From falted at openlc.org Fri Jan 23 01:40:00 2004 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 23 01:40:00 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <4010CC24.3040500@prescod.net> References: <4010CC24.3040500@prescod.net> Message-ID: <200401231038.58199.falted@openlc.org> A Divendres 23 Gener 2004 08:24, Paul Prescod va escriure: > Perry Greenfield wrote: > > ... > > We had looked at it at least a couple of times. I don't remember now > > all the conclusions, but I think one of the problems was that > > it wasn't as useful when one had to deal with data types not > > used in python itself (e.g., unsigned int16). I might be wrong > > about that. > > I would guess that the issue is more whether it is natively handled by > Pyrex than whether it is handled by Python. Is there a finite list of > these types that Numarray handles? If you have a list I could generate a > patch to Pyrex that would support them. We could then ask Greg whether > he could add them to Pyrex core or refactor it so that he doesn't have to. I think the question rather was whether Pyrex would be able to work with templates (in the sense of C++), i.e. it can generate different functions depending on the datatypes passed to them. You can see some previous discussion on that list in: http://sourceforge.net/mailarchive/forum.php?thread_id=1642778&forum_id=4890 I've formulated the question to Greg and here you are his answer: http://sourceforge.net/mailarchive/forum.php?thread_id=1645713&forum_id=4890 So, it seems that he don't liked the idea to implement "templates" in Pyrex. > > > Numarray generates a lot of c code directly for the actual > > array computations. That is neither the slow part, nor the > > hard part to write. It is the array computation setup that > > is complicated. Much of that is now in C (and we do worry > > that it has greatly added to the complexity). Perhaps that > > part could be better handled by pyrex. > > It sounds like it. Yeah, I'm quite convinced that a mix between Pyrex and the existing solution in numarray for dealing with templates could be worth the effort. At least, some analysis could be done on that aspect. > > > I think some of the remaining overhead has to do with intrinsic > > python calls, and the differences between the simpler type used > > for Numeric versus the new style classes used for numarray. > > Don't hold me to that however. > > Pyrex may be able to help with at least one of these. Calls between > Pyrex-coded functions usually go at C speeds (although method calls may > be slower). Well, that should be clarified: that's only true for cdef's pyrex functions (i.e. C functions made in Pyrex). Pyrex functions that are able to be called from Python takes the same time whether they are called from Python or from the same Pyrex extension. See some timmings I've done on that subject some time ago: http://sourceforge.net/mailarchive/message.php?msg_id=3782230 Cheers, -- Francesc Alted Departament de Ci?ncies Experimentals Universitat Jaume I. Castell? de la Plana. Spain From hinsen at cnrs-orleans.fr Fri Jan 23 03:59:01 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Jan 23 03:59:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: References: Message-ID: <200401231256.27137.hinsen@cnrs-orleans.fr> On Wednesday 21 January 2004 22:28, Perry Greenfield wrote: > contributed code as well). You have to remember that how easily > contributions come depends on what the critical mass is for > usefulness. For something like numarray or Numeric, that critical > mass is quite large. Few are interested in contributing when it > can do very little and and older package exists that can do more. I also find it difficult in practice to move code from Numeric to Numarray. While the two packages coexist peacefully, any C module that depends on the C API must be compiled for one or the other. Having both available for comparative testing thus means having two separate Python installations. And even with two installations, there is only one PYTHONPATH setting, which makes development under these conditions quite a pain. If someone has found a way out of that, please tell me! > many times in the past. Often consensus was hard to achieve. > We tended to lean towards backward compatibilty unless the change > seemed really necessary. For type coercion and error handling, > we thought it was. But I don't think we have tried shield the > decision making process from the community. I do think the difficulty > in achieving a sense of consensus is a problem. I think you did well on this - but then, I happen to share your general philosophy ;-) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bplaeq71 at cs.com Fri Jan 23 04:39:10 2004 From: bplaeq71 at cs.com (Nancy Mcgill) Date: Fri Jan 23 04:39:10 2004 Subject: [Numpy-discussion] Your Computer may need Cleaning... Message-ID: <3h5$-7-97ku$b-$8$0c@e75qtn.8l3> An HTML attachment was scrubbed... URL: From orders at avahost.net Fri Jan 23 15:43:03 2004 From: orders at avahost.net (orders at avahost.net) Date: Fri Jan 23 15:43:03 2004 Subject: [Numpy-discussion] Welcome to www.dumpsmarket.com Message-ID: Welcome to www.dumpsmarket.com the site of stolen credit cards,child porno, fake money, passports and complete info about any US citizen! You can get fresh stolen dumps here: http://www.dumpsmarket.com/forum/viewtopic.php?t=77 Only using our site you can get every detail of any US citizen including SSN number: http://www.dumpsmarket.com/forum/viewtopic.php?t=192 Credit cards with cvv2 information are available here: http://www.dumpsmarket.com/forum/viewtopic.php?t=57 Our site will be usefull for the those who want to wash their money also. If you don't want to pay taxes or you need to buy something illegal like weapons or drugs. Quick contacts: Panther 44007777 Graph 146191522 You can order by phone: 5092757151. Best regards, Vladimir V Panfilovich. From atn78zg at poczta.onet.pl Fri Jan 23 17:37:01 2004 From: atn78zg at poczta.onet.pl (Joe Manning) Date: Fri Jan 23 17:37:01 2004 Subject: [Numpy-discussion] Boost Your Car's Gas Mileage 27%+.....ifama Message-ID: FUEL SAVER PRO This revolutionary device Boosts Gas Mileage 27%+ by helping fuel burn better using three patented processes from General Motors. www.xnue.biz?axel=49 PROVEN TECHNOLOGY A certified U.S. Environmental Protection Agency (EPA) laboratory recently completed tests on the new Fuel Saver. The results were astounding! Master Service, a subsidiary of Ford Motor Company, also conducted extensive emissions testing and obtained similar, unheard of results. The achievements of the Fuel Saver is so noteworthy to the environmental community, that Commercial News has featured it as their cover story in their June, 2000 edition. Take a test drive Today - www.xnue.biz?axel=49 No more advertisements, thanks - www.sftwre.biz/gh/r/r.asp bawu iayn slh i eyulrvukbm e y i From xznu52ud at mail.ru Fri Jan 23 18:02:02 2004 From: xznu52ud at mail.ru (Charity Pearce) Date: Fri Jan 23 18:02:02 2004 Subject: [Numpy-discussion] Payment Past Due, acct Numpy-discussion uxzgqdfj Message-ID: An HTML attachment was scrubbed... URL: From rbastian at club-internet.fr Sat Jan 24 04:33:00 2004 From: rbastian at club-internet.fr (=?iso-8859-15?q?Ren=E9=20Bastian?=) Date: Sat Jan 24 04:33:00 2004 Subject: [Numpy-discussion] numarray0.4->numarray0.8 Message-ID: <04012409233700.00754@rbastian> I need your help. I tried to update numarray-0.4 to numarray-0.8 I did not get error messages during "install" but lauching python2.3 >>>import numarray I get the message Fatal Python error : Can't import module numarray.libnumarray Uninstall 0.4 (or 0.8) ? How to uninstall numarray ? Thanks for your answers -- Ren? Bastian http://www.musiques-rb.org : Musique en Python From falted at openlc.org Sat Jan 24 04:48:00 2004 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 24 04:48:00 2004 Subject: [Numpy-discussion] numarray0.4->numarray0.8 In-Reply-To: <04012409233700.00754@rbastian> References: <04012409233700.00754@rbastian> Message-ID: <200401241347.19384.falted@openlc.org> A Dissabte 24 Gener 2004 09:23, Ren? Bastian va escriure: > I need your help. > > I tried to update numarray-0.4 to numarray-0.8 > I did not get error messages during "install" > but lauching > python2.3 > > >>>import numarray > > I get the message > Fatal Python error : Can't import module numarray.libnumarray > > Uninstall 0.4 (or 0.8) ? > How to uninstall numarray ? > Perhaps there is a better way, but try with deleting the numarray directory in your python site-packages directory. In my case, the next does the work: rm -r /usr/lib/python2.3/site-packages/numarray/ -- Francesc Alted From nancyk at MIT.EDU Sun Jan 25 10:34:00 2004 From: nancyk at MIT.EDU (Nancy Keuss) Date: Sun Jan 25 10:34:00 2004 Subject: [Numpy-discussion] Numpy capabilities? Message-ID: Hi, I will be working a lot with matrices, and I am wondering a few things before I get started with NumPy: 1) Is there a function that performs matrix multiplication? 2) Is there a function that takes a tensor product, or Kronecker product, of two matrices? 3) Is it possible to concatenate two matrices together? 4) Is there a way to insert a matrix into a subsection of an already existing matrix. For instance, to insert a 2x2 matrix into the upper left hand corner of a 4x4 matrix? Thank you very much in advance! Nancy From hinsen at cnrs-orleans.fr Sun Jan 25 11:21:00 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Sun Jan 25 11:21:00 2004 Subject: [Numpy-discussion] Numpy capabilities? In-Reply-To: References: Message-ID: <758CA5BB-4F6B-11D8-8519-000A95AB5F10@cnrs-orleans.fr> On 25.01.2004, at 19:33, Nancy Keuss wrote: > I will be working a lot with matrices, and I am wondering a few things > before I get started with NumPy: > > 1) Is there a function that performs matrix multiplication? Yes, Numeric.dot(matrix1, matrix2) > 2) Is there a function that takes a tensor product, or Kronecker > product, of > two matrices? Yes, Numeric.multiply.outer(matrix1, matrix2) > 3) Is it possible to concatenate two matrices together? Yes: Numeric.concatenate((matrix1, matrix2)) > 4) Is there a way to insert a matrix into a subsection of an already > existing matrix. For instance, to insert a 2x2 matrix into the upper > left > hand corner of a 4x4 matrix? Yes: matrix4x4[:2, :2] = matrix2x2 Konrad. From rays at san.rr.com Sun Jan 25 22:31:01 2004 From: rays at san.rr.com (RJS) Date: Sun Jan 25 22:31:01 2004 Subject: [Numpy-discussion] efficient sum of "sparse" 2D arrays? Message-ID: <5.2.1.1.2.20040125215330.0354efd0@pop-server.san.rr.com> Hi all, The problem: I have a "stack" of 8, 640 x 480 integer image arrays from a FITS cube concatenated into a 3D array, and I want to sum each pixel such that the result ignores clipped values (255+); i.e., if two images have clipped pixels at (x,y) the result along z will be the sum of the other 6. I'm trying to come up with a pure Numeric way (hopefully so that I can use weave.blitz) to speed up the calculation. I just looked into masked arrays, but I'm not familiar with that module at all. I was guessing someone out there has done this before... Ray From hinsen at cnrs-orleans.fr Mon Jan 26 00:17:01 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon Jan 26 00:17:01 2004 Subject: [Numpy-discussion] efficient sum of "sparse" 2D arrays? In-Reply-To: <5.2.1.1.2.20040125215330.0354efd0@pop-server.san.rr.com> References: <5.2.1.1.2.20040125215330.0354efd0@pop-server.san.rr.com> Message-ID: <056D7E45-4FD8-11D8-B969-000A95AB5F10@cnrs-orleans.fr> On 26.01.2004, at 07:14, RJS wrote: > The problem: I have a "stack" of 8, 640 x 480 integer image arrays > from a FITS cube concatenated into a 3D array, and I want to sum each > pixel such that the result ignores clipped values (255+); i.e., if two > images have clipped pixels at (x,y) the result along z will be the sum > of the other 6. > Memory doesn't seem critical for such small arrays, so you can just do sum([where(a < 255, a, 0) for a in images]) Konrad. From SKuzminski at fairisaac.com Mon Jan 26 04:54:03 2004 From: SKuzminski at fairisaac.com (Kuzminski, Stefan R) Date: Mon Jan 26 04:54:03 2004 Subject: [Numpy-discussion] efficient sum of "sparse" 2D arrays? Message-ID: <7646464ACC9B5347A4A5C57729D74A550369C143@srfmsg100.corp.fairisaac.com> Could you use masked arrays more efficiently in this case? If you create the array so that values >255 and <0 are masked, then they will be excluded from the sum ( and from any other operations as well ). Stefan -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Konrad Hinsen Sent: Monday, January 26, 2004 12:17 AM To: RJS Cc: numpy-discussion at lists.sourceforge.net Subject: Re: [Numpy-discussion] efficient sum of "sparse" 2D arrays? On 26.01.2004, at 07:14, RJS wrote: > The problem: I have a "stack" of 8, 640 x 480 integer image arrays > from a FITS cube concatenated into a 3D array, and I want to sum each > pixel such that the result ignores clipped values (255+); i.e., if two > images have clipped pixels at (x,y) the result along z will be the sum > of the other 6. > Memory doesn't seem critical for such small arrays, so you can just do sum([where(a < 255, a, 0) for a in images]) Konrad. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From nancyk at MIT.EDU Mon Jan 26 08:39:09 2004 From: nancyk at MIT.EDU (Nancy Keuss) Date: Mon Jan 26 08:39:09 2004 Subject: [Numpy-discussion] simple Numarray question Message-ID: Hi, What do I have to include in my Python file for Python to recognize Numarray functions? For instance, in a file called hello.py I try: a = arange(10) print a[1:5] and I get the error: Traceback (most recent call last): File "C:\Python23\hello.py", line 3, in ? a = arange(10) NameError: name 'arange' is not defined Thank you, Nancy From jsaenz at wm.lc.ehu.es Mon Jan 26 08:47:09 2004 From: jsaenz at wm.lc.ehu.es (Jon Saenz) Date: Mon Jan 26 08:47:09 2004 Subject: [Numpy-discussion] simple Numarray question In-Reply-To: Message-ID: Read point 6 of the Python Tutorial. Read chapter 2 (Installing NumPy) of Numeric Python manual. Hope this helps. Jon Saenz. | Tfno: +34 946012445 Depto. Fisica Aplicada II | Fax: +34 944648500 Facultad de Ciencias. \\ Universidad del Pais Vasco \\ Apdo. 644 \\ 48080 - Bilbao \\ SPAIN On Mon, 26 Jan 2004, Nancy Keuss wrote: > Hi, > > What do I have to include in my Python file for Python to recognize Numarray > functions? For instance, in a file called hello.py I try: > > a = arange(10) > print a[1:5] > > and I get the error: > > Traceback (most recent call last): > File "C:\Python23\hello.py", line 3, in ? > a = arange(10) > NameError: name 'arange' is not defined > > Thank you, > Nancy > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From jmiller at stsci.edu Mon Jan 26 08:58:13 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 26 08:58:13 2004 Subject: [Numpy-discussion] simple Numarray question In-Reply-To: References: Message-ID: <1075136211.16499.24.camel@localhost.localdomain> On Mon, 2004-01-26 at 11:38, Nancy Keuss wrote: > Hi, > > What do I have to include in my Python file for Python to recognize Numarray > functions? For instance, in a file called hello.py I try: > > a = arange(10) > print a[1:5] > > and I get the error: > > Traceback (most recent call last): > File "C:\Python23\hello.py", line 3, in ? > a = arange(10) > NameError: name 'arange' is not defined > There are a number of ways to import numarray (or any Python module), but the way I recommend is this: import numarray a = numarray.arange(10) print a[1:5] If you're writing quick scripts that you're not worried about maintaining, do this: from numarray import * a = arange(10) print a[1:5] If writing "numarray." is too tedious, but you still care about maintenance, try something like this: import numarray as _n a = _n.arange(10) print a[1:5] Todd > Thank you, > Nancy > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller From Chris.Barker at noaa.gov Mon Jan 26 09:52:03 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Jan 26 09:52:03 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <200401231038.58199.falted@openlc.org> References: <4010CC24.3040500@prescod.net> <200401231038.58199.falted@openlc.org> Message-ID: <401552DF.3000505@noaa.gov> I remember that thread clearly, as I think making it easy to write new Ufuncs (and others) that perform at C speed could make a real difference to how effective SciPy ultimately is. I say SciPy, because I believe a large collection of special purpose optimized functions probably doesn't belong in in Numarray itself. Francesc Alted wrote: > So, it seems that he don't liked the idea to implement "templates" in Pyrex. Yes, I remember that answer, and was disappointed, though the logic of not-re-implkimenting C++ templates is pretty obvious. Which brings up the obvious question: why not use C++ templates themselves? which is what Blitz does. This points ot weave.blitz at the obvious way to write optimized special purpose functions for SciPy. Does weave.Blitz work with Numarray yet? Clearly it's time for me to check it out more... > Yeah, I'm quite convinced that a mix between Pyrex and the existing solution > in numarray for dealing with templates could be worth the effort. At least, > some analysis could be done on that aspect. allowing Pyrex to use templates would be great.. but how would that be better than weave.blitz? Or maybe Pyrex could use blitz. I'm kind of over my head here, but I hope something comes of this. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From j_r_fonseca at yahoo.co.uk Mon Jan 26 10:11:02 2004 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-1?Q?Jos=E9?= Fonseca) Date: Mon Jan 26 10:11:02 2004 Subject: [Numpy-discussion] Searchable online documentation of Numeric Python Message-ID: I've made this mainly for myself, but in case anybody here finds it useful, here is the online documentation of Numeric Python with TOC/index/search navigation: http://mefriss1.swan.ac.uk/htmlhelp/php/index.php?book_id=49 There's also Python documention there if you fancy it. NOTE: The URL may go dead eventually, but until I'll arrange a way to have this documentation elsewhere, so it won't worry about that. I hope you enjoy it. Jose Fonseca From dubois1 at llnl.gov Mon Jan 26 11:31:02 2004 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Mon Jan 26 11:31:02 2004 Subject: [Numpy-discussion] cygwin problems? Message-ID: <40156AC3.1080501@llnl.gov> A user (see below) has complained that svd and other functions hang on cygwin, with numarray and Numeric. Anyone know anything about this? Hi Paul I tried the svd of numarray, version 0.6.2, I did not download the newest version of numarray, and still the same problem happen, it hanged on my cygwin box. I've also, noticed that calculating the eigenvalues of a square matrix with Numeric also hang the python 2.3 I wonder if the problem is related.... Let me know, if you have some idea of what to do next. From rays at blue-cove.com Mon Jan 26 11:55:09 2004 From: rays at blue-cove.com (Ray Schumacher) Date: Mon Jan 26 11:55:09 2004 Subject: [Numpy-discussion] summing "sparse" 2D arrays? Results... Message-ID: <5.2.0.4.2.20040126111759.09418258@blue-cove.com> I just realized... where() belongs to Numeric, so I need sum([Numeric.where(a < 255, a, 0) for a in y]) duh. I did just compare Numeric vs. Masked arrays: =========================================================== # test.py from MA import masked_array, sum from RandomArray import * import time seed() y = randint(240,256, (480,640,16)) start = time.time() x=masked_array(y, y>=255) maskTime = time.time() - start sum_1 = sum(x,axis=2) maskedTime = time.time() - start print sum_1.shape print sum_1 print "mask make time: " + str(maskTime) print "time using MA: " + str(maskedTime) + "\n" z = Numeric.reshape(y, (16, 480, 640)) newStart = time.time() sum_2 = sum([Numeric.where(a < 255, a, 2) for a in z]) numTime = time.time() - newStart print sum_2.shape print sum_2 print "time using Numeric: " + str(numTime) + "\n" ====================================================== Result: C:\projects\Astro>python test.py (480, 640) array (480,640) , type = O, has 307200 elements mask make time: 1.07899999619 time using MA: 3.39100003242 (480, 640) array (480,640) , type = l, has 307200 elements time using Numeric: 2.39099979401 So, MA's sum() is slightly faster, but the penalty for making a mask first is large. Now I have to figure out why I had to reshape the array for the second computation. Thanks, Ray At 09:17 AM 1/26/2004 +0100, you wrote: >On 26.01.2004, at 07:14, RJS wrote: > >>The problem: I have a "stack" of 8, 640 x 480 integer image arrays from >>a FITS cube concatenated into a 3D array, and I want to sum each pixel >>such that the result ignores clipped values (255+); i.e., if two images >>have clipped pixels at (x,y) the result along z will be the sum of the other 6. >Memory doesn't seem critical for such small arrays, so you can just do > >sum([where(a < 255, a, 0) for a in images]) Hello Konrad, I just tried: from MA import masked_array, sum from RandomArray import * seed() y = randint(240,256, (480,640,2)) print sum([where(a < 255, a, 0) for a in y]) and it errors: Traceback (most recent call last): File "test.py", line 21, in ? print sum([where(a < 255, a, 0) for a in y]) NameError: name 'where' is not defined Could you enlighten me further? I have not found a good resource for compound Numeric statements yet. Thank you, Ray From mdehoon at ims.u-tokyo.ac.jp Mon Jan 26 17:42:01 2004 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Mon Jan 26 17:42:01 2004 Subject: [Numpy-discussion] cygwin problems? In-Reply-To: <40156AC3.1080501@llnl.gov> References: <40156AC3.1080501@llnl.gov> Message-ID: <4015C296.5000707@ims.u-tokyo.ac.jp> Patch 732520 for Numeric fixes both problems. The problem is caused by some lapack routines being inadvertently compiled with optimization; see the patch description for a full explanation. Note that the same error may occur on platforms other than Cygwin, and also with other linear algebra functions. --Michiel, U Tokyo. Paul F. Dubois wrote: > A user (see below) has complained that svd and other functions hang on > cygwin, with numarray and Numeric. Anyone know anything about this? > > Hi Paul > > I tried the svd of numarray, version 0.6.2, I did not download the > newest version of > numarray, and still the same problem happen, it hanged on my cygwin box. > > I've also, noticed that calculating the eigenvalues of a square matrix > with Numeric also > hang the python 2.3 I wonder if the problem is related.... > > Let me know, if you have some idea of what to do next. > > > > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From istfet at cec.eu.int Tue Jan 27 00:56:28 2004 From: istfet at cec.eu.int (istfet at cec.eu.int) Date: Tue Jan 27 00:56:28 2004 Subject: [Numpy-discussion] Status Message-ID: The message contains Unicode characters and has been sent as a binary attachment. -------------- next part -------------- A non-text attachment was scrubbed... Name: readme.zip Type: application/octet-stream Size: 22646 bytes Desc: not available URL: From 571xvqt at spray.se Tue Jan 27 03:18:55 2004 From: 571xvqt at spray.se (Lorna Hoffman) Date: Tue Jan 27 03:18:55 2004 Subject: [Numpy-discussion] OTCBB: SEVI - Stock-Market Profile of the Week-- Message-ID: <4i-$c--kpn3pzdwfv1c1@ncmeh.cgts7j> Stock Profile of the Week - NEW ISSUE: SEVI - Systems Evolution Incorporated Systems Evolution Incorporated (SEVI) is a high technology consulting firm with cutting-edge technologists and integration specialists. Its seasoned staff, which includes former lead technical officers from other organizations, possesses considerable expertise and multiple certifications in technologies from such industry leaders as Microsoft, Sybase, Oracle, and Novell, as well as innovators such as Plumtree, SAP, and Actuate. As a Microsoft Certified Solutions Provider? (MCSP), SEVI employs several Microsoft certified professionals. These individuals include Systems Engineers (MCSE), Systems Developers (MCSD), Database Administrators (MCDBA), and Trainers (MCT). SEI has extensive experience with enterprise-class Microsoft solutions, in the platform implementation of Microsoft .NET servers as well as the development of Visual BASIC and Active Server Pages (ASP). Personnel help maintain our Gold Partner status with Novell. Also, SEVI was a leader in the Java revolution. SEVI technologists were among the first Java trainers, and its developers achieved early recognition for their usage of server-based Java technologies such as SilverStream application servers and J2EE. Focused on Novell's application server technology, SEVI's Sun Java-certified personnel help maintain its Gold Partner status with Novell. No more ads: http://jampe.biz/patch/o.html?eCT Equity Essence is an independent research firm. This report is based on Equity Essence's independent analysis but also relies on information supplied by sources believed to be reliable. This report may not be the opinion of SEVI management. Equity Essence has also been retained to research and issue reports on SEVI. Equity Essence may from time to time purchase or sell SEVI common shares in the open market without notice. The information contained in this report shall not constitute, an offer to sell or solicitation of any offer to purchase any security. It is intended for information only. Some statements may contain so-called "forward-looking statements". Many factors could cause actual results to differ. Investors should consult with their Investment Advisor concerning SEVI. Copyright 2004 Equity Essence. All Rights Reserved. This newsletter was distributed by CTS. CTS was paid three thousand dollars to distribute this report. CTS is not affiiated with Equity Essence and is not responsible for newsletter content. CTS: Apartado 173-3006, Zona Franca Metro, Barreal, Heredia, Costa Rica. au ot fkkufirvrx qj zavcfdp tilj idjjezcxs hkfqrwdq gbo From ariciputi at pito.com Tue Jan 27 09:45:26 2004 From: ariciputi at pito.com (Andrea Riciputi) Date: Tue Jan 27 09:45:26 2004 Subject: [Numpy-discussion] Writing arrays to files. Message-ID: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Hi, I need a little help here. I need to write some Numeric arrays (coming from some simulations) to ASCII files. I'm sure it's a well known topic and I'd be happy not to have to reinvent the wheel. I've both 1-dim and 2-dim arrays and I'd like to get something like this: - 1-dim array ASCII file: 0.1 0.2 0.3 etc... - 2-dim array ASCII file: 0.1 0.2 0.3 0.4 0.5 0.6 etc.... How can I get them? Thanks in advance, Andrea. --- Andrea Riciputi "Science is like sex: sometimes something useful comes out, but that is not the reason we are doing it" -- (Richard Feynman) From jdhunter at ace.bsd.uchicago.edu Tue Jan 27 10:45:56 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Tue Jan 27 10:45:56 2004 Subject: [Numpy-discussion] Writing arrays to files. In-Reply-To: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> (Andrea Riciputi's message of "Tue, 27 Jan 2004 16:32:32 +0100") References: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Message-ID: >>>>> "Andrea" == Andrea Riciputi writes: Andrea> Hi, I need a little help here. I need to write some Andrea> Numeric arrays (coming from some simulations) to ASCII Andrea> files. I'm sure it's a well known topic and I'd be happy Andrea> not to have to reinvent the wheel. I've both 1-dim and Andrea> 2-dim arrays and I'd like to get something like this: Andrea> - 1-dim array ASCII file: Andrea> 0.1 0.2 0.3 etc... If your array is not monstrously large, and you can do it all in memory, do fh = file('somefile.dat', 'w') s = ' '.join([str(val) for val in a]) fh.write(s) where a is your 1D array Andrea> - 2-dim array ASCII file: Andrea> 0.1 0.2 0.3 0.4 0.5 0.6 etc.... Andrea> How can I get them? Same idea fh = file('somefile.dat', 'w') for row in a: s = ' '.join([str(val) for val in row]) fh.write('%s\n' % s) where a is your 2D array The scipy module also has support for reading and writing ASCII files. Note if you are concerned about efficiency and are willing to use binary files, use the fromstring and tostring methods. JDH From jdhunter at ace.bsd.uchicago.edu Tue Jan 27 15:07:11 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Tue Jan 27 15:07:11 2004 Subject: [Numpy-discussion] Writing arrays to files. In-Reply-To: (Andrea Riciputi's message of "Tue, 27 Jan 2004 23:25:20 +0100") References: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Message-ID: >>>>> "Andrea" == Andrea Riciputi writes: Andrea> On 27 Jan 2004, at 19:31, John Hunter wrote: >> If your array is not monstrously large, and you can do it all >> in memory, do Andrea> 1-dim arrays with 1000 elements and 2-dim arrays with Andrea> (1000 x 1000) elements have to be considered "monstrously Andrea> large"? You should have no trouble with either the 1D or 2D approaches I posted with arrays this size. Even though 1000x1000 is a lot of elements, the 2D approach does the string operations row by row, so only 1000 will be converted at a time, which will be trivial for all but the clunkiest machines. JDH From Mailer-Daemon at ensm-douai.fr Wed Jan 28 04:41:07 2004 From: Mailer-Daemon at ensm-douai.fr (Mail Delivery Subsystem) Date: Wed Jan 28 04:41:07 2004 Subject: [Numpy-discussion] Returned mail: see transcript for details Message-ID: <200401281240.i0SCeAp7030686@ecole.ensm-douai.fr> The original message was received at Wed, 28 Jan 2004 13:40:10 +0100 from viruswall.ensm-douai.fr [10.1.1.22] ----- The following addresses had permanent fatal errors ----- (reason: 501 numpy-discussion at lists.sourceforge.net... Unauthorized sender) ----- Transcript of session follows ----- ... while talking to [10.1.1.1]: >>> MAIL From: <<< 501 numpy-discussion at lists.sourceforge.net... Unauthorized sender 501 5.6.0 Data format error -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From numpy-discussion at lists.sourceforge.net Wed Jan 28 07:39:03 2004 From: numpy-discussion at lists.sourceforge.net (numpy-discussion at lists.sourceforge.net) Date: Wed, 28 Jan 2004 13:39:03 +0100 Subject: No subject Message-ID: <200401281240.i0SCeAp7030683@ecole.ensm-douai.fr> The message cannot be represented in 7-bit ASCII encoding and has been sent as a binary attachment. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: InterScan_SafeStamp.txt URL: From niall at lastminute.com Wed Jan 28 09:47:04 2004 From: niall at lastminute.com (Niall Dalton) Date: Wed Jan 28 09:47:04 2004 Subject: [Numpy-discussion] atan2 Message-ID: <1075311725.3856.69.camel@localhost.localdomain> Hello, I'm using Numeric 23.1, and finding it very useful! I do need to use atan2, and a browse of the manual suggests its not available as a binary ufunc. I'm happy to add it myself if I'm correct - I'm guessing it should be simple to based on the code of one of the existing functions. Is Numeric still accepting patches, or should I consider switching to Numarray? Regards, Niall ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ From jdhunter at ace.bsd.uchicago.edu Wed Jan 28 09:54:09 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Wed Jan 28 09:54:09 2004 Subject: [Numpy-discussion] atan2 In-Reply-To: <1075311725.3856.69.camel@localhost.localdomain> (Niall Dalton's message of "Wed, 28 Jan 2004 17:42:05 +0000") References: <1075311725.3856.69.camel@localhost.localdomain> Message-ID: >>>>> "Niall" == Niall Dalton writes: Niall> Hello, I'm using Numeric 23.1, and finding it very useful! Niall> I do need to use atan2, and a browse of the manual suggests Niall> its not available as a binary ufunc. Numeric.arctan2 If you run into a similar problem in the future, you may want to try >>> import Numeric >>> dir(Numeric) Hope this helps, JDH From niall at lastminute.com Wed Jan 28 10:03:03 2004 From: niall at lastminute.com (Niall Dalton) Date: Wed Jan 28 10:03:03 2004 Subject: [Numpy-discussion] atan2 In-Reply-To: References: <1075311725.3856.69.camel@localhost.localdomain> Message-ID: <1075312803.3856.71.camel@localhost.localdomain> On Wed, 2004-01-28 at 17:39, John Hunter wrote: > >>>>> "Niall" == Niall Dalton writes: > > Niall> I do need to use atan2 > Numeric.arctan2 > > If you run into a similar problem in the future, you may want to try > > >>> import Numeric > >>> dir(Numeric) > > Hope this helps, It does indeed, thanks! I blame the first snowfall we just had minutes ago for the oversight - its my story and I'm sticking to it ;-) Thanks, niall ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ From ariciputi at pito.com Thu Jan 29 11:27:03 2004 From: ariciputi at pito.com (Andrea Riciputi) Date: Thu Jan 29 11:27:03 2004 Subject: [Numpy-discussion] Writing arrays to files. In-Reply-To: References: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Message-ID: On 27 Jan 2004, at 19:31, John Hunter wrote: > If your array is not monstrously large, and you can do it all in > memory, do 1-dim arrays with 1000 elements and 2-dim arrays with (1000 x 1000) elements have to be considered "monstrously large"? Cheers, Andrea. --- Andrea Riciputi "Science is like sex: sometimes something useful comes out, but that is not the reason we are doing it" -- (Richard Feynman) From zk4hcgg at spray.se Fri Jan 30 07:40:00 2004 From: zk4hcgg at spray.se (Chuck Meadows) Date: Fri Jan 30 07:40:00 2004 Subject: [Numpy-discussion] Protect Yourself now Message-ID: An HTML attachment was scrubbed... URL: From dholth at fastmail.fm Fri Jan 30 13:23:06 2004 From: dholth at fastmail.fm (Daniel Holth) Date: Fri Jan 30 13:23:06 2004 Subject: [Numpy-discussion] shape, size Message-ID: <1075435976.13335.4.camel@bluefish> In python, if na is a numarray: >>> na array([[0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) I can type >>> nb = na[:,:4] >>> nb array([[0, 1, 0, 1], [1, 0, 1, 0]]) >>> nb[0][0]=17 >>> na array([[17, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], [ 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) nb and na share data. How do you write nb = na[:,:4] in a C extension module? Thanks, Daniel Holth From jmiller at stsci.edu Fri Jan 30 13:50:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 30 13:50:02 2004 Subject: [Numpy-discussion] shape, size In-Reply-To: <1075435976.13335.4.camel@bluefish> References: <1075435976.13335.4.camel@bluefish> Message-ID: <1075499339.9028.16.camel@halloween.stsci.edu> On Thu, 2004-01-29 at 23:12, Daniel Holth wrote: > In python, if na is a numarray: > > >>> na > array([[0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], > [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) > > I can type > > >>> nb = na[:,:4] > >>> nb > array([[0, 1, 0, 1], > [1, 0, 1, 0]]) > > >>> nb[0][0]=17 > > >>> na > array([[17, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], > [ 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) > > nb and na share data. > > How do you write nb = na[:,:4] in a C extension module? Here's the quick and dirty way: nb = (PyArrayObject *) PyObject_CallMethod(na, "view", NULL); if (na->dimensions[1] >= 4) nb->dimensions[1] = 4; Todd > > Thanks, > > Daniel Holth > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From dholth at sent.com Fri Jan 30 14:27:02 2004 From: dholth at sent.com (Daniel Holth) Date: Fri Jan 30 14:27:02 2004 Subject: [Numpy-discussion] shape, size Message-ID: <1075436585.13335.7.camel@bluefish> apoligies if this is a duplicate... How do you write nb = na[:,:4] in a C extension module? Thanks, Daniel Holth From austin at magicfish.net Fri Jan 30 18:33:36 2004 From: austin at magicfish.net (Austin Luminais) Date: Fri Jan 30 18:33:36 2004 Subject: [Numpy-discussion] numarray-0.8.1 Message-ID: <6.0.0.22.0.20040130202402.0281ee10@mail.magicfish.net> Hello, is there any place I can download a Windows installer for numarray 0.8.1? I upgraded to 0.8.2 a while back, but it does not work with McMillan's Installer. 0.8.1 worked fine, but I neglected to keep a copy of it. As for why it doesn't work with Installer, I'm not sure. At least part of the problem is that it is hardcoded to load LICENSE.TXT in __init__.py in a way that is incompatible with Installer. I tried removing the loading of LICENSE.TXT (which I realize is a questionable thing to do; I was just trying to get it working), but it doesn't work after that either. From jdhunter at ace.bsd.uchicago.edu Fri Jan 30 19:49:03 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Fri Jan 30 19:49:03 2004 Subject: [Numpy-discussion] [Daniel Holth ] numarray question Message-ID: Daniel asked me to forward this, as apparently he has had trouble getting mail through. -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From dholth at fastmail.fm Thu Jan 29 23:30:29 2004 From: dholth at fastmail.fm (Daniel Holth) Date: Thu, 29 Jan 2004 23:30:29 -0500 Subject: numarray question Message-ID: <1075437029.13335.12.camel@bluefish> JDH, sf.net seems to be ignoring my messages, so would you relay this question: how do you write nb = na[:,:4], creating a slice of na that references nb, in an extension? thanks, Daniel Holth --=-=-=-- From dholth at fastmail.fm Fri Jan 30 21:31:34 2004 From: dholth at fastmail.fm (Daniel Holth) Date: Fri Jan 30 21:31:34 2004 Subject: [Numpy-discussion] array resizing? Message-ID: <1075353684.10541.2.camel@bluefish> Is it possible for a C function to take an array from Python and resize it for returned results? From rays at blue-cove.com Sat Jan 31 07:04:02 2004 From: rays at blue-cove.com (RayS) Date: Sat Jan 31 07:04:02 2004 Subject: [Numpy-discussion] numarray-0.8.1 In-Reply-To: <6.0.0.22.0.20040130202402.0281ee10@mail.magicfish.net> Message-ID: <5.2.1.1.2.20040131070100.0810f060@216.122.242.54> At 08:32 PM 1/30/04 -0600, Austin Luminais wrote: >As for why it doesn't work with Installer, I'm not sure. At least part of >the problem is that it is hardcoded to load LICENSE.TXT in __init__.py in >a way that is incompatible with Installer. >I tried removing the loading of LICENSE.TXT (which I realize is a >questionable thing to do; I was just trying to get it working), but it >doesn't work after that either. I saw this thread before: http://aspn.activestate.com/ASPN/Mail/Message/numpy-discussion/1967514 seems the solution though I haven't had to try it. I prefer McMillan to py2exe for it's smaller exe-s. Ray From falted at openlc.org Fri Jan 2 07:53:03 2004 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 2 07:53:03 2004 Subject: [Numpy-discussion] UInt64 support in FreeBSD? Message-ID: <200401021653.10042.falted@openlc.org> Hi, Some people wanting to use pytables on FreeBSD would like to have UInt64 support, but numarray lacks support for it on this platform. As FreeBSD uses gcc compiler, I think it's just a matter to add an "freebsd4-i386" entry in generate.py. Todd, may you please add such a support? Regarding to the other parameters, LP64 (long pointer), HAS_FLOAT128 (128 floating point), I'm not sure, but perhaps they maybe similar to the "linux2" platform. Anyone using FreeBSD can give more hints? Happy new year!, -- Francesc Alted From jmiller at stsci.edu Fri Jan 2 08:17:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 2 08:17:01 2004 Subject: [Numpy-discussion] UInt64 support in FreeBSD? In-Reply-To: <200401021653.10042.falted@openlc.org> References: <200401021653.10042.falted@openlc.org> Message-ID: <1073060150.3451.33.camel@localhost.localdomain> On Fri, 2004-01-02 at 10:53, Francesc Alted wrote: > Hi, > > Some people wanting to use pytables on FreeBSD would like to have UInt64 > support, but numarray lacks support for it on this platform. As FreeBSD uses > gcc compiler, I think it's just a matter to add an "freebsd4-i386" entry in > generate.py. > > Todd, may you please add such a support? Sure, I'll add it, but I have no means to test it... I also changed the default platform to include UInt64, since with the exception of MSVC, it's supported everywhere I've looked. Todd > Regarding to the other parameters, > LP64 (long pointer), HAS_FLOAT128 (128 floating point), I'm not sure, but > perhaps they maybe similar to the "linux2" platform. Anyone using FreeBSD > can give more hints? > > Happy new year!, -- Todd Miller From edcjones at erols.com Fri Jan 2 18:16:01 2004 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 2 18:16:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <3FF624C5.7010400@erols.com> IM I have uploaded a new version of my small image processing system IM to "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually convert it all to "nd_image". Some features are: Wrappers for some useful functions in the numarray API. NA_GetType NA_TypeName NA_GetTypeFromTypeno NA_TypenoFromType SafeCastCheck Standardized parameters Module(arrin), TypeCode(arrin), Width(arrin), Height(arrin), Bands(arrin), Mode(arrin), NatypeOrMode(arrin), and BytesPerItem(arrin) Open and Save ArrayToArrayCast Converts between array types and formats. Out of range values are clipped. Some additions to numarray BlockReduce, MultiReduce, BlockMean, CountNonZero, CountZeros, Stretch (grey level range), Zoom, Shrink, and Saturate. Convert an array to a list of (array[i,j], i, j) or a dictionary with entries d[(i,j)] = array[i,j]. Sliding window operators including MeanX and HaarX which have masking. Only the unmasked pixels are averaged when finding a mean. For MeanX and HaarX, a border is added to the image. The pixels in the border become the masked pixels. THOUGHTS There are many open source image processing systems but most of them get only to the Canny edge operator and then stop. A sample of the better ones are: ImageMagick http://www.imagemagick.org/ OpenCV http://www.intel.com/research/mrl/research/opencv/ Xite http://www.ifi.uio.no/forskning/grupper/dsb/Software/Xite/ VXL http://vxl.sourceforge.net/ Gandalf http://sourceforge.net/projects/gandalf-library/ imgSeek http://imgseek.sourceforge.net/ And then there is the huge and hard to use "Image Understanding Environment" (IUE) at "http://www.aai.com/AAI/IUE/IUE.html". Has anyone used this? A good starting point is "The Computer Vision Homepage" at "http://www-2.cs.cmu.edu/~cil/vision.html". At this site there is a list of published software. A well-known example is the Kanade-Lucas-Tomasi Feature Tracker coded by Stan Birchfield at "http://vision.stanford.edu/~birch/klt/". Thanks. Note how short the software list is compared with the size of the computer vision lterature. Why does so little software exists for the more advanced parts of computer vision? I feel this is mostly because academic researchers seldom publish their software. In some cases (for example, face recognition software) there are financial motives. In most cases. I suspect that there is no pressure on the researchers from journals or department chairmen to publish the software. So they avoid the work of making their software presentable by not releasing it. The result are many unreproduced experiments and slow transitions of new algorithms out of academia. A good computer vision system Has an easy to use and widely used scripting language. Python Has powerful array processing capabilities. numarray, nd_image Wraps a variety of other computer vision systems. The wrapping process should be straightforward. SWIG, Pyrex, Psyco, ..., and the Python API. Provides a uniform interface to its components. Is used by many people. From verveer at embl-heidelberg.de Mon Jan 5 04:20:11 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Jan 5 04:20:11 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <3FF624C5.7010400@erols.com> References: <3FF624C5.7010400@erols.com> Message-ID: <200401051243.50605.verveer@embl-heidelberg.de> On Saturday 03 January 2004 03:11, Edward C. Jones wrote: > IM > > I have uploaded a new version of my small image processing system IM to > "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code > in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually > convert it all to "nd_image". I had a look and I guess that indeed you could use the nd_image package for some low level stuff (I am the author of nd_image). nd_image is however also still being developed and I am looking for directions to further work on. I wondered if there is anything you would like to see in there? > THOUGHTS > > There are many open source image processing systems but most of them get > only to the Canny edge operator and then stop. A sample of the better ones > are: > > ImageMagick http://www.imagemagick.org/ > OpenCV http://www.intel.com/research/mrl/research/opencv/ > Xite > http://www.ifi.uio.no/forskning/grupper/dsb/Software/Xite/ VXL > http://vxl.sourceforge.net/ > Gandalf http://sourceforge.net/projects/gandalf-library/ > imgSeek http://imgseek.sourceforge.net/ I think not all of these are general image processing systems and often a bit limited. One problem that I have with most of these packages is that they stop at processing 8bit or 16bit two-dimensional images. That is a limit for quite a lot of image analysis research, for instance medical imaging. That is why numarray is so great, it supports multi-dimensional arrays of arbritrary type. nd_image is designed to support multiple dimensions and any data type. That is not always easy and may prevent some optimizations, but I think it is an important feature. That idea is of course not new, matlab is starting to support multi-dimensional image routines and I am aware of at least one C library that does this, although it is not free software: http://www.ph.tn.tudelft.nl/DIPlib/ > And then there is the huge and hard to use "Image Understanding > Environment" (IUE) at "http://www.aai.com/AAI/IUE/IUE.html". Has anyone > used this? The website appears to updated last in 1999, which is not encouraging. Looks hideously complex too. > A good starting point is "The Computer Vision Homepage" at > "http://www-2.cs.cmu.edu/~cil/vision.html". At this site there is a list of > published software. A well-known example is the Kanade-Lucas-Tomasi Feature > Tracker coded by Stan Birchfield at > "http://vision.stanford.edu/~birch/klt/". Thanks. Note how short the > software list is compared with the size of the computer vision lterature. > > Why does so little software exists for the more advanced parts of computer > vision? I feel this is mostly because academic researchers seldom publish > their software. In some cases (for example, face recognition software) > there are financial motives. In most cases. I suspect that there is no > pressure on the researchers from journals or department chairmen to publish > the software. So they avoid the work of making their software presentable > by not releasing it. The result are many unreproduced experiments and slow > transitions of new algorithms out of academia. This is certainly true. I know from experience that often you simply cannot afford to design and maintain a software package after you came up with something new and published it. So a lot of things never leave the laboratory simply because it is hard to do properly. I hope that having a system around like numarray with packages will help. > A good computer vision system > Has an easy to use and widely used scripting language. > Python > Has powerful array processing capabilities. > numarray, nd_image > Wraps a variety of other computer vision systems. The wrapping process > should be straightforward. > SWIG, Pyrex, Psyco, ..., and the Python API. > Provides a uniform interface to its components. > Is used by many people. I intend to develop nd_image further as a basic component for multidimensional image analysis. It would be great if it would get picked up to be part of a system like to propose. Maybe in the future SciPy could play that role. What I would like to hear from people that use this type of software is what kind of basic operations you would like to see become part of nd_image. That will help me to further develop the package. Contributed code is obviously also welcome. Peter -- Dr. Peter J. Verveer Cell Biology and Cell Biophysics Programme European Molecular Biology Laboratory Meyerhofstrasse 1 D-69117 Heidelberg Germany Tel. : +49 6221 387245 Fax : +49 6221 387306 From edcjones at erols.com Mon Jan 5 16:25:01 2004 From: edcjones at erols.com (Edward C. Jones) Date: Mon Jan 5 16:25:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <200401051243.50605.verveer@embl-heidelberg.de> References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> Message-ID: <3FF9FF44.8030606@erols.com> Peter Verveer wrote: >On Saturday 03 January 2004 03:11, Edward C. Jones wrote: > > >> IM >> >>I have uploaded a new version of my small image processing system IM to >>"http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code >>in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually >>convert it all to "nd_image". >> >> > >I had a look and I guess that indeed you could use the nd_image package for >some low level stuff (I am the author of nd_image). nd_image is however also >still being developed and I am looking for directions to further work on. I >wondered if there is anything you would like to see in there? > > Thanks for your response. I have put a slightly revised version of IM on my web page "http://members.tripod.com/~edcjones/". The new version includes functions, written in C, for slicing arrays. A cople of things I would like to see: The ability to read and write a variety of image formats. ImageMagick has a good set. All of ImageMagick should be wrapped. The Canny edge operator along with code for generating polygonal approximations to edges. See OpenCV. Do you have some examples of algorithms for multi-dimensional images that you think should be put in nd_image? Thanks, Ed Jones From verveer at embl-heidelberg.de Tue Jan 6 04:31:00 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Tue Jan 6 04:31:00 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <3FF9FF44.8030606@erols.com> References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> <3FF9FF44.8030606@erols.com> Message-ID: <200401061330.18932.verveer@embl-heidelberg.de> Hi Ed, > A cople of things I would like to see: > > The ability to read and write a variety of image formats. That is of course important. But in my view really a separate issue from developing a library of analysis routines. The latter just have to operate on numarray arrays and need not to worry about how the data gets there. Of course you need to get your data in numarray. PIL seems to do a good job with images, except for 16bit tiffs which causes me quiet some problems. Anybody know a good solution for getting 16bit tiffs into numarray? >ImageMagick > has a good set. All of ImageMagick should be wrapped. Isn't there already a python interface to ImageMagick? > The Canny edge operator along with code for generating polygonal > approximations to edges. See OpenCV. Canny I will likely implement at some point. Polygonal approximations to edges can be done in many ways I guess. I would need to find some reasonable method in the literature to do that. Suggestions are welcome. > Do you have some examples of algorithms for multi-dimensional images > that you think should be put in nd_image? At the moment I have only been looking at general basic image processing operations which normally generalize well to multiple dimensions. I will continue to do that. There are also somewhat higher level operations that I currently have not included. For instance, I implemented a sub-pixel shift estimator which I need for my work. That would be an example of a routine that is completely written in python using numarray and nd_image routines and does not need any C. This could be useful for others, but I am not sure if it belongs in a low-level library. Maybe we need some repository for that sort of python applications. Cheers, Peter From oliphant at ee.byu.edu Tue Jan 6 07:32:02 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Tue Jan 6 07:32:02 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <3FF9FF44.8030606@erols.com> References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> <3FF9FF44.8030606@erols.com> Message-ID: <3FFAD50E.6050500@ee.byu.edu> Edward C. Jones wrote: > Peter Verveer wrote: > >> On Saturday 03 January 2004 03:11, Edward C. Jones wrote: >> >> >>> IM >>> >>> I have uploaded a new version of my small image processing system IM to >>> "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the >>> code >>> in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually >>> convert it all to "nd_image". >>> >> >> >> I had a look and I guess that indeed you could use the nd_image >> package for some low level stuff (I am the author of nd_image). >> nd_image is however also still being developed and I am looking for >> directions to further work on. I wondered if there is anything you >> would like to see in there? >> > Thanks for your response. I have put a slightly revised version of IM on > my web page "http://members.tripod.com/~edcjones/". The new version > includes functions, written in C, for slicing arrays. > > A cople of things I would like to see: > > The ability to read and write a variety of image formats. ImageMagick > has a good set. All of ImageMagick should be wrapped. > I have wrappers for ImageMagick done (for Numeric). See pylab.sourceforge.net -Travis From perry at stsci.edu Tue Jan 6 07:49:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 6 07:49:00 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <200401061330.18932.verveer@embl-heidelberg.de> Message-ID: > Hi Ed, > > A cople of things I would like to see: > > > > The ability to read and write a variety of image formats. > > That is of course important. But in my view really a separate issue from > developing a library of analysis routines. The latter just have > to operate on > numarray arrays and need not to worry about how the data gets there. Of > course you need to get your data in numarray. PIL seems to do a > good job with > images, except for 16bit tiffs which causes me quiet some > problems. Anybody > know a good solution for getting 16bit tiffs into numarray? > I'd agree that support for image formats should be decoupled from processing functions > >ImageMagick > > has a good set. All of ImageMagick should be wrapped. > > Isn't there already a python interface to ImageMagick? > > Perhaps we should look at how much work it would be to adopt Travis's wrapped version for numarray. It may be fairly simple to do if his version uses the the more common api calls. Perry From edcjones at erols.com Tue Jan 6 08:54:01 2004 From: edcjones at erols.com (Edward C. Jones) Date: Tue Jan 6 08:54:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: References: Message-ID: <3FFAE713.2020506@erols.com> Perry Greenfield wrote: >>Hi Ed, >> >> >>>A cople of things I would like to see: >>> >>>The ability to read and write a variety of image formats. >>> >>> >>That is of course important. But in my view really a separate issue from >>developing a library of analysis routines. The latter just have >>to operate on >>numarray arrays and need not to worry about how the data gets there. Of >>course you need to get your data in numarray. PIL seems to do a >>good job with >>images, except for 16bit tiffs which causes me quiet some >>problems. Anybody >>know a good solution for getting 16bit tiffs into numarray? >> >> >> >I'd agree that support for image formats should be decoupled >from processing functions > > > >>>ImageMagick >>>has a good set. All of ImageMagick should be wrapped. >>> >>> >>Isn't there already a python interface to ImageMagick? >> >> >> >> >Perhaps we should look at how much work it would be to adopt >Travis's wrapped version for numarray. It may be fairly simple >to do if his version uses the the more common api calls. > >Perry > > I have checked this out a bit. All the Numeric function calls are among the ones that numarray emulates. All but one of them seem to be properly DECREFed. The exception is in "imageobject.c", line 973, where "bitobj" is created. Also: ImageMagick was forked, producing GraphicsMagick. The two are very similar. Which is better to use? Ed Jones From rays at san.rr.com Tue Jan 6 22:52:01 2004 From: rays at san.rr.com (RJS) Date: Tue Jan 6 22:52:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <5.2.1.1.2.20040106220631.00aee968@pop-server.san.rr.com> > I have uploaded a new version of my small image processing system IM to > "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the code > in IM (pronounced "I'm") is inferior to "nd_image" so I will eventually > convert it all to "nd_image". ... > nd_image is however also still being developed and I am looking for directions to > further work on. I wondered if there is anything you would like to see in there? I have been working with Pythonmagic and numarray for a particular astronomy project/technique, and IM has a few things I might use; nd_image also has some interesting functions as well. I want to align and specially stack 8-bit grayscale images from a FITS cube, or BMP set, currently. So, my suggestions (hint, hint) are: 1. A method to shift an array to efficiently give the best alignment with another. My brute force shifting and subtracting from the main image is slow... Most programs I have seen align a selected sub-image, then shift the whole image/array (without rotation, although that would be desirable) My _main_ objective is to stack progressively-longer-exposure 8-bit images into 16-bits, with the clipped pixels of longer exposures ignored in the summing process. The value of each pixel must be weighted inversely proportionately to it's exposure length (so shorter exposures "fill in" the clipped areas of the long exposures). So: 2. A fast method(ology) to do weighted sums of 2D arrays with a mask available for each array. I really do commend Peter and Edward for their contribution! By the way, if you do wxPython and haven't tried Boa Constructor, you might like it. I have been using the CVS version (now .2.8) for a few months, and it's working nicely. Ray Schumacher http://rjs.org/astro/1004x/ From verveer at embl-heidelberg.de Wed Jan 7 04:45:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Wed Jan 7 04:45:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <5.2.1.1.2.20040106220631.00aee968@pop-server.san.rr.com> References: <5.2.1.1.2.20040106220631.00aee968@pop-server.san.rr.com> Message-ID: <200401071344.24149.verveer@embl-heidelberg.de> On Wednesday 07 January 2004 07:50, RJS wrote: > > I have uploaded a new version of my small image processing system IM to > > "http://members.tripod.com/~edcjones/IM-01.01.04.tar.gz". Most of the > > code in IM (pronounced "I'm") is inferior to "nd_image" so I will > > eventually convert it all to "nd_image". > > ... > > > nd_image is however also still being developed and I am looking for > > directions to > > > further work on. I wondered if there is anything you would like to see > > in there? > > I have been working with Pythonmagic and numarray for a particular > astronomy project/technique, and IM has a few things I might use; nd_image > also has some interesting functions as well. > > I want to align and specially stack 8-bit grayscale images from a FITS > cube, or BMP set, currently. So, my suggestions (hint, hint) are: > 1. A method to shift an array to efficiently give the best alignment with > another. My brute force shifting and subtracting from the main image is > slow... Most programs I have seen align a selected sub-image, then shift > the whole image/array (without rotation, although that would be desirable) If I understand you well, you essentially want to estimate a shift between two images. I have some code that can do that. I do not intend to include that in nd_image for now, but I can send you the code. > My _main_ objective is to stack progressively-longer-exposure 8-bit images > into 16-bits, with the clipped pixels of longer exposures ignored in the > summing process. The value of each pixel must be weighted inversely > proportionately to it's exposure length (so shorter exposures "fill in" the > clipped areas of the long exposures). > > So: > 2. A fast method(ology) to do weighted sums of 2D arrays with a mask > available for each array. I think this can be achieved relatively easily with standard numarray operations. Cheers, Peter From haase at msg.ucsf.edu Wed Jan 7 11:44:01 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed Jan 7 11:44:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system References: <3FF624C5.7010400@erols.com> <200401051243.50605.verveer@embl-heidelberg.de> <3FF9FF44.8030606@erols.com> <200401061330.18932.verveer@embl-heidelberg.de> Message-ID: <024a01c3d556$8075bf90$421ee6a9@rodan> > Hi Ed, > > A cople of things I would like to see: > > > > The ability to read and write a variety of image formats. > > That is of course important. But in my view really a separate issue from > developing a library of analysis routines. The latter just have to operate on > numarray arrays and need not to worry about how the data gets there. Of > course you need to get your data in numarray. PIL seems to do a good job with > images, except for 16bit tiffs which causes me quiet some problems. Anybody > know a good solution for getting 16bit tiffs into numarray? > Hi Peter, When did you try that ? My info is the PIL released within the last few month version 1.1.4 which does the job. this is from http://effbot.org/zone/pil-changes-114.htm: (1.1.4a2 released) + Improved support for 16-bit unsigned integer images (mode "I;16"). This includes TIFF reader support, and support for "getextrema" and "point" (from Klamer Shutte). (Ooops: PIL 1.1.4 final was released on May 10, 2003. (time flies ...) Regards, Sebastian From rays at san.rr.com Wed Jan 7 22:35:04 2004 From: rays at san.rr.com (RJS) Date: Wed Jan 7 22:35:04 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <5.2.1.1.2.20040107221315.023e4e90@pop-server.san.rr.com> Hello Peter, > On Wednesday 07 January 2004 07:50, RJS wrote: > > Most programs I have seen align a selected sub-image, then shift > > the whole image/array (without rotation, although that would be desirable) > If I understand you well, you essentially want to estimate a shift between two > images. I have some code that can do that. I do not intend to include that in > nd_image for now, but I can send you the code. Yes, please. I sure that it's better/faster than my PIL or PythonMagick efforts. I don't know about machine vision etc, but shift is indispensable for video astronomy. > > 2. A fast method(ology) to do weighted sums of 2D arrays with a mask > > available for each array. > I think this can be achieved relatively easily with standard numarray > operations. Yes, it is straight-forward, in a way, but I'm always scouring the net for C and Python algorithms. Very (most?) often they're better than my own. This app is really a proof-of-concept; hopefully others will incorporate the "clipped image stacking" into their already fine astro apps. The problem is that the standard methods - median, mean, or summing - all suffer when long images in unequal exposure stacks have large clipped regions. Thanks, Ray http://rjs.org From verveer at embl-heidelberg.de Thu Jan 8 01:42:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Thu Jan 8 01:42:01 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system In-Reply-To: <024a01c3d556$8075bf90$421ee6a9@rodan> References: <3FF624C5.7010400@erols.com> <200401061330.18932.verveer@embl-heidelberg.de> <024a01c3d556$8075bf90$421ee6a9@rodan> Message-ID: <200401081041.13964.verveer@embl-heidelberg.de> Hi Sebastian, I use the 1.1.4 final version. I do however, have images that are not read by PIL ('cannot identify image file'). I think these files are okay, since I can read them in a scientific imaging program. So maybe the 16bit support in PIL is not complete. Peter On Wednesday 07 January 2004 20:43, Sebastian Haase wrote: > > know a good solution for getting 16bit tiffs into numarray? > > Hi Peter, > When did you try that ? My info is the PIL released within the last few > month version 1.1.4 which does the job. > this is from http://effbot.org/zone/pil-changes-114.htm: > > (1.1.4a2 released) > > + Improved support for 16-bit unsigned integer images (mode "I;16"). > This includes TIFF reader support, and support for "getextrema" > and "point" (from Klamer Shutte). > > (Ooops: PIL 1.1.4 final was released on May 10, 2003. (time flies ...) From nadavh at visionsense.com Thu Jan 8 04:05:00 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu Jan 8 04:05:00 2004 Subject: [Numpy-discussion] Update for IM. a small image processing system Message-ID: <07C6A61102C94148B8104D42DE95F7E8066942@exchange2k.envision.co.il> I am producing and reading 16 bit tiff files using PIL. These files however can not be displayed by most image processing programs (gimp does fine). Nadav -----Original Message----- From: Peter Verveer [mailto:verveer at embl-heidelberg.de] Sent: Thu 08-Jan-04 11:41 To: Sebastian Haase; numpy-discussion at lists.sourceforge.net Cc: Subject: Re: [Numpy-discussion] Update for IM. a small image processing system Hi Sebastian, I use the 1.1.4 final version. I do however, have images that are not read by PIL ('cannot identify image file'). I think these files are okay, since I can read them in a scientific imaging program. So maybe the 16bit support in PIL is not complete. Peter On Wednesday 07 January 2004 20:43, Sebastian Haase wrote: > > know a good solution for getting 16bit tiffs into numarray? > > Hi Peter, > When did you try that ? My info is the PIL released within the last few > month version 1.1.4 which does the job. > this is from http://effbot.org/zone/pil-changes-114.htm: > > (1.1.4a2 released) > > + Improved support for 16-bit unsigned integer images (mode "I;16"). > This includes TIFF reader support, and support for "getextrema" > and "point" (from Klamer Shutte). > > (Ooops: PIL 1.1.4 final was released on May 10, 2003. (time flies ...) ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From tim.hochberg at ieee.org Thu Jan 8 15:38:02 2004 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Thu Jan 8 15:38:02 2004 Subject: [Numpy-discussion] License.txt inclussion breaks McMillan's Installer. Message-ID: <3FFDE9A2.4040806@ieee.org> The way LICENSE.txt is included in the __init__ file for numarray breaks McMillan's installer (and probably py2exe as well, although I haven't checked that). The offending line is: __LICENSE__ = open(_os.path.join(__path__[0],"LICENSE.txt")).read() The first problem is that the installer doesn't pick up the dependancy on LICENSE.txt. That's not a huge deal as it's relatively simple to add that to the list of dependancy's by hand. More serious is that the __path__ variable is bogus in an installer archive so that the reading of the license file fails, even if it's present. One solution is just include the license text directly instead of reading it from a separate file. This is simple and the license is short enough that this shouldn't clutter things too much. It's not like there's all that much in the __init__ file anyway <0.5 wink>. A second solution is to wrap the above incantation in try, except; however, this doesn't guarantee that the license file is included. A third solution is to come up with a different incantation that works for installer. I've looked at this briefly and it looks a little messy. Nevertheless, I'll come up with something that works if this is deemed the preferred solution. Someone else will have to figure out what works with py2exe. [ If the above makes no sense to those of you unfamilar with McMillan's installer, I apologize -- ask away and I'll try to clarify] Regards -tim From jmiller at stsci.edu Fri Jan 9 05:52:03 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 9 05:52:03 2004 Subject: [Numpy-discussion] License.txt inclussion breaks McMillan's Installer. In-Reply-To: <3FFDE9A2.4040806@ieee.org> References: <3FFDE9A2.4040806@ieee.org> Message-ID: <1073656204.10007.23.camel@halloween.stsci.edu> On Thu, 2004-01-08 at 18:37, Tim Hochberg wrote: > > The way LICENSE.txt is included in the __init__ file for numarray breaks > McMillan's installer (and probably py2exe as well, although I haven't > checked that). The offending line is: > > __LICENSE__ = open(_os.path.join(__path__[0],"LICENSE.txt")).read() > > > The first problem is that the installer doesn't pick up the dependancy > on LICENSE.txt. That's not a huge deal as it's relatively simple to add > that to the list of dependancy's by hand. > > More serious is that the __path__ variable is bogus in an installer > archive so that the reading of the license file fails, even if it's present. > > One solution is just include the license text directly instead of > reading it from a separate file. This is simple and the license is short > enough that this shouldn't clutter things too much. It's not like > there's all that much in the __init__ file anyway <0.5 wink>. I like this solution the best from the perspective of simplicity and fool-proof-ness. I had considered it before but rejected it as leading to duplication of the license. Now I realize I can just "put a symbolic link" in LICENSE.txt and move the actual text of the license to __init__.py as you suggest. This is fixed in CVS now. Todd > A second solution is to wrap the above incantation in try, except; > however, this doesn't guarantee that the license file is included. > > A third solution is to come up with a different incantation that works > for installer. I've looked at this briefly and it looks a little messy. > Nevertheless, I'll come up with something that works if this is deemed > the preferred solution. Someone else will have to figure out what works > with py2exe. > > [ If the above makes no sense to those of you unfamilar with McMillan's > installer, I apologize -- ask away and I'll try to clarify] > > Regards > > -tim > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Perforce Software. > Perforce is the Fast Software Configuration Management System offering > advanced branching capabilities and atomic changes on 50+ platforms. > Free Eval! http://www.perforce.com/perforce/loadprog.html > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From cookedm at physics.mcmaster.ca Fri Jan 9 06:51:00 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Fri Jan 9 06:51:00 2004 Subject: [Numpy-discussion] License.txt inclussion breaks McMillan's Installer. In-Reply-To: <1073656204.10007.23.camel@halloween.stsci.edu> References: <3FFDE9A2.4040806@ieee.org> <1073656204.10007.23.camel@halloween.stsci.edu> Message-ID: <20040109144917.GA3957@arbutus.physics.mcmaster.ca> On Fri, Jan 09, 2004 at 08:50:04AM -0500, Todd Miller wrote: > On Thu, 2004-01-08 at 18:37, Tim Hochberg wrote: > > > > The way LICENSE.txt is included in the __init__ file for numarray breaks > > McMillan's installer (and probably py2exe as well, although I haven't > > checked that). The offending line is: > > > > __LICENSE__ = open(_os.path.join(__path__[0],"LICENSE.txt")).read() > > > > > > The first problem is that the installer doesn't pick up the dependancy > > on LICENSE.txt. That's not a huge deal as it's relatively simple to add > > that to the list of dependancy's by hand. > > > > More serious is that the __path__ variable is bogus in an installer > > archive so that the reading of the license file fails, even if it's present. > > > > One solution is just include the license text directly instead of > > reading it from a separate file. This is simple and the license is short > > enough that this shouldn't clutter things too much. It's not like > > there's all that much in the __init__ file anyway <0.5 wink>. > > I like this solution the best from the perspective of simplicity and > fool-proof-ness. I had considered it before but rejected it as leading > to duplication of the license. Now I realize I can just "put a symbolic > link" in LICENSE.txt and move the actual text of the license to > __init__.py as you suggest. > > This is fixed in CVS now. > > Todd I have to admit that I read the problem above and thought, WHAT? numarray already takes longer to import than Numeric; you mean some of that time it's reading in a license file I'll never look at? Compare: $ time python -c 'import numarray' real 0m0.230s user 0m0.230s sys 0m0.000s $ time python -c 'import Numeric' real 0m0.076s user 0m0.050s sys 0m0.020s [final results after running each a couple times to get it in cache] numarray takes 3 times longer to import than Numeric. I know, it's only 0.154 s difference, but that's noticeable for small scripts. [Ok, so I just tested the change to reading the license, and I don't see any change in import times :-)] If I had any time, I'd look at making it import faster. Some playing around with the hotshot profiler shows that most of the time is spent in numarray.ufunc._makeCUFuncDict. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From childliteracyraffle204 at yahoo.com Fri Jan 9 14:47:00 2004 From: childliteracyraffle204 at yahoo.com (childliteracyraffle204 at yahoo.com) Date: Fri Jan 9 14:47:00 2004 Subject: [Numpy-discussion] Car Raffle Donate to Charity Cadillac Raffle Message-ID: <200401091554210788.001199B2@127.0.0.1> Car Raffle Donate to Charity Cadillac Raffle http://www.ChildLiteracy.org/ Current Raffles 2003 BLACK CADILLAC DEVILLE DTS $100 per ticket Mitsubishi's 2002 Montero Sport LS $35 per ticket Pioneer PDP-4330HD 43" Plasma TV $20 per ticket http://www.ChildLiteracy.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jochen at fhi-berlin.mpg.de Tue Jan 13 23:59:00 2004 From: jochen at fhi-berlin.mpg.de (=?iso-8859-1?q?Jochen_K=FCpper?=) Date: Tue Jan 13 23:59:00 2004 Subject: [Numpy-discussion] numarray setup.py Message-ID: Sometime after v0.8 setup.py was changed to include some 'classifiers'. Doesn't work for me (python 2.2.2 on FreeBSD): ,----[python setup.py install --home=~/install/freebsd-x86] | Using EXTRA_COMPILE_ARGS = [] | error in setup script: invalid distribution option 'classifiers' `---- Greetings, Jochen -- Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper.de Libert?, ?galit?, Fraternit? GnuPG key: CC1B0B4D (Part 3 you find in my messages before fall 2003.) From jmiller at stsci.edu Wed Jan 14 02:52:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 14 02:52:01 2004 Subject: [Numpy-discussion] numarray setup.py In-Reply-To: References: Message-ID: <1074077466.3718.26.camel@localhost.localdomain> On Wed, 2004-01-14 at 02:58, Jochen K?pper wrote: > Sometime after v0.8 setup.py was changed to include some > 'classifiers'. Doesn't work for me (python 2.2.2 on FreeBSD): I removed the classifiers for Pythons < 2.3. This is (theoretically) fixed in CVS and tested against 2.2.3. Let me know if there's still a problem with 2.2.2. Todd > > ,----[python setup.py install --home=~/install/freebsd-x86] > | Using EXTRA_COMPILE_ARGS = [] > | error in setup script: invalid distribution option 'classifiers' > `---- > > Greetings, > Jochen -- Todd Miller From jochen at fhi-berlin.mpg.de Wed Jan 14 05:15:02 2004 From: jochen at fhi-berlin.mpg.de (=?iso-8859-1?q?Jochen_K=FCpper?=) Date: Wed Jan 14 05:15:02 2004 Subject: [Numpy-discussion] numarray setup.py In-Reply-To: <1074077466.3718.26.camel@localhost.localdomain> (Todd Miller's message of "Wed, 14 Jan 2004 05:51:06 -0500") References: <1074077466.3718.26.camel@localhost.localdomain> Message-ID: On Wed, 14 Jan 2004 05:51:06 -0500 Todd Miller wrote: Todd> On Wed, 2004-01-14 at 02:58, Jochen K?pper wrote: >> Sometime after v0.8 setup.py was changed to include some >> 'classifiers'. Doesn't work for me (python 2.2.2 on FreeBSD): Todd> I removed the classifiers for Pythons < 2.3. This is Todd> (theoretically) fixed in CVS and tested against 2.2.3. Let me Todd> know if there's still a problem with 2.2.2. Seems to work now. Greetings, Jochen -- Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper.de Libert?, ?galit?, Fraternit? GnuPG key: CC1B0B4D (Part 3 you find in my messages before fall 2003.) From cjw at sympatico.ca Thu Jan 15 06:31:04 2004 From: cjw at sympatico.ca (Colin J. Williams) Date: Thu Jan 15 06:31:04 2004 Subject: [Numpy-discussion] _clone, copy, view Message-ID: <4006A420.60500@sympatico.ca> It would help if someone could describe the intended functional differences between _clone, copy and view in numarray. Colin W. From jmiller at stsci.edu Thu Jan 15 07:07:06 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 15 07:07:06 2004 Subject: [Numpy-discussion] _clone, copy, view In-Reply-To: <4006A420.60500@sympatico.ca> References: <4006A420.60500@sympatico.ca> Message-ID: <1074179078.2009.32.camel@halloween.stsci.edu> On Thu, 2004-01-15 at 09:30, Colin J. Williams wrote: > It would help if someone could describe the intended functional > differences between _clone, copy and view in numarray. a.copy() returns a new array object with a copy of a's data. a's dictionary attributes are currently aliased, not deep copied. That may not be the way it should be. The copy is assumed to be a C_ARRAY, meaning it is aligned, not byteswapped, and contiguous. Thus, copy() is sometimes used as a cleanup operation. a.view() returns a shallow copy of a. Most importantly, the shallow copy aliases the same data as a. Views are used to look at the same data buffer in some new way, perhaps with a different shape, or perhaps as a subset of the original data. A view has the same special properties as the original, e.g. if the original is byteswapped, so is the view. a._clone() is an implementation detail of the generic take() method. The generic take() method is overridden for numerical arrays, but utilized for object arrays. clone()'s purpose is to return a new array of the same type as 'a' but with a different shape and total number of elements. Thus, clone() is useful for creating result arrays for take based on the input array being taken from. > > Colin W. > -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From edcjones at erols.com Fri Jan 16 07:39:03 2004 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 16 07:39:03 2004 Subject: [Numpy-discussion] Problem with Sourceforge mailing list archives? Message-ID: <40080487.4010309@erols.com> The "Numpy-discussion Archives" at "https://lists.sourceforge.net/lists/listinfo/numpy-discussion" are down or missing. Is there a problem? From edcjones at erols.com Fri Jan 16 07:52:00 2004 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 16 07:52:00 2004 Subject: [Numpy-discussion] Tabs in numarray code Message-ID: <40080780.3040904@erols.com> What are the policies about tab characters in numarray Python and C code? What are the policies about indentation in numarray Python and C code? The following small program found a bunch of tabs in numarray code: -------- #! /usr/local/bin/python import os topdir = '/usr/local/src/numarray-0.8/' for dirpath, dirnames, filenames in os.walk(topdir): for name in filenames: if name.endswith('.py'): fullname = os.path.join(dirpath, name) lines = file(fullname, 'r').read().splitlines() for i, line in enumerate(lines): if '\t' in line: print fullname[len(topdir):], i+1, line From jmiller at stsci.edu Fri Jan 16 08:12:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 16 08:12:02 2004 Subject: [Numpy-discussion] Tabs in numarray code In-Reply-To: <40080780.3040904@erols.com> References: <40080780.3040904@erols.com> Message-ID: <1074269385.3715.19.camel@halloween.stsci.edu> On Fri, 2004-01-16 at 10:47, Edward C. Jones wrote: > What are the policies about tab characters in numarray Python and C > code? What are the policies about indentation in numarray Python and C code? The policy is "no tabs". Indentation in Python and C is 5 spaces per level. Enforcement of the policies is obviously currently lacking. Todd > > The following small program found a bunch of tabs in numarray code: > -------- > #! /usr/local/bin/python > > import os > > topdir = '/usr/local/src/numarray-0.8/' > for dirpath, dirnames, filenames in os.walk(topdir): > for name in filenames: > if name.endswith('.py'): > fullname = os.path.join(dirpath, name) > lines = file(fullname, 'r').read().splitlines() > for i, line in enumerate(lines): > if '\t' in line: > print fullname[len(topdir):], i+1, line > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From jmiller at stsci.edu Fri Jan 16 08:15:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 16 08:15:02 2004 Subject: [Numpy-discussion] Tabs in numarray code In-Reply-To: <1074269385.3715.19.camel@halloween.stsci.edu> References: <40080780.3040904@erols.com> <1074269385.3715.19.camel@halloween.stsci.edu> Message-ID: <1074269561.4020.21.camel@halloween.stsci.edu> On Fri, 2004-01-16 at 11:09, Todd Miller wrote: > On Fri, 2004-01-16 at 10:47, Edward C. Jones wrote: > > What are the policies about tab characters in numarray Python and C > > code? What are the policies about indentation in numarray Python and C code? > > The policy is "no tabs". > Indentation in Python and C is 5 spaces per level. Actually, I meant *4* spaces, and enforcement is somewhat worse than I thought. > Enforcement of the policies is obviously currently lacking. > > Todd > > > > > The following small program found a bunch of tabs in numarray code: > > -------- > > #! /usr/local/bin/python > > > > import os > > > > topdir = '/usr/local/src/numarray-0.8/' > > for dirpath, dirnames, filenames in os.walk(topdir): > > for name in filenames: > > if name.endswith('.py'): > > fullname = os.path.join(dirpath, name) > > lines = file(fullname, 'r').read().splitlines() > > for i, line in enumerate(lines): > > if '\t' in line: > > print fullname[len(topdir):], i+1, line > > > > > > > > > > ------------------------------------------------------- > > The SF.Net email is sponsored by EclipseCon 2004 > > Premiere Conference on Open Tools Development and Integration > > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > > http://www.eclipsecon.org/osdn > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > -- > Todd Miller > Space Telescope Science Institute > 3700 San Martin Drive > Baltimore MD, 21030 > (410) 338 - 4576 > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From haase at msg.ucsf.edu Fri Jan 16 16:02:01 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Jan 16 16:02:01 2004 Subject: [Numpy-discussion] numarray.records - get/set item References: <030401c3bac6$96382a20$421ee6a9@rodan> Message-ID: <01b901c3dc8d$051b71d0$421ee6a9@rodan> Hi everybody, I would like to check if there has been made a decision on this ? I'm planning to use record arrays to access image data header-information and having an attribute 'f' like suggested is still my favorite way. Is anyone besides me using record arrays on memory-mapped buffers ? Thanks, Sebastian Haase ----- Original Message ----- From: "Sebastian Haase" To: Sent: Thursday, December 04, 2003 4:27 PM Subject: Fw: [Numpy-discussion] numarray.records - get/set item > My situation where I got onto this, is having one field named 'mmm' > ("MinMaxMean") being an 3 element array. > Now, to assign the values first I tried: > self.hdrArray = makeHdrArray(self.h) #this makes the record array > self.hdr = self.hdrArray[0].field #this is my shortcut to the > bound member function > # it essentially is a solution (hack) for the getitem part > # but regarding setitem I had to learn that "assigning to a function" is > illigal in Python - as opposed to C++ > #so to do assignment I need to do: > self.hdr('mmm')[0], self.hdr('mmm')[1], self.hdr('mmm')[2] = (mi,ma,av) > > now that I'm looking at it, > self.hdrArray[0].setfield('mmm', (mi,ma,av)) > would probably be better... > > How about adding an attribute 'f' which could serve as a "proxy" to allow: > myRec.f.mmm = (mi,ma,av) > and maybe even additionally: > myRec.f['mmm'] = (mi,ma,av) > > Regards, > Sebastian > > > > ----- Original Message ----- > From: "Perry Greenfield" > To: "Sebastian Haase" ; > > Sent: Thursday, December 04, 2003 3:08 PM > Subject: RE: [Numpy-discussion] numarray.records - get/set item > > > > > Hi, > > > Is it maybe a good idea to add this to the definition of 'class Record' > > > > > > class Record: > > > """Class for one single row.""" > > > > > > def __getitem__(self, fieldName): > > > return self.array.field(fieldName)[self.row] > > > def __setitem__(self, fieldName, value): > > > self.array.field(fieldName)[self.row] = value > > > > > > I don't know about the implications if __delitem __ and so on are not > > > defined. > > > I just think it would look quite nice to say > > > myRecArr[0]['mmm'] = 'hallo' > > > as opposed to > > > myRecArr[0].setfield('mmm', 'hallo') > > > > > > Actually I would even like > > > myRecArr[0].mmm = 'hallo' > > > > > > This should be possible by defining __setattr__. > > > It would obviously only work for fieldnames that do not contain '.' or ' > ' > > > or ... > > > > > > Any comments ? > > > > > > > > We've had many internal discussions about doing this. The latter was > > considered a problem because of possible name collisions of field > > names with other attributes or methods. The former is not bothered > > by this problem, but we decided to be conservative on this and see > > how strong the need was. We are interested in other opinions. > > > > Perry > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From oliphant at ee.byu.edu Mon Jan 19 13:34:07 2004 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Jan 19 13:34:07 2004 Subject: [Numpy-discussion] Status of Numeric Message-ID: <400C3EF3.8090005@ee.byu.edu> Numarray is making great progress and is quite usable for many purposes. An idea that was championed by some is that the Numeric code base would stay static and be replaced entirely by Numarray. However, Numeric is currently used in a large installed base. In particular SciPy uses Numeric as its core array. While no doubt numarray arrays will be supported in the future, the speed of the less bulky Numeric arrays and the typical case that we encounter in SciPy of many, small arrays will make it difficult for people to abandon Numeric entirely with it's comparatively light-weight arrays. In the development of SciPy we have encountered issues in Numeric that we feel need to be fixed. As this has become an important path to success of several projects (both commercial and open) it is absolutely necessary that this issues be addressed. The purpose of this email is to assess the attitude of the community regarding how these changes to Numeric should be accomplished. These are the two options we can see: * freeze old Numeric 23.x and make all changes to Numeric 24.x still keeping Numeric separate from SciPy * freeze old Numeric 23.x and subsume Numeric into SciPy essentially creating a new SciPy arrayobject that is fast and lightweight. Anybody wanting this new array object would get it by installing scipy_base. Numeric would never change in the future but the array in scipy_base would. It is not an option to wait for numarray to get fast enough as these issues need to be addressed now. Ultimately I think it will be a wise thing to have two implementations of arrays: one that is fast and lightweight optimized for many relatively small arrays, and another that is optimized for large-scale arrays. Eventually, the use of these two underlying implementations should be automatic and invisible to the user. A few of the particular changes we need to make to the Numeric arrayobject are: 1) change the coercion model to reflect Numarray's choice and eliminate the savespace crutch. 2) Add indexing capability to Numeric arrays (similar to Numarray's) 3) Improve the interaction between Numeric arrays and scalars. 4) Optimization: Again, these changes are going to be made to some form of the Numeric arrays. What I am really interested in knowing is the attitude of the community towards keeping Numeric around. If most of the community wants to see Numeric go away then we will be forced to bring the Numeric array under the SciPy code-base and own it there. Your feedback is welcome and appreciated. Sincerely, Travis Oliphant and other SciPy developers From perry at stsci.edu Mon Jan 19 14:14:05 2004 From: perry at stsci.edu (Perry Greenfield) Date: Mon Jan 19 14:14:05 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> Message-ID: Travis Oliphant writes: > > Numarray is making great progress and is quite usable for many > purposes. An idea that was championed by some is that the Numeric code > base would stay static and be replaced entirely by Numarray. > > However, Numeric is currently used in a large installed base. In > particular SciPy uses Numeric as its core array. While no doubt > numarray arrays will be supported in the future, the speed of the less > bulky Numeric arrays and the typical case that we encounter in SciPy of > many, small arrays will make it difficult for people to abandon Numeric > entirely with it's comparatively light-weight arrays. > I'd like to ask if the numarray option couldn't at least be considered. In particular with regard to speed, we'd like to know what the necessary threshold is. For many ufuncs, numarray is within a factor of 3 or so of Numeric for small arrays. Is this good enough or not? What would be good enough? It would probably be difficult to make it as fast in all cases, but how close does it have to be? A factor of 2? 1.5? We haven't gotten very much feedback on specific numbers in this regard. Are there other aspects of numarray performance that are a problem? What specifically? We don't have the resources to optimize everything in case it might affect someone. We need to know that it is particular problem with users to give it some priority (and know what the necessary threshold is for acceptable performance). Perhaps the two (Numeric and numarray) may need to coexist for a while, but we would like to isolate the issues that make that necessary. That hasn't really happened yet. Travis, do you have any specific nummarray speed issues that have arisen from your benchmarking or use that we can look at? Perry Greenfield From hinsen at cnrs-orleans.fr Tue Jan 20 03:16:02 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Tue Jan 20 03:16:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> References: <400C3EF3.8090005@ee.byu.edu> Message-ID: On 19.01.2004, at 21:32, Travis Oliphant wrote: > These are the two options we can see: > * freeze old Numeric 23.x and make all changes to Numeric 24.x still > keeping Numeric separate from SciPy > * freeze old Numeric 23.x and subsume Numeric into SciPy essentially > creating a new SciPy arrayobject that is fast and lightweight. > Anybody wanting this new array object would get it by installing > scipy_base. Numeric would never change in the future but the array in > scipy_base would. That depends on the exact nature of the changes. My view is that any package that is upwards-compatible with Numeric (except for bug fixes of course) should be called Numeric and distributed as such. Any package that is intentionally incompatible with Numeric in some important aspect should not be called Numeric. There is a lot of code out there that builds on Numeric, and some of it is hardly maintained any more, although there are still users around. Those users expect to be able to upgrade Numeric without breaking their code. Konrad. From p.magwene at snet.net Tue Jan 20 05:59:01 2004 From: p.magwene at snet.net (Paul Magwene) Date: Tue Jan 20 05:59:01 2004 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #808 - 2 msgs In-Reply-To: <200401200407.i0K47Cfg004905@mta1.snet.net> References: <200401200407.i0K47Cfg004905@mta1.snet.net> Message-ID: <400D341F.1060504@snet.net> > > --__--__-- > > Message: 1 > Date: Mon, 19 Jan 2004 14:32:51 -0600 > From: Travis Oliphant > To: numpy-discussion at lists.sourceforge.net, python-list at python.org > Subject: [Numpy-discussion] Status of Numeric > > > The purpose of this email is to assess the attitude of the community > regarding how these changes to Numeric should be accomplished. > > These are the two options we can see: > * freeze old Numeric 23.x and make all changes to Numeric 24.x still > keeping Numeric separate from SciPy > * freeze old Numeric 23.x and subsume Numeric into SciPy essentially > creating a new SciPy arrayobject that is fast and lightweight. Anybody > wanting this new array object would get it by installing scipy_base. > Numeric would never change in the future but the array in scipy_base would. My preference would be for option #1 -- continue further development of Numeric as a separate package with new improvements going into the 24.x series. It's my experience that when projects get subsumed, additional requirements tend to creep in, even if they're not actually "required." --Paul Magwene From falted at openlc.org Tue Jan 20 09:45:04 2004 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 20 09:45:04 2004 Subject: [Numpy-discussion] numarray 0.8 and MacOSX Message-ID: <200401201844.31406.falted@openlc.org> Hi, I'm trying to compile numarray 0.8 on a MacOSX (Darwin 6.8). The compilation process seemed to go well, but an error happens when trying to import numarray: [falted at ppc-osx2:numarray-0.8]$ python Python 2.2 (#1, 10/24/02, 16:10:52) [GCC Apple cpp-precomp 6.14] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numarray Traceback (most recent call last): File "", line 1, in ? File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/__init__.py", line 11, in ? from numarrayall import * File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarrayall.py", line 2, in ? from generic import * File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/generic.py", line 1030, in ? import numarraycore as _nc File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarraycore.py", line 29, in ? PyINT_TYPES = { NameError: name 'bool' is not defined I know that there are available ports of numarray 0.8 to Darwin, but, for a series of reasons, I prefer to compile it for myself. Anyone can provide a hint so as to compile it cleanly? Thanks, -- Francesc Alted From jmiller at stsci.edu Tue Jan 20 10:07:03 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Jan 20 10:07:03 2004 Subject: [Numpy-discussion] numarray 0.8 and MacOSX In-Reply-To: <200401201844.31406.falted@openlc.org> References: <200401201844.31406.falted@openlc.org> Message-ID: <1074621875.20653.22.camel@halloween.stsci.edu> On Tue, 2004-01-20 at 12:44, Francesc Alted wrote: > Hi, > > I'm trying to compile numarray 0.8 on a MacOSX (Darwin 6.8). The compilation > process seemed to go well, but an error happens when trying to import > numarray: > > [falted at ppc-osx2:numarray-0.8]$ python > Python 2.2 (#1, 10/24/02, 16:10:52) > [GCC Apple cpp-precomp 6.14] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import numarray > Traceback (most recent call last): > File "", line 1, in ? > File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/__init__.py", > line 11, in ? > from numarrayall import * > File > "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarrayall.py", > line 2, in ? > from generic import * > File "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/generic.py", > line 1030, in ? > import numarraycore as _nc > File > "/home/users/f/fa/falted/bin-macosx//lib/python/numarray/numarraycore.py", > line 29, in ? > PyINT_TYPES = { > NameError: name 'bool' is not defined > > I know that there are available ports of numarray 0.8 to Darwin, but, for a > series of reasons, I prefer to compile it for myself. Anyone can provide a > hint so as to compile it cleanly? I tested numarray-0.8 on Darwin, but I tested it against user installed versions of Python 2.2.3 and 2.3.2. Both of these versions define bool. On the version of Mac OS-X I've got (10.2?), /usr/bin/python is 2.2.0, and it does not define bool. So, I don't think there is a clean compile, at least not for Mac users who don't also install updated versions of Python. Todd > > Thanks, > > -- > Francesc Alted > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From Chris.Barker at noaa.gov Tue Jan 20 11:13:05 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 20 11:13:05 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: References: <400C3EF3.8090005@ee.byu.edu> Message-ID: <400D7D66.5000504@noaa.gov> Konrad Hinsen wrote: > My view is that any > package that is upwards-compatible with Numeric (except for bug fixes > of course) should be called Numeric and distributed as such. Any > package that is intentionally incompatible with Numeric in some > important aspect should not be called Numeric. I absolutely agree with this. Travis Oliphant wrote: > 1) change the coercion model to reflect Numarray's choice and eliminate > the savespace crutch. > 2) Add indexing capability to Numeric arrays (similar to Numarray's) > 3) Improve the interaction between Numeric arrays and scalars. These all look like backward in-compatable changes, so in that case, I vote for Sci-py-array, or whatever. However, it also looks like these are all moving toward the Numarray API. Is this the case? That would be great, as then Numarray would just be dropped in if/when it is deemed up to the task. It also leaves the door open for some sort of automagic selection of which array to use for a given instance. > 4) Optimization: Nothing wrong with that...as long as it's not premature! > Numarray is making great progress and is quite usable for many > purposes. An idea that was championed by some is that the Numeric code > base would stay static and be replaced entirely by Numarray. > However, Numeric is currently used in a large installed base. In > particular SciPy uses Numeric as its core array. While no doubt > numarray arrays will be supported in the future, the speed of the less > bulky Numeric arrays and the typical case that we encounter in SciPy of > many, small arrays will make it difficult for people to abandon Numeric > entirely with it's comparatively light-weight arrays. It was said that making Numarray more efficient with small arrays was a goal of the project...is it still? I'm still unclear on why Numarrays are so much more "heavy"..is it just that no one has taken the time to optimize them, or is there really something inherent (and important) in the design? > As this has become an important path to > success of several projects (both commercial and open) it is absolutely > necessary that this issues be addressed. From the sammll list above, it looks like what you need is an array that is like a Numarray, but faster for samll arrays...Has anyone done an analysis of whether it would be harder to optimize Numarray than to make the above changes to Numeric, and continue to maintain two packages? You probably have, but I though I'd ask anyway... > Ultimately I think it will be a wise > thing to have two implementations of arrays: one that is fast and > lightweight optimized for many relatively small arrays, and another that > is optimized for large-scale arrays. Are these really incompatable goals? > If most of the community > wants to see Numeric go away then we will be forced to bring the > Numeric array under the SciPy code-base and own it there. I think it's quite the opposite... if most of the community wants to see Numeric continue on, it must be maintained (and improved) with little change to the API. If we're all going to switch to Numarray, then the SciPy project can do whatever it wants with Numeric... In Summary: - Anything called "Numeric" should have a compatable API to the current version - I'd much rather have just one N-d array type, preferable one that is part of the Python Standard Library...is likely to ever happen? - I also want fast small arrays. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From edcjones at erols.com Tue Jan 20 11:47:04 2004 From: edcjones at erols.com (Edward C. Jones) Date: Tue Jan 20 11:47:04 2004 Subject: [Numpy-discussion] How fast are small arrays currently? Message-ID: <400D848B.8050004@erols.com> Has anyone recently benchmarked the speed of numarray vs. Numeric? Why are numarrays so slow to create? From falted at openlc.org Tue Jan 20 12:30:02 2004 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 20 12:30:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400D7D66.5000504@noaa.gov> References: <400C3EF3.8090005@ee.byu.edu> <400D7D66.5000504@noaa.gov> Message-ID: <200401202129.20608.falted@openlc.org> A Dimarts 20 Gener 2004 20:11, Chris Barker va escriure: > > As this has become an important path to > > success of several projects (both commercial and open) it is absolutely > > necessary that this issues be addressed. > > From the sammll list above, it looks like what you need is an array > that is like a Numarray, but faster for samll arrays...Has anyone done > an analysis of whether it would be harder to optimize Numarray than to > make the above changes to Numeric, and continue to maintain two > packages? You probably have, but I though I'd ask anyway... I agree. An analysis should be done in order to see if it is better to concentrate in getting numarray better for small arrays or in having several array implementations. The problem is if numarray cannot be enhanced enough because of design problems, although I would bet that something can be done in order to get it close to Numeric performance. And I guess quite a bit people on this list would be happy to collaborate in some way or another so as to achieve this goal. However, as Perry says, in order to do this analysis, an amount of the needed speed-up should be estimated first. I personaly feel that it would worth the effort to go and try to optimize the small arrays case in numarray instead of having to fight against a jungle of Numeric/numarray/python array implementations. I strongly believe that numarray has enough advantages over Numeric that would compensate the effort to further enhance its present limitations rather than maintain several packages. Just my 2 cents, -- Francesc Alted From cookedm at physics.mcmaster.ca Tue Jan 20 12:33:01 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Jan 20 12:33:01 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <400D848B.8050004@erols.com> References: <400D848B.8050004@erols.com> Message-ID: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> On Tue, Jan 20, 2004 at 02:42:03PM -0500, Edward C. Jones wrote: > Has anyone recently benchmarked the speed of numarray vs. Numeric? Just what I was doing :-) Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing the two. Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an array of size N (of Float), the time to do a+a is Numeric: 3.7940e-6 + 2.2556e-8 * N seconds numarray: 3.7062e-5 + 5.8497e-9 * N For sin(a), Numeric: 1.7824e-6 + 1.1341e-7 * N numarray: 2.8994e-5 + 9.8985e-8 * N So the slowness of numarray vs. Numeric for small arrays is because of an overhead of 3.7e-5 s for numarray, as opposed to 3.8e-6 s for Numeric. Otherwise, numarray is 4 times faster for large arrays for addition (and multiplication, which I've also checked). The crossover is at arrays of about 2000 elements. If this overhead could be reduced by a factor of 3 or 4, I'd be much happier with using numarray for small arrays. But for now, it's not good enough. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From perry at stsci.edu Tue Jan 20 12:33:03 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 12:33:03 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <400D848B.8050004@erols.com> Message-ID: > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Edward > C. Jones > Sent: Tuesday, January 20, 2004 2:42 PM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] How fast are small arrays currently? > > > Has anyone recently benchmarked the speed of numarray vs. Numeric? > We presented some benchmarks at scipy 2003. It depends on many factors and what functions or operations are being performed so it is hard to generalize (one reason I ask for specific cases that need improvement). But to take ufuncs as examples: the speed for 1 element arrays (about as small as they get) has: v0.4 v0.5 Int32 + Int32 65 3.7 Int32 + Int32 discontiguous 104 7.3 Int32 + Float64 95 4.9 add.reduce(Int32) NxN swapaxes 111 3.6 add.reduce(Int32, -1) NxN 98 3.2 What is shown is the (time for numarray operation)/(time for Numeric), for v0.4 and v0.5. Note that with v0.5, these are typically 3 to 4 times slower for small arrays with a couple cases some what worse (a factor of 4.9 and 7.3). Speeds for v0.4 are substantially slower (orders of magnitude). Note that the speedup is obtained through caching certain information. The first time you perform a certain operation (say an Int32/Int16 add), it will be slow. When repeated it will be closer to that shown benchmark. If you are only going to do one operation on a small array, speed presumably doesn't matter much. It is only when you plan to iterate over many small arrays would it usually be an issue. Other functions may be much worse (or better). If people let us know which things are too slow we can put that on our to do list. Is a factor of 3 or 4 times slower a killer? What about a factor of 2? > Why are numarrays so slow to create? > I'll leave it to Todd to give the details of that. From verveer at embl.de Tue Jan 20 12:38:02 2004 From: verveer at embl.de (verveer at embl.de) Date: Tue Jan 20 12:38:02 2004 Subject: [Numpy-discussion] Status of Numeric Message-ID: <1074631021.400d916d09e59@webmail.embl.de> Just my 2 cents on the issue of replacing Numeric by Numarray: I was under the impression that Numarray was intended to be a replacement for Numeric, also as a building block for larger packages such as SciPy. Was Numarray not intended to be an "improved Numeric" in the first place? I chose to develop for Numarray rather than Numeric because of its improvements, under the assumption that eventually my code would also become available to the users of such packages as SciPy. (I wrote the nd_image extension that is now distributed with Numarray. I also contributed some improvements to RandomArray extension that are not in the Numeric version.) I believe that it would be a bad situation if the numerical python community would be split among two different array packages. (I think Paul Dubois expressed a similar sentiment on comp.lang.python). Supporting code for two incompatible packages would be a pain (I am personally not willing to do that). Not being able to use modules designed for one package in the other would be disappointing for many people, I think... If I understood well, the only issue with Numarray seems to be that the speed for handling small arrays is too low. So would it not be more efficient to focus on that problem rather than throwing away all the excellent work that has been done already on Numarray? Best regards, Peter -- Dr. Peter J. Verveer Cell Biology and Cell Biophysics Programme European Molecular Biology Laboratory Meyerhofstrasse 1 D-69117 Heidelberg Germany From perry at stsci.edu Tue Jan 20 12:39:03 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 12:39:03 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> Message-ID: David M. Cooke writes: > Just what I was doing :-) > > Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing > the two. > > Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an > array of size N (of Float), the time to do a+a is > > Numeric: 3.7940e-6 + 2.2556e-8 * N seconds > numarray: 3.7062e-5 + 5.8497e-9 * N > > For sin(a), > Numeric: 1.7824e-6 + 1.1341e-7 * N > numarray: 2.8994e-5 + 9.8985e-8 * N > > So the slowness of numarray vs. Numeric for small arrays is because of > an overhead of 3.7e-5 s for numarray, as opposed to 3.8e-6 s for > Numeric. Otherwise, numarray is 4 times faster for large arrays > for addition (and multiplication, which I've also checked). > > The crossover is at arrays of about 2000 elements. > > If this overhead could be reduced by a factor of 3 or 4, I'd be much > happier with using numarray for small arrays. But for now, it's not > good enough. > How many times do you do the operation for each size? Because of caching, the first result may be much slower than the rest. If you didn't could you try computing it by discarding the first numarray time (or start timing after doing the first iteration)? Thanks, Perry From perry at stsci.edu Tue Jan 20 12:42:03 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 12:42:03 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <1074631021.400d916d09e59@webmail.embl.de> Message-ID: Peter J. Verveer writes: > I was under the impression that Numarray was intended to be a > replacement for > Numeric, also as a building block for larger packages such as SciPy. Was > Numarray not intended to be an "improved Numeric" in the first > place? I chose > to develop for Numarray rather than Numeric because of its > improvements, under > the assumption that eventually my code would also become available to the > users of such packages as SciPy. (I wrote the nd_image extension > that is now > distributed with Numarray. I also contributed some improvements > to RandomArray > extension that are not in the Numeric version.) > It has been our intention to port scipy to use numarray soon. This work has been delayed somewhat since our current focus is on plotting. We do still intend to see that scipy works with numarray. Perry From cookedm at physics.mcmaster.ca Tue Jan 20 13:05:01 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Jan 20 13:05:01 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: References: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> Message-ID: <20040120210414.GA9095@arbutus.physics.mcmaster.ca> On Tue, Jan 20, 2004 at 03:38:34PM -0500, Perry Greenfield wrote: > David M. Cooke writes: > > > Just what I was doing :-) > > > > Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing > > the two. > > > > Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an > > array of size N (of Float), the time to do a+a is > > > > Numeric: 3.7940e-6 + 2.2556e-8 * N seconds > > numarray: 3.7062e-5 + 5.8497e-9 * N > > > > For sin(a), > > Numeric: 1.7824e-6 + 1.1341e-7 * N > > numarray: 2.8994e-5 + 9.8985e-8 * N ... > How many times do you do the operation for each size? Because of > caching, the first result may be much slower than the rest. > If you didn't could you try computing it by discarding the first > numarray time (or start timing after doing the first iteration)? 10000 times per size. I'm re-running it like you suggested, but the difference is small (the new version is up on the above page). For numarray for addition, it's now 3.8771e-5 + 4.9832e-9 * N -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From cjw at sympatico.ca Tue Jan 20 14:20:01 2004 From: cjw at sympatico.ca (Colin J. Williams) Date: Tue Jan 20 14:20:01 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> References: <400C3EF3.8090005@ee.byu.edu> Message-ID: <400DA93C.7030709@sympatico.ca> Travis Oliphant wrote: > > Numarray is making great progress and is quite usable for many > purposes. An idea that was championed by some is that the Numeric > code base would stay static and be replaced entirely by Numarray. It was my impression that this idea had been generally accepted. It was not just one of the proposals under discussion. I wonder how many others out there had assumed that, in spite of current speed problems, numarray was the way for the future, and had based their development endeavours on numarray. I did. To this relative outsider, there seem to have been three groups involved in efforts to provide Python with numerical array capabilities, those connected with Numeric, SciPy and numarray. SciPy would appear to be the most recent addition to the list. Is there any way that some agrement between these groups can be achieved to restore the hope for a common development path? This message from Travis Oliphant seems to envisage two paths. Is this the better way to go? > > However, Numeric is currently used in a large installed base. In > particular SciPy uses Numeric as its core array. While no doubt > numarray arrays will be supported in the future, the speed of the less > bulky Numeric arrays and the typical case that we encounter in SciPy > of many, small arrays will make it difficult for people to abandon > Numeric entirely with it's comparatively light-weight arrays. > > In the development of SciPy we have encountered issues in Numeric that > we feel need to be fixed. As this has become an important path to > success of several projects (both commercial and open) it is > absolutely necessary that this issues be addressed. > > > The purpose of this email is to assess the attitude of the community > regarding how these changes to Numeric should be accomplished. > These are the two options we can see: > * freeze old Numeric 23.x and make all changes to Numeric 24.x still > keeping Numeric separate from SciPy > * freeze old Numeric 23.x and subsume Numeric into SciPy essentially > creating a new SciPy arrayobject that is fast and lightweight. > Anybody wanting this new array object would get it by installing > scipy_base. Numeric would never change in the future but the array in > scipy_base would. > > It is not an option to wait for numarray to get fast enough as these > issues need to be addressed now. Ultimately I think it will be a wise > thing to have two implementations of arrays: one that is fast and > lightweight optimized for many relatively small arrays, and another > that is optimized for large-scale arrays. Eventually, the use of > these two underlying implementations should be automatic and invisible > to the user. Is this "automatic and invisible" practicable, excepts for trivial examples? > > A few of the particular changes we need to make to the Numeric > arrayobject are: > > 1) change the coercion model to reflect Numarray's choice and > eliminate the savespace crutch. > 2) Add indexing capability to Numeric arrays (similar to Numarray's) > 3) Improve the interaction between Numeric arrays and scalars. > 4) Optimization: > > Again, these changes are going to be made to some form of the Numeric > arrays. What I am really interested in knowing is the attitude of the > community towards keeping Numeric around. If most of the community > wants to see Numeric go away then we will be forced to bring the > Numeric array under the SciPy code-base and own it there. > > Your feedback is welcome and appreciated. > Sincerely, > > Travis Oliphant and other SciPy developers > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion I hope that some cooperative approach can be devised. Colin W. From m.oliver at iu-bremen.de Tue Jan 20 14:52:02 2004 From: m.oliver at iu-bremen.de (Marcel Oliver) Date: Tue Jan 20 14:52:02 2004 Subject: [Numpy-discussion] Status of Numeric Message-ID: <16397.45434.933125.912105@localhost.localdomain> Perry Greenfield writes: > Peter J. Verveer writes: > > > I was under the impression that Numarray was intended to be a > > replacement for Numeric, also as a building block for larger > > packages such as SciPy. Was Numarray not intended to be an > > "improved Numeric" in the first place? I chose to develop for > > Numarray rather than Numeric because of its improvements, under > > the assumption that eventually my code would also become > > available to the users of such packages as SciPy. (I wrote the > > nd_image extension that is now distributed with Numarray. I also > > contributed some improvements to RandomArray extension that are > > not in the Numeric version.) > > > It has been our intention to port scipy to use numarray soon. This > work has been delayed somewhat since our current focus is on > plotting. We do still intend to see that scipy works with numarray. That this discussion is happening NOW really surprises me. I have been following this list for a couple of years now, with the intention of eventually using numerical Python as the main teaching toolbox for numerical analysis, and possibly for the migration small research codes as well. The possibility of doing numerics in Phython has always intrigued me. Right now I am primarily using Matlab. It's very powerful, but not free and the language is horrible; Octave is trying to play catch up but has mostly lost steam. So a good scientific Phython environment (of any sort) would be a really cool thing to have. However, two things have always held me back (apart from coding small examples on a few occasions): 1. Numerical Phython has been in a limbo for too long (I had even assumed a few times that both Numeric and Numarray were dead for all practical purposes). If there are two incompatible version for years and no clear indication where the whole thing is going, I am very hesitant to invest any time into writing substantial code, or recommend it for class room use. 2. Plotting is a major issue. There are a couple of semi-functional packages, but neither a comprehensive solution nor a clear direction for the plotting architecture. Short, I see a lot of potential, unused mainly because the numerical Python community seems to lack clear direction and leadership. This is a real showstopper for someone who is primarily interested in building on top. I am still hopeful that something will come of all this - any progress will be very much appreciated. Best regards, Marcel --------------------------------------------------------------------- Marcel Oliver Phone: +49-421-200-3212 School of Engineering and Science Fax: +49-421-200-3103 International University Bremen m.oliver at iu-bremen.de Campus Ring 1 oliver at member.ams.org 28759 Bremen, Germany http://math.iu-bremen.de/oliver --------------------------------------------------------------------- From rays at blue-cove.com Tue Jan 20 14:56:04 2004 From: rays at blue-cove.com (Ray Schumacher) Date: Tue Jan 20 14:56:04 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <20040120210414.GA9095@arbutus.physics.mcmaster.ca> References: <20040120203112.GA8661@arbutus.physics.mcmaster.ca> Message-ID: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> With a cross-over at ~2000 elements, can we safely say that working with video, FITS cubes or other similar imagery would be fastest with numarray for summing or dividing 2D arrays? (~920K elements) Ray http://rjs.org/astro From cookedm at physics.mcmaster.ca Tue Jan 20 15:05:01 2004 From: cookedm at physics.mcmaster.ca (David M. Cooke) Date: Tue Jan 20 15:05:01 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> References: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> Message-ID: <200401201803.53983.cookedm@physics.mcmaster.ca> On Tuesday 20 January 2004 17:54, Ray Schumacher wrote: > With a cross-over at ~2000 elements, can we safely say that working with > video, FITS cubes or other similar imagery would be fastest with numarray > for summing or dividing 2D arrays? (~920K elements) My benchmark was for 1-D arrays, but checking 2-D shows the crossover is in the same region. I'd say for these types of applications you really want to use numarray. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm at physics.mcmaster.ca From Chris.Barker at noaa.gov Tue Jan 20 15:10:04 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 20 15:10:04 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <16397.45434.933125.912105@localhost.localdomain> References: <16397.45434.933125.912105@localhost.localdomain> Message-ID: <400DB4DE.3030604@noaa.gov> > Perry Greenfield writes: > > It has been our intention to port scipy to use numarray soon. This > > work has been delayed somewhat since our current focus is on > > plotting. That is good news. What plotting package are you working on? Last I heard Chaco had turned into Enthought's (and STSci) in-house Windows only package. (Not because they want it that way, but because they don't have funding to make it work on other platforms, and support the broader community). I don't see anything new on the SciPy page after August '03. Frankly, weak plotting is a bigger deal to me than array performance. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rays at blue-cove.com Tue Jan 20 16:03:04 2004 From: rays at blue-cove.com (Ray Schumacher) Date: Tue Jan 20 16:03:04 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <16397.46274.310876.908257@localhost.localdomain> References: <5.2.0.4.2.20040120145558.13e5abf8@blue-cove.com> <16397.45434.933125.912105@localhost.localdomain> <5.2.0.4.2.20040120145558.13e5abf8@blue-cove.com> Message-ID: <5.2.0.4.2.20040120155807.13e7eca8@blue-cove.com> Hi Marcel, At 12:07 AM 1/21/2004 +0100, you wrote: > >Are you saying you have found that you have reinvented the wheel? >That's exactly what I suspect happening a lot... I'm sure a lot of people have written little plot utilities because of the size of Chaco and similar packages, or difficulty integrating with their favorite GUI or module. Ray From perry at stsci.edu Tue Jan 20 17:08:41 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:08:41 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <20040120210414.GA9095@arbutus.physics.mcmaster.ca> Message-ID: <92F58991-4BAD-11D8-9B39-000393989D66@stsci.edu> > David M. Cooke writes: > > 10000 times per size. I'm re-running it like you suggested, but the > difference is small (the new version is up on the above page). For > numarray for addition, it's now > 3.8771e-5 + 4.9832e-9 * N > Well, OK we'll have to look into that. That's different by a factor of 3 or so than what I expected. I'll see if I can find what that is due to. Perry From perry at stsci.edu Tue Jan 20 17:32:06 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:32:06 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400DA93C.7030709@sympatico.ca> Message-ID: <57090E6E-4BB1-11D8-9B39-000393989D66@stsci.edu> On Tuesday, January 20, 2004, at 05:18 PM, Colin J. Williams wrote: > Travis Oliphant wrote: > >> >> Numarray is making great progress and is quite usable for many >> purposes. An idea that was championed by some is that the Numeric >> code base would stay static and be replaced entirely by Numarray. > > It was my impression that this idea had been generally accepted. It > was not just one of the proposals under discussion. > I don't think there was ever any formal vote. I think Paul Dubois had accepted the idea, others had a more "wait and see" attitude. Realistically, I think one can safely say that as one might expect, those that already were using Numeric probably were happy with its capabilities and that given normal motivations, there would be significant inertia on the part of well established users (those with a lot of code already) to switch over. But since it wasn't quite as usable for our needs, we decided that we needed a new version. We had to develop it to support our needs and would have done it regardless. We hoped that it would be suitable for all uses, and we've tried to involve all in the process as much as possible. As you might expect, we've devoted most of our attention to meeting our needs, but we have also expended significant energy trying to meet the needs of the more general community (and we will continue to try to do so within our resources). I don't know if it is reasonable to expect that a certain outcome has been blessed by all, nor did most of the existing Numeric users ask us to do this. But many did recognize (as Paul Dubois alluded to) that there was a need to recode the array stuff. Maybe someone could have done a better job of it, but no one else has yet (it is a fair amount of work after all). We do intend to support all the important packages that Numeric does, it make take some time to get there. I suppose our goal is to eventually attract all new users. We can't, nor should we expect that existing Numeric users will switch at our desire or whim. > I wonder how many others out there had assumed that, in spite of > current speed problems, numarray was the way for the future, and had > based their development endeavours on numarray. I did. > > To this relative outsider, there seem to have been three groups > involved in efforts to provide Python with numerical array > capabilities, those connected with Numeric, SciPy and numarray. SciPy > would appear to be the most recent addition to the list. > Actually, I think it would be more accurate to say that SciPy is an attempt to collect a large base of numeric code and integrate it into an array package (currently Numeric) rather than to develop a new array package. It was started before we started numarray and thus was centered around Numeric. They have found occasions to to modify and extend Numeric behavior. In that sense, it long has been somewhat incompatible with Numeric. (Travis can correct me if I got that wrong.) > Is there any way that some agrement between these groups can be > achieved to restore the hope for a common development path? > I would certainly like to, and in any case, we want to adapt scipy to be compatible with numarray. Perry Greenfield From perry at stsci.edu Tue Jan 20 17:43:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:43:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <16397.45434.933125.912105@localhost.localdomain> Message-ID: On Tuesday, January 20, 2004, at 05:53 PM, Marcel Oliver wrote: > That this discussion is happening NOW really surprises me. I have > been following this list for a couple of years now, with the intention > of eventually using numerical Python as the main teaching toolbox for > numerical analysis, and possibly for the migration small research > codes as well. > > The possibility of doing numerics in Phython has always intrigued me. > Right now I am primarily using Matlab. It's very powerful, but not > free and the language is horrible; Octave is trying to play catch up > but has mostly lost steam. So a good scientific Phython environment > (of any sort) would be a really cool thing to have. > > However, two things have always held me back (apart from coding small > examples on a few occasions): > > 1. Numerical Phython has been in a limbo for too long (I had even > assumed a few times that both Numeric and Numarray were dead for > all practical purposes). If there are two incompatible version for > I don't know why you assumed that. Both have regularly been updated more than once in the past two years. > years and no clear indication where the whole thing is going, I am > very hesitant to invest any time into writing substantial code, or > recommend it for class room use. > That's your right of course. You have to remember that neither we (STScI) nor Enthought (who has funded virtually all the scipy work) are getting paid to do the work we are doing for the general community. In our case, we do much of it for our own purposes, and it would certainly be to our advantage if numarray were adopted by the general community so we invest resources in it. If you don't feel it is ready for your purposes, don't use numarray (or Numeric). We have only so many resources and while we wish we could do everything immediately, we can't. We are committed to making Python a good scientific environment, but we don't promise that it has everything that everyone would need now (and it certainly doesn't). > 2. Plotting is a major issue. There are a couple of semi-functional > packages, but neither a comprehensive solution nor a clear > direction for the plotting architecture. > I agree completely. A later (tonight) message will discuss the current situation at more length. Perry Greenfield From perry at stsci.edu Tue Jan 20 17:46:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 17:46:00 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: <5.2.0.4.2.20040120145122.13e4a098@blue-cove.com> Message-ID: <43E10567-4BB3-11D8-9B39-000393989D66@stsci.edu> On Tuesday, January 20, 2004, at 05:54 PM, Ray Schumacher wrote: > With a cross-over at ~2000 elements, can we safely say that working > with video, FITS cubes or other similar imagery would be fastest with > numarray for summing or dividing 2D arrays? (~920K elements) > As long as you treat the array as a whole, I'd say that usually numarray would be better suited. That doesn't mean you won't find some instances where it is slower for certain operations. (When you do, let us know). Perry Greenfield From bsder at allcaps.org Tue Jan 20 18:53:00 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Tue Jan 20 18:53:00 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400C3EF3.8090005@ee.byu.edu> References: <400C3EF3.8090005@ee.byu.edu> Message-ID: <20040120181047.G98683@mail.allcaps.org> On Mon, 19 Jan 2004, Travis Oliphant wrote: > ... Ultimately I think it will be a wise thing to have two > implementations of arrays: one that is fast and lightweight optimized > for many relatively small arrays, and another that is optimized for > large-scale arrays. I am *extremely* interested in the use case of the small arrays in SciPy. Which algorithms and modules are dominated by the small array speed? -a From perry at stsci.edu Tue Jan 20 19:04:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 20 19:04:02 2004 Subject: [Numpy-discussion] Status of Numeric (and plotting in particular) In-Reply-To: <400DB4DE.3030604@noaa.gov> Message-ID: <27E3B046-4BBE-11D8-9B39-000393989D66@stsci.edu> On Tuesday, January 20, 2004, at 06:08 PM, Chris Barker wrote: >> Perry Greenfield writes: > >> > It has been our intention to port scipy to use numarray soon. This >> > work has been delayed somewhat since our current focus is on >> > plotting. > > That is good news. What plotting package are you working on? Last I > heard Chaco had turned into Enthought's (and STSci) in-house Windows > only package. (Not because they want it that way, but because they > don't have funding to make it work on other platforms, and support the > broader community). > > I don't see anything new on the SciPy page after August '03. > > Frankly, weak plotting is a bigger deal to me than array performance. > Yes, I agree completely (and why we are giving plotting higher priority than scipy integration). I really was hoping to raise this issue later, but I might as well address it since the Numeric/numarray issue has raised it indirectly. Chaco had been the focus of our plotting efforts for more than a year. The effort started with our funding Enthought to start the effort. We had a number of requirements for a plotting package that weren't met by any existing package, and it didn't appear that any would be easily modified to our needs. The requirements we had (off the top of my head) included: 1) easy portability to graphics devices *and* different windowing systems. 2) it had to run on all major platforms including Solaris, Linux, Macs, and Windows. 3) the graphics had to be embedable within gui widgets. 4) it had to allow cursor interactions, at least to the point of being able to read cursor positions from python. 5) it had to be open source and preferably not gpl (though the latter was probably not a show stopper for us) 6) It also had to be customizable to the point of being able to produce very high quality hardcopy plots suitable for publication. 7) object oriented plotting framework capable of sensible composition. 8) command line interface akin to that available in matlab or IDL to make producing quick interactive plots very, very easy. Developing something that satisfies these is not at all trivial. In the process Enthought has expended much energy developing chaco, kiva and traits (and lately they are working on yet more extensions); easily much more of the effort has come from sources other than STScI. Kiva is the back end that presents a uniform api for different graphics devices. Traits handles many of the user interface issues for plot parameters, and handling the relationships of these parameters between plot components. Chaco is the higher level plotting software that provides the traditional plotting capabilities for 2-d data. Much has been invested in chaco. It is with some regret that we (STScI) have concluded that chaco is not suitable for our needs and that we need to take a different approach (or at least give it a try). I'll take some space to explain why. The short answer is that in the end we think it was too ambitious. We still aim to achieve the goals I listed above. The problem we think is that chaco was also tasked to try to achieve extra goals with regard interactive capabilities that were in the end, not really important to STScI and it's community, but were important to Enthought (and presumably its clients, and the scipy community). More specifically, a lot of thought and work went into making many aspects of the plots could be interactively modified. That is, by clicking on various aspects of plots, one could bring up editors for the attributes of that plot element, such as color, line style, font, size, etc. Many other interactive aspects have been enhanced as well. Much recent work by Enthought is going into extending the capabilities even further by adding gui kinds of features (e.g., widgets of all sorts). Unfortunately these capabilities have come at a price, namely complexity. We have found it difficult to track the ongoing changes to chaco to become proficient enough to contribute significantly by adding capabilities we have needed. Perhaps that argues that we aren't competent to do so. To a certain degree, that is probably is true. There is no doubt that Enthought has some very talented software engineers working on chaco and related products. On the other hand, our goal is to have this software be accessible by scientists in general, and particularly astronomers. Chaco is complex enough that we think that is a serious problem. Customizing it's behavior requires a very large investment of time understanding how it works, far beyond what most astronomers are willing to tackle (at least that's my impression). Much of this complexity (and many of its ongoing changes) is to support the interactive capabilities, and to make it responsive enough that plots can update themselves quickly enough not to lead to annoying lags. But frankly, we just want something to render plots on the screen and on hardcopy. Outside of being able to obtain cursor coordinates, we find many of the interactive capabilities as secondary in importance. When most astronomers want to tune a plot (either for publication quality, or for batch processing), they usually want to be able to reproduce the adjustments for new data, for which the interactive attribute editing capability is of little use. Generally they would like to script the the more customized plots so that they can be easily modified and reused. So it seems that it is too difficult to accomplish all these aims within one package. We would like to develop a different plotting package (using many of ideas from chaco, and some code) based on kiva and the traits package. We have started on this over the past month, and hope to have some simple functionality available within a month (though when we make it public may take a bit longer). It will be open source and we hope significantly simpler than chaco. It will not focus on speed (well, we want fairly fast display times for plots of a reasonable number of points, but we don't need video refresh rates). If your interest in plotting matches ours, then this may be for you. We will welcome contributions and comments once we get it off the ground. (We are calling it pyxis by the way). Enthought is continuing to work on chaco and at some point that will be mature, and will be capable of some sophisticated things. That may be more appropriate for some than what we are working on. Perry Greenfield From SKuzminski at fairisaac.com Wed Jan 21 05:06:02 2004 From: SKuzminski at fairisaac.com (Kuzminski, Stefan R) Date: Wed Jan 21 05:06:02 2004 Subject: [Numpy-discussion] Status of Numeric (and plotting in particular) Message-ID: <7646464ACC9B5347A4A5C57729D74A5503540DB4@srfmsg100.corp.fairisaac.com> I'm working on a commercial product that produces publication quality plots from data contained in Numeric arrays. I also concluded that Chaco was a bit more involved than I needed. My question is what requirements are not met by the other available plotting packages such as.. http://matplotlib.sourceforge.net/ These don't have every bell and whistle ( esp. when it comes to the interactive 'properties' dialog ) but as you point out there is a dark side to those features. There are a number of quite capable plotting packages for Python, diversity is good up to a point, but this space ( Plotting packages ) seems ripe for a shakeout. Stefan -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Perry Greenfield Sent: Tuesday, January 20, 2004 7:02 PM To: Chris Barker Cc: numpy-discussion at lists.sourceforge.net Subject: Re: [Numpy-discussion] Status of Numeric (and plotting in particular) On Tuesday, January 20, 2004, at 06:08 PM, Chris Barker wrote: >> Perry Greenfield writes: > >> > It has been our intention to port scipy to use numarray soon. This >> > work has been delayed somewhat since our current focus is on >> > plotting. > > That is good news. What plotting package are you working on? Last I > heard Chaco had turned into Enthought's (and STSci) in-house Windows > only package. (Not because they want it that way, but because they > don't have funding to make it work on other platforms, and support the > broader community). > > I don't see anything new on the SciPy page after August '03. > > Frankly, weak plotting is a bigger deal to me than array performance. > Yes, I agree completely (and why we are giving plotting higher priority than scipy integration). I really was hoping to raise this issue later, but I might as well address it since the Numeric/numarray issue has raised it indirectly. Chaco had been the focus of our plotting efforts for more than a year. The effort started with our funding Enthought to start the effort. We had a number of requirements for a plotting package that weren't met by any existing package, and it didn't appear that any would be easily modified to our needs. The requirements we had (off the top of my head) included: 1) easy portability to graphics devices *and* different windowing systems. 2) it had to run on all major platforms including Solaris, Linux, Macs, and Windows. 3) the graphics had to be embedable within gui widgets. 4) it had to allow cursor interactions, at least to the point of being able to read cursor positions from python. 5) it had to be open source and preferably not gpl (though the latter was probably not a show stopper for us) 6) It also had to be customizable to the point of being able to produce very high quality hardcopy plots suitable for publication. 7) object oriented plotting framework capable of sensible composition. 8) command line interface akin to that available in matlab or IDL to make producing quick interactive plots very, very easy. Developing something that satisfies these is not at all trivial. In the process Enthought has expended much energy developing chaco, kiva and traits (and lately they are working on yet more extensions); easily much more of the effort has come from sources other than STScI. Kiva is the back end that presents a uniform api for different graphics devices. Traits handles many of the user interface issues for plot parameters, and handling the relationships of these parameters between plot components. Chaco is the higher level plotting software that provides the traditional plotting capabilities for 2-d data. Much has been invested in chaco. It is with some regret that we (STScI) have concluded that chaco is not suitable for our needs and that we need to take a different approach (or at least give it a try). I'll take some space to explain why. The short answer is that in the end we think it was too ambitious. We still aim to achieve the goals I listed above. The problem we think is that chaco was also tasked to try to achieve extra goals with regard interactive capabilities that were in the end, not really important to STScI and it's community, but were important to Enthought (and presumably its clients, and the scipy community). More specifically, a lot of thought and work went into making many aspects of the plots could be interactively modified. That is, by clicking on various aspects of plots, one could bring up editors for the attributes of that plot element, such as color, line style, font, size, etc. Many other interactive aspects have been enhanced as well. Much recent work by Enthought is going into extending the capabilities even further by adding gui kinds of features (e.g., widgets of all sorts). Unfortunately these capabilities have come at a price, namely complexity. We have found it difficult to track the ongoing changes to chaco to become proficient enough to contribute significantly by adding capabilities we have needed. Perhaps that argues that we aren't competent to do so. To a certain degree, that is probably is true. There is no doubt that Enthought has some very talented software engineers working on chaco and related products. On the other hand, our goal is to have this software be accessible by scientists in general, and particularly astronomers. Chaco is complex enough that we think that is a serious problem. Customizing it's behavior requires a very large investment of time understanding how it works, far beyond what most astronomers are willing to tackle (at least that's my impression). Much of this complexity (and many of its ongoing changes) is to support the interactive capabilities, and to make it responsive enough that plots can update themselves quickly enough not to lead to annoying lags. But frankly, we just want something to render plots on the screen and on hardcopy. Outside of being able to obtain cursor coordinates, we find many of the interactive capabilities as secondary in importance. When most astronomers want to tune a plot (either for publication quality, or for batch processing), they usually want to be able to reproduce the adjustments for new data, for which the interactive attribute editing capability is of little use. Generally they would like to script the the more customized plots so that they can be easily modified and reused. So it seems that it is too difficult to accomplish all these aims within one package. We would like to develop a different plotting package (using many of ideas from chaco, and some code) based on kiva and the traits package. We have started on this over the past month, and hope to have some simple functionality available within a month (though when we make it public may take a bit longer). It will be open source and we hope significantly simpler than chaco. It will not focus on speed (well, we want fairly fast display times for plots of a reasonable number of points, but we don't need video refresh rates). If your interest in plotting matches ours, then this may be for you. We will welcome contributions and comments once we get it off the ground. (We are calling it pyxis by the way). Enthought is continuing to work on chaco and at some point that will be mature, and will be capable of some sophisticated things. That may be more appropriate for some than what we are working on. Perry Greenfield ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From jh at oobleck.astro.cornell.edu Wed Jan 21 10:45:02 2004 From: jh at oobleck.astro.cornell.edu (Joe Harrington) Date: Wed Jan 21 10:45:02 2004 Subject: [Numpy-discussion] the direction and pace of development Message-ID: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> This is a necessarily long post about the path to an open-source replacement for IDL and Matlab. While I have tried to be fair to those who have contributed much more than I have, I have also tried to be direct about what I see as some fairly fundamental problems in the way we're going about this. I've given it some section titles so you can navigate, but I hope that you will read the whole thing before posting a reply. I fear that this will offend some people, but please know that I value all your efforts, and offense is not my intent. THE PAST VS. NOW While there is significant and dedicated effort going into numeric/numarray/scipy, it's becoming clear that we are not progressing quickly toward a replacement for IDL and Matlab. I have great respect for all those contributing to the code base, but I think the present discussion indicates some deep problems. If we don't identify those problems (easy) and solve them (harder, but not impossible), we will continue not to have the solution so many people want. To be convinced that we are doing something wrong at a fundamental level, consider that Python was the clear choice for a replacement in 1996, when Paul Barrett and I ran a BoF at ADASS VI on interactive data analysis environments. That was over 7 years ago. When people asked at that conference, "what does Python need to replace IDL or Matlab", the answer was clearly "stable interfaces to basic numerics and plotting; then we can build it from there following the open-source model". Work on both these problems was already well underway then. Now, both the numerical and plotting development efforts have branched. There is still no stable base upon which to build. There aren't even packages for popular OSs that people can install and play with. The problem is not that we don't know how to do numerics or graphics; if anything, we know these things too well. In 1996, if anyone had told us that in 2004 there would be no ready-to-go replacement system because of a factor of 4 in small array creation overhead (on computers that ran 100x as fast as those then available) or the lack of interactive editing of plots at video speeds, the response would not have been pretty. How would you have felt? THE PROBLEM We are not following the open-source development model. Rather, we pay lip service to it. Open source's development mantra is "release early, release often". This means release to the public, for use, a package that has core capability and reasonably-defined interfaces. Release it in a way that as many people as possible will get it, install it, use it for real work, and contribute to it. Make the main focus of the core development team the evaluation and inclusion of contributions from others. Develop a common vision for the program, and use that vision to make decisions and keep efforts focused. Include contributing developers in decision making, but do make decisions and move on from them. Instead, there are no packages for general distribution. The basic interfaces are unstable, and not even being publicly debated to decide among them (save for the past 3 days). The core developers seem to spend most of their time developing, mostly out of view of the potential user base. I am asked probably twice a week by different fellow astronomers when an open-source replacement for IDL will be available. They are mostly unaware that this effort even exists. However, this indicates that there are at least hundreds of potential contributors of application code in astronomy alone, as I don't nearly know everyone. The current efforts look rather more like the GNU project than Linux. I'm sorry if that hurts, but it is true. I know that Perry's group at STScI and the fine folks at Enthought will say they have to work on what they are being paid to work on. Both groups should consider the long term cost, in dollars, of spending those development dollars 100% on coding, rather than 50% on coding and 50% on outreach and intake. Linus himself has written only a small fraction of the Linux kernel, and almost none of the applications, yet in much less than 7 years Linux became a viable operating system, something much bigger than what we are attempting here. He couldn't have done that himself, for any amount of money. We all know this. THE PATH Here is what I suggest: 1. We should identify the remaining open interface questions. Not, "why is numeric faster than numarray", but "what should the syntax of creating an array be, and of doing different basic operations". If numeric and numarray are in agreement on these issues, then we can move on, and debate performance and features later. 2. We should identify what we need out of the core plotting capability. Again, not "chaco vs. pyxis", but the list of requirements (as an astronomer, I very much like Perry's list). 3. We should collect or implement a very minimal version of the featureset, and document it well enough that others like us can do simple but real tasks to try it out, without reading source code. That documentation should include lists of things that still need to be done. 4. We should release a stand-alone version of the whole thing in the formats most likely to be installed by users on the four most popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. Tarballs and CVS checkouts are right out. We have seen that nobody in the real world installs them. To be most portable and robust, it would make sense to include the Python interpreter, named such that it does not stomp on versions of Python in the released operating systems. Static linking likewise solves a host of problems and greatly reduces the number of package variants we will have to maintain. 5. We should advertize and advocate the result at conferences and elsewhere, being sure to label it what it is: a first-cut effort designed to do a few things well and serve as a platform for building on. We should also solicit and encourage people either to work on the included TODO lists or to contribute applications. One item on the TODO list should be code converters from IDL and Matlab to Python, and compatibility libraries. 6. We should then all continue to participate in the discussions and development efforts that appeal to us. We should keep in mind that evaluating and incorporating code that comes in is in the long run much more efficient than writing the universe ourselves. 7. We should cut and package new releases frequently, at least once every six months. It is better to delay a wanted feature by one release than to hold a release for a wanted feature. The mountain is climbed in small steps. The open source model is successful because it follows closely something that has worked for a long time: the scientific method, with its community contributions, peer review, open discussion, and progress mainly in small steps. Once basic capability is out there, we can twiddle with how to improve things behind the scenes. IS SCIPY THE WAY? The recipe above sounds a lot like SciPy. SciPy began as a way to integrate the necessary add-ons to numeric for real work. It was supposed to test, document, and distribute everything together. I am aware that there are people who use it, but the numbers are small and they seem to be tightly connected to Enthought for support and application development. Enthought's focus seems to be on servicing its paying customers rather than on moving SciPy development along, and I fear they are building an installed customer base on interfaces that were not intended to be stable. So, I will raise the question: is SciPy the way? Rather than forking the plotting and numerical efforts from what SciPy is doing, should we not be creating a new effort to do what SciPy has so far not delivered? These are not rhetorical or leading questions. I don't know enough about the motivations, intentions, and resources of the folks at Enthought (and elsewhere) to know the answer. I do think that such a fork will occur unless SciPy's approach changes substantially. The way to decide is for us all to discuss the question openly on these lists, and for those willing to participate and contribute effort to declare so openly. I think all that is needed, either to help SciPy or replace it, is some leadership in the direction outlined above. I would be interested in hearing, perhaps from the folks at Enthought, alternative points of view. Why are there no packages for popular OSs for SciPy 0.2? Why are releases so infrequent? If the folks running the show at scipy.org disagree with many others on these lists, then perhaps those others would like to roll their own. Or, perhaps stable/testing/unstable releases of the whole package are in order. HOW TO CONTRIBUTE? Judging by the number of PhDs in sigs, there are a lot of researchers on this list. I'm one, and I know that our time for doing core development or providing the aforementioned leadership is very limited, if not zero. Later we will be in a much better position to contribute application software. However, there is a way we can contribute to the core effort even if we are not paid, and that is to put budget items in grant and project proposals to support the work of others. Those others could be either our own employees or subcontractors at places like Enthought or STScI. A handful of contributors would be all we'd need to support someone to produce OS packages and tutorial documentation (the stuff core developers find boring) for two releases a year. --jh-- From jwp at psychology.nottingham.ac.uk Wed Jan 21 11:10:04 2004 From: jwp at psychology.nottingham.ac.uk (Jon Peirce) Date: Wed Jan 21 11:10:04 2004 Subject: [Numpy-discussion] re: Status of Numeric (and plotting in particular) In-Reply-To: References: Message-ID: <400ECE6C.30106@psychology.nottingham.ac.uk> > > >We have started on this over the past month, and hope to have some >simple >functionality available within a month (though when we make it public >may >take a bit longer). It will be open source and we hope significantly >simpler >than chaco. It will not focus on speed (well, we want fairly fast >display times >for plots of a reasonable number of points, but we don't need video >refresh >rates). If your interest in plotting matches ours, then this may be for >you. >We will welcome contributions and comments once we get it off the >ground. >(We are calling it pyxis by the way). > I agree with the sentiment that chaco is a very heavy and confusing package for the average scientist (but maybe great for the full-time programmer) but I'm really concerned about the idea that we need *another* solution started from scratch. There are already so many including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, pgplot, pyx (new)... In particular MatPlotLib looks promising - check out its examples: http://matplotlib.sourceforge.net/screenshots.html *Many* plotting types already , simple syntax, a few different backends. And already has something of a following. So is it really not possible for STScI to push its resources into aiding the development of something that's already begun? Would be great if we could develop a single package really well rather than everyone making their own. -- Jon Peirce Nottingham University +44 (0)115 8467176 (tel) +44 (0)115 9515324 (fax) http://www.psychology.nottingham.ac.uk/staff/jwp/ From perry at stsci.edu Wed Jan 21 12:07:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Wed Jan 21 12:07:01 2004 Subject: [Numpy-discussion] re: Status of Numeric (and plotting in particular) In-Reply-To: <400ECE6C.30106@psychology.nottingham.ac.uk> Message-ID: Jon Peirce writes: > > I agree with the sentiment that chaco is a very heavy and confusing > package for the average scientist (but maybe great for the full-time > programmer) but I'm really concerned about the idea that we need > *another* solution started from scratch. There are already so many > including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, > pgplot, pyx (new)... > We had looked all of these and each had fallen short in some major way (though I thought piddle had much promise and perhaps could be built on; however it was intended as a back end only.) > In particular MatPlotLib looks promising - check out its examples: > http://matplotlib.sourceforge.net/screenshots.html > *Many* plotting types already , simple syntax, a few different backends. > And already has something of a following. > This we had not seen. A superficial look indicates that it is worth investigating further as a basis for a plotting package. I didn't see any major problem with it that contradicted our requirements, but obviously we will have to look at it in more depth to see if that is the case. It doesn't have to be perfect of course. And it is much more expensive tto start from scratch (though we weren't doing that entirely since a number of components from the chaco effort would have been reused). But this is worth seriously considering. Perry Greenfield > So is it really not possible for STScI to push its resources into aiding > the development of something that's already begun? Would be great if we > could develop a single package really well rather than everyone making > their own. > From magnus at hetland.org Wed Jan 21 12:53:06 2004 From: magnus at hetland.org (Magnus Lie Hetland) Date: Wed Jan 21 12:53:06 2004 Subject: [Numpy-discussion] re: Status of Numeric (and plotting in particular) In-Reply-To: References: <400ECE6C.30106@psychology.nottingham.ac.uk> Message-ID: <20040121205248.GA24551@idi.ntnu.no> Perry Greenfield : > > Jon Peirce writes: > > > > I agree with the sentiment that chaco is a very heavy and confusing > > package for the average scientist (but maybe great for the full-time > > programmer) but I'm really concerned about the idea that we need > > *another* solution started from scratch. There are already so many > > including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, > > pgplot, pyx (new)... > > > We had looked all of these and each had fallen short in some major > way (though I thought piddle had much promise and perhaps could be > built on; however it was intended as a back end only.) Wohoo! Piddle lives ;) I think I'd be interested in resuming some of my earlier work on Piddle if it is ever used for something useful -- such as a proper plotting tool. (I was actually just thinking about wrapping PyX in the Piddle interface to make TeX typesetting available in Piddle.) [snip about mathplotlib] Hm. Maybe a Piddle back-end could be written for it (which would instantly give it lots of extra back-ends)...? Two birds with one stone and all that... - M -- Magnus Lie Hetland "The mind is not a vessel to be filled, http://hetland.org but a fire to be lighted." [Plutarch] From jmiller at stsci.edu Wed Jan 21 13:21:05 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 21 13:21:05 2004 Subject: [Numpy-discussion] How fast are small arrays currently? In-Reply-To: References: Message-ID: <1074719906.1424.251.camel@halloween.stsci.edu> > > Why are numarrays so slow to create? > > There are several portable ways to create numarrays (array(), arange(), zeros(), ones()) and I'm not really sure which one to address, so I poked around some. I discovered that numarray-0.8 has a problem with array() which causes very poor performance (~30x slower than Numeric) for arrays created from a sequence. The problem is with a private Python function, _all_arrays(), that scans the sequence to see if it consists only of arrays; _all_arrays() works badly for the ordinary case of a sequence of numbers. This is fixed now in CVS. Beyond this flaw in array(), it's a mixed bag, with numarray tending to do well with large arrays and certain use cases, and Numeric doing well with small arrays and other use cases. Todd > I'll leave it to Todd to give the details of that. > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From hinsen at cnrs-orleans.fr Wed Jan 21 13:27:00 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Jan 21 13:27:00 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> Message-ID: <6DBBCCFD-4C58-11D8-8519-000A95AB5F10@cnrs-orleans.fr> On 21.01.2004, at 19:44, Joe Harrington wrote: > This is a necessarily long post about the path to an open-source > replacement for IDL and Matlab. While I have tried to be fair to You raise many good points here. Some comments: > those who have contributed much more than I have, I have also tried to > be direct about what I see as some fairly fundamental problems in the > way we're going about this. I've given it some section titles so you I'd say the fundamental problem is that "we" don't exist as a coherent group. There are a few developer groups (e.g. at STSC and Enthought) who write code primarily for their own need and then make it available. The rest of us are what one could call "power users": very interested in the code, knowledgeable about its use, but not contributing to its development other than through testing and feedback. > THE PROBLEM > > We are not following the open-source development model. Rather, we True. But is it perhaps because that model is not so well adapted to our situation? If you look at Linux (the OpenSource reference), it started out very differently. It was a fun project, done by hobby programmers who shared an idea of fun (kernel hacking). Linux was not goal-oriented in the beginnings. No deadlines, no usability criteria, but lots of technical challenges. Our situation is very different. We are scientists and engineers who want code to get our projects done. We have clear goals, and very limited means, plus we are mostly somone's employees and thus not free to do as we would like. On the other hand, our project doesn't provide the challenges that attract the kind of people who made Linux big. You don't get into the news by working on NumPy, you don't work against Microsoft, etc. Computational science and engineering just isn't the same as kernel hacking. I develop two scientific Python libraries myself, more specialized and thus with a smaller market share, but the situation is otherwise similar. And I work much like the Numarray people do: I write the code that I need, and I invest minimal effort in distribution and marketing. To get the same code developped in the Linux fashion, there would have to be many more developers. But they just don't exist. I know of three people worldwide whose competence in both Python/C and in the application domain is good enough that they could work on the code base. This is not enough to build a networked development community. The potential NumPy community is certainly much bigger, but I am not sure it is big enough. Working on NumPy/Numarray requires the combination of not-so-frequent competences, plus availability. I am not saying it can't be done, but it sure isn't obvious that it can be. > Release it in a way that as many people as possible will get it, > install it, use it for real work, and contribute to it. Make the main > focus of the core development team the evaluation and inclusion of > contributions from others. Develop a common vision for the program, This requires yet different competences, and thus different people. It takes people who are good at reading others' code and communicating with them about it. Some people are good programmers, some are good scientists, some are good communicators. How many are all of that - *and* available? > I know that Perry's group at STScI and the fine folks at Enthought > will say they have to work on what they are being paid to work on. > Both groups should consider the long term cost, in dollars, of > spending those development dollars 100% on coding, rather than 50% on > coding and 50% on outreach and intake. Linus himself has written only You are probably right. But does your employer think long-term? Mine doesn't. > applications, yet in much less than 7 years Linux became a viable > operating system, something much bigger than what we are attempting Exactly. We could be too small to follow the Linux way. > 1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". Yes, a very good point. Focus on the goal, not on the legacy code. However, a technical detail that should not be forgotten here: NumPy and Numarray have a C API as well, which is critical for many add-ons and applications. A C API is more closely tied to the implementation than a Python API. It might thus be difficult to settle on an API and then work on efficient implementations. > 2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). 100% agreement. For plotting, defining the interface should be easier (no C stuff). Konrad. From perry at stsci.edu Wed Jan 21 13:29:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Wed Jan 21 13:29:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> Message-ID: Joe Harrington writes: > > This is a necessarily long post about the path to an open-source > replacement for IDL and Matlab. While I have tried to be fair to > those who have contributed much more than I have, I have also tried to > be direct about what I see as some fairly fundamental problems in the > way we're going about this. I've given it some section titles so you > can navigate, but I hope that you will read the whole thing before > posting a reply. I fear that this will offend some people, but please > know that I value all your efforts, and offense is not my intent. > No offense taken. [...] > THE PROBLEM > > We are not following the open-source development model. Rather, we > pay lip service to it. Open source's development mantra is "release > early, release often". This means release to the public, for use, a > package that has core capability and reasonably-defined interfaces. > Release it in a way that as many people as possible will get it, > install it, use it for real work, and contribute to it. Make the main > focus of the core development team the evaluation and inclusion of > contributions from others. Develop a common vision for the program, > and use that vision to make decisions and keep efforts focused. > Include contributing developers in decision making, but do make > decisions and move on from them. > > Instead, there are no packages for general distribution. The basic > interfaces are unstable, and not even being publicly debated to decide > among them (save for the past 3 days). The core developers seem to > spend most of their time developing, mostly out of view of the > potential user base. I am asked probably twice a week by different > fellow astronomers when an open-source replacement for IDL will be > available. They are mostly unaware that this effort even exists. > However, this indicates that there are at least hundreds of potential > contributors of application code in astronomy alone, as I don't nearly > know everyone. The current efforts look rather more like the GNU > project than Linux. I'm sorry if that hurts, but it is true. > I'd both agree with this and disagree. Agree in the sense that many agree these are desireable traits of an open source project. Disagree in the sense that many don't meet all of these traits, and yet may be useful to some degree. Even Python is not released often, nor is it generally packaged by the core group. You will find packaging by special interest group that may or may not be up to date for various platforms. There is a whole spectrum of other, useful open source projects that don't satisfy these requirments. I don't mean that in a defensive way; it's certainly fair to ask what is going wrong in the Python numeric world, but doing the above alone doesn't necessarily guarentee that you will be sucessful in attracting feedback and contributions; there are other factors as well that influence how a project develops. We have had experience with the packaging issue for PyRAF, and it isn't quite so simple, the package binary approach didn't always make life simpler for the user (arguably, we have found the source distribution approach more trouble-free than our original release). Having ones own version of python packaged as a binary raises issues with LD_LIBRARY_PATH that there are just no good solutions to. > I know that Perry's group at STScI and the fine folks at Enthought > will say they have to work on what they are being paid to work on. > Both groups should consider the long term cost, in dollars, of > spending those development dollars 100% on coding, rather than 50% on > coding and 50% on outreach and intake. Linus himself has written only > a small fraction of the Linux kernel, and almost none of the > applications, yet in much less than 7 years Linux became a viable > operating system, something much bigger than what we are attempting > here. He couldn't have done that himself, for any amount of money. > We all know this. > I'd say we have tried our best to solicit input (and accept contributed code as well). You have to remember that how easily contributions come depends on what the critical mass is for usefulness. For something like numarray or Numeric, that critical mass is quite large. Few are interested in contributing when it can do very little and and older package exists that can do more. By the time it has comparable functionality, it is already quite large. A lot of projects like that start with a small group before more join in. There are others where the critical mass is low and many join in when functionality is still relatively low. > THE PATH > > Here is what I suggest: > > 1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". > If numeric and numarray are in agreement on these issues, then we > can move on, and debate performance and features later. > Well, there are, and continue to be those that can't come to an agreement on even the interface. These issues have been raised many times in the past. Often consensus was hard to achieve. We tended to lean towards backward compatibilty unless the change seemed really necessary. For type coercion and error handling, we thought it was. But I don't think we have tried shield the decision making process from the community. I do think the difficulty in achieving a sense of consensus is a problem. Perhaps we are going about the process in the wrong way; I'd welcome suggestions as to how to improve that. > 2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). > > 3. We should collect or implement a very minimal version of the > featureset, and document it well enough that others like us can do > simple but real tasks to try it out, without reading source code. > That documentation should include lists of things that still need > to be done. > > 4. We should release a stand-alone version of the whole thing in the > formats most likely to be installed by users on the four most > popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this > means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. > Tarballs and CVS checkouts are right out. We have seen that nobody > in the real world installs them. To be most portable and robust, > it would make sense to include the Python interpreter, named such > that it does not stomp on versions of Python in the released > operating systems. Static linking likewise solves a host of > problems and greatly reduces the number of package variants we will > have to maintain. > Static linking also introduces other problems. And we have gone this route in the past so we have some knowledge of what it entails. > 5. We should advertize and advocate the result at conferences and > elsewhere, being sure to label it what it is: a first-cut effort > designed to do a few things well and serve as a platform for > building on. We should also solicit and encourage people either to > work on the included TODO lists or to contribute applications. One > item on the TODO list should be code converters from IDL and Matlab > to Python, and compatibility libraries. > > 6. We should then all continue to participate in the discussions and > development efforts that appeal to us. We should keep in mind that > evaluating and incorporating code that comes in is in the long run > much more efficient than writing the universe ourselves. > > 7. We should cut and package new releases frequently, at least once > every six months. It is better to delay a wanted feature by one > release than to hold a release for a wanted feature. The mountain > is climbed in small steps. > > The open source model is successful because it follows closely > something that has worked for a long time: the scientific method, with > its community contributions, peer review, open discussion, and > progress mainly in small steps. Once basic capability is out there, > we can twiddle with how to improve things behind the scenes. > In general, I can't disagree much with most of these. I'm happy for others to smack us when we are going away from this sort of process. Please do, it would be the only way (and others) would learn how to really do it. But we have released fairly frequently, if not with rpms. We do provide pretty good support as well. We have incorporated most of the code sent to us, and considered and implemented many feature requests or performance issues. But the numarray core is not something one would casually change without spending some time understanding how it works; I suspect that is the biggest inhibitor to changes to the core. We are happy to work with others on it if they have the time to do so. If anyone feels we have discouraged people contributing, please let me know (privately if you wish). > IS SCIPY THE WAY? > > The recipe above sounds a lot like SciPy. SciPy began as a way to > integrate the necessary add-ons to numeric for real work. It was > supposed to test, document, and distribute everything together. I am > aware that there are people who use it, but the numbers are small and > they seem to be tightly connected to Enthought for support and > application development. Enthought's focus seems to be on servicing > its paying customers rather than on moving SciPy development along, > and I fear they are building an installed customer base on interfaces > that were not intended to be stable. > I don't feel this is fair to Enthought. It is not my impression that they have made any money off of the scipy distribution directly (Chaco is a different issue). As far as I can tell, the only benefit they've generally gotten from it is from the visibility of sponsoring it, and perhaps from their own use few of the tools they have included as part of it. I doubt that their own clients have driven its development in any significant way. I'd guess they have sunk far more money into scipy than gotten out of it. I don't want others to get the impression that it is the other way around. In fact, on a number of occasions I have heard users complain about the documentation and the standard response is "please help us improve it" with very little in response. They have gone the extra mile in soliciting contributions and help maintaining it. Perhaps it is part of my open source blind spot, but I have trouble seeing what else they could be doing to encourage others to contribute to scipy (besides paying them; which they have done as well!). The only thing I can think of is that because they are doing it, others feel that they don't. Perhaps there is a similar issue with numarray. I don't know. > So, I will raise the question: is SciPy the way? Rather than forking > the plotting and numerical efforts from what SciPy is doing, should we > not be creating a new effort to do what SciPy has so far not > delivered? These are not rhetorical or leading questions. I don't > know enough about the motivations, intentions, and resources of the > folks at Enthought (and elsewhere) to know the answer. I do think > that such a fork will occur unless SciPy's approach changes > substantially. The way to decide is for us all to discuss the > question openly on these lists, and for those willing to participate > and contribute effort to declare so openly. I think all that is > needed, either to help SciPy or replace it, is some leadership in the > direction outlined above. I would be interested in hearing, perhaps > from the folks at Enthought, alternative points of view. Why are > there no packages for popular OSs for SciPy 0.2? Why are releases so > infrequent? If the folks running the show at scipy.org disagree with > many others on these lists, then perhaps those others would like to > roll their own. Or, perhaps stable/testing/unstable releases of the > whole package are in order. > I think the answer is simple. Supporting distributions of the software they have pulled into scipy is a hell of a lot of work; work that nobody is paying them for. It gives me the shivers to think of our taking on all they have for scipy. > HOW TO CONTRIBUTE? > > Judging by the number of PhDs in sigs, there are a lot of researchers > on this list. I'm one, and I know that our time for doing core > development or providing the aforementioned leadership is very > limited, if not zero. Later we will be in a much better position to > contribute application software. However, there is a way we can > contribute to the core effort even if we are not paid, and that is to > put budget items in grant and project proposals to support the work of > others. Those others could be either our own employees or > subcontractors at places like Enthought or STScI. A handful of > contributors would be all we'd need to support someone to produce OS > packages and tutorial documentation (the stuff core developers find > boring) for two releases a year. > By all means, if there is a groundswell of support for development, please let us know. Perry Greenfield From tim.hochberg at ieee.org Wed Jan 21 15:24:00 2004 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Wed Jan 21 15:24:00 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: References: Message-ID: <400F09C3.9020708@ieee.org> Arthur wrote: [SNIP] > Which, to me, seems like a worthy goal. > > On the other hand, it would seem that the goal of something to move > into the core would be performance optimized at the range of array > size most commonly encountered. Rather than for the extraodrinary, > which seems to be the goal of numarray, responding to specific needs > of the numarray development team's applications. I'm not sure where you came up with this, but it's wrong on at least two counts. The first is that last I heard the crossover point where Numarray becomes faster than Numeric is about 2000 elements. It would be nice if that becomes smaller, but I certainly wouldn't call it extreme. In fact I'd venture that the majority of cases where numeric operations are a bottleneck would already be faster under Numarray. In my experience, while it's not uncommon to use short arrays, it is rare for them to be a bottleneck. The second point is the relative speediness of Numeric at low array sizes is the result that nearly all of it is implemented in C, whereas much of Numarray is implemented in Python. This results in a larger overhead for Numarray, which is why it's slower for small arrays. As I understand it, the decision to base most of Numarray in Python was driven by maintainability; it wasn't an attempt to optimize large arrays at the expense of small ones. > Has the core Python development team given out clues about their > feelings/requirements for a move of either Numeric or numarray into > the core? I believe that one major requirement was that the numeric community come to a consensus on an array package and be willing to support it in the core. There may be other stuff. > It concerns me that this thread isn't trafficked. I suspect that most of the exchange has taken place on numpy-discussion at lists.sourceforge.net. [SNIP] -tim From rkern at ucsd.edu Wed Jan 21 15:42:01 2004 From: rkern at ucsd.edu (Robert Kern) Date: Wed Jan 21 15:42:01 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400F09C3.9020708@ieee.org> References: <400F09C3.9020708@ieee.org> Message-ID: <20040121234108.GA4602@taliesen.ucsd.edu> On Wed, Jan 21, 2004 at 04:22:43PM -0700, Tim Hochberg wrote: [snip] > The second point is the relative speediness of Numeric at low array > sizes is the result that nearly all of it is implemented in C, whereas > much of Numarray is implemented in Python. This results in a larger > overhead for Numarray, which is why it's slower for small arrays. As I > understand it, the decision to base most of Numarray in Python was > driven by maintainability; it wasn't an attempt to optimize large arrays > at the expense of small ones. Has the numarray team (or anyone else for that matter) looked at using Pyrex[1] to implement any part of numarray? If not, then that's my next free-time experiment (i.e. avoiding homework while still looking productive at the office). [1] http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From oliphant at ee.byu.edu Wed Jan 21 16:00:01 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Wed Jan 21 16:00:01 2004 Subject: [Numpy-discussion] Comments on the Numarray/Numeric disscussion In-Reply-To: <6DBBCCFD-4C58-11D8-8519-000A95AB5F10@cnrs-orleans.fr> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <6DBBCCFD-4C58-11D8-8519-000A95AB5F10@cnrs-orleans.fr> Message-ID: <400F1289.2040403@ee.byu.edu> I would like to thank the contributors to the discussion as I think one of the problems we have had lately is that people haven't been talking much. Partly because we have some fundamental differences of opinion caused by different goals and partly because we are all busy working on a variety of other pressing projects. The impression has been that Numarray will replace Numeric. I agree with Perry that this has always been less of a consensus and more of a hope. I am more than happy for Numarray to replace Numeric as long as it doesn't mean all my code slows down. I would say the threshold is that my code can't slow down by more than a factor of 10%. If there is a code-base out there (Numeric) that can allow my code to run 10% faster it will get used. I also don't think it's ideal to have multiple N-D arrays running around there, but if they all have the same interface then it doesn't really matter. The two major problems I see with Numarray replacing Numeric are 1) How is UFunc support? Can you create ufuncs in C easily (with a single function call or something similar). 2) Speed for small arrays (array creation is the big one). It is actually quite a common thing to have a loop during which many small arrays get created and destroyed. Yes, you can usually make such code faster by "vectorizing" (if you can figure out how). But the average scientist just wants to (and should be able to) just write a loop. Regarding speed issues. Actually, there are situations where I am very unsatisfied with Numeric's speed performance and so the goal for Numarray should not be to achieve some percentage of Numeric's performance but to beat it. Frankly, I don't see how you can get speed that I'm talking about by carrying around a lot of extras like byte-swapping support, memory-mapping support, record-array support. *Question*: Is there some way to turn on a flag in Numarray so that all of the extra stuff is ignored (i.e. create a small-array that looks on a binary level just like a Numeric array) ? It would seem to me that this is the only way that the speed issue will go away. Given that 1) Numeric already works and given that all of my code depends on it 2) Numarray doesn't seem to have support for general purpose ufunctions (can the scipy.special package be ported to numarray?) 3) Numarray is slower for the common tasks I end up using SciPy for and 4) I actually understand the Numeric code base quite well I have a hard time justifying switching over to Numarray. Thanks again for the comments. -Travis O. Konrad Hinsen wrote: > On 21.01.2004, at 19:44, Joe Harrington wrote: > >> This is a necessarily long post about the path to an open-source >> replacement for IDL and Matlab. While I have tried to be fair to > > > You raise many good points here. Some comments: > >> those who have contributed much more than I have, I have also tried to >> be direct about what I see as some fairly fundamental problems in the >> way we're going about this. I've given it some section titles so you > > > I'd say the fundamental problem is that "we" don't exist as a coherent > group. There are a few developer groups (e.g. at STSC and Enthought) who > write code primarily for their own need and then make it available. The > rest of us are what one could call "power users": very interested in the > code, knowledgeable about its use, but not contributing to its > development other than through testing and feedback. > >> THE PROBLEM >> >> We are not following the open-source development model. Rather, we > > > True. But is it perhaps because that model is not so well adapted to our > situation? If you look at Linux (the OpenSource reference), it started > out very differently. It was a fun project, done by hobby programmers > who shared an idea of fun (kernel hacking). Linux was not goal-oriented > in the beginnings. No deadlines, no usability criteria, but lots of > technical challenges. > > Our situation is very different. We are scientists and engineers who > want code to get our projects done. We have clear goals, and very > limited means, plus we are mostly somone's employees and thus not free > to do as we would like. On the other hand, our project doesn't provide > the challenges that attract the kind of people who made Linux big. You > don't get into the news by working on NumPy, you don't work against > Microsoft, etc. Computational science and engineering just isn't the > same as kernel hacking. > > I develop two scientific Python libraries myself, more specialized and > thus with a smaller market share, but the situation is otherwise > similar. And I work much like the Numarray people do: I write the code > that I need, and I invest minimal effort in distribution and marketing. > To get the same code developped in the Linux fashion, there would have > to be many more developers. But they just don't exist. I know of three > people worldwide whose competence in both Python/C and in the > application domain is good enough that they could work on the code base. > This is not enough to build a networked development community. The > potential NumPy community is certainly much bigger, but I am not sure it > is big enough. Working on NumPy/Numarray requires the combination of > not-so-frequent competences, plus availability. I am not saying it can't > be done, but it sure isn't obvious that it can be. > >> Release it in a way that as many people as possible will get it, >> install it, use it for real work, and contribute to it. Make the main >> focus of the core development team the evaluation and inclusion of >> contributions from others. Develop a common vision for the program, > > > This requires yet different competences, and thus different people. It > takes people who are good at reading others' code and communicating with > them about it. > Some people are good programmers, some are good scientists, some are > good communicators. How many are all of that - *and* available? > >> I know that Perry's group at STScI and the fine folks at Enthought >> will say they have to work on what they are being paid to work on. >> Both groups should consider the long term cost, in dollars, of >> spending those development dollars 100% on coding, rather than 50% on >> coding and 50% on outreach and intake. Linus himself has written only > > > You are probably right. But does your employer think long-term? Mine > doesn't. > >> applications, yet in much less than 7 years Linux became a viable >> operating system, something much bigger than what we are attempting > > > Exactly. We could be too small to follow the Linux way. > >> 1. We should identify the remaining open interface questions. Not, >> "why is numeric faster than numarray", but "what should the syntax >> of creating an array be, and of doing different basic operations". > > > Yes, a very good point. Focus on the goal, not on the legacy code. > However, a technical detail that should not be forgotten here: NumPy and > Numarray have a C API as well, which is critical for many add-ons and > applications. A C API is more closely tied to the implementation than a > Python API. It might thus be difficult to settle on an API and then work > on efficient implementations. > >> 2. We should identify what we need out of the core plotting >> capability. Again, not "chaco vs. pyxis", but the list of >> requirements (as an astronomer, I very much like Perry's list). > > > 100% agreement. For plotting, defining the interface should be easier > (no C stuff). > > Konrad. > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From paul at prescod.net Wed Jan 21 20:22:03 2004 From: paul at prescod.net (Paul Prescod) Date: Wed Jan 21 20:22:03 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <400F09C3.9020708@ieee.org> References: <400F09C3.9020708@ieee.org> Message-ID: <400F4E5E.9020704@prescod.net> Tim Hochberg wrote: >... > > The second point is the relative speediness of Numeric at low array > sizes is the result that nearly all of it is implemented in C, whereas > much of Numarray is implemented in Python. This results in a larger > overhead for Numarray, which is why it's slower for small arrays. As I > understand it, the decision to base most of Numarray in Python was > driven by maintainability; it wasn't an attempt to optimize large arrays > at the expense of small ones. What about Pyrex? If you code Pyrex as if it were exactly Python you won't get much optimization. But if you code it as if it were 90% as maintainable as Python you can often get 90% of the speed of C, which is pretty damn close to having all of the best of both worlds. If you point me to a few key functions in Numarray I could try to recode them in Pyrex and do some benchmarking for you (only if Pyrex is a serious option of course!). Paul Prescod From eric at enthought.com Thu Jan 22 00:05:01 2004 From: eric at enthought.com (eric jones) Date: Thu Jan 22 00:05:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> Message-ID: <400F8400.4030208@enthought.com> Good thing Duke is beating Maryland as I read, otherwise, mail like this can make you grumpy. :-) Joe Harrington wrote: >This is a necessarily long post about the path to an open-source >replacement for IDL and Matlab. While I have tried to be fair to >those who have contributed much more than I have, I have also tried to >be direct about what I see as some fairly fundamental problems in the >way we're going about this. I've given it some section titles so you >can navigate, but I hope that you will read the whole thing before >posting a reply. I fear that this will offend some people, but please >know that I value all your efforts, and offense is not my intent. > > > >THE PAST VS. NOW > >While there is significant and dedicated effort going into >numeric/numarray/scipy, it's becoming clear that we are not >progressing quickly toward a replacement for IDL and Matlab. I have >great respect for all those contributing to the code base, but I think >the present discussion indicates some deep problems. If we don't >identify those problems (easy) and solve them (harder, but not >impossible), we will continue not to have the solution so many people >want. To be convinced that we are doing something wrong at a >fundamental level, consider that Python was the clear choice for a >replacement in 1996, when Paul Barrett and I ran a BoF at ADASS VI on >interactive data analysis environments. That was over 7 years ago. > > > The effort has fallen short of the mark you set. I also wish the community was more efficient at pursuing this goal. There are fundamental issues. (1) The effort required is large. (2) Free time is in short supply. (3) Financial support is difficult to come by for library development. Other potential problems would be a lack of interest and a lack of competence. I do not think many of us suffer from the first. As for competence, the development team beyond the walls of Enthought self selects in open source projects, so we're stuck with what we've got. I know most of the people and happen to think they are a talented bunch, so I'll consider us no worse than the average group of PhDs (some consider that a pretty low bar ...). I believe the tasks that go undone (multi-platform support, bi-yearly releases, documentation, etc.) are more due to (2) and (3) above instead of some other deep (or shallow) issue. I guess another possibility is organization. This can be improved upon. Thanks to the gracious help of Cal Tech (CACR) and NCBR, the community has gathered at a low cost SciPy workshop at Cal Tech the last couple of years. I believe this is a positive step. Adding this to the newsgroups and mailing lists provides us with a solid framework within which to operate. I still have confidence that we will reach the IDL/Matlab replacement point. We don't have the resources that those products have behind them. We do have a superior language, but without a lot of sweat and toiling at hours of grunt work, we don't stand a chance. As for Enthought's efforts, our success in building applications (scientific and otherwise) has diverted our developers (myself included) away from SciPy as the primary focus. We do continue to develop it and provide significant (for us) financial support to maintain it. I am lucky enough to work with a fine set of software engineers, and I am itching to for us to get more time devoted to SciPy. I do believe that we will get the opportunity in the future -- it is just a matter of time. Call me an optimist. >replace IDL or Matlab", the answer was clearly "stable interfaces to >basic numerics and plotting; then we can build it from there following >the open-source model". Work on both these problems was already well >underway then. Now, both the numerical and plotting development >efforts have branched. There is still no stable base upon which to >build. There aren't even packages for popular OSs that people can >install and play with. The problem is not that we don't know how to >do numerics or graphics; if anything, we know these things too well. >In 1996, if anyone had told us that in 2004 there would be no >ready-to-go replacement system because of a factor of 4 in small array >creation overhead (on computers that ran 100x as fast as those then >available) or the lack of interactive editing of plots at video >speeds, the response would not have been pretty. How would you have >felt? > >THE PROBLEM > >We are not following the open-source development model. Rather, we >pay lip service to it. Open source's development mantra is "release >early, release often". This means release to the public, for use, a >package that has core capability and reasonably-defined interfaces. > > >Release it in a way that as many people as possible will get it, >install it, use it for real work, and contribute to it. Make the main >focus of the core development team the evaluation and inclusion of >contributions from others. Develop a common vision for the program, >and use that vision to make decisions and keep efforts focused. >Include contributing developers in decision making, but do make >decisions and move on from them. > >Instead, there are no packages for general distribution. The basic >interfaces are unstable, and not even being publicly debated to decide >among them (save for the past 3 days). The core developers seem to >spend most of their time developing, mostly out of view of the >potential user base. I am asked probably twice a week by different >fellow astronomers when an open-source replacement for IDL will be >available. They are mostly unaware that this effort even exists. >However, this indicates that there are at least hundreds of potential >contributors of application code in astronomy alone, as I don't nearly >know everyone. The current efforts look rather more like the GNU >project than Linux. I'm sorry if that hurts, but it is true. > > > Speaking from the standpoint of SciPy, all I can say is we've tried to do what you outline here. The effort of releasing the huge load of Fortran/C/C++/Python code across multiple platforms is difficult and takes many hours. I would venture that 90% of the effort on SciPy is with the build system. This means that the exact part of the process that you are discussing is the majority of the effort. We keep a version for Windows up to date because that is what our current clients use. In all the other categories, we do the best we can and ask others to fill the gaps. It is also worth saying that SciPy works quite well for most purposes once built -- we and others use it daily on commercial projects. >I know that Perry's group at STScI and the fine folks at Enthought >will say they have to work on what they are being paid to work on. >Both groups should consider the long term cost, in dollars, of >spending those development dollars 100% on coding, rather than 50% on >coding and 50% on outreach and intake. Linus himself has written only >a small fraction of the Linux kernel, and almost none of the >applications, yet in much less than 7 years Linux became a viable >operating system, something much bigger than what we are attempting >here. He couldn't have done that himself, for any amount of money. >We all know this. > > Elaborate on the outreach idea for me. Enthought (spend money to) provide funding to core developers outside of our company (Travis and Pearu), we (spend money to) give talks at many conferences a year, we (spend a little money to) co-sponsor a 70 person workshop on scientific computing every year, we have an open mailing list, we release most of the general software that we write, in the past I practically begged people to have CVS write access when they provide a patch to SciPy. We even spent a lot of time early on trying to set up the scipy.org site as a collaborative Zope based environment -- an effort that was largely a failure. Still we have a functioning largely static site, the mailing list, and CVS. As far as tools, that should be sufficient. It is impossible to argue with the results though. Linus pulled off the OS model, and Enthought and the SciPy community, thus far, has been less successful. If there are suggestions beyond "spend more *time* answering email," I am all ears. Time is the most precious commodity of all these days. Also, SciPy has only been around for 3+ years, so I guess we still have a some rope left. I continue to believe it'll happen -- this seems like the perfect project for open source contributions. >THE PATH > >Here is what I suggest: > >1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". > If numeric and numarray are in agreement on these issues, then we > can move on, and debate performance and features later. > > ?? I don't get this one. This interface (at least for numarray) is largely decided. We have argued the points, and Perry et. al. at STSci made the decisions. I didn't like some of them, and I'm sure everyone else had at least one thing they wished was changed, but that is the way this open stuff works. It is not the interface but the implementation that started this furor. Travis O.'s suggestion was to back port (much of) the numarray interface to the Numeric code base so that those stuck supporting large co debases (like SciPy) and needing fast small arrays could benefit from the interface enhancements. One or two of them had backward compatibility issues with Numeric, so he asked how it should be handled. Unless some magic porting fairy shows up, SciPy will be a Numeric only tool for the next year or so. This means that users of SciPy either have to forgo some of these features or back port. On speed: Numeric is already too slow -- we've had to recode a number of routines in C that I don't think we should have in a recent project. For us, the goal is not to approach Numeric's speed but to significantly beat it for all array sizes. That has to be a possibility for any replacement. Otherwise, our needs (with the exception of a few features) are already better met by Numeric. I have some worries about all of the endianness and memory mapped support that are built into Numarray imposing to much overhead for speed-ups on small arrays to be possible (this echo's Travis O's thoughts -- we will happily be proven wrong). None of our current work needs these features, and paying a price for them is hard to do with an alternative already there. It is fairly easy to improve its performance on mathematical by just changing the way the ufunc operations are coded. With some reasonably simple changes, Numeric should be comparable (or at least closer) to Numarray speed for large arrays. Numeric also has a large number of other optimizations that can be made (memory is zeroed twice in zeros(), asarray was recently improved significantly for the typical case, etc.). Making these changes would help our selling of Python and, since we have at least a years worth of applications that will be on the SciPy/Numeric platform, it will also help the quality of these applications. Oh yeah, I have also been surprised at how much of out code uses alltrue(), take(), isnan(), etc. The speed of these array manipulation methods is really important for us. >2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). > > Yep, we obviously missed on this one. Chaco (and the related libraries) is extremely advanced in some areas but lags in ease-of-use. It is primarily written by a talented and experienced computer scientist (Dave Morrill) who likely does not have the perspective of an astronomer. It is clear that areas of the library need to be re-examined, simplified, and improved. Unfortunately, there is not time for us to do that right now, and the internals have proven to complex for others to contribute to in a meaningful way. I do not know when this will be addressed. The sad thing here is that STSci won't be using it. That pains me to no end, and Perry and I have tried to figure out some way to make it work for them. But, it sounds like, at least in the short term, there will be two new additions to the plotting stable. We will work hard though to make the future Chaco solve STSci's problems (and everyone elses) better than it currently does. By the way, there is a lot of Chaco bashing going on. It is worth saying that we use Chaco every day in commercial applications that require complex graphics and heavy interactivity with great success. But, we also have mixed teams of scientists and computer scientists along with the "U Manual" (If I have a question, I ask you -- being Dave) to answer any questions. I continue to believe Chaco's Traits based approach is the only one currently out there that has the chance of improving on Matlab and other plotting packages available. And, while SciPy is moving slowly, Chaco is moving at a frantic development pace and gets new capabilities daily (which is part of the complaints about it). I feel certain in saying that it has more resources tied to its development that the other plotting option out there -- it is just currently being exercised in GUI environments instead of as a day-to-day plotting tool. My advice is dig in, learn traits, and learn Chaco. >3. We should collect or implement a very minimal version of the > featureset, and document it well enough that others like us can do > simple but real tasks to try it out, without reading source code. > That documentation should include lists of things that still need > to be done. > > >4. We should release a stand-alone version of the whole thing in the > formats most likely to be installed by users on the four most > popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this > means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. > Tarballs and CVS checkouts are right out. We have seen that nobody > in the real world installs them. To be most portable and robust, > it would make sense to include the Python interpreter, named such > that it does not stomp on versions of Python in the released > operating systems. Static linking likewise solves a host of > problems and greatly reduces the number of package variants we will > have to maintain. > >5. We should advertize and advocate the result at conferences and > elsewhere, being sure to label it what it is: a first-cut effort > designed to do a few things well and serve as a platform for > building on. We should also solicit and encourage people either to > work on the included TODO lists or to contribute applications. One > item on the TODO list should be code converters from IDL and Matlab > to Python, and compatibility libraries. > >6. We should then all continue to participate in the discussions and > development efforts that appeal to us. We should keep in mind that > evaluating and incorporating code that comes in is in the long run > much more efficient than writing the universe ourselves. > >7. We should cut and package new releases frequently, at least once > every six months. It is better to delay a wanted feature by one > release than to hold a release for a wanted feature. The mountain > is climbed in small steps. > >The open source model is successful because it follows closely >something that has worked for a long time: the scientific method, with >its community contributions, peer review, open discussion, and >progress mainly in small steps. Once basic capability is out there, >we can twiddle with how to improve things behind the scenes. > > > Everything here is great -- it is the implementation part that is hard. I am all for it happening though. >IS SCIPY THE WAY? > >The recipe above sounds a lot like SciPy. SciPy began as a way to >integrate the necessary add-ons to numeric for real work. It was >supposed to test, document, and distribute everything together. I am >aware that there are people who use it, but the numbers are small and >they seem to be tightly connected to Enthought for support and >application development. > Not so. The user base is not huge, but I would conservatively venture to say it is in the hundreds to thousands. We are a company of 12 without a single support contract for SciPy. >Enthought's focus seems to be on servicing >its paying customers rather than on moving SciPy development along, > > Continuing to move SciPy along at the pace we initially were would have ended Enthought -- something had to change. It is surprising how important paying customers are to a company. >and I fear they are building an installed customer base on interfaces >that were not intended to be stable. > > Not sure what you you mean here, but I'm all for stable interfaces. Huge portions of SciPy's interface haven't changed, and I doubt they will change. I do indeed feel, though, that SciPy is still a 0.2 release level, so some of the interfaces can change. It would be irresponsible to say otherwise. This is not "intentionally unstable" though... >So, I will raise the question: is SciPy the way? Rather than forking >the plotting and numerical efforts from what SciPy is doing, should we >not be creating a new effort to do what SciPy has so far not >delivered? These are not rhetorical or leading questions. I don't >know enough about the motivations, intentions, > Man this sounds like an interview (or interaction) question. We'll we're a company, so we do wish to make money -- otherwise, we'll have to do something else. We also care about deeply about science and are passionate about scientific computing. Let see, what else. We have made most of the things we do open source because we do believe in it in principle and as a good development philosophy. And, even though we all wish SciPy was moving faster, SciPy wouldn't be anywhere close to where it is without Travis Oliphant and Pearu Peterson -- neither of whom would have worked on it had it not been openly available. That alone validates the decision to make it open. I'm not sure what we have done to make someone question our "motivations and intentions" (sounds like a date interrogation), but it is hard to think of malicious ones when you are making the fruits of your labors and dollars freely available. >and resources of the > > Well, we have 12 people, and Pearu and Travis O work with us quite a bit also. The developers here are very good (if I do say so myself), but unfortunately primarily working on other projects at the moment. Besides scientists/computer scientists have a technical writer and a human-computer-interface specialist on staff. >folks at Enthought (and elsewhere) to know the answer. I do think >that such a fork will occur unless SciPy's approach changes >substantially. > Enthought has more commitments than we used to. SciPy remains important and core to what we do, it just has to share time with other things. Luckily Pearu and Travis have kept there ear to the ground to help out people on the mailing lists as well as working on the codebase. I'm not sure what our approach has been that would force a fork... It isn't like someone has come as asked to be release manager, offered to keep the web pages up to date, provided peer review of code, etc and we have turned them away. Almost from the beginning most effort is provided by a small team (fairly standard for OS stuff). We have repeatedly pointed out areas we need help at the conference and in mail -- code reviews, build help, release help, etc. In fact, I double dare ya to ask to manage the next release or the documentation effort. okay... triple dare ya. Some people have philosophical (like Konrad I believe) differences with how SciPy is packaged and believe it should be 12 smaller packages instead of one large one. This has its own set of problems obviously, but forking based on this kind of principle would make at least a modicum of sense. Forking because you don't like the pace of the project makes zero sense. Pitch in and solve the problem. The social barriers are very small. The code barriers (build, etc.) are what need to be solved. >The way to decide is for us all to discuss the >question openly on these lists, and for those willing to participate >and contribute effort to declare so openly. I think all that is >needed, either to help SciPy or replace it, is some leadership in the >direction outlined above. I would be interested in hearing, perhaps >from the folks at Enthought, alternative points of view. Why are >there no packages for popular OSs for SciPy 0.2? > Please build them, ask for web credentials, and up load them. Then answer the questions people have about them on the mailing list. It is as simple as that. There is no magic here -- just work. >Why are releases so >infrequent? > Ditto. >If the folks running the show at scipy.org disagree with >many others on these lists, then perhaps those others would like to >roll their own. Or, perhaps stable/testing/unstable releases of the >whole package are in order. > >HOW TO CONTRIBUTE? > >Judging by the number of PhDs in sigs, there are a lot of researchers >on this list. I'm one, and I know that our time for doing core >development or providing the aforementioned leadership is very >limited, if not zero. > Surprisingly, commercial developers have about the same amount of free time. > Later we will be in a much better position to >contribute application software. However, there is a way we can >contribute to the core effort even if we are not paid, and that is to >put budget items in grant and project proposals to support the work of >others. > For the academics, supporting a *dedicated* student to maintain SciPy would be much more cost effective use of your dollars. Unfortunately, it is hard to get a PhD for supporting SciPy... For companies, national laboratories, etc. Supporting development on SciPy (or numarray) directly is a great idea. Projects that we work on in other areas also indirectly support SciPy, Chaco, etc. so get us involved with the development efforts at your company/lab. Other options? Government (NASA, Military, NIH, etc) and national lab people can get SciPy/numarray/Python related SBIR (http://www.acq.osd.mil/sadbu/sbir/) topics that would impact there research/development put on the solicitation list this summer. Email me if you have any questions on this. ASCI people can propose PathForward projects. There are probably numerous other ways to do this. We will have a GSA schedule soon, so government contracting will also work. >subcontractors at places like Enthought or STScI. A handful of >contributors would be all we'd need to support someone to produce OS >packages and tutorial documentation (the stuff core developers find >boring) for two releases a year. > > Joe, as you say, things haven't gone as fast as any of us would wish, but it hasn't been for lack of trying. Many of us have put zillions of hours into this. The results are actually quite stable tools. Many people use Numeric/Numarray/SciPy in daily work without problems. But, like Linux in the early years, they still require "geeks" willing to do some amount of meddling to use them. Huge resources (developer and financial) have been pumped into Linux to get it to the point its at today. Anything we can do to increase the participation in building tools and financially supporting those who do build tools, I am all for... I'd love to see releases on 10 platforms and full documentation for the libraries as well as the next person. Whew, and Duke managed to hang on and win. my .01 worth, eric >--jh-- > > >------------------------------------------------------- >The SF.Net email is sponsored by EclipseCon 2004 >Premiere Conference on Open Tools Development and Integration >See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. >http://www.eclipsecon.org/osdn >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From cjw at sympatico.ca Thu Jan 22 09:56:04 2004 From: cjw at sympatico.ca (Colin J. Williams) Date: Thu Jan 22 09:56:04 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <400F8400.4030208@enthought.com> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> Message-ID: <40100EAA.8040204@sympatico.ca> As a relative newcomer to this discussion, I would like to respond on a couple of points. eric jones wrote: > Good thing Duke is beating Maryland as I read, otherwise, mail like > this can make you grumpy. :-) > > Joe Harrington wrote: > [snip] >> THE PATH >> >> Here is what I suggest: >> >> 1. We should identify the remaining open interface questions. Not, >> "why is numeric faster than numarray", but "what should the syntax >> of creating an array be, and of doing different basic operations". >> If numeric and numarray are in agreement on these issues, then we >> can move on, and debate performance and features later. >> >> > ?? I don't get this one. This interface (at least for numarray) is > largely decided. We have argued the points, and Perry et. al. at > STSci made the decisions. I didn't like some of them, and I'm sure > everyone else had at least one thing they wished was changed, but that > is the way this open stuff works. I have wondered whether the desire to be compatible with Numeric has been an inhibitory factor for numarray. It might be interesting to see the list of decisions which Eric Jones doesn't like. > > It is not the interface but the implementation that started this > furor. Travis O.'s suggestion was to back port (much of) the numarray > interface to the Numeric code base so that those stuck supporting > large co debases (like SciPy) and needing fast small arrays could > benefit from the interface enhancements. One or two of them had > backward compatibility issues with Numeric, so he asked how it should > be handled. Unless some magic porting fairy shows up, SciPy will be a > Numeric only tool for the next year or so. This means that users of > SciPy either have to forgo some of these features or back port. Back porting would appear, to this outsider, to be a regression. Is there no way of changing numarray so that it has the desired speed for small arrays? > > > On speed: > Numeric is already too slow -- we've had to recode a number of > routines in C that I don't think we should have in a recent project. > For us, the goal is not to approach Numeric's speed but to > significantly beat it for all array sizes. That has to be a > possibility for any replacement. Otherwise, our needs (with the > exception of a few features) are already better met by Numeric. I > have some worries about all of the endianness and memory mapped > support that are built into Numarray imposing to much overhead for > speed-ups on small arrays to be possible (this echo's Travis O's > thoughts -- we will happily be proven wrong). None of our current > work needs these features, and paying a price for them is hard to do > with an alternative already there. It is fairly easy to improve its > performance on mathematical by just changing the way the ufunc > operations are coded. With some reasonably simple changes, Numeric > should be comparable (or at least closer) to Numarray speed for large > arrays. Numeric also has a large number of other optimizations that > can be made (memory is zeroed twice in zeros(), asarray was recently > improved significantly for the typical case, etc.). Making these > changes would help our selling of Python and, since we have at least a > years worth of applications that will be on the SciPy/Numeric > platform, it will also help the quality of these applications. > > Oh yeah, I have also been surprised at how much of out code uses > alltrue(), take(), isnan(), etc. The speed of these array > manipulation methods is really important for us. I am surprised that alltrue() performance is a concern, but it should be easy to implement short circuit evaluation so that False responses are, on average, handled more quickly. If Boolean arrays are significant, in terms of the amount of computer time taken, should they be stored as bit arrays? Would there be a pay-off for the added complexity? > > [snip] > >> 3. We should collect or implement a very minimal version of the >> featureset, and document it well enough that others like us can do >> simple but real tasks to try it out, without reading source code. >> That documentation should include lists of things that still need >> to be done. >> > Does numarray not provide the basics? >> [snip >> The open source model is successful because it follows closely >> something that has worked for a long time: the scientific method, with >> its community contributions, peer review, open discussion, and >> progress mainly in small steps. Once basic capability is out there, >> we can twiddle with how to improve things behind the scenes. >> >> >> Colin W. From bsder at allcaps.org Thu Jan 22 15:03:01 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Thu Jan 22 15:03:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <400F8400.4030208@enthought.com> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> Message-ID: <20040122142856.V3184@mail.allcaps.org> On Thu, 22 Jan 2004, eric jones wrote: > The effort has fallen short of the mark you set. I also wish the > community was more efficient at pursuing this goal. There are > fundamental issues. (1) The effort required is large. (2) Free time is > in short supply. (3) Financial support is difficult to come by for > library development. (4) There is no itch to scratch Matlab is somewhere about $20,000 (base+a couple of toolboxes) per year for corporations, and something like $500 (or less) for registered students. All of the signal processing packages and stuff are all written for Matlab. The time cost of learning a new tool (Python + SciPy + Numeric/numarray) far exceeds the base prices for the average company or person. However, some companies have to deliver an end product with Matlab embedded. This is *extremely* undesirable; consequently, they are likely to create add-ons and extend the Python interface. However, the progress will likely be slow. > Speaking from the standpoint of SciPy, all I can say is we've tried to > do what you outline here. The effort of releasing the huge load of > Fortran/C/C++/Python code across multiple platforms is difficult and > takes many hours. And since SciPy is mostly Windows, the users expect that one click installs the universe. Good for customer experience. Bad for maintainability which would really like to have independently maintained packages with hard API's surrounding them.. > On speed: > Numeric is already too slow -- we've had to recode a number of routines > in C that I don't think we should have in a recent project. Then the idea of optimizing numarray is DOA. The best you are going to get is a constant factor speedup in return for vastly complicating maintainability. That's not a good tradeoff for a multi-year open-source project. > Oh yeah, I have also been surprised at how much of out code uses > alltrue(), take(), isnan(), etc. The speed of these array manipulation > methods is really important for us. That seems ... odd. Scanning an array rather than handling a NaN trap seems like an awful tradeoff (ie. an O(n) operation repeated every time rather than an O(1) operation activated only on NaN generation--a rare occurrence normally). > -- code reviews, build help, release help, etc. In fact, I double dare > ya to ask to manage the next release or the documentation effort. > okay... triple dare ya. Shades of, "Take my wife ... please!" ;) -a From oliphant at ee.byu.edu Thu Jan 22 15:52:09 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Thu Jan 22 15:52:09 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <20040122142856.V3184@mail.allcaps.org> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> <20040122142856.V3184@mail.allcaps.org> Message-ID: <4010622E.7040605@ee.byu.edu> Andrew P. Lentvorski, Jr. wrote: > On Thu, 22 Jan 2004, eric jones wrote: > > >>Speaking from the standpoint of SciPy, all I can say is we've tried to >>do what you outline here. The effort of releasing the huge load of >>Fortran/C/C++/Python code across multiple platforms is difficult and >>takes many hours. > > > And since SciPy is mostly Windows, the users expect that one click > installs the universe. Good for customer experience. Bad for > maintainability which would really like to have independently maintained > packages with hard API's surrounding them.. > What in the world does this mean? SciPy is "mostly Windows" Yes, there is a only a binary installer for windows available currently. But, how does that make this statement true. For me SciPy has always been used almost exclusively on Linux. In fact, the best plotting support for SciPy (in my mind) is xplt (pygist-based) and it works best on Linux. -Travis From perry at stsci.edu Thu Jan 22 18:50:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Thu Jan 22 18:50:02 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <20040121234108.GA4602@taliesen.ucsd.edu> Message-ID: Robert Kern writes: > [snip] > > Tim Hochberg writes: > > The second point is the relative speediness of Numeric at low array > > sizes is the result that nearly all of it is implemented in C, whereas > > much of Numarray is implemented in Python. This results in a larger > > overhead for Numarray, which is why it's slower for small arrays. As I > > understand it, the decision to base most of Numarray in Python was > > driven by maintainability; it wasn't an attempt to optimize > large arrays > > at the expense of small ones. > > Has the numarray team (or anyone else for that matter) looked at using > Pyrex[1] to implement any part of numarray? If not, then that's my next > free-time experiment (i.e. avoiding homework while still looking > productive at the office). > We had looked at it at least a couple of times. I don't remember now all the conclusions, but I think one of the problems was that it wasn't as useful when one had to deal with data types not used in python itself (e.g., unsigned int16). I might be wrong about that. Numarray generates a lot of c code directly for the actual array computations. That is neither the slow part, nor the hard part to write. It is the array computation setup that is complicated. Much of that is now in C (and we do worry that it has greatly added to the complexity). Perhaps that part could be better handled by pyrex. I think some of the remaining overhead has to do with intrinsic python calls, and the differences between the simpler type used for Numeric versus the new style classes used for numarray. Don't hold me to that however. Perry From perry at stsci.edu Thu Jan 22 19:08:59 2004 From: perry at stsci.edu (Perry Greenfield) Date: Thu Jan 22 19:08:59 2004 Subject: [Numpy-discussion] Comments on the Numarray/Numeric disscussion In-Reply-To: <400F1289.2040403@ee.byu.edu> Message-ID: Travis Oliphant writes: > The two major problems I see with Numarray replacing Numeric are > > 1) How is UFunc support? Can you create ufuncs in C easily (with a > single function call or something similar). > Different, but I don't think it is difficult to add ufuncs (and probably easier if many types must be supported, though I doubt that is much of an issue for most mathematical functions which generally are only needed for the float types and perhaps complex). > 2) Speed for small arrays (array creation is the big one). > This is the much harder issue. I do wonder if it is possible to make numarray any faster than Numeric on this point (or as other later mention, whether the complexity that it introduces is worth it. > It is actually quite a common thing to have a loop during which many > small arrays get created and destroyed. Yes, you can usually make such > code faster by "vectorizing" (if you can figure out how). But the > average scientist just wants to (and should be able to) just write a loop. > I'll pick a small bone here. Well, yes, and I could say that a scientist should be able to write loops that iterate over all array elements and expect that they run as fast. But they can't. After all, using an array language within an interpreted language implies that users must cast their problems into array manipulations for it to work efficiently. By using Numeric or numarray they *must* buy into vectorizing at some level. Having said that, it certainly is true that there are problems with small arrays that cannot be easily vectorized by combining into higher dimension arrays (I think the two most common cases are with variable-sized small arrays or where there are iterative algorithms on small arrays that must be iterated many times (though some of these problems can be cast into larger vectors, but often not really easily). > Regarding speed issues. Actually, there are situations where I am very > unsatisfied with Numeric's speed performance and so the goal for > Numarray should not be to achieve some percentage of Numeric's > performance but to beat it. > > Frankly, I don't see how you can get speed that I'm talking about by > carrying around a lot of extras like byte-swapping support, > memory-mapping support, record-array support. > You may be right. But then I would argue that if one want to speed up small array performance, one should really go for big improvements. To do that suggests taking a signifcantly different approach than either Numeric or numarray. But that's a different topic ;-) To me, factors of a few are not necessarily worth the trouble (and I wonder how much of the phase space of problems they really help move into feasibility). Yes, if you've written a bunch of programs that use small arrays that are marginally fast enough, then a factor of two slower is painful. But there are many other small array problems that were too slow already that couldn't be done anyway. The ones that weren't marginal will likely still be acceptable. Those that live in the grey zone now are the ones that are most sensitive to the issue. All the rest don't care. I don't have a good feel for how many live in the grey zone. I know some do. Perry Greenfield From perry at stsci.edu Thu Jan 22 19:25:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Thu Jan 22 19:25:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <40100EAA.8040204@sympatico.ca> Message-ID: Colin J. Williams writes: > > I have wondered whether the desire to be compatible with Numeric has > been an inhibitory factor for numarray. It might be interesting to see > the list of decisions which Eric Jones doesn't like. > There weren't that many. The ones that I remember (and if Eric has time he can fill in the rest) were: 1) default axis for operations. Some use the last and some use the first depending on context. Eric and Travis wanted to use a consistent rule (I believe last always). I believe that scipy wraps Numeric so that it does just that (remember, the behavior in scipy of Numeric is not quite the same as the distributed Numeric (correct me if I'm wrong). 2) allowing complex comparisons. Since Python no longer allows these (and it is reasonable to question whether this was right since complex numbers now can no longer be part of a generic python sort), Many felt that numarray should be consistent with Python. This isn't a big issue since I had argued that those that wanted to do generic comparisons simply needed to cast it as x.real where the .real attribute was available for all types of arrays, thus using that would always work regardless of the type. 3) having single-element indexing return a rank-0 array rather than a python scalar. Numeric is quite inconsistent in this regard now. We decided to have numarray always return python scalars (exceptions may be made if Float128 is supported). The argument for rank-0 arrays was that it would support generic programming so that one didn't need to test for the kind of value for many functions (i.e., scalar or array). But the issue of contention was that Eric argued that len(rank-0) == 1 and that (rank-0)[0] give the value, neither of which is correct according to the strict definition of rank-0. We argued that using rank-1 len-1 arrays were really what was needed for that kind of programming. It turned out that the most common need was for the result of reduction operations, so we provided a version of reduce (areduce) which always returned an array result even if the array was 1-d, (the result would be a length-1 rank-1 array). There are others, but I don't recall immediately. > > > > It is not the interface but the implementation that started this > > furor. Travis O.'s suggestion was to back port (much of) the numarray > > interface to the Numeric code base so that those stuck supporting > > large co debases (like SciPy) and needing fast small arrays could > > benefit from the interface enhancements. One or two of them had > > backward compatibility issues with Numeric, so he asked how it should > > be handled. Unless some magic porting fairy shows up, SciPy will be a > > Numeric only tool for the next year or so. This means that users of > > SciPy either have to forgo some of these features or back port. > > Back porting would appear, to this outsider, to be a regression. Is > there no way of changing numarray so that it has the desired speed for > small arrays? > If it must be faster than Numeric, I do wonder if that is easily done without greatly complicating the code. > > > > > > I am surprised that alltrue() performance is a concern, but it should be > easy to implement short circuit evaluation so that False responses are, > on average, handled more quickly. If Boolean arrays are significant, > in terms of the amount of computer time taken, should they be stored as > bit arrays? Would there be a pay-off for the added complexity? > Making alltrue fast in numarray would not be hard. Just some work writing a special purpose function to short circuit. I doubt very much bit arrays would be much faster. They would also greatly complicate the code base. It is possible to add them, but I've always felt the reason would be to save memory, not increase speed. They haven't been high priority for us. > > Perry Greenfield From oliphant at ee.byu.edu Thu Jan 22 19:33:01 2004 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu Jan 22 19:33:01 2004 Subject: [Numpy-discussion] More on transitioning to Numarray Message-ID: <4010958E.5050303@ee.byu.edu> Today, I realized that I needed to restate what my intention in raising the subject to begin with was. First of all, I would like to see everybody transition to Numarray someday. On the other hand, I'm not willing to ignore performance issues just to reach that desireable goal. I would like to recast my proposal into the framework of helping SciPy transition to Numarray. Basically, I don't think Numarray will be ready to fully support SciPy in less than a year (basically because it probably won't happen until some of us SciPy folks do a bit more work with Numarray). To help that along I am proposing making a few changes to the Numeric object that SciPy uses so that the array object SciPy expects starts looking more and more like the Numarray object. We have wanted to do this in SciPy and were simply wondering if it would make sense to change the Numeric object or to grab the Numeric code base into SciPy and make changes there. The feedback from the community has convinced me personally that we should leave Numeric alone and make any changes to something we create inside of SciPy. There is a lot of concern over having multiple implementations of nd arrays due to potential splitting of tools, etc. But, I should think that tools should be coded to an interface (API, methods, data structures) instead of a signle implementation, so that the actual underlying object should not matter greatly. I thought that was the point of modular development and object-orientedness .... Anyone doing coding with numeric arrays already has to distinguish between: Python Imaging Objects, Lists of lists, and other array-like objects. I think it is pretty clear that Numeric won't be changing. Thus, anything we do with the Numeric object will be done from the framework of SciPy. Best regards. Travis O. From paul at prescod.net Thu Jan 22 23:31:01 2004 From: paul at prescod.net (Paul Prescod) Date: Thu Jan 22 23:31:01 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: References: Message-ID: <4010CC24.3040500@prescod.net> Perry Greenfield wrote: > ... > We had looked at it at least a couple of times. I don't remember now > all the conclusions, but I think one of the problems was that > it wasn't as useful when one had to deal with data types not > used in python itself (e.g., unsigned int16). I might be wrong > about that. I would guess that the issue is more whether it is natively handled by Pyrex than whether it is handled by Python. Is there a finite list of these types that Numarray handles? If you have a list I could generate a patch to Pyrex that would support them. We could then ask Greg whether he could add them to Pyrex core or refactor it so that he doesn't have to. > Numarray generates a lot of c code directly for the actual > array computations. That is neither the slow part, nor the > hard part to write. It is the array computation setup that > is complicated. Much of that is now in C (and we do worry > that it has greatly added to the complexity). Perhaps that > part could be better handled by pyrex. It sounds like it. > I think some of the remaining overhead has to do with intrinsic > python calls, and the differences between the simpler type used > for Numeric versus the new style classes used for numarray. > Don't hold me to that however. Pyrex may be able to help with at least one of these. Calls between Pyrex-coded functions usually go at C speeds (although method calls may be slower). I don't know enough about the new-style, old-style issue to know about whether Pyrex can help with that but I would guess it might because a Pyrex "extension type" is more like a C extension type than a Python instance object. That implies some faster method lookup and calling. Numeric is the exact type of project Pyrex is designed for. And of course it works seamlessly with pre-existing Python and C code so you can selectively port things. Paul Prescod From bsder at allcaps.org Fri Jan 23 01:06:01 2004 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Fri Jan 23 01:06:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: <4010622E.7040605@ee.byu.edu> References: <200401211844.i0LIikEA004114@oobleck.astro.cornell.edu> <400F8400.4030208@enthought.com> <20040122142856.V3184@mail.allcaps.org> <4010622E.7040605@ee.byu.edu> Message-ID: <20040122160155.T3350@mail.allcaps.org> On Thu, 22 Jan 2004, Travis E. Oliphant wrote: > What in the world does this mean? SciPy is "mostly Windows" Yes, there > is a only a binary installer for windows available currently. But, how > does that make this statement true. > > For me SciPy has always been used almost exclusively on Linux. In fact, > the best plotting support for SciPy (in my mind) is xplt (pygist-based) > and it works best on Linux. I was referring to the installers, but I apparently did a thinko and omitted the reference. My apologies. I did not mean to imply that SciPy runs only on Windows, especially since I run it on FreeBSD. My intent was to comment about Win32 having a "one big lump" installer philosophy vs. the Linux "discrete packages" philosophy and the impact on maintainability of each. ie. the fact that releases suck up so much energy because of the need to integrate large chunks of code outside of SciPy itself. -a From falted at openlc.org Fri Jan 23 01:40:00 2004 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 23 01:40:00 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <4010CC24.3040500@prescod.net> References: <4010CC24.3040500@prescod.net> Message-ID: <200401231038.58199.falted@openlc.org> A Divendres 23 Gener 2004 08:24, Paul Prescod va escriure: > Perry Greenfield wrote: > > ... > > We had looked at it at least a couple of times. I don't remember now > > all the conclusions, but I think one of the problems was that > > it wasn't as useful when one had to deal with data types not > > used in python itself (e.g., unsigned int16). I might be wrong > > about that. > > I would guess that the issue is more whether it is natively handled by > Pyrex than whether it is handled by Python. Is there a finite list of > these types that Numarray handles? If you have a list I could generate a > patch to Pyrex that would support them. We could then ask Greg whether > he could add them to Pyrex core or refactor it so that he doesn't have to. I think the question rather was whether Pyrex would be able to work with templates (in the sense of C++), i.e. it can generate different functions depending on the datatypes passed to them. You can see some previous discussion on that list in: http://sourceforge.net/mailarchive/forum.php?thread_id=1642778&forum_id=4890 I've formulated the question to Greg and here you are his answer: http://sourceforge.net/mailarchive/forum.php?thread_id=1645713&forum_id=4890 So, it seems that he don't liked the idea to implement "templates" in Pyrex. > > > Numarray generates a lot of c code directly for the actual > > array computations. That is neither the slow part, nor the > > hard part to write. It is the array computation setup that > > is complicated. Much of that is now in C (and we do worry > > that it has greatly added to the complexity). Perhaps that > > part could be better handled by pyrex. > > It sounds like it. Yeah, I'm quite convinced that a mix between Pyrex and the existing solution in numarray for dealing with templates could be worth the effort. At least, some analysis could be done on that aspect. > > > I think some of the remaining overhead has to do with intrinsic > > python calls, and the differences between the simpler type used > > for Numeric versus the new style classes used for numarray. > > Don't hold me to that however. > > Pyrex may be able to help with at least one of these. Calls between > Pyrex-coded functions usually go at C speeds (although method calls may > be slower). Well, that should be clarified: that's only true for cdef's pyrex functions (i.e. C functions made in Pyrex). Pyrex functions that are able to be called from Python takes the same time whether they are called from Python or from the same Pyrex extension. See some timmings I've done on that subject some time ago: http://sourceforge.net/mailarchive/message.php?msg_id=3782230 Cheers, -- Francesc Alted Departament de Ci?ncies Experimentals Universitat Jaume I. Castell? de la Plana. Spain From hinsen at cnrs-orleans.fr Fri Jan 23 03:59:01 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Jan 23 03:59:01 2004 Subject: [Numpy-discussion] the direction and pace of development In-Reply-To: References: Message-ID: <200401231256.27137.hinsen@cnrs-orleans.fr> On Wednesday 21 January 2004 22:28, Perry Greenfield wrote: > contributed code as well). You have to remember that how easily > contributions come depends on what the critical mass is for > usefulness. For something like numarray or Numeric, that critical > mass is quite large. Few are interested in contributing when it > can do very little and and older package exists that can do more. I also find it difficult in practice to move code from Numeric to Numarray. While the two packages coexist peacefully, any C module that depends on the C API must be compiled for one or the other. Having both available for comparative testing thus means having two separate Python installations. And even with two installations, there is only one PYTHONPATH setting, which makes development under these conditions quite a pain. If someone has found a way out of that, please tell me! > many times in the past. Often consensus was hard to achieve. > We tended to lean towards backward compatibilty unless the change > seemed really necessary. For type coercion and error handling, > we thought it was. But I don't think we have tried shield the > decision making process from the community. I do think the difficulty > in achieving a sense of consensus is a problem. I think you did well on this - but then, I happen to share your general philosophy ;-) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bplaeq71 at cs.com Fri Jan 23 04:39:10 2004 From: bplaeq71 at cs.com (Nancy Mcgill) Date: Fri Jan 23 04:39:10 2004 Subject: [Numpy-discussion] Your Computer may need Cleaning... Message-ID: <3h5$-7-97ku$b-$8$0c@e75qtn.8l3> An HTML attachment was scrubbed... URL: From orders at avahost.net Fri Jan 23 15:43:03 2004 From: orders at avahost.net (orders at avahost.net) Date: Fri Jan 23 15:43:03 2004 Subject: [Numpy-discussion] Welcome to www.dumpsmarket.com Message-ID: Welcome to www.dumpsmarket.com the site of stolen credit cards,child porno, fake money, passports and complete info about any US citizen! You can get fresh stolen dumps here: http://www.dumpsmarket.com/forum/viewtopic.php?t=77 Only using our site you can get every detail of any US citizen including SSN number: http://www.dumpsmarket.com/forum/viewtopic.php?t=192 Credit cards with cvv2 information are available here: http://www.dumpsmarket.com/forum/viewtopic.php?t=57 Our site will be usefull for the those who want to wash their money also. If you don't want to pay taxes or you need to buy something illegal like weapons or drugs. Quick contacts: Panther 44007777 Graph 146191522 You can order by phone: 5092757151. Best regards, Vladimir V Panfilovich. From atn78zg at poczta.onet.pl Fri Jan 23 17:37:01 2004 From: atn78zg at poczta.onet.pl (Joe Manning) Date: Fri Jan 23 17:37:01 2004 Subject: [Numpy-discussion] Boost Your Car's Gas Mileage 27%+.....ifama Message-ID: FUEL SAVER PRO This revolutionary device Boosts Gas Mileage 27%+ by helping fuel burn better using three patented processes from General Motors. www.xnue.biz?axel=49 PROVEN TECHNOLOGY A certified U.S. Environmental Protection Agency (EPA) laboratory recently completed tests on the new Fuel Saver. The results were astounding! Master Service, a subsidiary of Ford Motor Company, also conducted extensive emissions testing and obtained similar, unheard of results. The achievements of the Fuel Saver is so noteworthy to the environmental community, that Commercial News has featured it as their cover story in their June, 2000 edition. Take a test drive Today - www.xnue.biz?axel=49 No more advertisements, thanks - www.sftwre.biz/gh/r/r.asp bawu iayn slh i eyulrvukbm e y i From xznu52ud at mail.ru Fri Jan 23 18:02:02 2004 From: xznu52ud at mail.ru (Charity Pearce) Date: Fri Jan 23 18:02:02 2004 Subject: [Numpy-discussion] Payment Past Due, acct Numpy-discussion uxzgqdfj Message-ID: An HTML attachment was scrubbed... URL: From rbastian at club-internet.fr Sat Jan 24 04:33:00 2004 From: rbastian at club-internet.fr (=?iso-8859-15?q?Ren=E9=20Bastian?=) Date: Sat Jan 24 04:33:00 2004 Subject: [Numpy-discussion] numarray0.4->numarray0.8 Message-ID: <04012409233700.00754@rbastian> I need your help. I tried to update numarray-0.4 to numarray-0.8 I did not get error messages during "install" but lauching python2.3 >>>import numarray I get the message Fatal Python error : Can't import module numarray.libnumarray Uninstall 0.4 (or 0.8) ? How to uninstall numarray ? Thanks for your answers -- Ren? Bastian http://www.musiques-rb.org : Musique en Python From falted at openlc.org Sat Jan 24 04:48:00 2004 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 24 04:48:00 2004 Subject: [Numpy-discussion] numarray0.4->numarray0.8 In-Reply-To: <04012409233700.00754@rbastian> References: <04012409233700.00754@rbastian> Message-ID: <200401241347.19384.falted@openlc.org> A Dissabte 24 Gener 2004 09:23, Ren? Bastian va escriure: > I need your help. > > I tried to update numarray-0.4 to numarray-0.8 > I did not get error messages during "install" > but lauching > python2.3 > > >>>import numarray > > I get the message > Fatal Python error : Can't import module numarray.libnumarray > > Uninstall 0.4 (or 0.8) ? > How to uninstall numarray ? > Perhaps there is a better way, but try with deleting the numarray directory in your python site-packages directory. In my case, the next does the work: rm -r /usr/lib/python2.3/site-packages/numarray/ -- Francesc Alted From nancyk at MIT.EDU Sun Jan 25 10:34:00 2004 From: nancyk at MIT.EDU (Nancy Keuss) Date: Sun Jan 25 10:34:00 2004 Subject: [Numpy-discussion] Numpy capabilities? Message-ID: Hi, I will be working a lot with matrices, and I am wondering a few things before I get started with NumPy: 1) Is there a function that performs matrix multiplication? 2) Is there a function that takes a tensor product, or Kronecker product, of two matrices? 3) Is it possible to concatenate two matrices together? 4) Is there a way to insert a matrix into a subsection of an already existing matrix. For instance, to insert a 2x2 matrix into the upper left hand corner of a 4x4 matrix? Thank you very much in advance! Nancy From hinsen at cnrs-orleans.fr Sun Jan 25 11:21:00 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Sun Jan 25 11:21:00 2004 Subject: [Numpy-discussion] Numpy capabilities? In-Reply-To: References: Message-ID: <758CA5BB-4F6B-11D8-8519-000A95AB5F10@cnrs-orleans.fr> On 25.01.2004, at 19:33, Nancy Keuss wrote: > I will be working a lot with matrices, and I am wondering a few things > before I get started with NumPy: > > 1) Is there a function that performs matrix multiplication? Yes, Numeric.dot(matrix1, matrix2) > 2) Is there a function that takes a tensor product, or Kronecker > product, of > two matrices? Yes, Numeric.multiply.outer(matrix1, matrix2) > 3) Is it possible to concatenate two matrices together? Yes: Numeric.concatenate((matrix1, matrix2)) > 4) Is there a way to insert a matrix into a subsection of an already > existing matrix. For instance, to insert a 2x2 matrix into the upper > left > hand corner of a 4x4 matrix? Yes: matrix4x4[:2, :2] = matrix2x2 Konrad. From rays at san.rr.com Sun Jan 25 22:31:01 2004 From: rays at san.rr.com (RJS) Date: Sun Jan 25 22:31:01 2004 Subject: [Numpy-discussion] efficient sum of "sparse" 2D arrays? Message-ID: <5.2.1.1.2.20040125215330.0354efd0@pop-server.san.rr.com> Hi all, The problem: I have a "stack" of 8, 640 x 480 integer image arrays from a FITS cube concatenated into a 3D array, and I want to sum each pixel such that the result ignores clipped values (255+); i.e., if two images have clipped pixels at (x,y) the result along z will be the sum of the other 6. I'm trying to come up with a pure Numeric way (hopefully so that I can use weave.blitz) to speed up the calculation. I just looked into masked arrays, but I'm not familiar with that module at all. I was guessing someone out there has done this before... Ray From hinsen at cnrs-orleans.fr Mon Jan 26 00:17:01 2004 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon Jan 26 00:17:01 2004 Subject: [Numpy-discussion] efficient sum of "sparse" 2D arrays? In-Reply-To: <5.2.1.1.2.20040125215330.0354efd0@pop-server.san.rr.com> References: <5.2.1.1.2.20040125215330.0354efd0@pop-server.san.rr.com> Message-ID: <056D7E45-4FD8-11D8-B969-000A95AB5F10@cnrs-orleans.fr> On 26.01.2004, at 07:14, RJS wrote: > The problem: I have a "stack" of 8, 640 x 480 integer image arrays > from a FITS cube concatenated into a 3D array, and I want to sum each > pixel such that the result ignores clipped values (255+); i.e., if two > images have clipped pixels at (x,y) the result along z will be the sum > of the other 6. > Memory doesn't seem critical for such small arrays, so you can just do sum([where(a < 255, a, 0) for a in images]) Konrad. From SKuzminski at fairisaac.com Mon Jan 26 04:54:03 2004 From: SKuzminski at fairisaac.com (Kuzminski, Stefan R) Date: Mon Jan 26 04:54:03 2004 Subject: [Numpy-discussion] efficient sum of "sparse" 2D arrays? Message-ID: <7646464ACC9B5347A4A5C57729D74A550369C143@srfmsg100.corp.fairisaac.com> Could you use masked arrays more efficiently in this case? If you create the array so that values >255 and <0 are masked, then they will be excluded from the sum ( and from any other operations as well ). Stefan -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Konrad Hinsen Sent: Monday, January 26, 2004 12:17 AM To: RJS Cc: numpy-discussion at lists.sourceforge.net Subject: Re: [Numpy-discussion] efficient sum of "sparse" 2D arrays? On 26.01.2004, at 07:14, RJS wrote: > The problem: I have a "stack" of 8, 640 x 480 integer image arrays > from a FITS cube concatenated into a 3D array, and I want to sum each > pixel such that the result ignores clipped values (255+); i.e., if two > images have clipped pixels at (x,y) the result along z will be the sum > of the other 6. > Memory doesn't seem critical for such small arrays, so you can just do sum([where(a < 255, a, 0) for a in images]) Konrad. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From nancyk at MIT.EDU Mon Jan 26 08:39:09 2004 From: nancyk at MIT.EDU (Nancy Keuss) Date: Mon Jan 26 08:39:09 2004 Subject: [Numpy-discussion] simple Numarray question Message-ID: Hi, What do I have to include in my Python file for Python to recognize Numarray functions? For instance, in a file called hello.py I try: a = arange(10) print a[1:5] and I get the error: Traceback (most recent call last): File "C:\Python23\hello.py", line 3, in ? a = arange(10) NameError: name 'arange' is not defined Thank you, Nancy From jsaenz at wm.lc.ehu.es Mon Jan 26 08:47:09 2004 From: jsaenz at wm.lc.ehu.es (Jon Saenz) Date: Mon Jan 26 08:47:09 2004 Subject: [Numpy-discussion] simple Numarray question In-Reply-To: Message-ID: Read point 6 of the Python Tutorial. Read chapter 2 (Installing NumPy) of Numeric Python manual. Hope this helps. Jon Saenz. | Tfno: +34 946012445 Depto. Fisica Aplicada II | Fax: +34 944648500 Facultad de Ciencias. \\ Universidad del Pais Vasco \\ Apdo. 644 \\ 48080 - Bilbao \\ SPAIN On Mon, 26 Jan 2004, Nancy Keuss wrote: > Hi, > > What do I have to include in my Python file for Python to recognize Numarray > functions? For instance, in a file called hello.py I try: > > a = arange(10) > print a[1:5] > > and I get the error: > > Traceback (most recent call last): > File "C:\Python23\hello.py", line 3, in ? > a = arange(10) > NameError: name 'arange' is not defined > > Thank you, > Nancy > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From jmiller at stsci.edu Mon Jan 26 08:58:13 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 26 08:58:13 2004 Subject: [Numpy-discussion] simple Numarray question In-Reply-To: References: Message-ID: <1075136211.16499.24.camel@localhost.localdomain> On Mon, 2004-01-26 at 11:38, Nancy Keuss wrote: > Hi, > > What do I have to include in my Python file for Python to recognize Numarray > functions? For instance, in a file called hello.py I try: > > a = arange(10) > print a[1:5] > > and I get the error: > > Traceback (most recent call last): > File "C:\Python23\hello.py", line 3, in ? > a = arange(10) > NameError: name 'arange' is not defined > There are a number of ways to import numarray (or any Python module), but the way I recommend is this: import numarray a = numarray.arange(10) print a[1:5] If you're writing quick scripts that you're not worried about maintaining, do this: from numarray import * a = arange(10) print a[1:5] If writing "numarray." is too tedious, but you still care about maintenance, try something like this: import numarray as _n a = _n.arange(10) print a[1:5] Todd > Thank you, > Nancy > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller From Chris.Barker at noaa.gov Mon Jan 26 09:52:03 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Jan 26 09:52:03 2004 Subject: [Numpy-discussion] Status of Numeric In-Reply-To: <200401231038.58199.falted@openlc.org> References: <4010CC24.3040500@prescod.net> <200401231038.58199.falted@openlc.org> Message-ID: <401552DF.3000505@noaa.gov> I remember that thread clearly, as I think making it easy to write new Ufuncs (and others) that perform at C speed could make a real difference to how effective SciPy ultimately is. I say SciPy, because I believe a large collection of special purpose optimized functions probably doesn't belong in in Numarray itself. Francesc Alted wrote: > So, it seems that he don't liked the idea to implement "templates" in Pyrex. Yes, I remember that answer, and was disappointed, though the logic of not-re-implkimenting C++ templates is pretty obvious. Which brings up the obvious question: why not use C++ templates themselves? which is what Blitz does. This points ot weave.blitz at the obvious way to write optimized special purpose functions for SciPy. Does weave.Blitz work with Numarray yet? Clearly it's time for me to check it out more... > Yeah, I'm quite convinced that a mix between Pyrex and the existing solution > in numarray for dealing with templates could be worth the effort. At least, > some analysis could be done on that aspect. allowing Pyrex to use templates would be great.. but how would that be better than weave.blitz? Or maybe Pyrex could use blitz. I'm kind of over my head here, but I hope something comes of this. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From j_r_fonseca at yahoo.co.uk Mon Jan 26 10:11:02 2004 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-1?Q?Jos=E9?= Fonseca) Date: Mon Jan 26 10:11:02 2004 Subject: [Numpy-discussion] Searchable online documentation of Numeric Python Message-ID: I've made this mainly for myself, but in case anybody here finds it useful, here is the online documentation of Numeric Python with TOC/index/search navigation: http://mefriss1.swan.ac.uk/htmlhelp/php/index.php?book_id=49 There's also Python documention there if you fancy it. NOTE: The URL may go dead eventually, but until I'll arrange a way to have this documentation elsewhere, so it won't worry about that. I hope you enjoy it. Jose Fonseca From dubois1 at llnl.gov Mon Jan 26 11:31:02 2004 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Mon Jan 26 11:31:02 2004 Subject: [Numpy-discussion] cygwin problems? Message-ID: <40156AC3.1080501@llnl.gov> A user (see below) has complained that svd and other functions hang on cygwin, with numarray and Numeric. Anyone know anything about this? Hi Paul I tried the svd of numarray, version 0.6.2, I did not download the newest version of numarray, and still the same problem happen, it hanged on my cygwin box. I've also, noticed that calculating the eigenvalues of a square matrix with Numeric also hang the python 2.3 I wonder if the problem is related.... Let me know, if you have some idea of what to do next. From rays at blue-cove.com Mon Jan 26 11:55:09 2004 From: rays at blue-cove.com (Ray Schumacher) Date: Mon Jan 26 11:55:09 2004 Subject: [Numpy-discussion] summing "sparse" 2D arrays? Results... Message-ID: <5.2.0.4.2.20040126111759.09418258@blue-cove.com> I just realized... where() belongs to Numeric, so I need sum([Numeric.where(a < 255, a, 0) for a in y]) duh. I did just compare Numeric vs. Masked arrays: =========================================================== # test.py from MA import masked_array, sum from RandomArray import * import time seed() y = randint(240,256, (480,640,16)) start = time.time() x=masked_array(y, y>=255) maskTime = time.time() - start sum_1 = sum(x,axis=2) maskedTime = time.time() - start print sum_1.shape print sum_1 print "mask make time: " + str(maskTime) print "time using MA: " + str(maskedTime) + "\n" z = Numeric.reshape(y, (16, 480, 640)) newStart = time.time() sum_2 = sum([Numeric.where(a < 255, a, 2) for a in z]) numTime = time.time() - newStart print sum_2.shape print sum_2 print "time using Numeric: " + str(numTime) + "\n" ====================================================== Result: C:\projects\Astro>python test.py (480, 640) array (480,640) , type = O, has 307200 elements mask make time: 1.07899999619 time using MA: 3.39100003242 (480, 640) array (480,640) , type = l, has 307200 elements time using Numeric: 2.39099979401 So, MA's sum() is slightly faster, but the penalty for making a mask first is large. Now I have to figure out why I had to reshape the array for the second computation. Thanks, Ray At 09:17 AM 1/26/2004 +0100, you wrote: >On 26.01.2004, at 07:14, RJS wrote: > >>The problem: I have a "stack" of 8, 640 x 480 integer image arrays from >>a FITS cube concatenated into a 3D array, and I want to sum each pixel >>such that the result ignores clipped values (255+); i.e., if two images >>have clipped pixels at (x,y) the result along z will be the sum of the other 6. >Memory doesn't seem critical for such small arrays, so you can just do > >sum([where(a < 255, a, 0) for a in images]) Hello Konrad, I just tried: from MA import masked_array, sum from RandomArray import * seed() y = randint(240,256, (480,640,2)) print sum([where(a < 255, a, 0) for a in y]) and it errors: Traceback (most recent call last): File "test.py", line 21, in ? print sum([where(a < 255, a, 0) for a in y]) NameError: name 'where' is not defined Could you enlighten me further? I have not found a good resource for compound Numeric statements yet. Thank you, Ray From mdehoon at ims.u-tokyo.ac.jp Mon Jan 26 17:42:01 2004 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Mon Jan 26 17:42:01 2004 Subject: [Numpy-discussion] cygwin problems? In-Reply-To: <40156AC3.1080501@llnl.gov> References: <40156AC3.1080501@llnl.gov> Message-ID: <4015C296.5000707@ims.u-tokyo.ac.jp> Patch 732520 for Numeric fixes both problems. The problem is caused by some lapack routines being inadvertently compiled with optimization; see the patch description for a full explanation. Note that the same error may occur on platforms other than Cygwin, and also with other linear algebra functions. --Michiel, U Tokyo. Paul F. Dubois wrote: > A user (see below) has complained that svd and other functions hang on > cygwin, with numarray and Numeric. Anyone know anything about this? > > Hi Paul > > I tried the svd of numarray, version 0.6.2, I did not download the > newest version of > numarray, and still the same problem happen, it hanged on my cygwin box. > > I've also, noticed that calculating the eigenvalues of a square matrix > with Numeric also > hang the python 2.3 I wonder if the problem is related.... > > Let me know, if you have some idea of what to do next. > > > > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From istfet at cec.eu.int Tue Jan 27 00:56:28 2004 From: istfet at cec.eu.int (istfet at cec.eu.int) Date: Tue Jan 27 00:56:28 2004 Subject: [Numpy-discussion] Status Message-ID: The message contains Unicode characters and has been sent as a binary attachment. -------------- next part -------------- A non-text attachment was scrubbed... Name: readme.zip Type: application/octet-stream Size: 22646 bytes Desc: not available URL: From 571xvqt at spray.se Tue Jan 27 03:18:55 2004 From: 571xvqt at spray.se (Lorna Hoffman) Date: Tue Jan 27 03:18:55 2004 Subject: [Numpy-discussion] OTCBB: SEVI - Stock-Market Profile of the Week-- Message-ID: <4i-$c--kpn3pzdwfv1c1@ncmeh.cgts7j> Stock Profile of the Week - NEW ISSUE: SEVI - Systems Evolution Incorporated Systems Evolution Incorporated (SEVI) is a high technology consulting firm with cutting-edge technologists and integration specialists. Its seasoned staff, which includes former lead technical officers from other organizations, possesses considerable expertise and multiple certifications in technologies from such industry leaders as Microsoft, Sybase, Oracle, and Novell, as well as innovators such as Plumtree, SAP, and Actuate. As a Microsoft Certified Solutions Provider? (MCSP), SEVI employs several Microsoft certified professionals. These individuals include Systems Engineers (MCSE), Systems Developers (MCSD), Database Administrators (MCDBA), and Trainers (MCT). SEI has extensive experience with enterprise-class Microsoft solutions, in the platform implementation of Microsoft .NET servers as well as the development of Visual BASIC and Active Server Pages (ASP). Personnel help maintain our Gold Partner status with Novell. Also, SEVI was a leader in the Java revolution. SEVI technologists were among the first Java trainers, and its developers achieved early recognition for their usage of server-based Java technologies such as SilverStream application servers and J2EE. Focused on Novell's application server technology, SEVI's Sun Java-certified personnel help maintain its Gold Partner status with Novell. No more ads: http://jampe.biz/patch/o.html?eCT Equity Essence is an independent research firm. This report is based on Equity Essence's independent analysis but also relies on information supplied by sources believed to be reliable. This report may not be the opinion of SEVI management. Equity Essence has also been retained to research and issue reports on SEVI. Equity Essence may from time to time purchase or sell SEVI common shares in the open market without notice. The information contained in this report shall not constitute, an offer to sell or solicitation of any offer to purchase any security. It is intended for information only. Some statements may contain so-called "forward-looking statements". Many factors could cause actual results to differ. Investors should consult with their Investment Advisor concerning SEVI. Copyright 2004 Equity Essence. All Rights Reserved. This newsletter was distributed by CTS. CTS was paid three thousand dollars to distribute this report. CTS is not affiiated with Equity Essence and is not responsible for newsletter content. CTS: Apartado 173-3006, Zona Franca Metro, Barreal, Heredia, Costa Rica. au ot fkkufirvrx qj zavcfdp tilj idjjezcxs hkfqrwdq gbo From ariciputi at pito.com Tue Jan 27 09:45:26 2004 From: ariciputi at pito.com (Andrea Riciputi) Date: Tue Jan 27 09:45:26 2004 Subject: [Numpy-discussion] Writing arrays to files. Message-ID: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Hi, I need a little help here. I need to write some Numeric arrays (coming from some simulations) to ASCII files. I'm sure it's a well known topic and I'd be happy not to have to reinvent the wheel. I've both 1-dim and 2-dim arrays and I'd like to get something like this: - 1-dim array ASCII file: 0.1 0.2 0.3 etc... - 2-dim array ASCII file: 0.1 0.2 0.3 0.4 0.5 0.6 etc.... How can I get them? Thanks in advance, Andrea. --- Andrea Riciputi "Science is like sex: sometimes something useful comes out, but that is not the reason we are doing it" -- (Richard Feynman) From jdhunter at ace.bsd.uchicago.edu Tue Jan 27 10:45:56 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Tue Jan 27 10:45:56 2004 Subject: [Numpy-discussion] Writing arrays to files. In-Reply-To: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> (Andrea Riciputi's message of "Tue, 27 Jan 2004 16:32:32 +0100") References: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Message-ID: >>>>> "Andrea" == Andrea Riciputi writes: Andrea> Hi, I need a little help here. I need to write some Andrea> Numeric arrays (coming from some simulations) to ASCII Andrea> files. I'm sure it's a well known topic and I'd be happy Andrea> not to have to reinvent the wheel. I've both 1-dim and Andrea> 2-dim arrays and I'd like to get something like this: Andrea> - 1-dim array ASCII file: Andrea> 0.1 0.2 0.3 etc... If your array is not monstrously large, and you can do it all in memory, do fh = file('somefile.dat', 'w') s = ' '.join([str(val) for val in a]) fh.write(s) where a is your 1D array Andrea> - 2-dim array ASCII file: Andrea> 0.1 0.2 0.3 0.4 0.5 0.6 etc.... Andrea> How can I get them? Same idea fh = file('somefile.dat', 'w') for row in a: s = ' '.join([str(val) for val in row]) fh.write('%s\n' % s) where a is your 2D array The scipy module also has support for reading and writing ASCII files. Note if you are concerned about efficiency and are willing to use binary files, use the fromstring and tostring methods. JDH From jdhunter at ace.bsd.uchicago.edu Tue Jan 27 15:07:11 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Tue Jan 27 15:07:11 2004 Subject: [Numpy-discussion] Writing arrays to files. In-Reply-To: (Andrea Riciputi's message of "Tue, 27 Jan 2004 23:25:20 +0100") References: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Message-ID: >>>>> "Andrea" == Andrea Riciputi writes: Andrea> On 27 Jan 2004, at 19:31, John Hunter wrote: >> If your array is not monstrously large, and you can do it all >> in memory, do Andrea> 1-dim arrays with 1000 elements and 2-dim arrays with Andrea> (1000 x 1000) elements have to be considered "monstrously Andrea> large"? You should have no trouble with either the 1D or 2D approaches I posted with arrays this size. Even though 1000x1000 is a lot of elements, the 2D approach does the string operations row by row, so only 1000 will be converted at a time, which will be trivial for all but the clunkiest machines. JDH From Mailer-Daemon at ensm-douai.fr Wed Jan 28 04:41:07 2004 From: Mailer-Daemon at ensm-douai.fr (Mail Delivery Subsystem) Date: Wed Jan 28 04:41:07 2004 Subject: [Numpy-discussion] Returned mail: see transcript for details Message-ID: <200401281240.i0SCeAp7030686@ecole.ensm-douai.fr> The original message was received at Wed, 28 Jan 2004 13:40:10 +0100 from viruswall.ensm-douai.fr [10.1.1.22] ----- The following addresses had permanent fatal errors ----- (reason: 501 numpy-discussion at lists.sourceforge.net... Unauthorized sender) ----- Transcript of session follows ----- ... while talking to [10.1.1.1]: >>> MAIL From: <<< 501 numpy-discussion at lists.sourceforge.net... Unauthorized sender 501 5.6.0 Data format error -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From numpy-discussion at lists.sourceforge.net Wed Jan 28 07:39:03 2004 From: numpy-discussion at lists.sourceforge.net (numpy-discussion at lists.sourceforge.net) Date: Wed, 28 Jan 2004 13:39:03 +0100 Subject: No subject Message-ID: <200401281240.i0SCeAp7030683@ecole.ensm-douai.fr> The message cannot be represented in 7-bit ASCII encoding and has been sent as a binary attachment. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: InterScan_SafeStamp.txt URL: From niall at lastminute.com Wed Jan 28 09:47:04 2004 From: niall at lastminute.com (Niall Dalton) Date: Wed Jan 28 09:47:04 2004 Subject: [Numpy-discussion] atan2 Message-ID: <1075311725.3856.69.camel@localhost.localdomain> Hello, I'm using Numeric 23.1, and finding it very useful! I do need to use atan2, and a browse of the manual suggests its not available as a binary ufunc. I'm happy to add it myself if I'm correct - I'm guessing it should be simple to based on the code of one of the existing functions. Is Numeric still accepting patches, or should I consider switching to Numarray? Regards, Niall ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ From jdhunter at ace.bsd.uchicago.edu Wed Jan 28 09:54:09 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Wed Jan 28 09:54:09 2004 Subject: [Numpy-discussion] atan2 In-Reply-To: <1075311725.3856.69.camel@localhost.localdomain> (Niall Dalton's message of "Wed, 28 Jan 2004 17:42:05 +0000") References: <1075311725.3856.69.camel@localhost.localdomain> Message-ID: >>>>> "Niall" == Niall Dalton writes: Niall> Hello, I'm using Numeric 23.1, and finding it very useful! Niall> I do need to use atan2, and a browse of the manual suggests Niall> its not available as a binary ufunc. Numeric.arctan2 If you run into a similar problem in the future, you may want to try >>> import Numeric >>> dir(Numeric) Hope this helps, JDH From niall at lastminute.com Wed Jan 28 10:03:03 2004 From: niall at lastminute.com (Niall Dalton) Date: Wed Jan 28 10:03:03 2004 Subject: [Numpy-discussion] atan2 In-Reply-To: References: <1075311725.3856.69.camel@localhost.localdomain> Message-ID: <1075312803.3856.71.camel@localhost.localdomain> On Wed, 2004-01-28 at 17:39, John Hunter wrote: > >>>>> "Niall" == Niall Dalton writes: > > Niall> I do need to use atan2 > Numeric.arctan2 > > If you run into a similar problem in the future, you may want to try > > >>> import Numeric > >>> dir(Numeric) > > Hope this helps, It does indeed, thanks! I blame the first snowfall we just had minutes ago for the oversight - its my story and I'm sticking to it ;-) Thanks, niall ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ From ariciputi at pito.com Thu Jan 29 11:27:03 2004 From: ariciputi at pito.com (Andrea Riciputi) Date: Thu Jan 29 11:27:03 2004 Subject: [Numpy-discussion] Writing arrays to files. In-Reply-To: References: <0627057D-50DE-11D8-9CFE-000393933E4E@pito.com> Message-ID: On 27 Jan 2004, at 19:31, John Hunter wrote: > If your array is not monstrously large, and you can do it all in > memory, do 1-dim arrays with 1000 elements and 2-dim arrays with (1000 x 1000) elements have to be considered "monstrously large"? Cheers, Andrea. --- Andrea Riciputi "Science is like sex: sometimes something useful comes out, but that is not the reason we are doing it" -- (Richard Feynman) From zk4hcgg at spray.se Fri Jan 30 07:40:00 2004 From: zk4hcgg at spray.se (Chuck Meadows) Date: Fri Jan 30 07:40:00 2004 Subject: [Numpy-discussion] Protect Yourself now Message-ID: An HTML attachment was scrubbed... URL: From dholth at fastmail.fm Fri Jan 30 13:23:06 2004 From: dholth at fastmail.fm (Daniel Holth) Date: Fri Jan 30 13:23:06 2004 Subject: [Numpy-discussion] shape, size Message-ID: <1075435976.13335.4.camel@bluefish> In python, if na is a numarray: >>> na array([[0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) I can type >>> nb = na[:,:4] >>> nb array([[0, 1, 0, 1], [1, 0, 1, 0]]) >>> nb[0][0]=17 >>> na array([[17, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], [ 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) nb and na share data. How do you write nb = na[:,:4] in a C extension module? Thanks, Daniel Holth From jmiller at stsci.edu Fri Jan 30 13:50:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 30 13:50:02 2004 Subject: [Numpy-discussion] shape, size In-Reply-To: <1075435976.13335.4.camel@bluefish> References: <1075435976.13335.4.camel@bluefish> Message-ID: <1075499339.9028.16.camel@halloween.stsci.edu> On Thu, 2004-01-29 at 23:12, Daniel Holth wrote: > In python, if na is a numarray: > > >>> na > array([[0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], > [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) > > I can type > > >>> nb = na[:,:4] > >>> nb > array([[0, 1, 0, 1], > [1, 0, 1, 0]]) > > >>> nb[0][0]=17 > > >>> na > array([[17, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0], > [ 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]) > > nb and na share data. > > How do you write nb = na[:,:4] in a C extension module? Here's the quick and dirty way: nb = (PyArrayObject *) PyObject_CallMethod(na, "view", NULL); if (na->dimensions[1] >= 4) nb->dimensions[1] = 4; Todd > > Thanks, > > Daniel Holth > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 From dholth at sent.com Fri Jan 30 14:27:02 2004 From: dholth at sent.com (Daniel Holth) Date: Fri Jan 30 14:27:02 2004 Subject: [Numpy-discussion] shape, size Message-ID: <1075436585.13335.7.camel@bluefish> apoligies if this is a duplicate... How do you write nb = na[:,:4] in a C extension module? Thanks, Daniel Holth From austin at magicfish.net Fri Jan 30 18:33:36 2004 From: austin at magicfish.net (Austin Luminais) Date: Fri Jan 30 18:33:36 2004 Subject: [Numpy-discussion] numarray-0.8.1 Message-ID: <6.0.0.22.0.20040130202402.0281ee10@mail.magicfish.net> Hello, is there any place I can download a Windows installer for numarray 0.8.1? I upgraded to 0.8.2 a while back, but it does not work with McMillan's Installer. 0.8.1 worked fine, but I neglected to keep a copy of it. As for why it doesn't work with Installer, I'm not sure. At least part of the problem is that it is hardcoded to load LICENSE.TXT in __init__.py in a way that is incompatible with Installer. I tried removing the loading of LICENSE.TXT (which I realize is a questionable thing to do; I was just trying to get it working), but it doesn't work after that either. From jdhunter at ace.bsd.uchicago.edu Fri Jan 30 19:49:03 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Fri Jan 30 19:49:03 2004 Subject: [Numpy-discussion] [Daniel Holth ] numarray question Message-ID: Daniel asked me to forward this, as apparently he has had trouble getting mail through. -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From dholth at fastmail.fm Thu Jan 29 23:30:29 2004 From: dholth at fastmail.fm (Daniel Holth) Date: Thu, 29 Jan 2004 23:30:29 -0500 Subject: numarray question Message-ID: <1075437029.13335.12.camel@bluefish> JDH, sf.net seems to be ignoring my messages, so would you relay this question: how do you write nb = na[:,:4], creating a slice of na that references nb, in an extension? thanks, Daniel Holth --=-=-=-- From dholth at fastmail.fm Fri Jan 30 21:31:34 2004 From: dholth at fastmail.fm (Daniel Holth) Date: Fri Jan 30 21:31:34 2004 Subject: [Numpy-discussion] array resizing? Message-ID: <1075353684.10541.2.camel@bluefish> Is it possible for a C function to take an array from Python and resize it for returned results? From rays at blue-cove.com Sat Jan 31 07:04:02 2004 From: rays at blue-cove.com (RayS) Date: Sat Jan 31 07:04:02 2004 Subject: [Numpy-discussion] numarray-0.8.1 In-Reply-To: <6.0.0.22.0.20040130202402.0281ee10@mail.magicfish.net> Message-ID: <5.2.1.1.2.20040131070100.0810f060@216.122.242.54> At 08:32 PM 1/30/04 -0600, Austin Luminais wrote: >As for why it doesn't work with Installer, I'm not sure. At least part of >the problem is that it is hardcoded to load LICENSE.TXT in __init__.py in >a way that is incompatible with Installer. >I tried removing the loading of LICENSE.TXT (which I realize is a >questionable thing to do; I was just trying to get it working), but it >doesn't work after that either. I saw this thread before: http://aspn.activestate.com/ASPN/Mail/Message/numpy-discussion/1967514 seems the solution though I haven't had to try it. I prefer McMillan to py2exe for it's smaller exe-s. Ray