[Numpy-discussion] mysql -> record array

Travis Oliphant oliphant at ee.byu.edu
Thu Nov 16 19:44:56 EST 2006


John Hunter wrote:

>>>>>> "Erin" == Erin Sheldon <erin.sheldon at gmail.com> writes:
>>>>>>           
>>>>>
>
>    Erin> The question I have been asking myself is "what is the
>    Erin> advantage of such an approach?".  It would be faster, but by
>
> In the use case that prompted this message, the pull from mysql took
> almost 3 seconds, and the conversion from lists to numpy arrays took
> more that 4 seconds.  We have a list of about 500000 2 tuples of
> floats.
>
> Digging in a little bit, we found that numpy is about 3x slower than
> Numeric here
>
>  peds-pc311:~> python test.py
>  with dtype: 4.25 elapsed seconds
>  w/o dtype 5.79 elapsed seconds
>  Numeric  1.58 elapsed seconds
>  24.0b2
>  1.0.1.dev3432
>
> Hmm... So maybe the question is -- is there some low hanging fruit
> here to get numpy speeds up?
>
> import time
> import numpy
> import numpy.random
> rand = numpy.random.rand
>
> x = [(rand(), rand()) for i in xrange(500000)]
> tnow = time.time()
> y = numpy.array(x, dtype=numpy.float_)
> tdone = time.time()
> print 'with dtype: %1.2f elapsed seconds'%(tdone - tnow)
>
> tnow = time.time()
> y = numpy.array(x)
> tdone = time.time()
> print 'w/o dtype %1.2f elapsed seconds'%(tdone - tnow)
>
> import Numeric
> tnow = time.time()
> y = Numeric.array(x, Numeric.Float)
> tdone = time.time()
> print 'Numeric  %1.2f elapsed seconds'%(tdone - tnow)
>
> print Numeric.__version__
> print numpy.__version__
>  
>

I just adapted Numarray's version of array (using the fromlist method) 
to NumPy.   This new change needs some testing as it is called in many, 
many ways.  But, I think it should be right (all tests of numpy and 
scipy pass with it).
With the change I get:

with dtype: 0.22 elapsed seconds
w/o dtype 5.02 elapsed seconds
Numeric  7.38 elapsed seconds
numarray  0.55 elapsed seconds
24.2
1.0.1.dev3437
1.5.1



-Travis






More information about the NumPy-Discussion mailing list