[Numpy-discussion] merge_arrays is very slow; alternatives?
Gerrit Holl
gerrit.holl at gmail.com
Fri Nov 26 14:16:56 EST 2010
Hi,
upon profiling my code, I found that
numpy.lib.recfunctions.merge_arrays is extremely slow; it does some
7000 rows/second. This is not acceptable for me.
I have two large record arrays, or arrays with a complicated dtype.
All I want to do is to merge them into one. I don't think that should
have to be a very slow operation, I don't need to copy anything, I
just want to view the two record arrays as one.
How can I do this in a faster way?
In [45]: cProfile.runctx("numpy.lib.recfunctions.merge_arrays([metarows,
targetrows2], flatten=True)", globals(), locals())
225381902 function calls (150254635 primitive calls) in
166.620 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.031 0.031 166.620 166.620 <string>:1(<module>)
68/1 0.000 0.000 0.000 0.000 _internal.py:82(_array_descr)
2 0.000 0.000 0.000 0.000 numeric.py:286(asanyarray)
2 0.000 0.000 0.000 0.000 recfunctions.py:135(flatten_descr)
1 0.000 0.000 0.001 0.001 recfunctions.py:161(zip_descr)
149165600/74038400 117.195 0.000 139.701 0.000
recfunctions.py:235(_izip_fields_flat)
1088801 12.146 0.000 151.847 0.000 recfunctions.py:263(izip_records)
3 0.000 0.000 0.000 0.000 recfunctions.py:277(sentinel)
1 4.599 4.599 166.589 166.589 recfunctions.py:328(merge_arrays)
3 0.000 0.000 0.000 0.000 recfunctions.py:406(<genexpr>)
75127201 22.506 0.000 22.506 0.000 {isinstance}
69 0.000 0.000 0.000 0.000 {len}
1 0.000 0.000 0.000 0.000 {map}
1 0.000 0.000 0.000 0.000 {max}
2 0.000 0.000 0.000 0.000 {method '__array__' of
'numpy.ndarray' objects}
136 0.000 0.000 0.000 0.000 {method 'append' of
'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'extend' of
'list' objects}
2 0.000 0.000 0.000 0.000 {method 'pop' of 'list' objects}
2 0.000 0.000 0.000 0.000 {method 'ravel' of
'numpy.ndarray' objects}
2 0.000 0.000 0.000 0.000 {numpy.core.multiarray.array}
1 10.142 10.142 10.142 10.142 {numpy.core.multiarray.fromiter}
Gerrit.
--
Gerrit Holl
PhD student at Department of Space Science, Luleå University of
Technology, Kiruna, Sweden
http://www.sat.ltu.se/members/gerrit/
More information about the NumPy-Discussion
mailing list