[Numpy-discussion] Matlab/Numeric/numarray benchmarks
Jason Rennie
jrennie at csail.mit.edu
Thu Jan 20 07:25:49 EST 2005
I have access to a variety of intel machines running Debian Sarge, and
I'm trying to decide between numarray and Numeric for some experiments
I'm about to run, so I thought I'd try out this benchmark. I need
fast matrix multipliation and element-wise operations. Here are the
results I see:
Celeron/2.8GHz
--------------
Matlab: 0.0475 1.44 5.78
Numeric: 0.0842 1.19 6.28
numarray: 7.62 9.78 Floating point exception
Pentium4/2.8GHz
---------------
Matlab: 0.0143 1.00 3.08
Numeric: 0.0653 1.19 6.26
numarray: 3.46 8.30 Floating point exception
DualXeon/3.06GHz
----------------
Matlab: 0.0102 0.886 2.71
Numeric: 0.0272 10.2 2.46
numarray: 2.23 3.43 Floating point exception
Numarray performance is pitiful. Numeric ain't bad, except for that
matrixmultiply on the Xeon. As luck would have it, our
cpu-cycle-servers are all Xeons, and the main big computations I have
to do are matrix multiplies... Grrr...
All three machines are Debian Sarge with atlas3-sse2 plus all the
python2.3 packages installed. I had to include /usr/lib/atlas/sse2 in
my LD_LIBRARY_PATH. Anyone have any clue why the Xeon would balk at
the Numeric matrixmultiply? Thinking it might be an atlas3-sse2
issue, I tried atlas-sse:
Xeon/atlas3-sse/Numeric: 0.0269 10.2 2.44
Xeon/atlas3-sse/numarray: 2.24 3.41 2.48
Apparently, there's a bug in the sse2 libraries that numarry is
tripping... Still horrible Numeric/matrixmultiply
performance... Interesting that sse2 doesn't provide a performance
boost over sse. I tried it on another Xeon machine... same bad
Numeric/matrixmultiply performance. I tried atlas3-base (386
instructions only):
Xeon/atlas3-base/Numeric: 0.0269 10.2 2.60
Xeon/atlas3-base/numarray: 2.23 3.41 2.54
Sheesh! No worse than the libraries w/ sse instructions... But
still, no improvement in the Numeric/matrixmultiply test. Next,
refblas3/lapack3:
Xeon/Numeric: 0.0271 3.45 2.72
Xeon/numarray: 2.24 3.42 2.62
Progress! Though, the Numeric/matrixmultiply is still four times
slower than Matlab...
As far as I can tell, I'm out of (Debian Sarge) libraries to
try... Any ideas as to why the Numeric matrixmultiply would be so slow
on the Xeon?
Thanks,
Jason
P.S. I had to move the import statements to the top of the file to get
benchmark.py to work. As a sanity check, I tried only importing sys,
time, Numeric, and RandomArray, defining test10. I then called
test10(). Same results as above.
More information about the NumPy-Discussion
mailing list