[Numpy-discussion] Numpy x Matlab: some synthetic benchmarks

Wed Jan 18 03:45:02 EST 2006

Hello,

Travis asked me to benchmark numpy versus matlab in some basic linear
algebra operations. Here are the resuts for matrices/vectors of
dimensions 5, 50 and 500:

Operation	x'*y	x*y'	A*x	A*B	A'*x	Half	2in2

Dimension 5							
Array		0.94	0.7	0.22	0.28	1.12	0.98	1.1
Matrix		7.06	1.57	0.66	0.79	1.6	3.11	4.56
Matlab		1.88	0.44	0.41	0.35	0.37	1.2	0.98

Dimension 50							
Array		9.74	3.09	0.56	18.12	13.93	4.2	4.33
Matrix 		81.99	3.81	1.04	19.13	14.58	6.3	7.88
Matlab		16.98	1.94	1.07	17.86	0.73	1.57	1.77

Dimension 500							
Array		1.2	8.97	2.03	166.59	20.34	3.99	4.31
Matrix		17.95	9.09	2.07	166.62	20.67	4.11	4.45
Matlab		2.09	6.07	2.17	169.45	2.1	2.56	3.06

Obs: The operation Half is actually A*x using only the lower half of the
matrix and vector. The operation 2in2 is A*x using only the even
indexes.

Of course there are many repetitions of the same operation: 100000 for
dim 5 and 50 and 1000 for dim 500. The inner product is number of
repetitions is multiplied by dimension (it is very fast).

The software is

numpy svn version 1926
Matlab 6.5.0.180913a Release 13 (Jun 18 2002)

Both softwares are using the *same* BLAS and LAPACK (ATLAS for sse).

As you can see, numpy array looks very competitive. The matrix class in
numpy has too much overhead for small dimension though. This overhead is
very small for medium size arrays. Looking at the results above
(specially the small dimensions ones, for higher dimensions the main
computations are being performed by the same BLAS) I believe we can say:

1) Numpy array is faster on usual operations but outerproduct (I believe
the reason is that the dot function uses the regular matrix
multiplication to compute outer-products, instead of using a special
function. This can "easily" changes). In particular numpy was faster in
matrix times vector operations, which is the most usual in numerical
linear algebra.

2) Any operation that involves transpose suffers a very big penalty in
numpy. Compare A'*x and A*x, it is 10 times slower. In contrast Matlab
deals with transpose quite well. Travis is already aware of this and it
can be probably solved. 

3) When using subarrays, numpy is a slower. The difference seems
acceptable. Travis, can this be improved?

Best,

Paulo

Obs: Latter on (in a couple of days) I may present less synthetic
benchmarks (a QR factorization and a Modified Cholesky).