[MATRIX-SIG] random number generator?

Geoffrey Furnish furnish@acl.lanl.gov
Wed, 29 Oct 1997 12:27:43 -0700 (MST)


David J. C. Beach writes:
 > Geoffrey Furnish wrote:
 > 
 > > Speed is the most important counter argument, imo, and one which ought
 > > to be of considerable concern to NumPy customers.  I have previously
 > > done timing tests between loops containing expressions like:
 > >
 > >         a[i] = a[i] + x[i];
 > >
 > > versus
 > >
 > >         a[i] += x[i];
 > >
 > > in C and C++.  The += form was as much as 20% faster on some tests on
 > > some architectures.
 > >
 > > Frankly, for a language litered with "self." on nearly every line, it
 > > is very hard for me to buy that "a = a + 1" is syntactically more
 > > beautiful.  It certainly is slower.
 > 
 > There's nothing inherently slower about "a = a + 1" than there is
 > about "a += 1".  The difference is in how the compiler interprets
 > them.  Now, any good optimizing C compiler should transform "a = a
 > + 1" into "a += 1" automatically.  (Were you using optimizations?)

Yes there is a difference.  "a = a + 1" evaluates a twice, "a += 1"
does it only once.  That's why its faster, and yes my tests were with
maximal optimization.  The fact that this result is a little
surprising, is THE reason I shared it with you.  I'm not confused--I
have already done this test, and am aware of the surprsiing fact of
the matter.  I brought it up in this forum because one could presume
that IF NumPy had a += operator at the Python language level, that its
underlying C implementation would use the same operator, and hence
would stand to run faster.  (Not just the use of the operator itself
here, but also the issue of temporary objects that Mike aluded to.)

Now whether it is fair for an optimizing compiler to reinterpret
things in the way you suggest, is a topic that is out of my league.
What I know of this issue is that there are a variety of farily
surprising limitations that compiler writers must operate under.  For
example, in C++, you cannot assume that a != b <==> !(a == b).
Evidently there are efforts on the part of certain influential
individuals in the community to lobby for certain "common sense"
semantic interpretations that would make life easier for compiler
optimizers, but this is outstanding at this time.

Could Python do it?  Sure.  There is no standards body to bicker
with...

 > Perhaps the Python byte code compiler would only look-up the lvalue
 > for a (or a[i]) once.  This would make "a = a + 1" just as fast as
 > "a += 1".  The point here is that you're confusing a language
 > difference for a performance difference.  The language is what you
 > type, but the performance depends on how the compiler/interpreter
 > transforms that language into machine instructions.

I'm not actually as confused as you think.  :-).

BTW, besides the operator evaluation semantics, there is also the
(possibly larger?) issue of termporary proliferation on binary
operators.  This has resulted in very interesting research in the C++
community into techniques for doing array/matrix math without
producing temporaries from binary operators.  This has resulted in
speed-up's as large as 10x for some applications.  If you're
interested, you could inspect the literature on "expression
templates". 

 > Come to think of it, I'm pretty sure that FORTRAN users (not that I
 > like the language) don't have either a += or a ++ operator, and I'd
 > be pretty willing to bet that you're a[i] = a[i] + x[i] test on a
 > good optomizing FORTRAN compiler would outperform the C version of
 > a[i] += x[i]. 

Think apples to apples.  Fortran optimizers operate under vastly
different semantic requirements than C or C++ do.

 > You might give it a try.

Thanks.  

 > As for the litered with "self" argument, I'm assuming that you're
 > complaining because you're needing to type "self.a[i] = self.a[i] +
 > 1" instead of "self.a[i] += 1".  Well, I rather like the "self"
 > prefix for object attributes because it makes it crystal clear that
 > you're setting an attribute of the object instead of some other
 > variable.  I find that it's easier to read other people's code when
 > this self "littered" style is employed.  And I could be wrong, but
 > I doubt I'm the only one.

I am aware that there is a spectrum of opinions, and that many, maybe
even most people who use Python, prefer "self.".  I also know I am not
alone in finding it really annoying, and I have met people who cite it
as a principle reason for not wanting to use Python.  Oh well.  I just
grin and bear it.  But I don't pesonally buy the argument that we
should keep fancy operators out of Python because it keeps the
language looking pretty, or because the hoomenfligutz operator in C++
was error prone.  I have never been "bitten" by ++, for instance, and
cannot for the life of me even imagine what on earth these comments
could be referring to.  If someone would like to enlighten me about
the perils of ++ and -- in private email, I would be interested in
hearing it.

 > And I think there's room for some fair comparison here.  How do you
 > like the complete lack of standard's for container classes in
 > C/C++?  Sometimes people use Rogue Wave's classes, sometimes they
 > use STL, sometimes he write they're own, sometimes they you their
 > "in-house" class library.  But they're all different: different
 > ways to get elements, slices, append, insert, etc.  I'll grant you
 > in an instant that C++, as a machine-compiled language, runs faster
 > than Python byte-compiled code, but it sure seems to lack any
 > consistency.  In C++, templates are generally a mess, there are six
 > different kinds of inheritance (public virtual, protected virtual,
 > private virtual, public, protected, and private), compilation tends
 > to be a slow process, and you get all the advantages (AND
 > DISADVANTAGES) a strongly typed language.  (In Python, I was able
 > to use the same function for a runge-kutta4 in python for a single
 > equation system, and a multiple equation system just by virtue of
 > whether I passed it floats or matrices.  You could get that same
 > behavior from C++, but you'd have to go well out of your way to
 > spoof dynamic typing, or simply write multiple versions of
 > runge-kutta4.)

This is by now way way off topic for this email list.  I will respond
to this portion of the thread one time only.

C: There are no container classes period.  Or any other kind of
	classes.  I try not to use C if I can help it.
C++: I do not agree with your assessment of the market realities.
	Every project I've been associated with for 18 months, has
	been  fully committed to using the STL.  Your remark has 
	merit from a historical standpoint, but not much relevance
	from the standpoint of a current assesment of the state of
	C++. 
Saying that "[C++] sure seems to lack any consistency" is patently
absurd.  If you would like to be made aware of some of the modernities
of C++, there is a great article "Foundations for a Native C++ Style"
by Koenig and Stroustrup in a recent SW Practice and Experience.
The rest of your comments above are sufficiently undirected that I
can't formulate a good response.  I'm glad you have rk in python that
pleases you.  My own view is that there is room for both compiled and
interpretted langauges in the solution space for large systems
programming projects, and that is precisely why I chair the Python C++
sig.

 > I'm pretty willing to bet that the Python language as a whole more
 > than stands up to C or C++.

I certainly do not view Python as a credible threat to C++.  Nor do I
view C++ as a credible threat to Python.  That's why I use them
together. 

-- 
Geoffrey Furnish                    email: furnish@lanl.gov
LANL XTM Radiation Transport/POOMA  phone: 505-665-4529     fax: 505-665-7880

"Software complexity is an artifact of implementation."    -Dan Quinlan

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________