Which happens first?

Sun Apr 8 17:29:40 EDT 2001

At 00:52 08/04/01 -0400, Tim Peters wrote:
>Because it does no optimizations, there's no supporting machinery for *doing*
>optimizations either, so "the first one" it tries is going to be more of a
>pain than you imagine.  In particular, negative integer literals don't exist
>in Python's grammar (only non-negative integer literals exist), and Python
>compiles straight from parse tree to bytecode.  So even that trivial little
>optimization would require a (non-existent) peephole optimizer doing pattern
>matching on the parse tree or bytecode stream.

Just curious, but is it so hard to change the grammar to detect negative 
integer literals? I know it involves a little bit of look ahead work, but 
there are well known techniques to handle it. On the other hand, I can see 
that this is not a big issue the vast majority of time (it would save only 
a few bytes of bytecode, and almost nothing on time...)

> > 5) Other strange thing: at least in this example, the compiler did
> > not made any optimization on constant expressions.
>
>Right, and it never does.  This isn't C, and Python assignments aren't
>macros, so doing stuff like
>
>TOTAL_BYTES = BYTES_PER_SECTOR * SECTORS_PER_TRACK * NUM_TRACKS
>
>at module level is a one-time module import cost, no matter how often
>TOTAL_BYTES is dynamically referenced later.  But, yes, doing that inside a
>loop is a Possibly Poor Idea.

Ok. So let's change my conclusion as follows: 'Avoid any calculation on 
literals inside loops in Python, as the compiler will not optimize 
anything. Put explicitly out of the loop everything that you can. This kind 
of code is optimized in some other languages/compilers, such as C or 
Pascal, but not in Python."

As always, it's good to make this point; people come to Python from very 
different backgrouns, and, for some of us, this may be a problem. In my 
case, I tend to use lots of small Python scripts to parse log files, some 
of them in binary format. So I end up coding stuff like this sometimes, to 
quickly get some information from inside complex records or structures. 
There are also some alternatives, such as the struct and re modules, where 
the coding style does not suffer from this problem.

> >  >>> p = a[(3*4+2)-1]   # get the second byte of the third DWORD
>
>I'll take your word for it that you find that easy to understand <wink>.

Ok, it was not a good example. Using pseudo literals (as your TOTAL_BYTES 
example) it's a better technique, specially in Python, where everything is 
referenced as an object, even the literals.

> > 2) If your expression involves value of mixed types, check
> > *carefully* the sequence of the calculation.
>
>That's crucial advice in any language.

Another good point. Maybe it should go in a reference document somewhere 
the documentation. The most obvious things are the easiest to overlook.

Carlos Ribeiro