PyWart: The problem with "print"

Sun Jun 2 20:16:21 EDT 2013

On Sun, Jun 2, 2013 at 1:20 PM, Chris Angelico <rosuav at gmail.com> wrote:

>
> Hmm. Could be costly. Hey, you know, Python has something for testing that.
>
> >>> timeit.timeit('debugprint("asdf")','def debugprint(*args):\n\tif not
> DEBUG: return\n\tprint(*args)\nDEBUG=False',number=1000000)
> 0.5838018519113444
>
> That's roughly half a second for a million calls to debugprint().
> That's a 580ns cost per call. Rather than fiddle with the language,
> I'd rather just take this cost. Oh, and there's another way, too: If
> you make the DEBUG flag have effect only on startup, you could write
> your module thus:
>

This is a slightly contrived demonstration... The time lost in a function
call is not just the time it takes to execute that function.  If it
consistently increases the frequency of cache misses then the cost is much
greater -- possibly by orders of magnitude if the application is truly
bound by the bandwidth of the memory bus and the CPU pipeline is almost
always saturated.

I'm actually with RR in terms of eliminating the overhead involved with
'dead' function calls, since there are instances when optimizing in Python
is desirable.  I actually recently adjusted one of my own scripts to
eliminate branching and improve data layout to achieve a 1000-fold
improvement in efficiency (~45 minutes to 0.42 s. for one example) --- all
in pure Python.  The first approach was unacceptable, the second is fine.
 For comparison, if I add a 'deactivated' debugprint call into the inner
loop (executed 243K times in this particular test), then the time of the
double-loop step that I optimized takes 0.73 seconds (nearly doubling the
duration of the whole step).  The whole program is too large to post here,
but the relevant code portion is shown below:

         i = 0
         begin = time.time()
         for mol in owner:
            for atm in mol:
               blankfunc("Print i %d" % i)
               new_atoms[i] = self.atom_list[atm]
               i += 1
         self.atom_list = new_atoms
         print "Done in %f seconds." % (time.time() - begin)

from another module:

DEBUG = False

[snip]

def blankfunc(instring):
   if DEBUG:
      print instring

Also, you're often not passing a constant literal to the debug print --
you're doing some type of string formatting or substitution if you're
really inspecting the value of a particular variable, and this also takes
time.  In the test I gave the timings for above, I passed a string the
counter substituted to the 'dead' debug function.  Copy-and-pasting your
timeit experiment on my machine yields different timings (Python 2.7):

>>> import sys
>>> timeit.timeit('debugprint("asdf")','def debugprint(*args):\n\tif not
DEBUG: return\n\tsys.stdout.write(*args)\nDEBUG=False',number=1000000)
0.15644001960754395

which is ~150 ns/function call, versus ~1300 ns/function call.  And there
may be even more extreme examples, this is just one I was able to cook up
quickly.

This is, I'm sad to say, where my alignment with RR ends.  While I use
prints in debugging all the time, it can also become a 'crutch', just like
reliance on valgrind or gdb.  If you don't believe me, you've never hit a
bug that 'magically' disappears when you add a debugging print statement
;-).

The easiest way to eliminate these 'dead' calls is to simply comment-out
the print call, but I would be quite upset if the interpreter tried to
outsmart me and do it automagically as RR seems to be suggesting.  And if
you're actually debugging, then you typically only add a few targeted print
statements -- not too hard to comment-out.  If it is, and you're really
that lazy, then by all means add a new, greppable function call and use a
sed command to comment those lines out for you.

BTW: *you* in the above message refers to a generic person -- none of my
comments were aimed at anybody in particular

All the best,
Jason

P.S. All that said, I would agree with ChrisA's suggestion that the
overhead is negligible is most cases...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20130602/14cf46dc/attachment.html>