SQL/Python question -- slow... What is the fixed cost? -- ANSWER SUMMARY

Leonardo B Lopes leo at iems.nwu.edu
Fri Oct 5 16:24:11 EDT 2001


Dear Friends,

	Thanks for your help with this problem. Here is a summary of this
thread:

	Question 1 of 2: What is the lower bound on cpu time to get a query
back from mysql using python and MySQLdb under Linux on a P3-800?

	Answer: ~ .0004 cpusec
	Relevant part of Message (Leo Lopes):

      1 #!/usr/bin/env python
      2 
      3 import MySQLdb
      4 import profile
      5 
      6 db = MySQLdb.connect(passwd='Iwanit2')
      7 c = db.cursor()
      8 
      9 def test():
     10     for i in range(0,10000):
***  11         c.execute('select * from mematomic where id=3')
     12         x = c.fetchall()
     13 
     14 c.execute('use atom')
***  15 c.execute(' CREATE TABLE mematomic (id MEDIUMINT PRIMARY KEY) \
***  16             TYPE = HEAP')
***  17 c.execute(' INSERT INTO mematomic (id) VALUES (3)')
     18 profile.run('test()')
     19 

and the new results:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.920    5.920 <string>:1(?)
    10000    1.440    0.000    2.910    0.000 cursors.py:110(__do_query)
    10000    0.200    0.000    0.200    0.000 cursors.py:137(_fetch_row)
    10000    0.520    0.000    0.520    0.000
cursors.py:147(_check_for_warnings)
    10000    0.600    0.000    0.760    0.000
cursors.py:162(_get_result)
    10000    0.650    0.000    3.760    0.000 cursors.py:169(_query)
    10000    0.380    0.000    0.500    0.000 cursors.py:193(fetchall)
    10000    0.120    0.000    0.120    0.000
cursors.py:36(_check_executed)
    30000    0.560    0.000    0.560    0.000 cursors.py:46(_get_db)
    10000    1.000    0.000    4.970    0.000 cursors.py:51(execute)
        0    0.000             0.000          profile:0(profiler)
        1    0.010    0.010    5.930    5.930 profile:0(test())
        1    0.450    0.450    5.920    5.920 testdb.py:9(test)

	Question 2 of 2: What does the python profiler measure? Elapsed time or
cpu time used by the process?
	Answer: Platform Dependant
	Relevant part of Message (Skip Montanaro):

    Leonardo> Thanks for the tip. I actually have used the explain
    Leonardo> statement. But are you sure about the profiler ignoring
the
    Leonardo> time taken by the query? Why would it be excluded from the
    Leonardo> computation time?

If the profiler is measuring cpu seconds, it will be measuring the
computation time of the Python process.  The computation time of other
processes involved in the computation (possibly on other machines) just
isn't available to it.  If it's measuring elapsed (wall clock) time,
then
sure, it gives you some measure of the time it takes for the entire
computation to complete.  However, that would include networking delays
as
well as delays caused by either or both the Python program and the MySQL
database server not actually having the CPU.

    Leonardo> How would the profiler even be able to distinguish between
    Leonardo> time waiting in a function for some external process and
time
    Leonardo> spent in the python interpreter itself?

The profiler uses time.clock() if it's available, which is generally
true on
most Unix-like systems.  Again, on Unix-like systems, time.clock()
returns
the CPU time of the current process, not the elapsed time.  From the
clock()
man page:

    DESCRIPTION
           The clock() function returns an approximation of processor
           time used by the program.

    Leonardo> I checked the manual for the profiler section. It says
that
    Leonardo> the profiler is activated by callbacks when functions are
    Leonardo> called and when they return, so if a function is blocked
    Leonardo> waiting for mysql, that time should be computed.

It all depends.  Take a look at the code in the Profile class's __init__
method:

    if not timer:
        if os.name == 'mac':
            self.timer = MacOS.GetTicks
            self.dispatcher = self.trace_dispatch_mac
            self.get_time = _get_time_mac
        elif hasattr(time, 'clock'):
            self.timer = self.get_time = time.clock
            self.dispatcher = self.trace_dispatch_i
        elif hasattr(os, 'times'):
            self.timer = os.times
            self.dispatcher = self.trace_dispatch
            self.get_time = _get_time_times
        else:
            self.timer = self.get_time = time.time
            self.dispatcher = self.trace_dispatch_i

If you're running on Windows (I think you said something about Linux),
then
time.clock is either unavailable or is, but doesn't record CPU time.  On
Linux systems it does (though to varying degrees of (in)accuracy).



-- 
=======================================================================
Leonardo B. Lopes                                      leo at iems.nwu.edu 
Ph.D. Student                                             (847)491-8470
IEMS - Northwestern University             http://www.iems.nwu.edu/~leo



More information about the Python-list mailing list