profiler -- numeric overflow?

kmself at ix.netcom.com kmself at ix.netcom.com
Mon Jun 26 23:30:57 EDT 2000


I'm new to Python.

I'm maintaining, extending, and optimizing in inherited weblog analyzer.
It runs in about an hour on a 420,000 line logfile, which strikes me as
longer than necessary, so I ran the code through the Python profiler.
Interesting results, but apparently a number of overflow errors, output
copied inline below.  I'm assuming an internal representation of 1/1000
second, which will overflow for large values -- I'm assuming 2^20 is the
maximum 

I'm not fully up on all my Python functions and internals, The intensive
calls appear to be:
    - <string> ( 1 call, -2605.767 cumtime).
    - __call__ (line 76, 4,135,458 calls, -4029.837 tottime)
    - _regularize (166 seconds -- internal function


>From this, does anyone have suggestions on how to modify the profiler so
that it doesn't overflow?  I've located what appears to be the source
under /usr/lib/python1.5/profile.py (Debian GNU/Linux).  Would it be
possible to change the time values to longs, and/or would this
negatively impact performance.

...yes, I've also considered running a smaller sample of the log, but I
suspect that runtime may grow nonlinearly, large input samples are good.


13166270 function calls (12889814 primitive calls) in -2605.747 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000 -2605.767 -2605.767 <string>:1(?)
        1    0.000    0.000    0.000    0.000 getopt.py:12(?)
        1    0.000    0.000    0.000    0.000 getopt.py:20(getopt)
        2    0.000    0.000    0.000    0.000 getopt.py:64(do_longs)
        2    0.000    0.000    0.000    0.000 getopt.py:85(long_has_args)
   362448   50.870    0.000  166.910    0.000 hitcount3.py:131(_regularize)
        5    0.000    0.000    0.000    0.000 hitcount3.py:141(__init__)
   190108    9.360    0.000    9.360    0.000 hitcount3.py:153(increment_dict)
        3    0.020    0.007    0.190    0.063 hitcount3.py:158(__repr__)
      321    0.050    0.000    0.050    0.000 hitcount3.py:169(<lambda>)
        1    0.000    0.000    0.000    0.000 hitcount3.py:177(__init__)
   384665   71.950    0.000  145.580    0.000 hitcount3.py:180(process_line)
        1    0.000    0.000    0.000    0.000 hitcount3.py:192(__init__)
   384665   69.810    0.000  143.600    0.000 hitcount3.py:195(process_line)
        1    0.000    0.000    0.000    0.000 hitcount3.py:205(__init__)
   384665   68.060    0.000  371.640    0.001 hitcount3.py:209(process_line)
        1    0.000    0.000    0.000    0.000 hitcount3.py:216(__repr__)
        1    0.000    0.000    0.000    0.000 hitcount3.py:221(__init__)
   384665   33.220    0.000  158.360    0.000 hitcount3.py:224(process_line)
        1    0.000    0.000    0.000    0.000 hitcount3.py:232(__init__)
   384665   33.650    0.000   54.130    0.000 hitcount3.py:235(process_line)
        1    0.000    0.000    0.010    0.010 hitcount3.py:241(__repr__)
      134    0.000    0.000    0.000    0.000 hitcount3.py:249(<lambda>)
       33    0.010    0.000    0.010    0.000 hitcount3.py:253(<lambda>)
        3    0.080    0.027    0.120    0.040 hitcount3.py:258(sort_by_value)
     1677    0.040    0.000    0.040    0.000 hitcount3.py:262(<lambda>)
        1    0.000    0.000    0.000    0.000 hitcount3.py:282(__init__)
   405299   35.170    0.000   65.460    0.000 hitcount3.py:284(__call__)
        1    0.000    0.000    0.000    0.000 hitcount3.py:289(__init__)
   424910   18.380    0.000   18.380    0.000 hitcount3.py:293(__call__)
        2    0.000    0.000    0.000    0.000 hitcount3.py:299(datestring_to_time)
        1  309.710  309.710 -2605.767 -2605.767 hitcount3.py:311(main)
   424910   71.730    0.000  265.550    0.001 hitcount3.py:62(__init__)
4135458/3859002 -4029.837   -0.001 -3649.137   -0.001 hitcount3.py:76(__call__)
  1172023  193.480    0.000  395.050    0.000 hitcount3.py:89(_compute_data)
        1    0.020    0.020 -2605.747 -2605.747 profile:0(main())
        0    0.000             0.000          profile:0(profiler)
  1192657  181.410    0.000  220.470    0.000 re.py:112(match)
  1057390   52.320    0.000   52.320    0.000 re.py:290(__init__)
   344808   41.110    0.000   41.110    0.000 re.py:335(group)
   424910   78.570    0.000   78.570    0.000 re.py:360(groupdict)
        1    0.000    0.000    0.000    0.000 re.py:76(compile)
        1    0.000    0.000    0.000    0.000 re.py:89(__init__)
  1105824  105.070    0.000  118.330    0.000 re.py:95(search)
        2    0.000    0.000    0.000    0.000 string.py:213(index)



-- 
Karsten M. Self <kmself at ix.netcom.com>         http://www.netcom.com/~kmself
  Evangelist, Opensales, Inc.                       http://www.opensales.org
   What part of "Gestalt" don't you understand?      Debian GNU/Linux rocks!
     http://gestalt-system.sourceforge.net/      K5: http://www.kuro5hin.org
GPG fingerprint: F932 8B25 5FDD 2528 D595  DC61 3847 889F 55F2 B9B0



More information about the Python-list mailing list