performance problem with time.strptime()

Nils Rüttershoff nils at ccsg.de
Thu Jul 2 09:00:11 EDT 2009


Hi Casey
Casey Webster wrote:
> On Jul 2, 7:30 am, Nils Rüttershoff <n... at ccsg.de> wrote:
>
>   
>> Rec = re.compile(r"^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s-\s\d+\s\[(\d{2}/\w+/\d{4}:\d{2}:\d{2}:\d{2})\s\+\d{4}\].*")
>> Line = '1.2.3.4 - 4459 [02/Jul/2009:01:50:26 +0200] "GET /foo HTTP/1.0" 200 - "-" "www.example.org" "-" "-" "-"'
>>     
>
> I'm not sure how much it will help but if you are only using the regex
> to get the date/time group element, it might be faster to replace the
> regex with:
>
>   
>>>> date_string = Line.split()[3][1:-1]
>>>>         

Indeed this would give a little speed up (by 1000000 iteration approx
3-4 sec). But this would be only a small piece of the cake. Although thx :)

The problem is that time.strptime() consult locale.py for each
iteration. Here the hole cProfile trace:

first with epoch and second with strptime (condensed):

         5000009 function calls in 33.084 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   33.084   33.084 <string>:1(<module>)
        1    2.417    2.417   33.084   33.084 <timeit-src>:2(inner)
  1000000    9.648    0.000   30.667    0.000 time_test.py:30(epoch)
        1    0.000    0.000   33.084   33.084 timeit.py:177(timeit)
  1000000    3.711    0.000    3.711    0.000 {built-in method groupdict}
  1000000    4.318    0.000    4.318    0.000 {built-in method match}
        1    0.000    0.000    0.000    0.000 {gc.disable}
        1    0.000    0.000    0.000    0.000 {gc.enable}
        1    0.000    0.000    0.000    0.000 {gc.isenabled}
  1000000    7.764    0.000    7.764    0.000 {map}
        1    0.000    0.000    0.000    0.000 {method 'disable' of
'_lsprof.Profiler' objects}
  1000000    5.225    0.000    5.225    0.000 {time.mktime}
        2    0.000    0.000    0.000    0.000 {time.time}

################################################################

         29000009 function calls in 124.449 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000  124.449  124.449 <string>:1(<module>)
        1    2.244    2.244  124.449  124.449 <timeit-src>:2(inner)
  1000000    3.500    0.000   33.559    0.000 _strptime.py:27(_getlang)
  1000000   41.814    0.000  100.754    0.000 _strptime.py:295(_strptime)
  1000000    4.010    0.000  104.764    0.000
_strptime.py:453(_strptime_time)
  1000000   11.647    0.000   19.529    0.000 locale.py:316(normalize)
  1000000    3.638    0.000   23.167    0.000
locale.py:382(_parse_localename)
  1000000    5.120    0.000   30.059    0.000 locale.py:481(getlocale)
  1000000    7.242    0.000  122.205    0.000 time_test.py:37(strptime)
        1    0.000    0.000  124.449  124.449 timeit.py:177(timeit)
  1000000    1.771    0.000    1.771    0.000 {_locale.setlocale}
  1000000    1.735    0.000    1.735    0.000 {built-in method __enter__}
  1000000    1.626    0.000    1.626    0.000 {built-in method end}
  1000000    3.854    0.000    3.854    0.000 {built-in method groupdict}
  1000000    1.646    0.000    1.646    0.000 {built-in method group}
  2000000    8.409    0.000    8.409    0.000 {built-in method match}
        1    0.000    0.000    0.000    0.000 {gc.disable}
        1    0.000    0.000    0.000    0.000 {gc.enable}
        1    0.000    0.000    0.000    0.000 {gc.isenabled}
  2000000    2.942    0.000    2.942    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of
'_lsprof.Profiler' objects}
  3000000    4.552    0.000    4.552    0.000 {method 'get' of 'dict'
objects}
  1000000    2.072    0.000    2.072    0.000 {method 'index' of 'list'
objects}
  1000000    1.517    0.000    1.517    0.000 {method 'iterkeys' of
'dict' objects}
  2000000    3.113    0.000    3.113    0.000 {method 'lower' of 'str'
objects}
  2000000    3.233    0.000    3.233    0.000 {method 'replace' of 'str'
objects}
  2000000    2.953    0.000    2.953    0.000 {method 'toordinal' of
'datetime.date' objects}
  1000000    1.476    0.000    1.476    0.000 {method 'weekday' of
'datetime.date' objects}
  1000000    4.332    0.000  109.097    0.000 {time.strptime}
        2    0.000    0.000    0.000    0.000 {time.time}

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090702/9b9cdd1f/attachment-0001.html>


More information about the Python-list mailing list