Big speed boost in os.walk in Python 2.5

looping kadeko at gmail.com
Fri Oct 13 05:27:26 EDT 2006


Hi,
I noticed a big speed improvement in some of my script that use os.walk
and I write a small script to check it:
import os
for path, dirs, files in os.walk('D:\\FILES\\'):
    pass

Results on Windows XP after some run to fill the disk cache (with
~59000 files and ~3500 folders):
Python 2.4.3 : 45s
Python 2.5 : 10s

Very nice, but somewhat strange...
Is Python 2.4.3 os.walk buggy ???
Is this results only valid in Windows or *nix system show the same
difference ?
The profiler show that most of time is spend in ntpath.isdir and this
function is *a lot* faster in Python 2.5.
Maybe this improvement could be backported in Python 2.4 branch for the
next release ?


Python 2.4.3
         604295 function calls (587634 primitive calls) in 48.629 CPU
seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    62554    0.264    0.000    0.264    0.000 :0(append)
        1    0.001    0.001   48.593   48.593 :0(execfile)
    66074    0.197    0.000    0.197    0.000 :0(len)
     3521    5.219    0.001    5.219    0.001 :0(listdir)
        1    0.036    0.036    0.036    0.036 :0(setprofile)
    62554   38.812    0.001   38.812    0.001 :0(stat)
        1    0.000    0.000   48.593   48.593 <string>:1(?)
    66074    0.218    0.000    0.218    0.000 ntpath.py:116(splitdrive)
     3520    0.009    0.000    0.009    0.000 ntpath.py:246(islink)
    62554    0.767    0.000   40.137    0.001 ntpath.py:268(isdir)
    66074    0.433    0.000    0.650    0.000 ntpath.py:51(isabs)
    66074    0.880    0.000    1.726    0.000 ntpath.py:59(join)
20183/3522    1.217    0.000   48.573    0.014 os.py:211(walk)
        1    0.000    0.000   48.629   48.629
profile:0(execfile('test.py'))
        0    0.000             0.000          profile:0(profiler)
    62554    0.174    0.000    0.174    0.000 stat.py:29(S_IFMT)
    62554    0.385    0.000    0.559    0.000 stat.py:45(S_ISDIR)
        1    0.019    0.019   48.592   48.592 test.py:1(?)


Python 2.5:
         604295 function calls (587634 primitive calls) in 17.386 CPU
seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    62554    0.247    0.000    0.247    0.000 :0(append)
        1    0.001    0.001   17.315   17.315 :0(execfile)
    66074    0.168    0.000    0.168    0.000 :0(len)
     3521    5.287    0.002    5.287    0.002 :0(listdir)
        1    0.071    0.071    0.071    0.071 :0(setprofile)
    62554    7.812    0.000    7.812    0.000 :0(stat)
        1    0.000    0.000   17.315   17.315 <string>:1(<module>)
    66074    0.186    0.000    0.186    0.000 ntpath.py:116(splitdrive)
     3520    0.009    0.000    0.009    0.000 ntpath.py:245(islink)
    62554    0.712    0.000    9.013    0.000 ntpath.py:267(isdir)
    66074    0.394    0.000    0.581    0.000 ntpath.py:51(isabs)
    66074    0.815    0.000    1.564    0.000 ntpath.py:59(join)
20183/3522    1.176    0.000   17.296    0.005 os.py:218(walk)
        1    0.000    0.000   17.386   17.386
profile:0(execfile('test.py'))
        0    0.000             0.000          profile:0(profiler)
    62554    0.159    0.000    0.159    0.000 stat.py:29(S_IFMT)
    62554    0.331    0.000    0.489    0.000 stat.py:45(S_ISDIR)
        1    0.018    0.018   17.314   17.314 test.py:1(<module>)




More information about the Python-list mailing list