[Python-Dev] PEP 471 Final: os.scandir() merged into Python 3.5

Ben Hoyt benhoyt at gmail.com
Mon Mar 9 12:58:26 CET 2015


Hi Ryan,

> ./configure --with-pydebug && make -j7
>
> I then ran ./python.exe ~/Workspace/python/scandir/benchmark.py and I got:
>
> Creating tree at /Users/rstuart/Workspace/python/scandir/benchtree: depth=4, num_dirs=5, num_files=50
> Using slower ctypes version of scandir
> Comparing against builtin version of os.walk()
> Priming the system's cache...
> Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 1/3...
> Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 2/3...
> Benchmarking walks on /Users/rstuart/Workspace/python/scandir/benchtree, repeat 3/3...
> os.walk took 0.184s, scandir.walk took 0.158s -- 1.2x as fast

Note that this benchmark is invalid for a couple of reasons. First,
you're compiling Python in debug mode (--with-pydebug), which produces
significantly slower code in my tests -- for example, on Windows
benchmark.py is about twice as slow when Python is compiled in debug
mode.

Second, as the output above shows, benchmark.py is "Using slower
ctypes version of scandir" and not a C version at all. If os.scandir()
is available, benchmark.py should use that, so there's something wrong
here -- maybe the patch didn't apply correctly or maybe you're testing
with a different version of Python than the one you built?

In any case, the easiest way to test it now is to download Python 3.5
alpha 2 which just came out:
https://www.python.org/downloads/release/python-350a2/

I just tried this on my Mac Mini (i5 2.3GHz, 2 GB RAM, HFS+ on
rotational drive) and got the following results:

Using Python 3.5's builtin os.scandir()
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on benchtree, repeat 1/3...
Benchmarking walks on benchtree, repeat 2/3...
Benchmarking walks on benchtree, repeat 3/3...
os.walk took 0.074s, scandir.walk took 0.016s -- 4.7x as fast

> I then did ./python.exe ~/Workspace/python/scandir/benchmark.py -s and got:

Also note that "benchmark.py -s" tests the system os.walk() against a
get_tree_size() function using scandir's DirEntry.stat().st_size,
which provides huge gains on Windows (because stat().st_size doesn't
require on OS call) but only modest gains on POSIX systems, which
still require an OS stat call to get the size (though not the file
type, so at least it's only one stat call). I get "2.2x as fast" on my
Mac for "benchmark.py -s".

-Ben


More information about the Python-Dev mailing list