[Python-ideas] Speed up os.walk() 5x to 9x by using file attributes from FindFirst/NextFile() and readdir()

Andrew Barnert abarnert at yahoo.com
Thu Nov 15 16:36:34 CET 2012


From: Mike Meyer <mwm at mired.org>
Sent: Thu, November 15, 2012 2:29:44 AM

>If the goal is to make os.walk fast, then it might be better (on Posix systems, 

>anyway) to see if it can be built on top of ftw instead of low-level directory 
>scanning routines.


You can't actually use ftw, because it doesn't give any way to handle the 
options to os.walk. Plus, it's "obsolescent" (POSIX2008), "deprecated" (linux), 
or "legacy" (OS X), and at least some platforms will throw warnings when you 
call it. It's also somewhat underspecified, and different platforms, even 
different versions of the same platform, will give you different behavior in 
various cases (especially with symlinks).

But you could, e.g., use fts on platforms that have it, nftw on platforms that 
have a version close enough to recent linux/glibc for our purposes, and fall 
back to readdir+stat for the rest. That could give a big speed improvement on 
the most popular platforms, and on the others, at least things would be no worse 
than today (and anyone who cared could much more easily write the appropriate 
nftw/fts/whatever port for their platform once the framework was in place).




More information about the Python-ideas mailing list