Faster os.walk()

Philippe C. Martin philippe at philippecmartin.com
Wed Apr 20 13:09:59 EDT 2005


How about rerouting stdout/err and 'popening" something like 

/bin/find -name '*' -exec
a_script_or_cmd_that_does_what_i_want_with_the_file {} \;

?

Regards,

Philippe




fuzzylollipop wrote:

> du is faster than my code that does the same thing in python, it is
> highly optomized at the os level.
> 
> that said, I profiled spawning an external process to call du and over
> the large number of times I need to do this it is actually slower to
> execute du externally than my os.walk() implementation.
> 
> du does not return the value I need anyway, I need files only not raw
> blocks consumed which is what du returns. also I need to filter out
> some files and dirs.
> 
> after extensive profiling I found out that the way that os.walk() is
> implemented it calls os.stat() on the dirs and files multiple times and
> that is where all the time is going.
> 
> I guess I need something like os.statcache() but that is deprecated,
> and probably wouldn't fix my problem. I only walk the dir once and then
> cache all bytes, it is the multiple calls to os.stat() that is kicked
> off by the os.walk() command internally on all the isdir() and
> getsize() and what not.
> 
> just wanted to check and see if anyone had already solved this problem.




More information about the Python-list mailing list