[Tutor] why is os.path.walk so slow?

Hugo Arts hugo.yoshi at gmail.com
Wed Nov 4 17:30:47 CET 2009


On Wed, Nov 4, 2009 at 4:56 PM, Wayne Werner <waynejwerner at gmail.com> wrote:
> On Wed, Nov 4, 2009 at 6:16 AM, Garry Willgoose
> <garry.willgoose at newcastle.edu.au> wrote:
>>
>> <snip>
>>
>> This is very fast for a directory on my local machine but significantly
>> slower on the remote machine. Not surprising but I would have expected that
>> the run time for the remote directory would be limited by my broadband speed
>> but when I look at upload/download in real time it's less than 10% of
>> maximum. Is this just par for the course or is there something I can do that
>> better utilizes my broadband bandwidth?
>
> I'm not sure if there's a correlation, but there probably is. What OS are
> you (and the remote system) using? What service are you using to connect?
> By way of disclosure, I don't have a lot of experience in this category, but
> I would venture that whatever service you're using has to send/receive
> requests for each file/dir that os.walk checks.
>
> <snip>

I'm taking a stab in the dark here, but maybe latency is the
bottleneck here. The process is sending a request for each
file/directory, waiting for the answer, and only then sending the next
request. All those little waits add up, even though the consumed
bandwidth is negligible.

Running the script on the remote server should be the solution if this
is the case, since you can request the data locally then transmit it
in one go, eliminating most of the waiting.

Hugo


More information about the Tutor mailing list