Optimizing tips for os.listdir

Jeremy Jones zanesdad at bellsouth.net
Mon Sep 27 09:01:57 EDT 2004


Thomas wrote:

>I'm doing this :
>
>[os.path.join(path, p) for p in os.listdir(path) if \
>os.path.isdir(os.path.join(path, p))]
>
>to get a list of folders in a given directory, skipping all plain
>files. When used on folders with lots of files,  it takes rather long
>time to finish. Just doing  a listdir, filtering out all plain files
>and a couple of joins, I didn't think this would take so long. 
>
>Is there a faster way of doing stuff like this?
>
>Best regards,
>Thomas
>  
>

I was going to use timeit, but kept encountering a "global name 'os' is 
not defined" error.  So, I created my own function named "time_me."  I 
created a directory with 5000 files and 3 directories.  I ran your code 
on it and got around .34 seconds:

In [38]: time_me('''[os.path.join(path, p) for p in os.listdir(".") if 
os.path.isdir(os.path.join(path, p))]''')
Out[38]: 0.33953285217285156

I tried glob rather than os.listdir.  Not any better:

In [39]: import glob

In [40]: time_me('''[os.path.join(path, p) for p in glob.glob("*") if 
os.path.isdir(os.path.join(path, p))]''')
Out[40]: 0.38322591781616211

I decided, since I was already in the directory, to not do the 
os.path.join() on path and p.  Got .23 seconds rather than .34.  That's 
probably not an option, though, and it didn't really buy you a whole lot:

In [42]: time_me('''[p for p in os.listdir(".") if os.path.isdir(p)]''')
Out[42]: 0.23234295845031738

os.listdir seems to be fairly cheap:

In [43]: time_me("os.listdir('.')")
Out[43]: 0.030396938323974609

I know that none of this is helpful so far, but I don't see how you're 
going to get around getting a directory listing and then statting each 
file to determine if it's a directory or not.  How you are doing it is a 
reasonable solution.  How many files is "lots of files"?  And how long 
are you seeing on your system?


Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20040927/3f1fad49/attachment.html>


More information about the Python-list mailing list