Optimizing tips for os.listdir
Jeremy Jones
zanesdad at bellsouth.net
Mon Sep 27 09:01:57 EDT 2004
Thomas wrote:
>I'm doing this :
>
>[os.path.join(path, p) for p in os.listdir(path) if \
>os.path.isdir(os.path.join(path, p))]
>
>to get a list of folders in a given directory, skipping all plain
>files. When used on folders with lots of files, it takes rather long
>time to finish. Just doing a listdir, filtering out all plain files
>and a couple of joins, I didn't think this would take so long.
>
>Is there a faster way of doing stuff like this?
>
>Best regards,
>Thomas
>
>
I was going to use timeit, but kept encountering a "global name 'os' is
not defined" error. So, I created my own function named "time_me." I
created a directory with 5000 files and 3 directories. I ran your code
on it and got around .34 seconds:
In [38]: time_me('''[os.path.join(path, p) for p in os.listdir(".") if
os.path.isdir(os.path.join(path, p))]''')
Out[38]: 0.33953285217285156
I tried glob rather than os.listdir. Not any better:
In [39]: import glob
In [40]: time_me('''[os.path.join(path, p) for p in glob.glob("*") if
os.path.isdir(os.path.join(path, p))]''')
Out[40]: 0.38322591781616211
I decided, since I was already in the directory, to not do the
os.path.join() on path and p. Got .23 seconds rather than .34. That's
probably not an option, though, and it didn't really buy you a whole lot:
In [42]: time_me('''[p for p in os.listdir(".") if os.path.isdir(p)]''')
Out[42]: 0.23234295845031738
os.listdir seems to be fairly cheap:
In [43]: time_me("os.listdir('.')")
Out[43]: 0.030396938323974609
I know that none of this is helpful so far, but I don't see how you're
going to get around getting a directory listing and then statting each
file to determine if it's a directory or not. How you are doing it is a
reasonable solution. How many files is "lots of files"? And how long
are you seeing on your system?
Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20040927/3f1fad49/attachment.html>
More information about the Python-list
mailing list