Iterating over files of a huge directory

Chris Angelico rosuav at gmail.com
Mon Dec 17 10:41:42 EST 2012


On Tue, Dec 18, 2012 at 2:28 AM, Gilles Lenfant
<gilles.lenfant at gmail.com> wrote:
> Hi,
>
> I have googled but did not find an efficient solution to my problem. My customer provides a directory with a huuuuge list of files (flat, potentially 100000+) and I cannot reasonably use os.listdir(this_path) unless creating a big memory footprint.
>
> So I'm looking for an iterator that yields the file names of a directory and does not make a giant list of what's in.

Sounds like you want os.walk. But... a hundred thousand files? I know
the Zen of Python says that flat is better than nested, but surely
there's some kind of directory structure that would make this
marginally manageable?

http://docs.python.org/3.3/library/os.html#os.walk

ChrisA



More information about the Python-list mailing list