scandir slower than listdir

Steve D'Aprano steve+python at pearwood.info
Thu Jul 20 07:43:02 EDT 2017


On Thu, 20 Jul 2017 03:33 pm, Torsten Bronger wrote:

> Hallöchen!
> 
> With a 24,000 files directory on an SSD running Ubuntu,
> 
>     #!/usr/bin/python3
> 
>     import os, time
> 
> 
>     start = time.time()
>     list(os.listdir("/home/bronger/.saves"))
>     print("listdir:", time.time() - start)
> 
>     start = time.time()
>     list(os.scandir("/home/bronger/.saves"))
>     print("scandir:", time.time() - start)
> 
> yields
> 
> listdir: 0.045470237731933594
> scandir: 0.08043360710144043
> 
> However, scandir is supposed to be faster than listdir.  Why do I
> see this?

The documentation says:


"Using scandir() instead of listdir() can significantly increase the performance
of code that ALSO NEEDS FILE TYPE OR FILE ATTRIBUTE INFORMATION"

[emphasis added]

https://docs.python.org/3.5/library/os.html#os.scandir

If all you need is the names, listdir() is faster because it only returns the
names. scandir() returns a data structure which may include cached values for:

- the name
- full path
- flag whether it is a directory
- flag whether it is a file
- flag whether it is a symlink
- inode number
- file stat record




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list