[Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

Ben Hoyt benhoyt at gmail.com
Mon Nov 25 00:31:36 CET 2013


Antoine's class-global flag seems like a bad idea.

> A global string (or Path) keyed cache (rather than a per-object cache) would
> actually be a safer option, since it would ensure distinct path objects
> always gave the same answer. That's the approach I will likely pursue at
> some point in walkdir.

Interesting approach. This wouldn't really solve the problem for
scandir / DirEntry / performance issues, but it's a fair idea in
general.

> It's also quite likely the "rich stat object" API will be pursued for 3.5,
> which is a much safer approach to stat result caching than trying to embed
> it directly in pathlib.Path objects.

As a Windows dev, I'm not sure I love the "rich stat object idea",
because stat_result objects are sooo Posixy. On Windows, (some of) the
file attribute info is stuffed into a stat_result struct. Which kinda
works, but I like how Path exposes the higher-level, cross-platform
stuff like .is_dir() so that most of the time you don't need to worry
about stat. (You still need to worry about caching, though.)

> That's why we decided to punt on the caching question until 3.5 - it's
> better to provide a predictable building block that doesn't provide caching,
> and then work out how to provide a sensible caching layer on top of that,
> rather than trying to rush a potentially flawed caching design that leads to
> inconsistent behaviour.

Yep, agreed about rushing in a potentially flawed caching design. But
I also don't want to "rush in" a design that prohibits scandir()-style
performance optimizations -- though I guess it can still go in there
one way or the other.

"Worst case", we can add os.scandir() separately, which return
DirEntry, "path-like" objects.

-Ben


More information about the Python-Dev mailing list