[Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

Nick Coghlan ncoghlan at gmail.com
Mon Nov 25 00:53:44 CET 2013


On 25 Nov 2013 09:31, "Ben Hoyt" <benhoyt at gmail.com> wrote:
>
> > It's also quite likely the "rich stat object" API will be pursued for
3.5,
> > which is a much safer approach to stat result caching than trying to
embed
> > it directly in pathlib.Path objects.
>
> As a Windows dev, I'm not sure I love the "rich stat object idea",
> because stat_result objects are sooo Posixy. On Windows, (some of) the
> file attribute info is stuffed into a stat_result struct. Which kinda
> works, but I like how Path exposes the higher-level, cross-platform
> stuff like .is_dir() so that most of the time you don't need to worry
> about stat. (You still need to worry about caching, though.)

The idea of the rich stat result object is that has all that info
prepopulated, based on an initial stat call. "Caching" it amounts to "keep
a reference to it".

It is suggested that it would be a subset of the pathlib.Path API:
http://bugs.python.org/issue19725

If it's also a superset of the existing stat object API, then at least
Path.stat and Path.lstat (and perhaps the lower level APIs) can be updated
to return it in 3.5.

> > That's why we decided to punt on the caching question until 3.5 - it's
> > better to provide a predictable building block that doesn't provide
caching,
> > and then work out how to provide a sensible caching layer on top of
that,
> > rather than trying to rush a potentially flawed caching design that
leads to
> > inconsistent behaviour.
>
> Yep, agreed about rushing in a potentially flawed caching design. But
> I also don't want to "rush in" a design that prohibits scandir()-style
> performance optimizations -- though I guess it can still go in there
> one way or the other.

Yeah, the realisation that an initial non-caching approach didn't lock us
out of external caching may not have been well communicated to the list. I
was discussing the walkdir integration possibilities with Antoine and Guido
and realised I would likely still need an external cache, even if pathlib
had its own internal caching. At that point, it seemed highly desirable to
duck the caching question entirely.

> "Worst case", we can add os.scandir() separately, which return
> DirEntry, "path-like" objects.

Indeed, we may still want such an object API, since dirent doesn't provide
full stat info.

A PEP reviewing all this for 3.5 and proposing a specific os.scandir API
would be a good thing.

Cheers,
Nick.

>
> -Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20131125/051ead0d/attachment.html>


More information about the Python-Dev mailing list