[Python-ideas] find-like functionality in pathlib

Guido van Rossum guido at python.org
Wed Jan 6 11:48:30 EST 2016


On Wed, Jan 6, 2016 at 8:11 AM, Random832 <random832 at fastmail.com> wrote:

> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote:
> > One problem with stat() caching is that Path objects are considered
> > immutable, and two Path objects referring to the same path are completely
> > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} is
> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
> > cache_stat=True), the behavior of two instances of that object might be
> > observably different (if they were instantiated at times when the
> > contents of the filesystem was different). So maybe stat-caching Path
> instances
> > should be considered unequal, or perhaps unhashable. Or perhaps they
> > should only be considered equal if their stat() values are actually
> equal (i.e.
> > if the file's stat() info didn't change).
>
> What about a global cache?


It would have to use a weak dict so if the last reference goes away it
discards the cached stats for a given path, otherwise you'd have trouble
containing the cache size.

And caching Path objects should still not be comparable to non-caching Path
objects (which we will need to preserve the semantics that repeatedly
calling stat() on a Path object created the default way will always redo
the syscall). The main advantage would be that caching Path objects could
be compared safely.

It could still cause unexpected results. E.g. if you have just traversed
some big tree using caching, and saved some results (so hanging on to some
paths and hence their stat() results), and then you make some changes and
traverse it again to look for something else, you might accidentally be
seeing stale (i.e. cached) stat() results.

Maybe there's a middle ground, where the user can create a StatCache object
and pass it into Path creation and traversal operations. Paths with the
same StatCache object (or both None) compare equal if their path components
are equal. Paths with different StatCache objects never compare equal (but
otherwise are ordered by path as usual -- the StatCache object's identity
is only used when the paths are equal.

Are you (or anyone still reading this) interested in implementing this idea?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160106/1110ead6/attachment.html>


More information about the Python-ideas mailing list