[Python-ideas] os.path.commonprefix: Yes that old chestnut.

Andrew Barnert abarnert at yahoo.com
Tue Mar 24 14:55:18 CET 2015


On Mar 24, 2015, at 4:56 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> 
>> On 23 March 2015 at 21:33, Gregory P. Smith <greg at krypto.org> wrote:
>> +1 pathlib would be the appropriate place for the correctly behaving
>> function to appear.
> 
> OK, so here's a question. What actual use cases exist for a
> common_prefix function? The reason I ask is that I'm looking at some
> of the edge cases, and the obvious behaviour isn't particularly clear
> to me.
> 
> For example, common_prefix('a/b/file.c', 'a/b/file.c'). The common
> prefix is obviously 'a/b/file.c' - but I can imagine people *actually*
> wanting the common *directory* containing both files. But taken
> literally, that's only possible if you check the filesystem, so it
> would no longer be a PurePath operation.

The traditional way to handle this is that the basename (the part after the last '/') is assumed to be a file (if you don't want that, include the trailing slash). POSIX even defines the technical term "path prefix" to mean everything up to the last slash, so something called a "common path prefix" sounds like it should be the common prefix of the path prefixes, right? Except that not command and function in POSIX works this way, requiring you to memorize or look up the man page to see what someone chose as "obvious" back in the 1970s....

At any rate, we probably don't need to figure this out from first principles; I'm pretty sure some subset of {Java, Boost, Cocoa, .NET, JUCE, one overwhelming popular CPAN library, etc.} have already come up with an answer, and if most of them agree, we probably want to follow suit (even if it seems silly).

> And what about common_prefix('foo/bar', '../here/foo')? Or
> common_prefix('bar/baz', 'foo/../bar/baz')? Pathlib avoids collapsing
> .. because the meaning could change in the face of symlinks. I believe
> the same applies here. Maybe you need to call resolve() before doing
> the common prefix operation (but that gives an absolute path).
> 
> With the above limitations, would a common_prefix function actually
> help typical use cases? In my experience, doing list operations on
> pathobj.parts is often simple enough that I don't need specialised
> functions like common_prefix...
> 
> Getting the edge cases right is fiddly enough for common_prefix that a
> specialised function is a reasonable idea, but only if the "obvious"
> behaviour is clear. If there's a lot of conflicting possibilities,
> maybe a recipe in the docs would be a better option.
> 
> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list