[Python-Dev] Re: os.path.commonprefix breakage

Skip Montanaro skip@mojam.com (Skip Montanaro)
Wed, 16 Aug 2000 23:41:59 -0500 (CDT)


    Fred> I'd guess that the path separator should only be appended if it's
    Fred> part of the passed-in strings; that would make it a legitimate
    Fred> part of the prefix.  If it isn't present for all of them, it
    Fred> shouldn't be part of the result:

    >>> os.path.commonprefix(["foo", "foo/bar"])
    'foo'

Hmmm... I think you're looking at it character-by-character again.  I see
three possibilities:

    * it's invalid to have a path with a trailing separator

    * it's okay to have a path with a trailing separator

    * it's required to have a path with a trailing separator

In the first and third cases, you have no choice.  In the second you have to
decide which would be best.

On Unix my preference would be to not include the trailing "/" for aesthetic
reasons.  The shell's pwd command, the os.getcwd function and the
os.path.normpath function all return directories without the trailing slash.
Also, while Python may not have this problem (and os.path.join seems to
normalize things), some external tools will interpret doubled "/" characters
as single characters while others (most notably Emacs), will treat the
second slash as "erase the prefix and start from /".  

In fact, the more I think of it, the more I think that Mark's reliance on
the trailing slash is a bug waiting to happen (in fact, it just happened
;-).  There's certainly nothing wrong (on Unix anyway) with paths that don't
contain a trailing slash, so if you're going to join paths together, you
ought to be using os.path.join.  To whack off prefixes, perhaps we need
something more general than os.path.split, so instead of

    prefix = os.path.commonprefix(files)
    for file in files:
       tail_portion = file[len(prefix):]

Mark would have used

    prefix = os.path.commonprefix(files)
    for file in files:
       tail_portion = os.path.splitprefix(prefix, file)[1]

The assumption being that

    os.path.splitprefix("/home", "/home/beluga/skip")

would return

    ["/home", "beluga/skip"]

Alternatively, how about os.path.suffixes?  It would work similar to
os.path.commonprefix, but instead of returning the prefix of a group of
files, return a list of the suffixes resulting in the application of the
common prefix:

    >>> files = ["/home/swen", "/home/swanson", "/home/jules"]
    >>> prefix = os.path.commonprefix(files)
    >>> print prefix
    "/home"
    >>> suffixes = os.path.suffixes(prefix, files)
    >>> print suffixes
    ["swen", "swanson", "jules"]

Skip