[Python-Dev] The path module PEP

BJörn Lindqvist bjourne at gmail.com
Tue Jan 24 21:22:01 CET 2006


The last time this was discussed six months ago it seemed like most of
python-dev fancied Jason Orendorff's path module. But Guido wanted a
PEP and noone created one. So I decided to claim the fame and write
one since I also love the path module. :) Much of it is copy-pasted
from Peter Astrand's PEP 324, many thanks to him.


#################################################
PEP: XXX
Title: path - Object oriented handling of filesystem paths
Version: $Revision: XXXX $
Last-Modified: $Date: 2006-01-24 19:24:59 +0100 (Sat, 29 Jan 2005) $
Author: Björn Lindqvist <bjourne at gmail.com>
Status: Dummo
Type: Standards Track (library)
Created: 24-Jan-2006
Content-Type: text/plain
Python-Version: 2.4


Abstract

    This PEP describes a new class for handling paths in an object
    oriented fashion.


Motivation

    Dealing with filesystem paths is a common task in any programming
    language, and very common in a high-level language like
    Python. Good support for this task is needed, because:

    - Almost every program uses paths to access files. It makes sense
      that a task, that is so often performed, should be as intuitive
      and as easy to perform as possible.

    - It makes Python an even better replacement language for
      over-complicated shell scripts.

    Currently, Python has a large number of different functions
    scattered over half a dozen modules for handling paths. This makes
    it hard for newbies and experienced developers to to choose the
    right method.

    The Path class provides the following enhancements over the
    current common practice:

    - One "unified" object provides all functionality from previous
      functions.

    - Subclassability - the Path object can be extended to support
      paths other than filesystem paths. The programmer does not need
      to learn a new API, but can reuse his or her knowledge of Path
      to deal with the extended class.

    - With all related functionality in one place, the right approach
      is easier to learn as one does not have to hunt through many
      different modules for the right functions.

    - Python is an object oriented language. Just like files,
      datetimes and sockets are objects so are paths, they are not
      merely strings to be passed to functions. Path objects are
      inherently a pythonic idea.

    - Path takes advantage of properties. Properties make for more
      readable code.

      if imgpath.ext == 'jpg':
          jpegdecode(imgpath)

      Is better than:

      if os.path.splitexit(imgpath)[1] == 'jpg':
          jpegdecode(imgpath)


Rationale

    The following points summarizes the design:

    - Path extends from string, therefore all code which expects
      string pathnames need not be modified and no existing code will
      break.

    - A Path object can be created either by using the classmethod
      Path.cwd, by instantiating the class with a string representing
      a path or by using the default constructor which is equivalent
      with Path(".").

    - The factory functions in popen2 have been removed, because I
      consider the class constructor equally easy to work with.

    - Path provides for common pathname manipulation, pattern
      expansion, pattern matching and other high-level file operations
      including copying. Basically everything path related except for
      manipulation of file contents which file objects are better
      suited for.

    - Platform incompatibilites are dealt with by not instantiating
      system specific methods.


Specification

    This class defines the following public methods:

        # Special Python methods.
        def __new__(cls, init = os.curdir): ...
        def __repr__(self): ...
        def __add__(self, more): ...
        def __radd__(self, other): ...
        def __div__(self, rel): ...
        def __truediv__(self, rel): ...

        # Alternative constructor.
        def cwd(cls): ...

        # Operations on path strings.
        def abspath(sef): ...
        def normcase(self): ...
        def normpath(self): ...
        def realpath(self): ...
        def expanduser(self): ...
        def expandvars(self): ...
        def dirname(self): ...
        def basename(self): ...
        def expand(self): ...
        def splitpath(self): ...
        def splitdrive(self): ...
        def splitext(self): ...
        def stripext(self): ...
        def splitunc(self): ... [1]
        def joinpath(self, *args): ...
        def splitall(self): ...
        def relpath(self): ...
        def relpathto(self, dest): ...

        # Properties about the path.
        parent, name, namebase, ext, drive, uncshare[1]

        # Operations that return lists of paths.
        def listdir(self, pattern = None): ...
        def dirs(self, pattern = None): ...
        def files(self, pattern = None): ...
        def walk(self, pattern = None): ...
        def walkdirs(self, pattern = None): ...
        def walkfiles(self, pattern = None): ...
        def match(self, pattern):
        def matchcase(self, pattern):
        def glob(self, pattern):

        # Methods for retrieving information about the filesystem
        # path.
        def exists(self): ...
        def isabs(self): ...
        def isdir(self): ...
        def isfile(self): ...
        def islink(self): ...
        def ismount(self): ...
        def samefile(self, other): ... [1]
        def getatime(self): ...
        def getmtime(self): ...
        def getctime(self): ...
        def getsize(self): ...
        def access(self, mode): ... [1]
        def stat(self): ...
        def lstat(self): ...
        def statvfs(self): ... [1]
        def pathconf(self, name): ... [1]
        def utime(self, times): ...
        def chmod(self, mode): ...
        def chown(self, uid, gid): ... [1]
        def rename(self, new): ...
        def renames(self, new):

        # Filesystem properties for path.
        atime, getmtime, getctime, size

        # Methods for manipulating information about the filesystem
        # path.
        def utime(self, times): ...
        def chmod(self, mode): ...
        def chown(self, uid, gid): ... [1]
        def rename(self, new): ...
        def renames(self, new): ...

        # Create/delete operations on directories
        def mkdir(self, mode = 0777): ...
        def makedirs(self, mode = 0777): ...
        def rmdir(self): ...
        def removedirs(self): ...

        # Modifying operations on files
        def touch(self): ...
        def remove(self): ...
        def unlink(self): ...

        # Modifying operations on links
        def link(self, newpath): ...
        def symlink(self, newlink): ...
        def readlink(self): ...
        def readlinkabs(self): ...

        # High-level functions from shutil
        def copyfile(self, dst): ...
        def copymode(self, dst): ...
        def copystat(self, dst): ...
        def copy(self, dst): ...
        def copy2(self, dst): ...
        def copytree(self, dst, symlinks = True): ...
        def move(self, dst): ...
        def rmtree(self, ignore_errors = False, onerror = None): ...

        # Special stuff from os
        def chroot(self): ... [1]
        def startfile(self): ... [1]

    [1] - Method is not availible on all platforms.


Replacing older functions with the Path class

    In this section, "a ==> b" means that b can be used as a
    replacement for a.

    In the following examples, we assume that the Path class is
    imported with "from path import Path".

    Replacing os.path.join
    ----------------------

    os.path.join(os.getcwd(), "foobar")
    ==>
    Path.cwd() / "foobar"


    Replacing os.path.splitext
    --------------------------

    os.path.splitext("Python2.4.tar.gz")[1]
    ==>
    Path("Python2.4.tar.gz").ext


    Replacing glob.glob
    -------------------

    glob.glob("/lib/*.so")
    ==>
    Path("/lib").glob("*.so")


Deprecations

    Introducing this module to the standard library introduces the
    need to deprecate a number of existing modules and functions. The
    table below explains which existing functionality that must be
    deprecated.

        PATH METHOD         DEPRECATES FUNCTION
        normcase()          os.path.normcase()
        normpath()          os.path.normpath()
        realpath()          os.path.realpath()
        expanduser()        os.path.expanduser()
        expandvars()        os.path.expandvars()
        dirname()           os.path.dirname()
        basename()          os.path.basename()
        splitpath()         os.path.split()
        splitdrive()        os.path.splitdrive()
        splitext()          os.path.splitext()
        splitunc()          os.path.splitunc()
        joinpath()          os.path.join()
        listdir()           os.listdir() [fnmatch.filter()]
        match()             fnmatch.fnmatch()
        matchcase()         fnmatch.fnmatchcase()
        glob()              glob.glob()
        exists()            os.path.exists()
        isabs()             os.path.isabs()
        isdir()             os.path.isdir()
        isfile()            os.path.isfile()
        islink()            os.path.islink()
        ismount()           os.path.ismount()
        samefile()          os.path.samefile()
        getatime()          os.path.getatime()
        getmtime()          os.path.getmtime()
        getsize()           os.path.getsize()
        cwd()               os.getcwd()
        access()            os.access()
        stat()              os.stat()
        lstat()             os.lstat()
        statvfs()           os.statvfs()
        pathconf()          os.pathconf()
        utime()             os.utime()
        chmod()             os.chmod()
        chown()             os.chown()
        rename()            os.rename()
        renames()           os.renames()
        mkdir()             os.mkdir()
        makedirs()          os.makedirs()
        rmdir()             os.rmdir()
        removedirs()        os.removedirs()
        remove()            os.remove()
        unlink()            os.unlink()
        link()              os.link()
        symlink()           os.symlink()
        readlink()          os.readlink()
        chroot()            os.chroot()
        startfile()         os.startfile()
        copyfile()          shutil.copyfile()
        copymode()          shutil.copymode()
        copystat()          shutil.copystat()
        copy()              shutil.copy()
        copy2()             shutil.copy2()
        copytree()          shutil.copytree()
        move()              shutil.move()
        rmtree()            shutil.rmtree()

    The Path class deprecates the whole of os.path, shutil, fnmatch
    and glob. A big chunk of os is also deprecated.


Open Issues

    Some functionality of Jason Orendorff's path module have been
    omitted:

    * Function for opening a path - better handled by the builtin
      open().

    * Functions for reading and writing a whole file - better handled
      by file objects read() and write() methods.

    * A chdir() function may be a worthy inclusion.

    * A deprecation schedule needs to be setup. How much functionality
      should Path implement? How much of existing functionality should
      it deprecate and when?

    * Where should the class be placed and what should it be called?

    The functions and modules that this new module is trying to
    replace (os.path, shutil, fnmatch, glob and parts of os are
    expected to be available in future Python versions for a long
    time, to preserve backwards compatibility.


Reference Implementation

    Currently, the Path class is implemented as a thin wrapper around
    the standard library modules sys, os, fnmatch, glob and
    shutil. The intention of this PEP is to move functionality from
    the aforementioned modules to Path while they are being
    deprecated.

    For more detail, and diverging implementations see:

        * http://www.jorendorff.com/articles/python/path/path.py
        * http://svn.python.org/projects/sandbox/trunk/path/path.py
        * http://cafepy.com/quixote_extras/rex/path/enhpath.py


Examples

    In this section, "a ==> b" means that b can be used as a
    replacement for a.

    1. Make all python files in the a directory executable:

        DIR = '/usr/home/guido/bin'
        for f in os.listdir(DIR):
            if f.endswith('.py'):
                path = os.path.join(DIR, f)
                os.chmod(path, 0755)
        ==>
        for f in Path('/usr/home/guido/bin'):
            f.chmod(0755)

    2. Delete emacs backup files:

        def delete_backups(arg, dirname, names):
            for name in names:
                if name.endswith('~'):
                    os.remove(os.path.join(dirname, name))
        ==>
        d = Path(os.environ['HOME'])
        for f in d.walkfiles('*~'):
            f.remove()

    3. Finding the relative path to a file:

        b = Path('/users/peter/')
        a = Path('/users/peter/synergy/tiki.txt')
        a.relpathto(b)

    4. Splitting a path into directory and filename:

        os.path.split("/path/to/foo/bar.txt")
        ==>
        Path("/path/to/foo/bar.txt").splitpath()

    5. List all Python scripts in the current directory tree:

        list(Path().walkfiles("*.py"))

    6. Create directory paths:

        os.path.join("foo", "bar", "baz")
        ==>
        Path("foo") / "bar" / "baz"


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

--
mvh Björn


More information about the Python-Dev mailing list