[py-dev] thoughts on embedding files

Matthew Scott mscott at goldenspud.com
Thu Jan 27 07:33:53 CET 2005


Matthew Scott wrote:
> (cross-posting this to both the Schevo and Py mailing lists, to get the 
> input of both sets of people)
> 
> I will be writing more as I discover more about py.path's innards, and 
> determine whether this would be a good choice or not for embedding files.

As promised, here is more.

I've begun working on a py.path.embed implementation.  Once I've 
completed a working first pass at it, I'll post a patch to the list for 
review.

So far, I've skimmed over the py.path.local API, and have also written 
down some thoughts and "forward-looking documentation" on how I think 
this could work.

Before I paste that in though, I'll preface it by saying that the first 
implementation of embedding files within Schevo was very specific.  That 
is, for schema, icons, and general files, there were (well... still are) 
three different ways of transforming files into Python modules, and 
three different sets of loaders that each had "filesystem loaders" and 
"python object loaders".

In short, lots of duplication and specialization with little gain.

What I plan to do once py.path.embed is implemented is to convert 
Schevo's schema and icon loaders to use py.path.embed exclusively, so 
that it can transparently use either local files and directories, or 
embedded "virtual" files and directories.

And with that out of the way, here are my notes, straight from the 
py.__impl__.path.embed docstring  :-)

"""Embedded path implementation.

Introduction
============

This path implementation, `py.path.embed`, lets you use a single API
to access either a set of files that are within a common directory, or
a Python object that contains 'virtual' files and directories that can
be embedded in a Python module.

No matter which mode you're in, you can do nearly all typical
path/file operations that you would use if you were using
`py.path.local` objects.  You can also transform a set of real files
into a dictionary suitable for embedding within a Python module, and
vice versa.

Such embedding of data using a file/directory metaphor allows you to
develop an application's external data (such as image files for use in
a user interface) using standard files and directories, which are easy
to use with external tools such as editors and source code
repositories.

When packaging these files in a form that will be distributed to
end-users, it is often desirable to not expose these files and
directories to those users.  As long as you are using `py.path.embed`
to work with sets of files located in your local file system, you can
embed those same files into a Python module, which can in turn be
included within your distributed application alongside the other
required modules.

Contexts
========

`py.path.embed` is not a general-purpose `py.path` implementation that
can work with any local path.  It works within *contexts*, which are
self-contained sets of files and directories.

There are two types of contexts:

- A *Local context* is a collection of all files and directories that
   are within a single base directory.

   A local context is created based on a `py.path.local` object, which
   becomes the *root* of the context.

   When using `py.path.embed` to access paths, files, and directories
   within a local context, it essentially acts as a proxy to the
   `py.path.local` root path of the context and the real file objects
   that `py.path.local` makes available.

- A *Virtual context* is a collection of virtual files and directories
   that are stored as strings and dictionaries within a Python
   dictionary.

   A virtual context is created based on either an existing dictionary
   created by `py.path.embed`, or an empty dictionary.  The dictionary
   becomes the *root* of the context.

   When using `py.path.embed` to access paths, files, and directories
   within a virtual context, it provides an API that acts as if you
   were accessing paths via `py.path.local` and provides file-like
   objects that act similarly to actual file objects.

Transformation
==============

The tranformation of files and directories between local and virtual
contexts is perhaps the primary reason that `py.path.embed`
exists. Thankfully, this becomes an easy process since `py.path.embed`
closely mirrors the `py.path.local` API.

To transform from a local directory to a module containing a
dictionary, follow these steps:

1. Create a local context `localCtx` whose root is the local
    directory.  Create a corresponding `py.path.embed` instance called
    `local` using that context::

      root = py.path.local('/path/to/the/files/to/embed')
      localCtx = py.path.embed.LocalContext(root)
      local = py.path.embed(localCtx)

2. Create a virtual context `virtualCtx` whose root is an empty
    dictionary.  Create a corresponding `py.path.embed` instance called
    `virtual` using that context::

      virtualCtx = py.path.embed.VirtualContext()
      virtual = py.path.embed(virtualCtx)

3. Populate the virtual context with copies of local files::

      local.copy(target=virtual)

4. Open a local file for output which will be your Python module
    containing the embedded files::

      foo = open('foo.py', 'w')

5. Write a representation of the virtual context's root dictionary to
    the Python module::

      foo.write('root = %r\n' % virtualCtx.root)
      foo.close()

Of course, if you want to populate the virtual context from scratch
using a more complex algorithm than simply copying files from the
local filesystem, you can simply create the virtual context and
associated `py.path.embed` object, and use it as if you were creating
files using `py.path.local`.

Once you have this module, you can now use code similar to the
following to attempt to load the embedded version that you distribute,
then fall back to the local version that you use during development if
the embedded version couldn't be loaded::

   try:
     import foo
   except ImportError:
     ctx = py.path.embed.LocalContext(root=py.path.local(...))
   else:
     ctx = py.path.embed.VirtualContext(root=foo.root)
   rootPath = py.path.embed(ctx)

"""



More information about the Pytest-dev mailing list