[Import-SIG] Loading Resources From a Python Module/Package

Sat Jan 31 17:19:25 CET 2015

On Sat Jan 31 2015 at 10:34:45 AM Donald Stufft <donald at stufft.io> wrote:

>
> On Jan 31, 2015, at 9:48 AM, Brett Cannon <brett at python.org> wrote:
>
> The reason Loader.get_data() takes absolute paths is to do away with
> ambiguity. If you have a relative path and ask a loader to read that path,
> where should that relative path be anchored? Should it be the top-level
> package? What about the module that loader ewas returned to handle? But
> then what about if a finder caches loaders and reuses them across modules
> (nothing in PEP 302 says you can't do this and in actuality the frozen and
> built-in loaders are just static and class methods). The choice of dealing
> exclusively in absolute paths was a conscious choice on my part.
>
> Now having said that, there is nothing to say absolute paths require file
> system I based paths. What you should really do is think of these paths as
> opaque, non-ambiguous paths for the loader which claimed it knew what file
> path was needed to pass to get_data(). If you think that way then you
> realize you can use markers in the path as necessary, e.g.
> some/path/file.zip/pkg/sub/data.txt. As long as loader.get_data() can
> unambiguously read that path as returned by get_data_filename() or whatever
> the method is called then you have fully abstracted paths out while still
> being able to read data from a loader.
>
> Basically any API dealing with paths for loaders needs to abstract away
> the concept of files, file-like paths, etc. and rely on using the loader
> API on pretty much everything as a simple os.path of its own. This is why I
> have not tried to tackle the issue of the list_contents() or some such API
> to list modules and potentially data files as it needs to not really have a
> concrete concept of file paths (and it really should be on finders and not
> loaders which complicates discovery, selecting the right finder, etc.).
> This is also why APIs wanting a file path instead of taking a file-like
> object simply cannot play well with importlib and loaders which have
> alternative back end storage without simply being lucky that the loader
> they are working with uses filesystem paths (or writing out to a temp file).
>
>
> I think that dealing in absolute file paths (whether they are “real” paths
> or not) makes the APIs super hard to use in anything but the simple case.
>

I think we are talking about two different things when we say "relative"; I
clarify later.

> For instance what do you do in a namespace package (either PEP 420 or one
> that extends the module __path__).
>
There you have multiple candidate file paths and no good way to figure out
> which one you need to use and It requires that your code couple itself with
> the implementation of the package and it will break if someone changes from
> a module to a namespace package.
>

Yep, but that's just life. If you're reading data out of a package anyway
then you are already coupled to its structure so this is no different.

>
> The way the PEP 302 Loaders work isn’t super obvious to me, so I’m looking
> at the implementation and making assumptions about it and I thought that it
> was one Loader per importable name. Looking closer it appears the way you
> “import” a module from a Loader is using Loader().exec_module(“foo.bar”).
> So I’d say then that the Loader() APIs should be
> Loader().get_bytes(“foo.bar”, “relative/to/foo.bar/file.txt”). That should
> resolve the case about not knowing what it should be relative to, since it
> should be relative to the name given. Then the Loader() can encapsulate the
> logic about how to turn “foo.bar” + “relative/to/foo.bar/file.txt” into an
> absolute path for to get some data (or something else).
>

Yes, specifying the package anchor point does away with the ambiguity of
relativity as it has an absolute position in a namespace. As long as we do
**that** then there are no relative paths to speak of as all the
information necessary to calculate an absolute path without ambiguity is
provided.

>
> It seems obvious to me that requiring a full path like that is the wrong
> way to expect people to work with constructing full paths for resources. It
> would be similar to expecting people to do ``import
> /data/foo.zip/submodule``. The import system should be abstracting all of
> that away for them.
>

I think what you mean by "relative" and what I mean by "relative" are
different. When I say "relative" I mean what you pass to loader.get_data().
What you mean by "relative" is I think the "file.txt" part of a call to
get_bytes('some.module', "file.txt") which I don't consider relative as you
specify everything for an absolute path. IOW I'm talking about the existing
API and its semantics and you're talking in terms of your new API, so we
are talking past each other. =)

-Brett

>
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20150131/1fff2f20/attachment-0001.html>