[Import-SIG] making it feasible to rely on loaders for reading intra-package data files

Nick Coghlan ncoghlan at gmail.com
Tue Feb 4 16:55:04 CET 2014


On 4 February 2014 07:00, Barry Warsaw <barry at python.org> wrote:
> On Feb 01, 2014, at 01:44 PM, Brett Cannon wrote:
>
>>Over on distutils-sig it came up that getting people to not simply assume
>>that __file__ points to an actual file and thus avoid using open() directly
>>to read intra-package files is an issue.
>
> I've always recommended that people use the Resource Manager APIs of
> pkg_resources to get at in-package data[*].  Those have always been the most
> reliable APIs AFAICT, but it's a shame that they're not available in the
> stdlib in any kind of backward compatible way.  Maybe the breadth or
> implementation of pkg_resources prevents it from being adopted wholesale into
> stdlib (and of course, it's too late for 3.4), but I really think we need
> something like that which we can promote loud and far.  And then there's PEP
> 365.

The problem with trying to use pkg_resources is that it conflates
multiple concepts in a hard to disentangle way, and its import time
side effects on sys.path are brutally confusing if you're trying to
use it to depend on non-default versions of a package on Fedora. You
have to get __main__.__requires__ set before pkg_resources is
imported, which means you're in a world of pain if you're trying to
run inside something like sphinx, gunicorn or nosetests that uses a
pkg_resources dependent wrapper script - instead of using the normal
CLI for those tools, you instead have to bypass that script to avoid
importing pkg_resources too early, and thus you end up with invocation
gems like these ones from Beaker:

====
args=[sys.executable, '-c',
                '__requires__ = ["CherryPy < 3.0"]; import pkg_resources; ' \
                'from gunicorn.app.wsgiapp import run; run()' ...
===
python -c '__requires__ = ["CherryPy < 3.0"]; import pkg_resources;
from nose.core import main; main()'
===
python -c '__requires__ = [$(SPHINXREQUIRES)]; import pkg_resources; \ ...
===

We have to do that so we can get our multi-version support
requirements into place without the underlying utility choosing the
wrong version of key dependencies by default as a side effect of
importing pkg_resources to look for the project's entry point.

The two core problems from my point of view are that pkg_resources is
difficult to comprehend (because so much of it relies on implicit side
effects as triggers react to data changes and it has non-trivial side
effects on the process global state at import time that may cause
failures later) and difficult to refactor (because it's hard to tell
what is a guaranteed API and what can be safely changed). There are
also a couple of thorny usability bugs that confused even me for a
while, and I have a pretty good idea how the import system works:
https://bitbucket.org/pypa/setuptools/issue/6/pkg_resources-merrily-adds-site-packages
and https://bitbucket.org/pypa/setuptools/issue/2/emit-less-cryptic-error-message-for-a

However, once you figure out those arcane workarounds and usability
traps (or if you're always using virtual environments and hence never
run into them), pkg_resources *works well*. It's only if you're trying
to use it in a shared distro environment with multi-level constructs
that it can cause trouble.

I have some ideas on how to fix those issues (see
https://bitbucket.org/pypa/import_resources/overview), but it hasn't
made it to the top of my todo list in a very long time (and doesn't
appear likely to get there any time soon, either).

> pkgutil.get_data() is as close as the stdlib comes I think, but it's not
> enough since sometimes you actually need a file name, or some of the other
> pkg_resources APIs.
>
> -Barry
>
> [*] Specifically: resource_exists(), resource_stream(), resource_string(),
> resource_isdir(), resource_listdir().

But unfortunately, you can't even import pkg_resources to get at those
without it version locking your entire sys.path.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Import-SIG mailing list