[issue42531] importlib.resources.path() raises TypeError for packages without __file__
William Schwartz
report at bugs.python.org
Thu Dec 3 11:07:46 EST 2020
William Schwartz <wkschwartz at gmail.com> added the comment:
> If the issue has been fixed on Python 3.9 but not on 3.8, then it was likely a redesign that enabled the improved behavior
That appears to be the case: path() shares code with files().
> a redesign that won't be ported back to Python 3.8 and earlier.
Nor should it.
> In these situations, the best recommendation is often to just rely on importlib_resources (the backport) for those older Python versions.
I do not need the files() API or anything else added in 3.9 at this time. I just need the existing 3.7/3.8 functionality to work as documented.
> have you considered using importlib_resources for Python 3.8 and earlier? That would likely also address the issue and you could adopt it sooner.
My application is sensitive to the size of the installed site-packages both in bytes and in just the number of packages. Yes, importlib_resources would very likely solve the problem reported in the OP. However I don't need the files() API, so adding an extra requirement feels like a heavy solution.
> To some extent, the behavior you've described could be considered a bug or could be considered a feature request (add support for path on packages that have no __spec__.origin). I am not aware whether this limitation was by design or incidental.
I agree there should be a high bar for patching old versions, but I posit that the behavior is an unintentional bug. In particular, I believe the behavior contradicts the documentation. Below I link and quote relevant documentation.
The function in question:
> importlib.resources.path(package, resource)¶
> ...package is either a name or a module object which conforms to the
> Package requirements.
https://docs.python.org/3.8/library/importlib.html#importlib.resources.path
So we jump to Package:
> importlib.resources.Package
> The Package type is defined as Union[str, ModuleType]. This means that
> where the function describes accepting a Package, you can pass in either a
> string or a module. Module objects must have a resolvable
> __spec__.submodule_search_locations that is not None.
https://docs.python.org/3.8/library/importlib.html#importlib.resources.Package
The Package type restricts the types of modules based on __spec__.submodule_search_locations. This suggests to me that the original author thought about which __spec__s to accept and which to reject but chose not to say anything about __spec__.origin, which is documented as possibly being None:
> class importlib.machinery.ModuleSpec(...)
> ...module.__spec__.origin == module.__file__.... Normally “origin” should
> be set, but it may be None (the default) which indicates it is unspecified
> (e.g. for namespace packages).
https://docs.python.org/3.8/library/importlib.html#importlib.machinery.ModuleSpec.origin
In particular, __spec__.origin *should* be None in certain situations:
> __file__
> __file__ is optional.... The import system may opt to leave __file__ unset
> if it has no semantic meaning (e.g. a module loaded from a database).
https://docs.python.org/3.8/reference/import.html#__file__
Taken together, the foregoing passages describe an `import` API in which path() works for all modules that are packages (i.e., __spec__.submodule_search_locations is not None), and in which some packages' __spec__.origin is None. That path() fails in practice to support some packages is therefore a bug not a the absence of a feature.
Regardless of whether PR 23611 is accepted, the test that it adds should be added to Python master to guard against regressions. I can submit that as a separate PR. Before I do that, do I need to create a new bpo ticket, or can I just reference this one?
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue42531>
_______________________________________
More information about the Python-bugs-list
mailing list