[Python-checkins] peps: Big update to PEP 451 in response to feedback.
eric.snow
python-checkins at python.org
Wed Aug 28 10:45:53 CEST 2013
http://hg.python.org/peps/rev/f139785b970b
changeset: 5081:f139785b970b
user: Eric Snow <ericsnowcurrently at gmail.com>
date: Wed Aug 28 02:43:45 2013 -0600
summary:
Big update to PEP 451 in response to feedback.
files:
pep-0451.txt | 905 +++++++++++++++++++++++++-------------
1 files changed, 581 insertions(+), 324 deletions(-)
diff --git a/pep-0451.txt b/pep-0451.txt
--- a/pep-0451.txt
+++ b/pep-0451.txt
@@ -17,10 +17,11 @@
========
This PEP proposes to add a new class to ``importlib.machinery`` called
-``ModuleSpec``. It will contain all the import-related information
-about a module without needing to load the module first. Finders will
-now return a module's spec rather than a loader. The import system will
-use the spec to load the module.
+``ModuleSpec``. It will be authoritative for all the import-related
+information about a module, and will be available without needing to
+load the module first. Finders will provide a module's spec instead of
+a loader. The import machinery will be adjusted to take advantage of
+module specs, including using them to load modules.
Motivation
@@ -85,7 +86,7 @@
As an example of complexity attributable to this flaw, the
implementation of namespace packages in Python 3.3 (see PEP 420) added
``FileFinder.find_loader()`` because there was no good way for
-``find_module()`` to provide the namespace path.
+``find_module()`` to provide the namespace search locations.
The answer to this gap is a ``ModuleSpec`` object that contains the
per-module information and takes care of the boilerplate functionality
@@ -100,334 +101,603 @@
The goal is to address the gap between finders and loaders while
changing as little of their semantics as possible. Though some
functionality and information is moved to the new ``ModuleSpec`` type,
-their semantics should remain the same. However, for the sake of
-clarity, those semantics will be explicitly identified.
+their behavior should remain the same. However, for the sake of clarity
+the finder and loader semantics will be explicitly identified.
+
+This is a high-level summary of the changes described by this PEP. More
+detail is available in later sections.
+
+importlib.machinery.ModuleSpec (new)
+------------------------------------
+
+Attributes:
+
+* name - a string for the name of the module.
+* loader - the loader to use for loading and for module data.
+* origin - a string for the location from which the module is loaded.
+* submodule_search_locations - strings for where to find submodules,
+ if a package.
+* loading_info - a container of data for use during loading (or None).
+* cached (property) - a string for where the compiled module will be
+ stored.
+* is_location (RO-property) - the module's origin refers to a location.
+
+.. XXX Find a better name than loading_info?
+.. XXX Add ``submodules`` (RO-property) - returns possible submodules
+ relative to spec (or None)?
+.. XXX Add ``loaded`` (RO-property) - the module in sys.modules, if any?
+
+Factory Methods:
+
+* from_file_location() - factory for file-based module specs.
+* from_module() - factory based on import-related module attributes.
+* from_loader() - factory based on information provided by loaders.
+
+.. XXX Move the factories to importlib.util or make class-only?
+
+Instance Methods:
+
+* init_module_attrs() - populate a module's import-related attributes.
+* module_repr() - provide a repr string for a module.
+* create() - provide a new module to use for loading.
+* exec() - execute the spec into a module namespace.
+* load() - prepare a module and execute it in a protected way.
+* reload() - re-execute a module in a protected way.
+
+.. XXX Make module_repr() match the spec (BC problem?)?
+
+API Additions
+-------------
+
+* ``importlib.abc.Loader.exec_module()`` will execute a module in its
+ own namespace, replacing ``importlib.abc.Loader.load_module()``.
+* ``importlib.abc.Loader.create_module()`` (optional) will return a new
+ module to use for loading.
+* Module objects will have a new attribute: ``__spec__``.
+* ``importlib.find_spec()`` will return the spec for a module.
+* ``__subclasshook__()`` will be implemented on the importlib ABCs.
+
+.. XXX Do __subclasshook__() separately from the PEP (issue18862).
+
+API Changes
+-----------
+
+* Import-related module attributes will no longer be authoritative nor
+ used by the import system.
+* ``InspectLoader.is_package()`` will become optional.
+
+.. XXX module __repr__() will prefer spec attributes?
+
+Deprecations
+------------
+
+* ``importlib.abc.MetaPathFinder.find_module()``
+* ``importlib.abc.PathEntryFinder.find_module()``
+* ``importlib.abc.PathEntryFinder.find_loader()``
+* ``importlib.abc.Loader.load_module()``
+* ``importlib.abc.Loader.module_repr()``
+* The parameters and attributes of the various loaders in
+ ``importlib.machinery``
+* ``importlib.util.set_package()``
+* ``importlib.util.set_loader()``
+* ``importlib.find_loader()``
+
+Removals
+--------
+
+* ``importlib.abc.Loader.init_module_attrs()``
+* ``importlib.util.module_to_load()``
+
+Other Changes
+-------------
+
+* The spec for the ``__main__`` module will reflect the appropriate
+ name and origin.
+* The module type's ``__repr__`` will defer to ModuleSpec exclusively.
+
+Backward-Compatibility
+----------------------
+
+* If a finder does not define ``find_spec()``, a spec is derived from
+ the loader returned by ``find_module()``.
+* ``PathEntryFinder.find_loader()`` will be used, if defined.
+* ``Loader.load_module()`` is used if ``exec_module()`` is not defined.
+* ``Loader.module_repr()`` is used by ``ModuleSpec.module_repr()`` if it
+ exists.
+
+What Will not Change?
+---------------------
+
+* The syntax and semantics of the import statement.
+* Existing finders and loaders will continue to work normally.
+* The import-related module attributes will still be initialized with
+ the same information.
+* Finders will still create loaders, storing them in the specs.
+* ``Loader.load_module()``, if a module defines it, will have all the
+ same requirements and may still be called directly.
+* Loaders will still be responsible for module data APIs.
+
+
+ModuleSpec Users
+================
+
+``ModuleSpec`` objects has 3 distinct target audiences: Python itself,
+import hooks, and normal Python users.
+
+Python will use specs in the import machinery, in interpreter startup,
+and in various standard library modules. Some modules are
+import-oriented, like pkgutil, and others are not, like pickle and
+pydoc. In all cases, the full ``ModuleSpec`` API will get used.
+
+Import hooks (finders and loaders) will make use of the spec in specific
+ways, mostly without using the ``ModuleSpec`` instance methods. First
+of all, finders will use the factory methods to create spec objects.
+They may also directly adjust the spec attributes after the spec is
+created. Secondly, the finder may bind additional information to the
+spec for the loader to consume during module creation/execution.
+Finally, loaders will make use of the attributes on a spec when creating
+and/or executing a module.
+
+Python users will be able to inspect a module's ``__spec__`` to get
+import-related information about the object. Generally, they will not
+be using the ``ModuleSpec`` factory methods nor the instance methods.
+However, each spec has methods named ``create``, ``exec``, ``load``, and
+``reload``. Since they are so easy to access (and misunderstand/abuse),
+their function and availability require explicit consideration in this
+proposal.
+
+
+What Will Existing Finders and Loaders Have to Do Differently?
+==============================================================
+
+Immediately? Nothing. The status quo will be deprecated, but will
+continue working. However, here are the things that the authors of
+finders and loaders should change relative to this PEP:
+
+* Implement ``find_spec()`` on finders.
+* Implement ``exec_module()`` on loaders, if possible.
+
+The factory methods of ``ModuleSpec`` are intended to be helpful for
+converting existing finders. ``from_loader()`` and
+``from_file_location()`` are both straight-forward utilities in this
+regard. In the case where loaders already expose methods for creating
+and preparing modules, a finder may use ``ModuleSpec.from_module()`` on
+a throw-away module to create the appropriate spec.
+
+As for loaders, ``exec_module()`` should be a relatively direct
+conversion from a portion of the existing ``load_module()``. However,
+``Loader.create_module()`` will also be necessary in some uncommon
+cases. Furthermore, ``load_module()`` will still work as a final option
+when ``exec_module()`` is not appropriate.
+
+
+How Loading Will Work
+=====================
+
+This is an outline of what happens in ``ModuleSpec.load()``.
+
+1. A new module is created by calling ``spec.create()``.
+
+ a. If the loader has a ``create_module()`` method, it gets called.
+ Otherwise a new module gets created.
+ b. The import-related module attributes are set.
+
+2. The module is added to sys.modules.
+3. ``spec.exec(module)`` gets called.
+
+ a. If the loader has an ``exec_module()`` method, it gets called.
+ Otherwise ``load_module()`` gets called for backward-compatibility
+ and the resulting module is updated to match the spec.
+
+4. If there were any errors the module is removed from sys.modules.
+5. If the module was replaced in sys.modules during ``exec()``, the one
+ in sys.modules is updated to match the spec.
+6. The module in sys.modules is returned.
+
+These steps are exactly what ``Loader.load_module()`` is already
+expected to do. Loaders will thus be simplified since they will only
+need to implement the portion in step 3a.
+
ModuleSpec
+==========
+
+This is a new class which defines the import-related values to use when
+loading the module. It closely corresponds to the import-related
+attributes of module objects. ``ModuleSpec`` objects may also be used
+by finders and loaders and other import-related APIs to hold extra
+import-related state concerning the module. This greatly reduces the
+need to add any new new import-related attributes to module objects, and
+loader ``__init__`` methods will no longer need to accommodate such
+per-module state.
+
+General Notes
+-------------
+
+* The spec for each module instance will be unique to that instance even
+ if the information is identical to that of another spec.
+* A module's spec is not intended to be modified by anything but
+ finders.
+
+Creating a ModuleSpec
+---------------------
+
+**ModuleSpec(name, loader, *, origin=None, is_package=None)**
+
+.. container::
+
+ ``name``, ``loader``, and ``origin`` are set on the new instance
+ without any modification. If ``is_package`` is not passed in, the
+ loader's ``is_package()`` gets called (if available), or it defaults
+ to `False`. If ``is_package`` is true,
+ ``submodule_search_locations`` is set to a new empty list. Otherwise
+ it is set to None.
+
+ Other attributes not listed as parameters (such as ``package``) are
+ either read-only dynamic properties or default to None.
+
+**from_filename(name, loader, *, filename=None, submodule_search_locations=None)**
+
+.. container::
+
+ This factory classmethod allows a suitable ModuleSpec instance to be
+ easily created with extra file-related information. This includes
+ the values that would be set on a module as ``__file__`` or
+ ``__cached__``.
+
+ ``is_location`` is set to True for specs created using
+ ``from_filename()``.
+
+**from_module(module, loader=None)**
+
+.. container::
+
+ This factory is used to create a spec based on the import-related
+ attributes of an existing module. Since modules should already have
+ ``__spec__`` set, this method has limited utility.
+
+**from_loader(name, loader, *, origin=None, is_package=None)**
+
+.. container::
+
+ A factory classmethod that returns a new ``ModuleSpec`` derived from
+ the arguments. ``is_package`` is used inside the method to indicate
+ that the module is a package. If not explicitly passed in, it falls
+ back to using the result of the loader's ``is_package()``, if
+ available. If not available, if defaults to False.
+
+ In contrast to ``ModuleSpec.__init__()``, which takes the arguments
+ as-is, ``from_loader()`` calculates missing values from the ones
+ passed in, as much as possible. This replaces the behavior that is
+ currently provided by several ``importlib.util`` functions as well as
+ the optional ``init_module_attrs()`` method of loaders. Just to be
+ clear, here is a more detailed description of those calculations::
+
+ If not passed in, ``filename`` is to the result of calling the
+ loader's ``get_filename()``, if available. Otherwise it stays
+ unset (``None``).
+
+ If not passed in, ``submodule_search_locations`` is set to an empty
+ list if ``is_package`` is true. Then the directory from ``filename``
+ is appended to it, if possible. If ``is_package`` is false,
+ ``submodule_search_locations`` stays unset.
+
+ If ``cached`` is not passed in and ``filename`` is passed in,
+ ``cached`` is derived from it. For filenames with a source suffix,
+ it set to the result of calling
+ ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g.
+ ``.pyc``), ``cached`` is set to the value of ``filename``. If
+ ``filename`` is not passed in or ``cache_from_source()`` raises
+ ``NotImplementedError``, ``cached`` stays unset.
+
+ If not passed in, ``origin`` is set to ``filename``. Thus if
+ ``filename`` is unset, ``origin`` stays unset.
+
+
+Attributes
----------
-A new class which defines the import-related values to use when loading
-the module. It closely corresponds to the import-related attributes of
-module objects. ``ModuleSpec`` objects may also be used by finders and
-loaders and other import-related APIs to hold extra import-related
-state about the module. This greatly reduces the need to add any new
-new import-related attributes to module objects, and loader ``__init__``
-methods won't need to accommodate such per-module state.
-
-Creating a ModuleSpec:
-
-``ModuleSpec(name, loader, *, origin=None, filename=None, cached=None,
-path=None)``
-
-Passed in parameter values are assigned directly to the corresponding
-attributes below. Other attributes not listed as parameters (such as
-``package``) are read-only properties that are automatically derived
-from these values.
-
-The ``ModuleSpec.from_loader()`` class method allows a suitable
-ModuleSpec instance to be easily created from a PEP 302 loader object.
-
-ModuleSpec Attributes
----------------------
-
Each of the following names is an attribute on ``ModuleSpec`` objects.
A value of ``None`` indicates "not set". This contrasts with module
objects where the attribute simply doesn't exist.
-While ``package`` and ``is_package`` are read-only properties, the
-remaining attributes can be replaced after the module spec is created
-and after import is complete. This allows for unusual cases where
-modifying the spec is the best option. However, typical use should not
-involve changing the state of a module's spec.
+While ``package`` is a read-only property, the remaining attributes can
+be replaced after the module spec is created and even after import is
+complete. This allows for unusual cases where directly modifying the
+spec is the best option. However, typical use should not involve
+changing the state of a module's spec.
Most of the attributes correspond to the import-related attributes of
modules. Here is the mapping, followed by a description of the
attributes. The reverse of this mapping is used by
-``init_module_attrs()``.
+``ModuleSpec.init_module_attrs()``.
-============= ===========
-On ModuleSpec On Modules
-============= ===========
-name __name__
-loader __loader__
-package __package__
-is_package -
-origin -
-filename __file__
-cached __cached__
-path __path__
-============= ===========
+========================== ===========
+On ModuleSpec On Modules
+========================== ===========
+name __name__
+loader __loader__
+package __package__
+origin __file__*
+cached __cached__*
+submodule_search_locations __path__**
+loading_info \-
+has_location (RO-property) \-
+========================== ===========
-``name``
+\* Only if ``is_location`` is true.
+\*\* Only if not None.
-The module's fully resolved and absolute name. It must be set.
+**name**
-``loader``
+.. container::
-The loader to use during loading and for module data. These specific
-functionalities do not change for loaders. Finders are still
-responsible for creating the loader and this attribute is where it is
-stored. The loader must be set.
+ The module's fully resolved and absolute name. It must be set.
-``package``
+**loader**
-The name of the module's parent. This is a dynamic attribute with a
-value derived from ``name`` and ``is_package``. For packages it is the
-value of ``name``. Otherwise it is equivalent to
-``name.rpartition('.')[0]``. Consequently, a top-level module will have
-the empty string for ``package``.
+.. container::
+ The loader to use during loading and for module data. These specific
+ functionalities do not change for loaders. Finders are still
+ responsible for creating the loader and this attribute is where it is
+ stored. The loader must be set.
-``is_package``
+**origin**
-Whether or not the module is a package. This dynamic attribute is True
-if ``path`` is not None (e.g. the empty list is a "true" value), else it
-is false.
+.. container::
-``origin``
+ A string for the location from which the module originates. Aside from
+ the informational value, it is also used in ``module_repr()``.
-A string for the location from which the module originates. If
-``filename`` is set, ``origin`` should be set to the same value unless
-some other value is more appropriate. ``origin`` is used in
-``module_repr()`` if it does not match the value of ``filename``.
+ The module attribute ``__file__`` has a similar but more restricted
+ meaning. Not all modules have it set (e.g. built-in modules). However,
+ ``origin`` is applicable to essentially all modules. For built-in
+ modules it would be set to "built-in".
-Using ``filename`` for this meaning would be inaccurate, since not all
-modules have path-based locations. For instance, built-in modules do
-not have ``__file__`` set. Yet it is useful to have a descriptive
-string indicating that it originated from the interpreter as a built-in
-module. So built-in modules will have ``origin`` set to ``"built-in"``.
+Secondary Attributes
+--------------------
-Path-based attributes:
+Some of the ``ModuleSpec`` attributes are not set via arguments when
+creating a new spec. Either they are strictly dynamically calculated
+properties or they are simply set to None (aka "not set"). For the
+latter case, those attributes may still be set directly.
-If any of these is set, it indicates that the module is path-based. For
-reference, a path entry is a string for a location where the import
-system will look for modules, e.g. the path entries in ``sys.path`` or a
-package's ``__path__``).
+**package**
-``filename``
+.. container::
-Like ``origin``, but limited to a path-based location. If ``filename``
-is set, ``origin`` should be set to the same string, unless origin is
-explicitly set to something else. ``filename`` is not necessarily an
-actual file name, but could be any location string based on a path
-entry. Regarding the attribute name, while it is potentially
-inaccurate, it is both consistent with the equivalent module attribute
-and generally accurate.
+ A dynamic property that gives the name of the module's parent. The
+ value is derived from ``name`` and ``is_package``. For packages it is
+ the value of ``name``. Otherwise it is equivalent to
+ ``name.rpartition('.')[0]``. Consequently, a top-level module will have
+ the empty string for ``package``.
-.. XXX Would a different name be better? ``path_location``?
+**has_location**
-``cached``
+.. container::
-The path-based location where the compiled code for a module should be
-stored. If ``filename`` is set to a source file, this should be set to
-corresponding path that PEP 3147 specifies. The
-``importlib.util.source_to_cache()`` function facilitates getting the
-correct value.
+ Some modules can be loaded by reference to a location, e.g. a filesystem
+ path or a URL or something of the sort. Having the location lets you
+ load the module, but in theory you could load that module under various
+ names.
-``path``
+ In contrast, non-located modules can't be loaded in this fashion, e.g.
+ builtin modules and modules dynamically created in code. For these, the
+ name is the only way to access them, so they have an "origin" but not a
+ "location".
-The list of path entries in which to search for submodules if this
-module is a package. Otherwise it is ``None``.
+ This attribute reflects whether or not the module is locatable. If it
+ is, ``origin`` must be set to the module's location and ``__file__``
+ will be set on the module. Furthermore, a locatable module is also
+ cacheable and so ``__cached__`` is tied to ``has_location``.
-.. XXX add a path-based subclass?
+ The corresponding module attribute name, ``__file__``, is somewhat
+ inaccurate and potentially confusion, so we will use a more explicit
+ combination of ``origin`` and ``has_location`` to represent the same
+ information. Having a separate ``filename`` is unncessary since we have
+ ``origin``.
-ModuleSpec Methods
-------------------
+**cached**
-``from_loader(name, loader, *, is_package=None, origin=None, filename=None, cached=None, path=None)``
+.. container::
-.. XXX use a different name?
+ A string for the location where the compiled code for a module should be
+ stored. PEP 3147 details the caching mechanism of the import system.
-A factory classmethod that returns a new ``ModuleSpec`` derived from the
-arguments. ``is_package`` is used inside the method to indicate that
-the module is a package. If not explicitly passed in, it is set to
-``True`` if ``path`` is passed in. It falls back to using the result of
-the loader's ``is_package()``, if available. Finally it defaults to
-False. The remaining parameters have the same meaning as the
-corresponding ``ModuleSpec`` attributes.
+ If ``has_location`` is true, this location string is set on the module
+ as ``__cached__``. When ``from_filename()`` is used to create a spec,
+ ``cached`` is set to the result of calling
+ ``importlib.util.source_to_cache()``.
-In contrast to ``ModuleSpec.__init__()``, which takes the arguments
-as-is, ``from_loader()`` calculates missing values from the ones passed
-in, as much as possible. This replaces the behavior that is currently
-provided by several ``importlib.util`` functions as well as the optional
-``init_module_attrs()`` method of loaders. Just to be clear, here is a
-more detailed description of those calculations::
+ ``cached`` is not necessarily a file location. A finder or loader may
+ store an alternate location string in ``cached``. However, in practice
+ this will be the file location dicated by PEP 3147.
- If not passed in, ``filename`` is to the result of calling the
- loader's ``get_filename()``, if available. Otherwise it stays
- unset (``None``).
+**submodule_search_locations**
- If not passed in, ``path`` is set to an empty list if
- ``is_package`` is true. Then the directory from ``filename`` is
- appended to it, if possible. If ``is_package`` is false, ``path``
- stays unset.
+.. container::
- If ``cached`` is not passed in and ``filename`` is passed in,
- ``cached`` is derived from it. For filenames with a source suffix,
- it set to the result of calling
- ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g.
- ``.pyc``), ``cached`` is set to the value of ``filename``. If
- ``filename`` is not passed in or ``cache_from_source()`` raises
- ``NotImplementedError``, ``cached`` stays unset.
+ The list of location strings, typically directory paths, in which to
+ search for submodules. If the module is a package this will be set to
+ a list (even an empty one). Otherwise it is ``None``.
- If not passed in, ``origin`` is set to ``filename``. Thus if
- ``filename`` is unset, ``origin`` stays unset.
+ The corresponding module attribute's name, ``__path__``, is relatively
+ ambiguous. Instead of mirroring it, we use a more explicit name that
+ makes the purpose clear.
-``module_repr()``
+**loading_info**
-Returns a repr string for the module if ``origin`` is set and
-``filename`` is not set. The string refers to the value of ``origin``.
-Otherwise ``module_repr()`` returns None. This indicates to the module
-type's ``__repr__()`` that it should fall back to the default repr.
+.. container::
-We could also have ``module_repr()`` produce the repr for the case where
-``filename`` is set or where ``origin`` is not set, mirroring the repr
-that the module type produces directly. However, the repr string is
-derived from the import-related module attributes, which might be out of
-sync with the spec.
+ A finder may set ``loading_info`` to any value to provide additional
+ data for the loader to use during loading. A value of ``None`` is the
+ default and indicates that there is no additional data. Otherwise it is
+ likely set to some containers, such as a ``dict``, ``list``, or
+ ``types.SimpleNamespace`` containing the relevant extra information.
-.. XXX Is using the spec close enough? Probably not.
+ For example, ``zipimporter`` could use it to pass the zip archive name
+ to the loader directly, rather than needing to derive it from ``origin``
+ or create a custom loader for each find operation.
-The implementation of the module type's ``__repr__()`` will change to
-accommodate this PEP. However, the current functionality will remain to
-handle the case where a module does not have a ``__spec__`` attribute.
+Methods
+-------
-.. XXX Clarify the above justification.
+**module_repr()**
-``init_module_attrs(module)``
+.. container::
-Sets the module's import-related attributes to the corresponding values
-in the module spec. If a path-based attribute is not set on the spec,
-it is not set on the module. For the rest, a ``None`` value on the spec
-(aka "not set") means ``None`` will be set on the module. If any of the
-attributes are already set on the module, the existing values are
-replaced. The module's own ``__spec__`` is not consulted but does get
-replaced with the spec on which ``init_module_attrs()`` was called.
-The earlier mapping of ``ModuleSpec`` attributes to module attributes
-indicates which attributes are involved on both sides.
+ Returns a repr string for the module, based on the module's import-
+ related attributes and falling back to the spec's attributes. The
+ string will reflect the current output of the module type's
+ ``__repr__()``.
-``load(module=None, *, is_reload=False)``
+ The module type's ``__repr__()`` will use the module's ``__spec__``
+ exclusively. If the module does not have ``__spec__`` set, a spec is
+ generated using ``ModuleSpec.from_module()``.
-This method captures the current functionality of and requirements on
-``Loader.load_module()`` without any semantic changes, except one.
-Reloading a module when ``exec_module()`` is available actually uses
-``module`` rather than ignoring it in favor of the one in
-``sys.modules``, as ``Loader.load_module()`` does.
+ Since the module attributes may be out of sync with the spec and to
+ preserve backward-compatibility in that case, we defer to the module
+ attributes and only when they are missing do we fall back to the spec
+ attributes.
-``module`` is only allowed when ``is_reload`` is true. This means that
-``is_reload`` could be dropped as a parameter. However, doing so would
-mean we could not use ``None`` to indicate that the module should be
-pulled from ``sys.modules``. Furthermore, ``is_reload`` makes the
-intent of the call clear.
+**init_module_attrs(module)**
-There are two parts to what happens in ``load()``. First, the module is
-prepared, loaded, updated appropriately, and left available for the
-second part. This is described in more detail shortly.
+.. container::
-Second, in the case of error during a normal load (not reload) the
-module is removed from ``sys.modules``. If no error happened, the
-module is pulled from ``sys.modules``. This the module returned by
-``load()``. Before it is returned, if it is a different object than the
-one produced by the first part, attributes of the module from
-``sys.modules`` are updated to reflect the spec.
+ Sets the module's import-related attributes to the corresponding values
+ in the module spec. If ``has_location`` is false on the spec,
+ ``__file__`` and ``__cached__`` are not set on the module. ``__path__``
+ is only set on the module if ``submodule_search_locations`` is None.
+ For the rest of the import-related module attributes, a ``None`` value
+ on the spec (aka "not set") means ``None`` will be set on the module.
+ If any of the attributes are already set on the module, the existing
+ values are replaced. The module's own ``__spec__`` is not consulted but
+ does get replaced with the spec on which ``init_module_attrs()`` was
+ called. The earlier mapping of ``ModuleSpec`` attributes to module
+ attributes indicates which attributes are involved on both sides.
-Returning the module from ``sys.modules`` accommodates the ability of
-the module to replace itself there while it is executing (during load).
+**create()**
-As already noted, this is what already happens in the import system.
-``load()`` is not meant to change any of this behavior.
+.. container::
-Regarding the first part of ``load()``, the following describes what
-happens. It depends on if ``is_reload`` is true and if the loader has
-``exec_module()``.
+ A new module is created relative to the spec and its import-related
+ attributes are set accordingly. If the spec's loader has a
+ ``create_module()`` method, that gets called to create the module. This
+ give the loader a chance to do any pre-loading initialization that can't
+ otherwise be accomplished elsewhere. Otherwise a bare module object is
+ created. In both cases ``init_module_attrs()`` is called on the module
+ before it gets returned.
-For normal load with ``exec_module()`` available::
+**exec(module)**
- A new module is created, ``init_module_attrs()`` is called to set
- its attributes, and it is set on sys.modules. At that point
- the loader's ``exec_module()`` is called, after which the module
- is ready for the second part of loading.
+.. container::
-.. XXX What if the module already exists in sys.modules?
+ The spec's loader is used to execute the module. If the loader has
+ ``exec_module()`` defined, the namespace of ``module`` is the target of
+ execution. Otherwise the loader's ``load_module()`` is called, which
+ ignores ``module`` and returns the module that was the actual
+ execution target. In that case the import-related attributes of that
+ module are updated to reflect the spec. In both cases the targeted
+ module is the one that gets returned.
-For normal load without ``exec_module()`` available::
+**load()**
- The loader's ``load_module()`` is called and the attributes of the
- module it returns are updated to match the spec.
+.. container::
-For reload with ``exec_module()`` available::
+ This method captures the current functionality of and requirements on
+ ``Loader.load_module()`` without any semantic changes. It is
+ essentially a wrapper around ``create()`` and ``exec()`` with some
+ extra functionality regarding ``sys.modules``.
- If ``module`` is ``None``, it is pulled from ``sys.modules``. If
- still ``None``, ImportError is raised. Otherwise ``exec_module()``
- is called, passing in the module-to-be-reloaded.
+ itself in ``sys.modules`` while executing. Consequently, the module in
+ ``sys.modules`` is the one that gets returned by ``load()``.
-For reload without ``exec_module()`` available::
+ Right before ``exec()`` is called, the module is added to
+ ``sys.modules``. In the case of error during loading the module is
+ removed from ``sys.modules``. The module in ``sys.modules`` when
+ ``load()`` finishes is the one that gets returned. Returning the module
+ from ``sys.modules`` accommodates the ability of the module to replace
+ itself there while it is executing (during load).
- The loader's ``load_module()`` is called and the attributes of the
- module it returns are updated to match the spec.
+ As already noted, this is what already happens in the import system.
+ ``load()`` is not meant to change any of this behavior.
-There is some boilerplate involved when ``exec_module()`` is available,
-but only the boilerplate that the import system uses currently.
+ If ``loader`` is not set (``None``), ``load()`` raises a ValueError.
-If ``loader`` is not set (``None``), ``load()`` raises a ValueError. If
-``module`` is passed in but ``is_reload`` is false, a ValueError is also
-raises to indicate that ``load()`` was called incorrectly. There may be
-use cases for calling ``load()`` in that way, but they are outside the
-scope of this PEP
+**reload(module)**
-.. XXX add reload(module=None) and drop load()'s parameters entirely?
-.. XXX add more of importlib.reload()'s boilerplate to load()/reload()?
+.. container::
+
+ As with ``load()`` this method faithfully fulfills the semantics of
+ ``Loader.load_module()`` in the reload case, with one exception:
+ reloading a module when ``exec_module()`` is available actually uses
+ ``module`` rather than ignoring it in favor of the one in
+ ``sys.modules``, as ``Loader.load_module()`` does. The functionality
+ here mirrors that of ``load()``, minus the ``create()`` call and the
+ ``sys.modules`` handling.
+
+.. XXX add more of importlib.reload()'s boilerplate to reload()?
Omitted Attributes and Methods
------------------------------
-``ModuleSpec`` does not have a ``from_module()`` factory method since
-all modules should already have a spec.
+There is no ``PathModuleSpec`` subclass of ``ModuleSpec`` that provides
+the ``has_location``, ``cached``, and ``submodule_search_locations``
+functionality. While that might make the separation cleaner, module
+objects don't have that distinction. ``ModuleSpec`` will support both
+cases equally well.
-Additionally, there is no ``PathModuleSpec`` subclass of ``ModuleSpec``
-that provides the ``filename``, ``cached``, and ``path`` functionality.
-While that might make the separation cleaner, module objects don't have
-that distinction. ``ModuleSpec`` will support both cases equally well.
+While ``is_package`` would be a simple additional attribute (aliasing
+``self.submodule_search_locations is not None``), it perpetuates the
+artificial (and mostly erroneous) distinction between modules and
+packages.
+
+Conceivably, ``ModuleSpec.load()`` could optionally take a list of
+modules with which to interact instead of ``sys.modules``. That
+capability is left out of this PEP, but may be pursued separately at
+some other time, including relative to PEP 406 (import engine).
+
+Likewise ``load()`` could be leveraged to implement multi-version
+imports. While interesting, doing so is outside the scope of this
+proposal.
Backward Compatibility
----------------------
-Since ``Finder.find_module()`` methods would now return a module spec
-instead of loader, specs must act like the loader that would have been
-returned instead. This is relatively simple to solve since the loader
-is available as an attribute of the spec. We will use ``__getattr__()``
-to do it.
-
-However, ``ModuleSpec.is_package`` (an attribute) conflicts with
-``InspectLoader.is_package()`` (a method). Working around this requires
-a more complicated solution but is not a large obstacle. Simply making
-``ModuleSpec.is_package`` a method does not reflect that is a relatively
-static piece of data. ``module_repr()`` also conflicts with the same
-method on loaders, but that workaround is not complicated since both are
-methods.
-
-Unfortunately, the ability to proxy does not extend to ``id()``
-comparisons and ``isinstance()`` tests. In the case of the return value
-of ``find_module()``, we accept that break in backward compatibility.
-However, we will mitigate the problem with ``isinstance()`` somewhat by
-registering ``ModuleSpec`` on the loaders in ``importlib.abc``.
+``ModuleSpec`` doesn't have any. This would be a different story if
+``Finder.find_module()`` were to return a module spec instead of loader.
+In that case, specs would have to act like the loader that would have
+been returned instead. Doing so would be relatively simple, but is an
+unnecessary complication.
Subclassing
-----------
Subclasses of ModuleSpec are allowed, but should not be necessary.
-Adding functionality to a custom finder or loader will likely be a
-better fit and should be tried first. However, as long as a subclass
-still fulfills the requirements of the import system, objects of that
-type are completely fine as the return value of ``find_module()``.
+Simply setting ``loading_info`` or adding functionality to a custom
+finder or loader will likely be a better fit and should be tried first.
+However, as long as a subclass still fulfills the requirements of the
+import system, objects of that type are completely fine as the return
+value of ``Finder.find_spec()``.
+
+
+Existing Types
+==============
Module Objects
--------------
-Module objects will now have a ``__spec__`` attribute to which the
-module's spec will be bound. None of the other import-related module
-attributes will be changed or deprecated, though some of them could be;
-any such deprecation can wait until Python 4.
+**__spec__**
+
+.. container::
+
+ Module objects will now have a ``__spec__`` attribute to which the
+ module's spec will be bound.
+
+None of the other import-related module attributes will be changed or
+deprecated, though some of them could be; any such deprecation can wait
+until Python 4.
``ModuleSpec`` objects will not be kept in sync with the corresponding
module object's import-related attributes. Though they may differ, in
@@ -438,32 +708,30 @@
reflect the actual module name while ``module.__name__`` will be
``__main__``.
-The ``__file__`` attribute will be set where applicable in the same way
-it is now. For instance, zip imports will still have it set for
-backward-compatibility reasons. However, the recommendation will be to
-have ``__file__`` set only for actual filenames from now on.
-
Finders
-------
-Finders will now return ModuleSpec objects when ``find_module()`` is
-called rather than loaders. For backward compatility, ``Modulespec``
-objects proxy the attributes of their ``loader`` attribute.
+**MetaPathFinder.find_spec(name, path=None)**
-Adding another similar method to avoid backward-compatibility issues
-is undersireable if avoidable. The import APIs have suffered enough,
-especially considering ``PathEntryFinder.find_loader()`` was just
-added in Python 3.3. The approach taken by this PEP should be
-sufficient to address backward-compatibility issues for
-``find_module()``.
+**PathEntryFinder.find_spec(name)**
-The change to ``find_module()`` applies to both ``MetaPathFinder`` and
-``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be
-deprecated and, for backward compatibility, implicitly special-cased if
-the method exists on a finder.
+.. container::
+
+ Finders will return ModuleSpec objects when ``find_spec()`` is
+ called. This new method replaces ``find_module()`` and
+ ``find_loader()`` (in the ``PathEntryFinder`` case). If a loader does
+ not have ``find_spec()``, ``find_module()`` and ``find_loader()`` are
+ used instead, for backward-compatibility.
+
+ Adding yet another similar method to loaders is a case of practicality.
+ ``find_module()`` could be changed to return specs instead of loaders.
+ This is tempting because the import APIs have suffered enough,
+ especially considering ``PathEntryFinder.find_loader()`` was just
+ added in Python 3.3. However, the extra complexity and a less-than-
+ explicit method name aren't worth it.
Finders are still responsible for creating the loader. That loader will
-now be stored in the module spec returned by ``find_module()`` rather
+now be stored in the module spec returned by ``find_spec()`` rather
than returned directly. As is currently the case without the PEP, if a
loader would be costly to create, that loader can be designed to defer
the cost until later.
@@ -471,26 +739,45 @@
Loaders
-------
-Loaders will have a new method, ``exec_module(module)``. Its only job
-is to "exec" the module and consequently populate the module's
-namespace. It is not responsible for creating or preparing the module
-object, nor for any cleanup afterward. It has no return value.
+**Loader.exec_module(module)**
-The ``load_module()`` of loaders will still work and be an active part
-of the loader API. It is still useful for cases where the default
-module creation/prepartion/cleanup is not appropriate for the loader.
+.. container::
-For example, the C API for extension modules only supports the full
-control of ``load_module()``. As such, ``ExtensionFileLoader`` will not
-implement ``exec_module()``. In the future it may be appropriate to
-produce a second C API that would support an ``exec_module()``
-implementation for ``ExtensionFileLoader``. Such a change is outside
-the scope of this PEP.
+ Loaders will have a new method, ``exec_module()``. Its only job
+ is to "exec" the module and consequently populate the module's
+ namespace. It is not responsible for creating or preparing the module
+ object, nor for any cleanup afterward. It has no return value.
+
+**Loader.load_module(fullname)**
+
+.. container::
+
+ The ``load_module()`` of loaders will still work and be an active part
+ of the loader API. It is still useful for cases where the default
+ module creation/prepartion/cleanup is not appropriate for the loader.
+ If implemented, ``load_module()`` will still be responsible for its
+ current requirements (prep/exec/etc.) since the method may be called
+ directly.
+
+ For example, the C API for extension modules only supports the full
+ control of ``load_module()``. As such, ``ExtensionFileLoader`` will not
+ implement ``exec_module()``. In the future it may be appropriate to
+ produce a second C API that would support an ``exec_module()``
+ implementation for ``ExtensionFileLoader``. Such a change is outside
+ the scope of this PEP.
A loader must define either ``exec_module()`` or ``load_module()``. If
both exist on the loader, ``ModuleSpec.load()`` uses ``exec_module()``
and ignores ``load_module()``.
+**Loader.create_module(spec)**
+
+.. container::
+
+ Loaders may also implement ``create_module()`` that will return a
+ new module to exec. However, most loaders will not need to implement
+ the method.
+
PEP 420 introduced the optional ``module_repr()`` loader method to limit
the amount of special-casing in the module type's ``__repr__()``. Since
this method is part of ``ModuleSpec``, it will be deprecated on loaders.
@@ -506,86 +793,56 @@
The path-based loaders in ``importlib`` take arguments in their
``__init__()`` and have corresponding attributes. However, the need for
-those values is eliminated. The only exception is
+those values is eliminated by module specs. The only exception is
``FileLoader.get_filename()``, which uses ``self.path``. The signatures
for these loaders and the accompanying attributes will be deprecated.
In addition to executing a module during loading, loaders will still be
directly responsible for providing APIs concerning module-related data.
+
Other Changes
--------------
+=============
* The various finders and loaders provided by ``importlib`` will be
updated to comply with this proposal.
-
* The spec for the ``__main__`` module will reflect how the interpreter
was started. For instance, with ``-m`` the spec's name will be that
of the run module, while ``__main__.__name__`` will still be
"__main__".
-
-* We add ``importlib.find_module()`` to mirror
+* We add ``importlib.find_spec()`` to mirror
``importlib.find_loader()`` (which becomes deprecated).
-
* Deprecations in ``importlib.util``: ``set_package()``,
``set_loader()``, and ``module_for_loader()``. ``module_to_load()``
(introduced prior to Python 3.4's release) can be removed.
-
* ``importlib.reload()`` is changed to use ``ModuleSpec.load()``.
-
* ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of
the per-module import lock, whereas ``Loader.load_module()`` did not.
+
Reference Implementation
-------------------------
+========================
-A reference implementation is available at <TBD>.
+A reference implementation will be available at
+http://bugs.python.org/issue18864.
-Open Questions
+Open Issues
==============
-* How to avoid having custom ModuleSpec attributes conflict with future
- normal attributes?
+\* The impact of this change on pkgutil (and setuptools) needs looking
+into. It has some generic function-based extensions to PEP 302. These
+may break if importlib starts wrapping loaders without the tools'
+knowledge.
-This could be done with a sub-namespace bound to a single ModuleSpec
-attribute. It could also be done by reserving names with a single
-leading underscore for custom attributes. Or we could just not worry
-about it.
+\* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc,
+inspect.
-* Get rid of the ``is_package`` property?
+\* Add ``ModuleSpec.data`` as a descriptor that wraps the data API of the
+spec's loader?
-It duplicates information
-both in the ``ModuleSpec()`` signature and in attributes. It is
-technically unncessary in light of the path attribute and it conflicts
-with ``InspectLoader.is_package()``, which makes the implementation more
-complicated. However, it also provides an explicit indicator of
-package-ness, which helps those less familiar with the import system.
-
-* Deprecate the use of ``__file__`` for anything except actual files?
-
-* Introduce a new extension module API that takes advantage of
- ``ModuleSpec``? I'd rather that be part of a separate proposal.
-
-* Add ``create_module()`` to loaders?
-
-It would take a ``ModuleSpec``
-and return the module that should be passed to ``spec.exec()``. This
-method would be helpful for new extension module import APIs.
-
-* Have ``ModuleSpec.module_repr()`` replace more of the module type's
- ``__repr__()`` implementation?
-
-A compliant module is required to have
-``__spec__`` set so that should work. However, currently the repr uses
-the module attributes. Using the spec attributes would give precedence
-to the spec in the case that they differ, which would be
-backward-incompatible.
-
-* Factor the path-based attributes/functionality into a subclass--
- something like ``PathModuleSpec``?
-
-It looks like there just isn't enough benefit to doing so.
+\* How to limit possible end-user confusion/abuses relative to spec
+attributes (since __spec__ will make them really accessible)?
References
--
Repository URL: http://hg.python.org/peps
More information about the Python-checkins
mailing list