[Python-checkins] peps: Big update to PEP 451 in response to feedback.

eric.snow python-checkins at python.org
Wed Aug 28 10:45:53 CEST 2013


http://hg.python.org/peps/rev/f139785b970b
changeset:   5081:f139785b970b
user:        Eric Snow <ericsnowcurrently at gmail.com>
date:        Wed Aug 28 02:43:45 2013 -0600
summary:
  Big update to PEP 451 in response to feedback.

files:
  pep-0451.txt |  905 +++++++++++++++++++++++++-------------
  1 files changed, 581 insertions(+), 324 deletions(-)


diff --git a/pep-0451.txt b/pep-0451.txt
--- a/pep-0451.txt
+++ b/pep-0451.txt
@@ -17,10 +17,11 @@
 ========
 
 This PEP proposes to add a new class to ``importlib.machinery`` called
-``ModuleSpec``.  It will contain all the import-related information
-about a module without needing to load the module first.  Finders will
-now return a module's spec rather than a loader.  The import system will
-use the spec to load the module.
+``ModuleSpec``.  It will be authoritative for all the import-related
+information about a module, and will be available without needing to
+load the module first.  Finders will provide a module's spec instead of
+a loader.  The import machinery will be adjusted to take advantage of
+module specs, including using them to load modules.
 
 
 Motivation
@@ -85,7 +86,7 @@
 As an example of complexity attributable to this flaw, the
 implementation of namespace packages in Python 3.3 (see PEP 420) added
 ``FileFinder.find_loader()`` because there was no good way for
-``find_module()`` to provide the namespace path.
+``find_module()`` to provide the namespace search locations.
 
 The answer to this gap is a ``ModuleSpec`` object that contains the
 per-module information and takes care of the boilerplate functionality
@@ -100,334 +101,603 @@
 The goal is to address the gap between finders and loaders while
 changing as little of their semantics as possible.  Though some
 functionality and information is moved to the new ``ModuleSpec`` type,
-their semantics should remain the same.  However, for the sake of
-clarity, those semantics will be explicitly identified.
+their behavior should remain the same.  However, for the sake of clarity
+the finder and loader semantics will be explicitly identified.
+
+This is a high-level summary of the changes described by this PEP.  More
+detail is available in later sections.
+
+importlib.machinery.ModuleSpec (new)
+------------------------------------
+
+Attributes:
+
+* name - a string for the name of the module.
+* loader - the loader to use for loading and for module data.
+* origin - a string for the location from which the module is loaded.
+* submodule_search_locations - strings for where to find submodules,
+  if a package.
+* loading_info - a container of data for use during loading (or None).
+* cached (property) - a string for where the compiled module will be
+  stored.
+* is_location (RO-property) - the module's origin refers to a location.
+
+.. XXX Find a better name than loading_info?
+.. XXX Add ``submodules`` (RO-property) - returns possible submodules
+   relative to spec (or None)?
+.. XXX Add ``loaded`` (RO-property) - the module in sys.modules, if any?
+
+Factory Methods:
+
+* from_file_location() - factory for file-based module specs.
+* from_module() - factory based on import-related module attributes.
+* from_loader() - factory based on information provided by loaders.
+
+.. XXX Move the factories to importlib.util or make class-only?
+
+Instance Methods:
+
+* init_module_attrs() - populate a module's import-related attributes.
+* module_repr() - provide a repr string for a module.
+* create() - provide a new module to use for loading.
+* exec() - execute the spec into a module namespace.
+* load() - prepare a module and execute it in a protected way.
+* reload() - re-execute a module in a protected way.
+
+.. XXX Make module_repr() match the spec (BC problem?)?
+
+API Additions
+-------------
+
+* ``importlib.abc.Loader.exec_module()`` will execute a module in its
+  own namespace, replacing ``importlib.abc.Loader.load_module()``.
+* ``importlib.abc.Loader.create_module()`` (optional) will return a new
+  module to use for loading.
+* Module objects will have a new attribute: ``__spec__``.
+* ``importlib.find_spec()`` will return the spec for a module.
+* ``__subclasshook__()`` will be implemented on the importlib ABCs.
+
+.. XXX Do __subclasshook__() separately from the PEP (issue18862).
+
+API Changes
+-----------
+
+* Import-related module attributes will no longer be authoritative nor
+  used by the import system.
+* ``InspectLoader.is_package()`` will become optional.
+
+.. XXX module __repr__() will prefer spec attributes?
+
+Deprecations
+------------
+
+* ``importlib.abc.MetaPathFinder.find_module()``
+* ``importlib.abc.PathEntryFinder.find_module()``
+* ``importlib.abc.PathEntryFinder.find_loader()``
+* ``importlib.abc.Loader.load_module()``
+* ``importlib.abc.Loader.module_repr()``
+* The parameters and attributes of the various loaders in
+  ``importlib.machinery``
+* ``importlib.util.set_package()``
+* ``importlib.util.set_loader()``
+* ``importlib.find_loader()``
+
+Removals
+--------
+
+* ``importlib.abc.Loader.init_module_attrs()``
+* ``importlib.util.module_to_load()``
+
+Other Changes
+-------------
+
+* The spec for the ``__main__`` module will reflect the appropriate
+  name and origin.
+* The module type's ``__repr__`` will defer to ModuleSpec exclusively.
+
+Backward-Compatibility
+----------------------
+
+* If a finder does not define ``find_spec()``, a spec is derived from
+  the loader returned by ``find_module()``.
+* ``PathEntryFinder.find_loader()`` will be used, if defined.
+* ``Loader.load_module()`` is used if ``exec_module()`` is not defined.
+* ``Loader.module_repr()`` is used by ``ModuleSpec.module_repr()`` if it
+  exists.
+
+What Will not Change?
+---------------------
+
+* The syntax and semantics of the import statement.
+* Existing finders and loaders will continue to work normally.
+* The import-related module attributes will still be initialized with
+  the same information.
+* Finders will still create loaders, storing them in the specs.
+* ``Loader.load_module()``, if a module defines it, will have all the
+  same requirements and may still be called directly.
+* Loaders will still be responsible for module data APIs.
+
+
+ModuleSpec Users
+================
+
+``ModuleSpec`` objects has 3 distinct target audiences: Python itself,
+import hooks, and normal Python users.
+
+Python will use specs in the import machinery, in interpreter startup,
+and in various standard library modules.  Some modules are
+import-oriented, like pkgutil, and others are not, like pickle and
+pydoc.  In all cases, the full ``ModuleSpec`` API will get used.
+
+Import hooks (finders and loaders) will make use of the spec in specific
+ways, mostly without using the ``ModuleSpec`` instance methods.  First
+of all, finders will use the factory methods to create spec objects.
+They may also directly adjust the spec attributes after the spec is
+created.  Secondly, the finder may bind additional information to the
+spec for the loader to consume during module creation/execution.
+Finally, loaders will make use of the attributes on a spec when creating
+and/or executing a module.
+
+Python users will be able to inspect a module's ``__spec__`` to get
+import-related information about the object.  Generally, they will not
+be using the ``ModuleSpec`` factory methods nor the instance methods.
+However, each spec has methods named ``create``, ``exec``, ``load``, and
+``reload``.  Since they are so easy to access (and misunderstand/abuse),
+their function and availability require explicit consideration in this
+proposal.
+
+
+What Will Existing Finders and Loaders Have to Do Differently?
+==============================================================
+
+Immediately?  Nothing.  The status quo will be deprecated, but will
+continue working.  However, here are the things that the authors of
+finders and loaders should change relative to this PEP:
+
+* Implement ``find_spec()`` on finders.
+* Implement ``exec_module()`` on loaders, if possible.
+
+The factory methods of ``ModuleSpec`` are intended to be helpful for
+converting existing finders.  ``from_loader()`` and
+``from_file_location()`` are both straight-forward utilities in this
+regard.  In the case where loaders already expose methods for creating
+and preparing modules, a finder may use ``ModuleSpec.from_module()`` on
+a throw-away module to create the appropriate spec.
+
+As for loaders, ``exec_module()`` should be a relatively direct
+conversion from a portion of the existing ``load_module()``.  However,
+``Loader.create_module()`` will also be necessary in some uncommon
+cases.  Furthermore, ``load_module()`` will still work as a final option
+when ``exec_module()`` is not appropriate.
+
+
+How Loading Will Work
+=====================
+
+This is an outline of what happens in ``ModuleSpec.load()``.
+
+1. A new module is created by calling ``spec.create()``.
+
+   a. If the loader has a ``create_module()`` method, it gets called.
+      Otherwise a new module gets created.
+   b. The import-related module attributes are set.
+
+2. The module is added to sys.modules.
+3. ``spec.exec(module)`` gets called.
+
+   a. If the loader has an ``exec_module()`` method, it gets called.
+      Otherwise ``load_module()`` gets called for backward-compatibility
+      and the resulting module is updated to match the spec.
+
+4. If there were any errors the module is removed from sys.modules.
+5. If the module was replaced in sys.modules during ``exec()``, the one
+   in sys.modules is updated to match the spec.
+6. The module in sys.modules is returned.
+
+These steps are exactly what ``Loader.load_module()`` is already
+expected to do.  Loaders will thus be simplified since they will only
+need to implement the portion in step 3a.
+
 
 ModuleSpec
+==========
+
+This is a new class which defines the import-related values to use when
+loading the module.  It closely corresponds to the import-related
+attributes of module objects.  ``ModuleSpec`` objects may also be used
+by finders and loaders and other import-related APIs to hold extra
+import-related state concerning the module.  This greatly reduces the
+need to add any new new import-related attributes to module objects, and
+loader ``__init__`` methods will no longer need to accommodate such
+per-module state.
+
+General Notes
+-------------
+
+* The spec for each module instance will be unique to that instance even
+  if the information is identical to that of another spec.
+* A module's spec is not intended to be modified by anything but
+  finders.
+
+Creating a ModuleSpec
+---------------------
+
+**ModuleSpec(name, loader, *, origin=None, is_package=None)**
+
+.. container::
+
+   ``name``, ``loader``, and ``origin`` are set on the new instance
+   without any modification.  If ``is_package`` is not passed in, the
+   loader's ``is_package()`` gets called (if available), or it defaults
+   to `False`.  If ``is_package`` is true,
+   ``submodule_search_locations`` is set to a new empty list.  Otherwise
+   it is set to None.
+
+   Other attributes not listed as parameters (such as ``package``) are
+   either read-only dynamic properties or default to None.
+
+**from_filename(name, loader, *, filename=None, submodule_search_locations=None)**
+
+.. container::
+
+   This factory classmethod allows a suitable ModuleSpec instance to be
+   easily created with extra file-related information.  This includes
+   the values that would be set on a module as ``__file__`` or
+   ``__cached__``.
+
+   ``is_location`` is set to True for specs created using
+   ``from_filename()``.
+
+**from_module(module, loader=None)**
+
+.. container::
+
+   This factory is used to create a spec based on the import-related
+   attributes of an existing module.  Since modules should already have
+   ``__spec__`` set, this method has limited utility.
+
+**from_loader(name, loader, *, origin=None, is_package=None)**
+
+.. container::
+
+   A factory classmethod that returns a new ``ModuleSpec`` derived from
+   the arguments.  ``is_package`` is used inside the method to indicate
+   that the module is a package.  If not explicitly passed in, it falls
+   back to using the result of the loader's ``is_package()``, if
+   available.  If not available, if defaults to False.
+
+   In contrast to ``ModuleSpec.__init__()``, which takes the arguments
+   as-is, ``from_loader()`` calculates missing values from the ones
+   passed in, as much as possible.  This replaces the behavior that is
+   currently provided by several ``importlib.util`` functions as well as
+   the optional ``init_module_attrs()`` method of loaders.  Just to be
+   clear, here is a more detailed description of those calculations::
+
+      If not passed in, ``filename`` is to the result of calling the
+      loader's ``get_filename()``, if available.  Otherwise it stays
+      unset (``None``).
+
+      If not passed in, ``submodule_search_locations`` is set to an empty
+      list if ``is_package`` is true.  Then the directory from ``filename``
+      is appended to it, if possible.  If ``is_package`` is false,
+      ``submodule_search_locations`` stays unset.
+
+      If ``cached`` is not passed in and ``filename`` is passed in,
+      ``cached`` is derived from it.  For filenames with a source suffix,
+      it set to the result of calling
+      ``importlib.util.cache_from_source()``.  For bytecode suffixes (e.g.
+      ``.pyc``), ``cached`` is set to the value of ``filename``.  If
+      ``filename`` is not passed in or ``cache_from_source()`` raises
+      ``NotImplementedError``, ``cached`` stays unset.
+
+      If not passed in, ``origin`` is set to ``filename``.  Thus if
+      ``filename`` is unset, ``origin`` stays unset.
+
+
+Attributes
 ----------
 
-A new class which defines the import-related values to use when loading
-the module.  It closely corresponds to the import-related attributes of
-module objects.  ``ModuleSpec`` objects may also be used by finders and
-loaders and other import-related APIs to hold extra import-related
-state about the module.  This greatly reduces the need to add any new
-new import-related attributes to module objects, and loader ``__init__``
-methods won't need to accommodate such per-module state.
-
-Creating a ModuleSpec:
-
-``ModuleSpec(name, loader, *, origin=None, filename=None, cached=None,
-path=None)``
-
-Passed in parameter values are assigned directly to the corresponding
-attributes below.  Other attributes not listed as parameters (such as
-``package``) are read-only properties that are automatically derived
-from these values.
-
-The ``ModuleSpec.from_loader()`` class method allows a suitable
-ModuleSpec instance to be easily created from a PEP 302 loader object.
-
-ModuleSpec Attributes
----------------------
-
 Each of the following names is an attribute on ``ModuleSpec`` objects.
 A value of ``None`` indicates "not set".  This contrasts with module
 objects where the attribute simply doesn't exist.
 
-While ``package`` and ``is_package`` are read-only properties, the
-remaining attributes can be replaced after the module spec is created
-and after import is complete.  This allows for unusual cases where
-modifying the spec is the best option.  However, typical use should not
-involve changing the state of a module's spec.
+While ``package`` is a read-only property, the remaining attributes can
+be replaced after the module spec is created and even after import is
+complete.  This allows for unusual cases where directly modifying the
+spec is the best option.  However, typical use should not involve
+changing the state of a module's spec.
 
 Most of the attributes correspond to the import-related attributes of
 modules.  Here is the mapping, followed by a description of the
 attributes.  The reverse of this mapping is used by
-``init_module_attrs()``.
+``ModuleSpec.init_module_attrs()``.
 
-============= ===========
-On ModuleSpec On Modules
-============= ===========
-name          __name__
-loader        __loader__
-package       __package__
-is_package    -
-origin        -
-filename      __file__
-cached        __cached__
-path          __path__
-============= ===========
+========================== ===========
+On ModuleSpec              On Modules
+========================== ===========
+name                       __name__
+loader                     __loader__
+package                    __package__
+origin                     __file__*
+cached                     __cached__*
+submodule_search_locations __path__**
+loading_info                \-
+has_location (RO-property)  \-
+========================== ===========
 
-``name``
+\* Only if ``is_location`` is true.
+\*\* Only if not None.
 
-The module's fully resolved and absolute name.  It must be set.
+**name**
 
-``loader``
+.. container::
 
-The loader to use during loading and for module data.  These specific
-functionalities do not change for loaders.  Finders are still
-responsible for creating the loader and this attribute is where it is
-stored.  The loader must be set.
+   The module's fully resolved and absolute name.  It must be set.
 
-``package``
+**loader**
 
-The name of the module's parent.  This is a dynamic attribute with a
-value derived from ``name`` and ``is_package``.  For packages it is the
-value of ``name``.  Otherwise it is equivalent to
-``name.rpartition('.')[0]``.  Consequently, a top-level module will have
-the empty string for ``package``.
+.. container::
 
+   The loader to use during loading and for module data.  These specific
+   functionalities do not change for loaders.  Finders are still
+   responsible for creating the loader and this attribute is where it is
+   stored.  The loader must be set.
 
-``is_package``
+**origin**
 
-Whether or not the module is a package.  This dynamic attribute is True
-if ``path`` is not None (e.g. the empty list is a "true" value), else it
-is false.
+.. container::
 
-``origin``
+   A string for the location from which the module originates.  Aside from
+   the informational value, it is also used in ``module_repr()``.
 
-A string for the location from which the module originates.  If
-``filename`` is set, ``origin`` should be set to the same value unless
-some other value is more appropriate.  ``origin`` is used in
-``module_repr()`` if it does not match the value of ``filename``.
+   The module attribute ``__file__`` has a similar but more restricted
+   meaning.  Not all modules have it set (e.g. built-in modules).  However,
+   ``origin`` is applicable to essentially all modules.  For built-in
+   modules it would be set to "built-in".
 
-Using ``filename`` for this meaning would be inaccurate, since not all
-modules have path-based locations.  For instance, built-in modules do
-not have ``__file__`` set.  Yet it is useful to have a descriptive
-string indicating that it originated from the interpreter as a built-in
-module.  So built-in modules will have ``origin`` set to ``"built-in"``.
+Secondary Attributes
+--------------------
 
-Path-based attributes:
+Some of the ``ModuleSpec`` attributes are not set via arguments when
+creating a new spec.  Either they are strictly dynamically calculated
+properties or they are simply set to None (aka "not set").  For the
+latter case, those attributes may still be set directly.
 
-If any of these is set, it indicates that the module is path-based.  For
-reference, a path entry is a string for a location where the import
-system will look for modules, e.g. the path entries in ``sys.path`` or a
-package's ``__path__``).
+**package**
 
-``filename``
+.. container::
 
-Like ``origin``, but limited to a path-based location.  If ``filename``
-is set, ``origin`` should be set to the same string, unless origin is
-explicitly set to something else.  ``filename`` is not necessarily an
-actual file name, but could be any location string based on a path
-entry.  Regarding the attribute name, while it is potentially
-inaccurate, it is both consistent with the equivalent module attribute
-and generally accurate.
+   A dynamic property that gives the name of the module's parent.  The
+   value is derived from ``name`` and ``is_package``.  For packages it is
+   the value of ``name``.  Otherwise it is equivalent to
+   ``name.rpartition('.')[0]``.  Consequently, a top-level module will have
+   the empty string for ``package``.
 
-.. XXX Would a different name be better?  ``path_location``?
+**has_location**
 
-``cached``
+.. container::
 
-The path-based location where the compiled code for a module should be
-stored.  If ``filename`` is set to a source file, this should be set to
-corresponding path that PEP 3147 specifies.  The
-``importlib.util.source_to_cache()`` function facilitates getting the
-correct value.
+   Some modules can be loaded by reference to a location, e.g. a filesystem
+   path or a URL or something of the sort.  Having the location lets you
+   load the module, but in theory you could load that module under various
+   names.
 
-``path``
+   In contrast, non-located modules can't be loaded in this fashion, e.g.
+   builtin modules and modules dynamically created in code.  For these, the
+   name is the only way to access them, so they have an "origin" but not a
+   "location".
 
-The list of path entries in which to search for submodules if this
-module is a package.  Otherwise it is ``None``.
+   This attribute reflects whether or not the module is locatable.  If it
+   is, ``origin`` must be set to the module's location and ``__file__``
+   will be set on the module.  Furthermore, a locatable module is also
+   cacheable and so ``__cached__`` is tied to ``has_location``.
 
-.. XXX add a path-based subclass?
+   The corresponding module attribute name, ``__file__``, is somewhat
+   inaccurate and potentially confusion, so we will use a more explicit
+   combination of ``origin`` and ``has_location`` to represent the same
+   information.  Having a separate ``filename`` is unncessary since we have
+   ``origin``.
 
-ModuleSpec Methods
-------------------
+**cached**
 
-``from_loader(name, loader, *, is_package=None, origin=None, filename=None, cached=None, path=None)``
+.. container::
 
-.. XXX use a different name?
+   A string for the location where the compiled code for a module should be
+   stored.  PEP 3147 details the caching mechanism of the import system.
 
-A factory classmethod that returns a new ``ModuleSpec`` derived from the
-arguments.  ``is_package`` is used inside the method to indicate that
-the module is a package.  If not explicitly passed in, it is set to
-``True`` if ``path`` is passed in.  It falls back to using the result of
-the loader's ``is_package()``, if available.  Finally it defaults to
-False.  The remaining parameters have the same meaning as the
-corresponding ``ModuleSpec`` attributes.
+   If ``has_location`` is true, this location string is set on the module
+   as ``__cached__``.  When ``from_filename()`` is used to create a spec,
+   ``cached`` is set to the result of calling
+   ``importlib.util.source_to_cache()``.
 
-In contrast to ``ModuleSpec.__init__()``, which takes the arguments
-as-is, ``from_loader()`` calculates missing values from the ones passed
-in, as much as possible.  This replaces the behavior that is currently
-provided by several ``importlib.util`` functions as well as the optional
-``init_module_attrs()`` method of loaders.  Just to be clear, here is a
-more detailed description of those calculations::
+   ``cached`` is not necessarily a file location.  A finder or loader may
+   store an alternate location string in ``cached``.  However, in practice
+   this will be the file location dicated by PEP 3147.
 
-   If not passed in, ``filename`` is to the result of calling the
-   loader's ``get_filename()``, if available.  Otherwise it stays
-   unset (``None``).
+**submodule_search_locations**
 
-   If not passed in, ``path`` is set to an empty list if
-   ``is_package`` is true.  Then the directory from ``filename`` is
-   appended to it, if possible.  If ``is_package`` is false, ``path``
-   stays unset.
+.. container::
 
-   If ``cached`` is not passed in and ``filename`` is passed in,
-   ``cached`` is derived from it.  For filenames with a source suffix,
-   it set to the result of calling
-   ``importlib.util.cache_from_source()``.  For bytecode suffixes (e.g.
-   ``.pyc``), ``cached`` is set to the value of ``filename``.  If
-   ``filename`` is not passed in or ``cache_from_source()`` raises
-   ``NotImplementedError``, ``cached`` stays unset.
+   The list of location strings, typically directory paths, in which to
+   search for submodules.  If the module is a package this will be set to
+   a list (even an empty one).  Otherwise it is ``None``.
 
-   If not passed in, ``origin`` is set to ``filename``.  Thus if
-   ``filename`` is unset, ``origin`` stays unset.
+   The corresponding module attribute's name, ``__path__``, is relatively
+   ambiguous.  Instead of mirroring it, we use a more explicit name that
+   makes the purpose clear.
 
-``module_repr()``
+**loading_info**
 
-Returns a repr string for the module if ``origin`` is set and
-``filename`` is not set.  The string refers to the value of ``origin``.
-Otherwise ``module_repr()`` returns None.  This indicates to the module
-type's ``__repr__()`` that it should fall back to the default repr.
+.. container::
 
-We could also have ``module_repr()`` produce the repr for the case where
-``filename`` is set or where ``origin`` is not set, mirroring the repr
-that the module type produces directly.  However, the repr string is
-derived from the import-related module attributes, which might be out of
-sync with the spec.
+   A finder may set ``loading_info`` to any value to provide additional
+   data for the loader to use during loading.  A value of ``None`` is the
+   default and indicates that there is no additional data.  Otherwise it is
+   likely set to some containers, such as a ``dict``, ``list``, or
+   ``types.SimpleNamespace`` containing the relevant extra information.
 
-.. XXX Is using the spec close enough?  Probably not.
+   For example, ``zipimporter`` could use it to pass the zip archive name
+   to the loader directly, rather than needing to derive it from ``origin``
+   or create a custom loader for each find operation.
 
-The implementation of the module type's ``__repr__()`` will change to
-accommodate this PEP.  However, the current functionality will remain to
-handle the case where a module does not have a ``__spec__`` attribute.
+Methods
+-------
 
-.. XXX Clarify the above justification.
+**module_repr()**
 
-``init_module_attrs(module)``
+.. container::
 
-Sets the module's import-related attributes to the corresponding values
-in the module spec.  If a path-based attribute is not set on the spec,
-it is not set on the module.  For the rest, a ``None`` value on the spec
-(aka "not set") means ``None`` will be set on the module.  If any of the
-attributes are already set on the module, the existing values are
-replaced.  The module's own ``__spec__`` is not consulted but does get
-replaced with the spec on which ``init_module_attrs()`` was called.
-The earlier mapping of ``ModuleSpec`` attributes to module attributes
-indicates which attributes are involved on both sides.
+   Returns a repr string for the module, based on the module's import-
+   related attributes and falling back to the spec's attributes.  The
+   string will reflect the current output of the module type's
+   ``__repr__()``.
 
-``load(module=None, *, is_reload=False)``
+   The module type's ``__repr__()`` will use the module's ``__spec__``
+   exclusively.  If the module does not have ``__spec__`` set, a spec is
+   generated using ``ModuleSpec.from_module()``.
 
-This method captures the current functionality of and requirements on
-``Loader.load_module()`` without any semantic changes, except one.
-Reloading a module when ``exec_module()`` is available actually uses
-``module`` rather than ignoring it in favor of the one in
-``sys.modules``, as ``Loader.load_module()`` does.
+   Since the module attributes may be out of sync with the spec and to
+   preserve backward-compatibility in that case, we defer to the module
+   attributes and only when they are missing do we fall back to the spec
+   attributes.
 
-``module`` is only allowed when ``is_reload`` is true.  This means that
-``is_reload`` could be dropped as a parameter.  However, doing so would
-mean we could not use ``None`` to indicate that the module should be
-pulled from ``sys.modules``.  Furthermore, ``is_reload`` makes the
-intent of the call clear.
+**init_module_attrs(module)**
 
-There are two parts to what happens in ``load()``.  First, the module is
-prepared, loaded, updated appropriately, and left available for the
-second part.  This is described in more detail shortly.
+.. container::
 
-Second, in the case of error during a normal load (not reload) the
-module is removed from ``sys.modules``.  If no error happened, the
-module is pulled from ``sys.modules``.  This the module returned by
-``load()``.  Before it is returned, if it is a different object than the
-one produced by the first part, attributes of the module from
-``sys.modules`` are updated to reflect the spec.
+   Sets the module's import-related attributes to the corresponding values
+   in the module spec.  If ``has_location`` is false on the spec,
+   ``__file__`` and ``__cached__`` are not set on the module.  ``__path__``
+   is only set on the module if ``submodule_search_locations`` is None.
+   For the rest of the import-related module attributes, a ``None`` value
+   on the spec (aka "not set") means ``None`` will be set on the module.
+   If any of the attributes are already set on the module, the existing
+   values are replaced.  The module's own ``__spec__`` is not consulted but
+   does get replaced with the spec on which ``init_module_attrs()`` was
+   called.  The earlier mapping of ``ModuleSpec`` attributes to module
+   attributes indicates which attributes are involved on both sides.
 
-Returning the module from ``sys.modules`` accommodates the ability of
-the module to replace itself there while it is executing (during load).
+**create()**
 
-As already noted, this is what already happens in the import system.
-``load()`` is not meant to change any of this behavior.
+.. container::
 
-Regarding the first part of ``load()``, the following describes what
-happens.  It depends on if ``is_reload`` is true and if the loader has
-``exec_module()``.
+   A new module is created relative to the spec and its import-related
+   attributes are set accordingly.  If the spec's loader has a
+   ``create_module()`` method, that gets called to create the module.  This
+   give the loader a chance to do any pre-loading initialization that can't
+   otherwise be accomplished elsewhere.  Otherwise a bare module object is
+   created.  In both cases ``init_module_attrs()`` is called on the module
+   before it gets returned.
 
-For normal load with ``exec_module()`` available::
+**exec(module)**
 
-   A new module is created, ``init_module_attrs()`` is called to set
-   its attributes, and it is set on sys.modules.  At that point
-   the loader's ``exec_module()`` is called, after which the module
-   is ready for the second part of loading.
+.. container::
 
-.. XXX What if the module already exists in sys.modules?
+   The spec's loader is used to execute the module.  If the loader has
+   ``exec_module()`` defined, the namespace of ``module`` is the target of
+   execution.  Otherwise the loader's ``load_module()`` is called, which
+   ignores ``module`` and returns the module that was the actual
+   execution target.  In that case the import-related attributes of that
+   module are updated to reflect the spec.  In both cases the targeted
+   module is the one that gets returned.
 
-For normal load without ``exec_module()`` available::
+**load()**
 
-   The loader's ``load_module()`` is called and the attributes of the
-   module it returns are updated to match the spec.
+.. container::
 
-For reload with ``exec_module()`` available::
+   This method captures the current functionality of and requirements on
+   ``Loader.load_module()`` without any semantic changes.  It is
+   essentially a wrapper around ``create()`` and ``exec()`` with some
+   extra functionality regarding ``sys.modules``.
 
-   If ``module`` is ``None``, it is pulled from ``sys.modules``.  If
-   still ``None``, ImportError is raised.  Otherwise ``exec_module()``
-   is called, passing in the module-to-be-reloaded.
+   itself in ``sys.modules`` while executing.  Consequently, the module in
+   ``sys.modules`` is the one that gets returned by ``load()``.
 
-For reload without ``exec_module()`` available::
+   Right before ``exec()`` is called, the module is added to
+   ``sys.modules``.  In the case of error during loading the module is
+   removed from ``sys.modules``.  The module in ``sys.modules`` when
+   ``load()`` finishes is the one that gets returned.  Returning the module
+   from ``sys.modules`` accommodates the ability of the module to replace
+   itself there while it is executing (during load).
 
-   The loader's ``load_module()`` is called and the attributes of the
-   module it returns are updated to match the spec.
+   As already noted, this is what already happens in the import system.
+   ``load()`` is not meant to change any of this behavior.
 
-There is some boilerplate involved when ``exec_module()`` is available,
-but only the boilerplate that the import system uses currently.
+   If ``loader`` is not set (``None``), ``load()`` raises a ValueError.
 
-If ``loader`` is not set (``None``), ``load()`` raises a ValueError.  If
-``module`` is passed in but ``is_reload`` is false, a ValueError is also
-raises to indicate that ``load()`` was called incorrectly.  There may be
-use cases for calling ``load()`` in that way, but they are outside the
-scope of this PEP
+**reload(module)**
 
-.. XXX add reload(module=None) and drop load()'s parameters entirely?
-.. XXX add more of importlib.reload()'s boilerplate to load()/reload()?
+.. container::
+
+   As with ``load()`` this method faithfully fulfills the semantics of
+   ``Loader.load_module()`` in the reload case, with one exception:
+   reloading a module when ``exec_module()`` is available actually uses
+   ``module`` rather than ignoring it in favor of the one in
+   ``sys.modules``, as ``Loader.load_module()`` does.  The functionality
+   here mirrors that of ``load()``, minus the ``create()`` call and the
+   ``sys.modules`` handling.
+
+.. XXX add more of importlib.reload()'s boilerplate to reload()?
 
 Omitted Attributes and Methods
 ------------------------------
 
-``ModuleSpec`` does not have a ``from_module()`` factory method since
-all modules should already have a spec.
+There is no ``PathModuleSpec`` subclass of ``ModuleSpec`` that provides
+the ``has_location``, ``cached``, and ``submodule_search_locations``
+functionality.  While that might make the separation cleaner, module
+objects don't have that distinction.  ``ModuleSpec`` will support both
+cases equally well.
 
-Additionally, there is no ``PathModuleSpec`` subclass of ``ModuleSpec``
-that provides the ``filename``, ``cached``, and ``path`` functionality.
-While that might make the separation cleaner, module objects don't have
-that distinction.  ``ModuleSpec`` will support both cases equally well.
+While ``is_package`` would be a simple additional attribute (aliasing
+``self.submodule_search_locations is not None``), it perpetuates the
+artificial (and mostly erroneous) distinction between modules and
+packages.
+
+Conceivably, ``ModuleSpec.load()`` could optionally take a list of
+modules with which to interact instead of ``sys.modules``.  That
+capability is left out of this PEP, but may be pursued separately at
+some other time, including relative to PEP 406 (import engine).
+
+Likewise ``load()`` could be leveraged to implement multi-version
+imports.  While interesting, doing so is outside the scope of this
+proposal.
 
 Backward Compatibility
 ----------------------
 
-Since ``Finder.find_module()`` methods would now return a module spec
-instead of loader, specs must act like the loader that would have been
-returned instead.  This is relatively simple to solve since the loader
-is available as an attribute of the spec.  We will use ``__getattr__()``
-to do it.
-
-However, ``ModuleSpec.is_package`` (an attribute) conflicts with
-``InspectLoader.is_package()`` (a method).  Working around this requires
-a more complicated solution but is not a large obstacle.  Simply making
-``ModuleSpec.is_package`` a method does not reflect that is a relatively
-static piece of data.  ``module_repr()`` also conflicts with the same
-method on loaders, but that workaround is not complicated since both are
-methods.
-
-Unfortunately, the ability to proxy does not extend to ``id()``
-comparisons and ``isinstance()`` tests.  In the case of the return value
-of ``find_module()``, we accept that break in backward compatibility.
-However, we will mitigate the problem with ``isinstance()`` somewhat by
-registering ``ModuleSpec`` on the loaders in ``importlib.abc``.
+``ModuleSpec`` doesn't have any.  This would be a different story if
+``Finder.find_module()`` were to return a module spec instead of loader.
+In that case, specs would have to act like the loader that would have
+been returned instead.  Doing so would be relatively simple, but is an
+unnecessary complication.
 
 Subclassing
 -----------
 
 Subclasses of ModuleSpec are allowed, but should not be necessary.
-Adding functionality to a custom finder or loader will likely be a
-better fit and should be tried first.  However, as long as a subclass
-still fulfills the requirements of the import system, objects of that
-type are completely fine as the return value of ``find_module()``.
+Simply setting ``loading_info`` or adding functionality to a custom
+finder or loader will likely be a better fit and should be tried first.
+However, as long as a subclass still fulfills the requirements of the
+import system, objects of that type are completely fine as the return
+value of ``Finder.find_spec()``.
+
+
+Existing Types
+==============
 
 Module Objects
 --------------
 
-Module objects will now have a ``__spec__`` attribute to which the
-module's spec will be bound.  None of the other import-related module
-attributes will be changed or deprecated, though some of them could be;
-any such deprecation can wait until Python 4.
+**__spec__**
+
+.. container::
+
+   Module objects will now have a ``__spec__`` attribute to which the
+   module's spec will be bound.
+
+None of the other import-related module attributes will be changed or
+deprecated, though some of them could be; any such deprecation can wait
+until Python 4.
 
 ``ModuleSpec`` objects will not be kept in sync with the corresponding
 module object's import-related attributes.  Though they may differ, in
@@ -438,32 +708,30 @@
 reflect the actual module name while ``module.__name__`` will be
 ``__main__``.
 
-The ``__file__`` attribute will be set where applicable in the same way
-it is now.  For instance, zip imports will still have it set for
-backward-compatibility reasons.  However, the recommendation will be to
-have ``__file__`` set only for actual filenames from now on.
-
 Finders
 -------
 
-Finders will now return ModuleSpec objects when ``find_module()`` is
-called rather than loaders.  For backward compatility, ``Modulespec``
-objects proxy the attributes of their ``loader`` attribute.
+**MetaPathFinder.find_spec(name, path=None)**
 
-Adding another similar method to avoid backward-compatibility issues
-is undersireable if avoidable.  The import APIs have suffered enough,
-especially considering ``PathEntryFinder.find_loader()`` was just
-added in Python 3.3.  The approach taken by this PEP should be
-sufficient to address backward-compatibility issues for
-``find_module()``.
+**PathEntryFinder.find_spec(name)**
 
-The change to ``find_module()`` applies to both ``MetaPathFinder`` and
-``PathEntryFinder``.  ``PathEntryFinder.find_loader()`` will be
-deprecated and, for backward compatibility, implicitly special-cased if
-the method exists on a finder.
+.. container::
+
+   Finders will return ModuleSpec objects when ``find_spec()`` is
+   called.  This new method replaces ``find_module()`` and
+   ``find_loader()`` (in the ``PathEntryFinder`` case).  If a loader does
+   not have ``find_spec()``, ``find_module()`` and ``find_loader()`` are
+   used instead, for backward-compatibility.
+
+   Adding yet another similar method to loaders is a case of practicality.
+   ``find_module()`` could be changed to return specs instead of loaders.
+   This is tempting because the import APIs have suffered enough,
+   especially considering ``PathEntryFinder.find_loader()`` was just
+   added in Python 3.3.  However, the extra complexity and a less-than-
+   explicit method name aren't worth it.
 
 Finders are still responsible for creating the loader.  That loader will
-now be stored in the module spec returned by ``find_module()`` rather
+now be stored in the module spec returned by ``find_spec()`` rather
 than returned directly.  As is currently the case without the PEP, if a
 loader would be costly to create, that loader can be designed to defer
 the cost until later.
@@ -471,26 +739,45 @@
 Loaders
 -------
 
-Loaders will have a new method, ``exec_module(module)``.  Its only job
-is to "exec" the module and consequently populate the module's
-namespace.  It is not responsible for creating or preparing the module
-object, nor for any cleanup afterward.  It has no return value.
+**Loader.exec_module(module)**
 
-The ``load_module()`` of loaders will still work and be an active part
-of the loader API.  It is still useful for cases where the default
-module creation/prepartion/cleanup is not appropriate for the loader.
+.. container::
 
-For example, the C API for extension modules only supports the full
-control of ``load_module()``.  As such, ``ExtensionFileLoader`` will not
-implement ``exec_module()``.  In the future it may be appropriate to
-produce a second C API that would support an ``exec_module()``
-implementation for ``ExtensionFileLoader``.  Such a change is outside
-the scope of this PEP.
+   Loaders will have a new method, ``exec_module()``.  Its only job
+   is to "exec" the module and consequently populate the module's
+   namespace.  It is not responsible for creating or preparing the module
+   object, nor for any cleanup afterward.  It has no return value.
+
+**Loader.load_module(fullname)**
+
+.. container::
+
+   The ``load_module()`` of loaders will still work and be an active part
+   of the loader API.  It is still useful for cases where the default
+   module creation/prepartion/cleanup is not appropriate for the loader.
+   If implemented, ``load_module()`` will still be responsible for its
+   current requirements (prep/exec/etc.) since the method may be called
+   directly.
+
+   For example, the C API for extension modules only supports the full
+   control of ``load_module()``.  As such, ``ExtensionFileLoader`` will not
+   implement ``exec_module()``.  In the future it may be appropriate to
+   produce a second C API that would support an ``exec_module()``
+   implementation for ``ExtensionFileLoader``.  Such a change is outside
+   the scope of this PEP.
 
 A loader must define either ``exec_module()`` or ``load_module()``.  If
 both exist on the loader, ``ModuleSpec.load()`` uses ``exec_module()``
 and ignores ``load_module()``.
 
+**Loader.create_module(spec)**
+
+.. container::
+
+   Loaders may also implement ``create_module()`` that will return a
+   new module to exec.  However, most loaders will not need to implement
+   the method.
+
 PEP 420 introduced the optional ``module_repr()`` loader method to limit
 the amount of special-casing in the module type's ``__repr__()``.  Since
 this method is part of ``ModuleSpec``, it will be deprecated on loaders.
@@ -506,86 +793,56 @@
 
 The path-based loaders in ``importlib`` take arguments in their
 ``__init__()`` and have corresponding attributes.  However, the need for
-those values is eliminated.  The only exception is
+those values is eliminated by module specs.  The only exception is
 ``FileLoader.get_filename()``, which uses ``self.path``.  The signatures
 for these loaders and the accompanying attributes will be deprecated.
 
 In addition to executing a module during loading, loaders will still be
 directly responsible for providing APIs concerning module-related data.
 
+
 Other Changes
--------------
+=============
 
 * The various finders and loaders provided by ``importlib`` will be
   updated to comply with this proposal.
-
 * The spec for the ``__main__`` module will reflect how the interpreter
   was started.  For instance, with ``-m`` the spec's name will be that
   of the run module, while ``__main__.__name__`` will still be
   "__main__".
-
-* We add ``importlib.find_module()`` to mirror
+* We add ``importlib.find_spec()`` to mirror
   ``importlib.find_loader()`` (which becomes deprecated).
-
 * Deprecations in ``importlib.util``: ``set_package()``,
   ``set_loader()``, and ``module_for_loader()``.  ``module_to_load()``
   (introduced prior to Python 3.4's release) can be removed.
-
 * ``importlib.reload()`` is changed to use ``ModuleSpec.load()``.
-
 * ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of
   the per-module import lock, whereas ``Loader.load_module()`` did not.
 
+
 Reference Implementation
-------------------------
+========================
 
-A reference implementation is available at <TBD>.
+A reference implementation will be available at
+http://bugs.python.org/issue18864.
 
 
-Open Questions
+Open Issues
 ==============
 
-* How to avoid having custom ModuleSpec attributes conflict with future
-  normal attributes?
+\* The impact of this change on pkgutil (and setuptools) needs looking
+into.  It has some generic function-based extensions to PEP 302.  These
+may break if importlib starts wrapping loaders without the tools'
+knowledge.
 
-This could be done with a sub-namespace bound to a single ModuleSpec
-attribute.  It could also be done by reserving names with a single
-leading underscore for custom attributes.  Or we could just not worry
-about it.
+\* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc,
+inspect.
 
-* Get rid of the ``is_package`` property?
+\* Add ``ModuleSpec.data`` as a descriptor that wraps the data API of the
+spec's loader?
 
-It duplicates information
-both in the ``ModuleSpec()`` signature and in attributes.  It is
-technically unncessary in light of the path attribute and it conflicts
-with ``InspectLoader.is_package()``, which makes the implementation more
-complicated.  However, it also provides an explicit indicator of
-package-ness, which helps those less familiar with the import system.
-
-* Deprecate the use of ``__file__`` for anything except actual files?
-
-* Introduce a new extension module API that takes advantage of
-  ``ModuleSpec``?  I'd rather that be part of a separate proposal.
-
-* Add ``create_module()`` to loaders?
-
-It would take a ``ModuleSpec``
-and return the module that should be passed to ``spec.exec()``.  This
-method would be helpful for new extension module import APIs.
-
-* Have ``ModuleSpec.module_repr()`` replace more of the module type's
-  ``__repr__()`` implementation?
-
-A compliant module is required to have
-``__spec__`` set so that should work.  However, currently the repr uses
-the module attributes.  Using the spec attributes would give precedence
-to the spec in the case that they differ, which would be
-backward-incompatible.
-
-* Factor the path-based attributes/functionality into a subclass--
-  something like ``PathModuleSpec``?
-
-It looks like there just isn't enough benefit to doing so.
+\* How to limit possible end-user confusion/abuses relative to spec
+attributes (since __spec__ will make them really accessible)?
 
 
 References

-- 
Repository URL: http://hg.python.org/peps


More information about the Python-checkins mailing list