[Python-checkins] r63920 - in peps/trunk: pep-0000.txt pep-0371.txt

Tue Jun 3 16:14:50 CEST 2008

Author: david.goodger
Date: Tue Jun  3 16:14:50 2008
New Revision: 63920

Log:
changes from PEP authors; corrections

Modified:
   peps/trunk/pep-0000.txt
   peps/trunk/pep-0371.txt

Modified: peps/trunk/pep-0000.txt
==============================================================================

--- peps/trunk/pep-0000.txt	(original)
+++ peps/trunk/pep-0000.txt	Tue Jun  3 16:14:50 2008
@@ -96,7 +96,7 @@
  S   364  Transitioning to the Py3K Standard Library   Warsaw
  S   368  Standard image protocol and class            Mastrodomenico
  S   369  Post import hooks                            Heimes
- S   371  Addition of the Processing module            Noller, Oudkerk
+ S   371  Addition of the multiprocessing package      Noller, Oudkerk
  S  3134  Exception Chaining and Embedded Tracebacks   Yee
  S  3135  New Super                                    Spealman, Delaney
  S  3138  String representation in Python 3000         Ishimoto
@@ -475,7 +475,7 @@
  S   368  Standard image protocol and class            Mastrodomenico
  S   369  Post import hooks                            Heimes
  SA  370  Per user site-packages directory             Heimes
- S   371  Addition of the Processing module            Noller, Oudkerk
+ S   371  Addition of the multiprocessing package      Noller, Oudkerk
  SR  666  Reject Foolish Indentation                   Creighton
  SR  754  IEEE 754 Floating Point Special Values       Warnes
  P  3000  Python 3000                                  GvR

Modified: peps/trunk/pep-0371.txt
==============================================================================
--- peps/trunk/pep-0371.txt	(original)
+++ peps/trunk/pep-0371.txt	Tue Jun  3 16:14:50 2008
@@ -1,5 +1,5 @@
 PEP: 371
-Title: Addition of the Processing module to standard library
+Title: Addition of the multiprocessing package to the standard library
 Version: $Revision: $
 Last-Modified: $Date: $
 Author: Jesse Noller <jnoller at gmail.com>
@@ -14,22 +14,22 @@
 
 Abstract
 
-    This PEP proposes the inclusion of the pyProcessing [1] module into the
-    python standard library.
+    This PEP proposes the inclusion of the pyProcessing [1] package into the
+    Python standard library, renamed to "multiprocessing".
 
-    The processing module mimics the standard library threading module and API
+    The processing package mimics the standard library threading module and API
     to provide a process-based approach to "threaded programming" allowing
     end-users to dispatch multiple tasks that effectively side-step the global
     interpreter lock.
 
-    The module also provides server and client modules to provide remote-
-    sharing and management of objects and tasks so that applications may not
-    only leverage multiple cores on the local machine, but also distribute
-    objects and tasks across a cluster of networked machines.
+    The package also provides server and client functionality (processing.Manager)
+    to provide remote sharing and management of objects and tasks so that 
+    applications may not only leverage multiple cores on the local machine, 
+    but also distribute objects and tasks across a cluster of networked machines.
 
-    While the distributed capabilities of the module are beneficial, the primary
+    While the distributed capabilities of the package are beneficial, the primary
     focus of this PEP is the core threading-like API and capabilities of the
-    module.
+    package.
 
 Rationale
 
@@ -41,20 +41,20 @@
     Python programmers who are leveraging multi-core machines.
 
     The GIL itself prevents more than a single thread from running within the
-    interpreter at any given point in time, effectively removing python's
+    interpreter at any given point in time, effectively removing Python's
     ability to take advantage of multi-processor systems.  While I/O bound
     applications do not suffer the same slow-down when using threading, they do
     suffer some performance cost due to the GIL.
 
-    The Processing module offers a method to side-step the GIL allowing
+    The pyProcessing package offers a method to side-step the GIL allowing
     applications within CPython to take advantage of multi-core architectures
     without asking users to completely change their programming paradigm (i.e.:
     dropping threaded programming for another "concurrent" approach - Twisted,
     etc).
 
-    The Processing module offers CPython users a known API (that of the
+    The Processing package offers CPython users a known API (that of the
     threading module), with known semantics and easy-scalability.  In the
-    future, the module might not be as relevant should the CPython interpreter
+    future, the package might not be as relevant should the CPython interpreter
     enable "true" threading, however for some applications, forking an OS
     process may sometimes be more desirable than using lightweight threads,
     especially on those platforms where process creation is fast/optimized.
@@ -70,7 +70,7 @@
         t.start()
         t.join()
 
-    The pyprocessing module mirrors the API so well, that with a simple change
+    The pyprocessing package mirrors the API so well, that with a simple change
     of the import to:
 
         from processing import Process as worker
@@ -78,17 +78,17 @@
     The code now executes through the processing.Process class.  This type of
     compatibility means that, with a minor (in most cases) change in code,
     users' applications will be able to leverage all cores and processors on a
-    given machine for parallel execution.  In many cases the pyprocessing module
+    given machine for parallel execution.  In many cases the pyprocessing package
     is even faster than the normal threading approach for I/O bound programs.
-    This of course, takes into account that the pyprocessing module is in
+    This of course, takes into account that the pyprocessing package is in
     optimized C code, while the threading module is not.
 
 The "Distributed" Problem
 
-    In the discussion on Python-Dev about the inclusion of this module [3] there
+    In the discussion on Python-Dev about the inclusion of this package [3] there
     was confusion about the intentions this PEP with an attempt to solve the
     "Distributed" problem - frequently comparing the functionality of this
-    module with other solutions like MPI-based communication [4], CORBA, or
+    package with other solutions like MPI-based communication [4], CORBA, or
     other distributed object approaches [5].
 
     The "distributed" problem is large and varied.  Each programmer working
@@ -96,24 +96,24 @@
     module/method or a highly customized problem for which no existing solution
     works.
 
-    The acceptance of this module does not preclude or recommend that
+    The acceptance of this package does not preclude or recommend that
     programmers working on the "distributed" problem not examine other solutions
-    for their problem domain.  The intent of including this module is to provide
+    for their problem domain.  The intent of including this package is to provide
     entry-level capabilities for local concurrency and the basic support to
     spread that concurrency across a network of machines - although the two are
-    not tightly coupled, the pyprocessing module could in fact, be used in
+    not tightly coupled, the pyprocessing package could in fact, be used in
     conjunction with any of the other solutions including MPI/etc.
 
     If necessary - it is possible to completely decouple the local concurrency
-    abilities of the module from the network-capable/shared aspects of the
-    module. Without serious concerns or cause however, the author of this PEP
+    abilities of the package from the network-capable/shared aspects of the
+    package. Without serious concerns or cause however, the author of this PEP
     does not recommend that approach.
 
 Performance Comparison
 
     As we all know - there are "lies, damned lies, and benchmarks".  These speed
     comparisons, while aimed at showcasing the performance of the pyprocessing
-    module, are by no means comprehensive or applicable to all possible use
+    package, are by no means comprehensive or applicable to all possible use
     cases or environments.  Especially for those platforms with sluggish process
     forking timing.
 
@@ -157,10 +157,10 @@
         threaded (8 threads)    0.007990 seconds
         processes (8 procs)     0.005512 seconds
 
-    As you can see, process forking via the pyprocessing module is faster than
+    As you can see, process forking via the pyprocessing package is faster than
     the speed of building and then executing the threaded version of the code.
 
-    The second test calculates 50000 fibonacci numbers inside of each thread
+    The second test calculates 50000 Fibonacci numbers inside of each thread
     (isolated and shared nothing):
 
         cmd: python run_benchmarks.py fibonacci.py
@@ -209,7 +209,7 @@
     showcase how the current threading implementation does hinder non-I/O
     applications.  Obviously, these tests could be improved to use a queue for
     coordination of results and chunks of work but that is not required to show
-    the performance of the module.
+    the performance of the package and core Processing module.
 
     The next test is an I/O bound test.  This is normally where we see a steep
     improvement in the threading module approach versus a single-threaded
@@ -264,51 +264,93 @@
         processes (8 procs)     0.298625 seconds
 
     We finally see threaded performance surpass that of single-threaded
-    execution, but the pyprocessing module is still faster when increasing the
+    execution, but the pyprocessing package is still faster when increasing the
     number of workers.  If you stay with one or two threads/workers, then the
     timing between threads and pyprocessing is fairly close.
 
-    Additional benchmarks can be found in the pyprocessing module's source
-    distribution's examples/ directory.
+    One item of note however, is that there is an implicit overhead within the
+    pyprocessing package's Queue implementation due to the object serialization.
+    
+    Alec Thomas provided a short example based on the run_benchmarks.py script
+    to demonstrate this overhead versus the default Queue implementation:
+
+        cmd: run_bench_queue.py 
+        non_threaded (1 iters)  0.010546 seconds
+        threaded (1 threads)    0.015164 seconds
+        processes (1 procs)     0.066167 seconds
+
+        non_threaded (2 iters)  0.020768 seconds
+        threaded (2 threads)    0.041635 seconds
+        processes (2 procs)     0.084270 seconds
+
+        non_threaded (4 iters)  0.041718 seconds
+        threaded (4 threads)    0.086394 seconds
+        processes (4 procs)     0.144176 seconds
+
+        non_threaded (8 iters)  0.083488 seconds
+        threaded (8 threads)    0.184254 seconds
+        processes (8 procs)     0.302999 seconds
+
+    Additional benchmarks can be found in the pyprocessing package's source
+    distribution's examples/ directory. The examples will be included in the
+    package's documentation.
 
 Maintenance
 
-    Richard M. Oudkerk - the author of the pyprocessing module has agreed to
-    maintaing the module within Python SVN.  Jesse Noller has volunteered to
-    also help maintain/document and test the module.
+    Richard M. Oudkerk - the author of the pyprocessing package has agreed to
+    maintain the package within Python SVN.  Jesse Noller has volunteered to
+    also help maintain/document and test the package.
+
+API Naming
+
+    The API of the pyprocessing package is designed to closely mimic that of
+    the threading and Queue modules. It has been proposed that instead of 
+    adding the package as-is, we rename it to be PEP 8 compliant instead.
+
+    Since the aim of the package is to be a drop-in for the threading
+    module, the authors feel that the current API should be used.
+    When the threading and Queue modules are updated to fully reflect
+    PEP 8, the pyprocessing/multiprocessing naming can be revised.
 
 Timing/Schedule
 
     Some concerns have been raised about the timing/lateness of this PEP
     for the 2.6 and 3.0 releases this year, however it is felt by both
-    the authors and others that the functionality this module offers
+    the authors and others that the functionality this package offers
     surpasses the risk of inclusion.
 
-    However, taking into account the desire not to destabilize python-core, some
-    refactoring of pyprocessing's code "into" python-core can be withheld until
-    the next 2.x/3.x releases.  This means that the actual risk to python-core
-    is minimal, and largely constrained to the actual module itself.
+    However, taking into account the desire not to destabilize Python-core, some
+    refactoring of pyprocessing's code "into" Python-core can be withheld until
+    the next 2.x/3.x releases.  This means that the actual risk to Python-core
+    is minimal, and largely constrained to the actual package itself.
 
 Open Issues
 
-    * All existing tests for the module should be converted to UnitTest format.
+    * All existing tests for the package should be converted to UnitTest format.
     * Existing documentation has to be moved to ReST formatting.
     * Verify code coverage percentage of existing test suite.
     * Identify any requirements to achieve a 1.0 milestone if required.
     * Verify current source tree conforms to standard library practices.
-    * Rename top-level module from "pyprocessing" to "multiprocessing".
+    * Rename top-level package from "pyprocessing" to "multiprocessing".
     * Confirm no "default" remote connection capabilities, if needed enable the
       remote security mechanisms by default for those classes which offer remote
       capabilities.
     * Some of the API (Queue methods qsize(), task_done() and join()) either
       need to be added, or the reason for their exclusion needs to be identified
       and documented clearly.
+    * Add in "multiprocessing.setExecutable()" method to override the default
+      behavior of the package to spawn processes using the current executable
+      name rather than the Python interpreter. Note that Mark Hammond has 
+      suggested a factory-style interface for this[7].
+        * Also note that the default behavior of process spawning does not make
+          it compatible with use within IDLE as-is, this will be examined as
+          a bug-fix or "setExecutable" enhancement.
 
 Closed Issues
 
-    * Reliance on ctypes: The pyprocessing module's reliance on ctypes prevents
-      the module from functioning on platforms where ctypes is not supported.
-      This is not a restriction of this module, but rather ctypes.
+    * Reliance on ctypes: The pyprocessing package's reliance on ctypes prevents
+      the package from functioning on platforms where ctypes is not supported.
+      This is not a restriction of this package, but rather of ctypes.
 
 References
 
@@ -330,6 +372,8 @@
         Magazine in December 2008: "Python Threads and the Global Interpreter
         Lock" by Jesse Noller.  It has been modified for this PEP.
 
+    [7] http://groups.google.com/group/python-dev2/msg/54cf06d15cbcbc34
+
 Copyright
 
     This document has been placed in the public domain.