[Python-checkins] python/nondist/peps pep-0307.txt,1.9,1.10

gvanrossum@users.sourceforge.net gvanrossum@users.sourceforge.net
Tue, 04 Feb 2003 11:12:28 -0800


Update of /cvsroot/python/python/nondist/peps
In directory sc8-pr-cvs1:/tmp/cvs-serv26330

Modified Files:
	pep-0307.txt 
Log Message:
Introduce extension codes.


Index: pep-0307.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0307.txt,v
retrieving revision 1.9
retrieving revision 1.10
diff -C2 -d -r1.9 -r1.10
*** pep-0307.txt	4 Feb 2003 17:53:55 -0000	1.9
--- pep-0307.txt	4 Feb 2003 19:12:25 -0000	1.10
***************
*** 489,492 ****
--- 489,568 ----
  
  
+ The extension registry
+ 
+     Protocol 2 supports a new mechanism to reduce the size of pickles.
+ 
+     When class instances (classic or new-style) are pickled, the full
+     name of the class (module name including package name, and class
+     name) is included in the pickle.  Especially for applications that
+     generate many small pickles, this is a lot of overhead that has to
+     be repeated in each pickle.  For large pickles, when using
+     protocol 1, repeated references to the same class name are
+     compressed using the "memo" feature; but each class name must be
+     spelled in full at least once per pickle, and this causes a lot of
+     overhead for small pickles.
+ 
+     The extension registry allows one to represent the most frequently
+     used names by small integers, which are pickled very efficiently:
+     an extension code in the range 1-255 requires only two bytes
+     including the opcode, one in the range 256-65535 requires only
+     three bytes including the opcode.
+ 
+     One of the design goals of the pickle protocol is to make pickles
+     "context-free": as long as you have installed the modules
+     containing the classes referenced by a pickle, you can unpickle
+     it, without needing to import any of those classes ahead of time.
+ 
+     Unbridled use of extension codes could jeopardize this desirable
+     property of pickles.  Therefore, the main use of extension codes
+     is reserved for a set of codes to be standardized by some
+     standard-setting body.  This being Python, the standard-setting
+     body is the PSF.  From time to time, the PSF will decide on a
+     table mapping extension codes to class names (or occasionally
+     names of other global objects; functions are also eligible).  This
+     table will be incorporated in the next Python release(s).
+ 
+     However, for some applications, like Zope, context-free pickles
+     are not a requirement, and waiting for the PSF to standardize
+     some codes may not be practical.  Two solutions are offered for
+     such applications.
+ 
+     First of all, a few ranges of extension codes is reserved for
+     private use.  Any application can register codes in these ranges.
+     Two applications exchanging pickles using codes in these ranges
+     need to have some out-of-band mechanism to agree on the mapping
+     between extension codes and names.
+ 
+     Second, some large Python projects (e.g. Zope or Twisted) can be
+     assigned a range of extension codes outside the "private use"
+     range that they can assign as they see fit.
+ 
+     The extension registry is defined as a mapping between extension
+     codes and names.  When an extension code is unpickled, it ends up
+     producing an object, but this object is gotten by interpreting the
+     name as a module name followed by a class (or function) name.  The
+     mapping from names to objects is cached.  It is quite possible
+     that certain names cannot be imported; that should not be a
+     problem as long as no pickle containing a reference to such names
+     has to be unpickled.  (The same issue already exists for direct
+     references to such names in pickles that use protocols 0 or 1.)
+ 
+     Here is the proposed initial assigment of extension code ranges:
+ 
+       First  Last Count  Purpose
+ 
+           0     0     1  Reserved -- will never be used
+           1   127   127  Reserved for Python standard library
+         128   191    64  Reserved for Zope 3
+         192   239    48  Reserved for 3rd parties
+         240   255    16  Reserved for private use (will never be assigned)
+         256   Max   Max  Reserved for future assignment
+ 
+     'Max' stands for 2147483647, or 2**31-1.  This is a hard
+     limitation of the protocol as currently defined.
+ 
+     At the moment, no specific extension codes have been assigned yet.
+ 
+ 
  TBD