namespaces module (a.k.a. bunch, struct, generic object, etc.) PEP

Steven Bethard steven.bethard at gmail.com
Thu Feb 10 13:56:45 EST 2005


In the "empty classes as c structs?" thread, we've been talking in some 
detail about my proposed "generic objects" PEP.  Based on a number of 
suggestions, I'm thinking more and more that instead of a single 
collections type, I should be proposing a new "namespaces" module 
instead.  Some of my reasons:

(1) Namespace is feeling less and less like a collection to me.  Even 
though it's still intended as a data-only structure, the use cases seem 
pretty distinct from other collections.

(2) I think there are a couple different types that should be provided, 
not just one.  For example:

* The suggested namespace view (which allows, for example, globals() to 
be manipulated with the getattr protocol) should probably be its own 
class, NamespaceView, since the behavior of a NamespaceView is 
substantially different from that of a Namespace.  (Modifying a 
NamespaceView modifies the dict, while modifying a Namespace doesn't.) 
This also allows NamespaceView to have a separate __repr__ to indicate 
some of the differences.

* The suggested namespace chain, if included, should also probably be 
its own class, NamespaceChain.  As defined, it doesn't depend on 
Namespace at all -- it could be used just as well for other 
non-Namespace objects...


I've updated the PEP to reflect these changes.  Comments and suggestions 
greatly appreciated!  (Note that I've included the current 
implementation of the module at the end of the PEP.)

----------------------------------------------------------------------
PEP: XXX
Title: Attribute-Value Mapping Data Type
Version:
Last-Modified:
Author: Steven Bethard <steven.bethard at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2005
Python-Version: 2.5
Post-History: 10-Feb-2005


Abstract
========

This PEP proposes a standard library addition to support the simple
creation of attribute-value mappping objects which can be given named
attributes without the need to declare a class. Such attribute-value
mappings are intended to complement the name-value mappings provided
by Python's builtin dict objects.


Motivation
==========

Python's dict objects provide a simple way of creating anonymous
name-value mappings. These mappings use the __getitem__ protocol to
access the value associated with a name, so that code generally
appears like::

     mapping['name']

Occasionally, a programmer may decide that dotted-attribute style
access is more appropriate to the domain than __getitem__ style
access, and that their mapping should be accessed like::

     mapping.name

Currently, if a Python programmer makes this design decision, they
are forced to declare a new class, and then build instances of this
class. When no methods are to be associated with the attribute-value
mappings, declaring a new class can be overkill.  This PEP proposes
adding a new module to the standard library to provide a few simple
types that can be used to build such attribute-value mappings.

Providing such types allows the Python programmer to determine which
type of mapping is most appropriate to their domain and apply this
choice with minimal effort.  Some of the suggested uses include:


Returning Named Results
-----------------------

It is often appropriate for a function that returns multiple items to
give names to the different items returned.  The types suggested in
this PEP provide a simple means of doing this that allows the
returned values to be accessed in the usual attribute-style access::

     >>> def f(x):
     ...     return Namespace(double=2*x, squared=x**2)
     ...
     >>> y = f(10)
     >>> y.double
     20
     >>> y.squared
     100


Representing Hierarchical Data
------------------------------

The types suggested in this PEP also allow a simple means of
representing hierarchical data that allows attribute-style access::

     >>> x = Namespace(spam=Namespace(rabbit=1, badger=[2]), ham='3')
     >>> x.spam.badger
     [2]
     >>> x.ham
     '3'


Manipulating Dicts through Attributes
-------------------------------------

Sometimes it is desirable to access the items of an existing dict
object using dotted-attribute style access instead of __getitem__
style access. The types suggested in this PEP provide a simple means
of doing so::

     >>> d = {'name':'Pervis', 'lumberjack':True}
     >>> ns = NamespaceView(d)
     >>> ns
     NamespaceView(lumberjack=True, name='Pervis')
     >>> ns.rugged = False
     >>> del ns.lumberjack
     >>> d
     {'rugged': False, 'name': 'Pervis'}


Rationale
=========

As Namespace objects are intended primarily to replace simple,
data-only classes, simple Namespace construction was a primary
concern.  As such, the Namespace constructor supports creation from
keyword arguments, dicts, and sequences of (attribute, value) pairs::

     >> Namespace(eggs=1, spam=2, ham=3)
     Namespace(eggs=1, ham=3, spam=2)
     >>> Namespace({'eggs':1, 'spam':2, 'ham':3})
     Namespace(eggs=1, ham=3, spam=2)
     >>> Namespace([('eggs',1), ('spam',2), ('ham',3)])
     Namespace(eggs=1, ham=3, spam=2)

To allow attribute-value mappings to be easily combined, the
Namespace type provides a update staticmethod that supports similar
arguments::

     >>> ns = Namespace(eggs=1)
     >>> Namespace.update(ns, [('spam', 2)], ham=3)
     >>> ns
     Namespace(eggs=1, ham=3, spam=2)

Note that update should be used through the class, not through the
instances, to avoid the confusion that might arise if an 'update'
attribute added to a Namespace instance hid the update method.


Namespace objects support also object equality, comparing Namespace
objects by attributes recursively::

     >>> x = Namespace(parrot=Namespace(lumberjack=True, spam=42), 
peng='shrub')
     >>> y = Namespace(peng='shrub', parrot=Namespace(spam=42, 
lumberjack=True))
     >>> z = Namespace(parrot=Namespace(lumberjack=True), peng='shrub')
     >>> x == y
     True
     >>> x == z
     False


NamespaceView objects are intended to allow manipulating dict objects
through the getattr protocol instead of the getitem protocol.  For
example usage, see "Viewing Dict Items as Attributes" above.


Note that support for the various mapping methods, e.g.
__(get|set|del)item__, __len__, __iter__, __contains__, items, keys,
values, etc. was intentionally omitted as these methods did not seem
to be necessary for the core uses of an attribute-value mapping.  If
such methods are truly necessary for a given use case, this may
suggest that a dict object is a more appropriate type for that use.


Examples
=========

Converting an XML DOM tree into a tree of nested Namespace objects::

     >>> import xml.dom.minidom
     >>> def gettree(element):
     ...     result = Namespace()
     ...     if element.attributes:
     ...         Namespace.update(result, element.attributes.items())
     ...     children = {}
     ...     for child in element.childNodes:
     ...         if child.nodeType == xml.dom.minidom.Node.TEXT_NODE:
     ...             children.setdefault('text', []).append(
     ...                 child.nodeValue)
     ...         else:
     ...             children.setdefault(child.nodeName, []).append(
     ...                 gettree(child))
     ...     Namespace.update(result, children)
     ...     return result
     ...
     >>> doc = xml.dom.minidom.parseString("""\
     ... <xml>
     ...   <a attr_a="1">
     ...     a text 1
     ...     <b attr_b="2" />
     ...     <b attr_b="3"> b text </b>
     ...     a text 2
     ...   </a>
     ...   <c attr_c="4"> c text </c>
     ... </xml>""")
     >>> tree = gettree(doc.documentElement)
     >>> tree.a[0].b[1]
     Namespace(attr_b=u'3', text=[u' b text '])

Reference Implementation
========================

::
     class Namespace(object):
         """Namespace([namespace|dict|seq], **kwargs) -> Namespace object

         The new Namespace object's attributes are initialized from (if
         provided) either another Namespace object's attributes, a
         dictionary, or a sequence of (name, value) pairs, then from the
         name=value pairs in the keyword argument list.
         """

         def __init__(*args, **kwargs):
             """Initializes a Namespace instance."""
             # inheritance-friendly update call
             type(args[0]).update(*args, **kwargs)

         def __eq__(self, other):
             """x.__eq__(y) <==> x == y

             Two Namespace objects are considered equal if they have the
             same attributes and the same values for each of those
             attributes.
             """
             return (other.__class__ == self.__class__ and
                     self.__dict__ == other.__dict__)

         def __repr__(self):
             """x.__repr__() <==> repr(x)

             If all attribute values in this namespace (and any nested
             namespaces) are reproducable with eval(repr(x)), then the
             Namespace object is also reproducable for eval(repr(x)).
             """
             return '%s(%s)' % (
                 type(self).__name__,
                 ', '.join('%s=%r' % (k, v) for k, v in sorted(
                     self.__dict__.iteritems())))

         def update(*args, **kwargs):
             """Namespace.update(ns, [ns|dict|seq,] **kwargs) -> None

             Updates the first Namespace object's attributes from (if
             provided) either another Namespace object's attributes, a
             dictionary, or a sequence of (name, value) pairs, then from
             the name=value pairs in the keyword argument list.
             """
             if not 1 <= len(args) <= 2:
                 raise TypeError('expected 1 or 2 arguments, got %i' %
                                 len(args))
             self = args[0]
             if not isinstance(self, Namespace):
                 raise TypeError('first argument to update should be '
                                 'Namespace,  not %s' %
                                 type(self).__name__)
             if len(args) == 2:
                 other = args[1]
                 if isinstance(other, Namespace):
                     other = other.__dict__
                 try:
                     self.__dict__.update(other)
                 except (TypeError, ValueError):
                     raise TypeError('cannot update Namespace with %s' %
                                     type(other).__name__)
             self.__dict__.update(kwargs)


     class NamespaceView(Namespace):
         """NamespaceView(dict) -> new Namespace view of the dict

         Creates a Namespace that is a view of the original dictionary,
         that is, changes to the Namespace object will be reflected in
         the dictionary, and vice versa.
         """
         def __init__(self, d):
             self.__dict__ = d


Open Issues
===========
What should the types be named?  Some suggestions include 'Bunch',
'Record', 'Struct' and 'Namespace'.

Where should the types be placed?  The current suggestion is a new
"namespaces" module.

Should namespace chaining be supported?  One suggestion would add a
NamespaceChain object to the module::

     class NamespaceChain(object):
         """NamespaceChain(*objects) -> new attribute lookup chain

         The new NamespaceChain object's attributes are defined by the
         attributes of the provided objects.  When an attribute is
         requested, the sequence is searched sequentially for an object
         with such an attribute.  The first such attribute found is
         returned, or an AttributeError is raised if none is found.

         Note that this chaining is only provided for getattr and delattr
         operations -- setattr operations must be applied explicitly to
         the appropriate objects.

         The list of objects is stored in a NamespaceChain object's
         __namespaces__ attribute.
         """
         def __init__(self, *objects):
             """Initialize the NamespaceChain object"""
             self.__namespaces__ = objects

         def __getattr__(self, name):
             """Return the first such attribute found in the object list
             """
             for obj in self.__namespaces__:
                 try:
                     return getattr(obj, name)
                 except AttributeError:
                     pass
             raise AttributeError(name)

         def __delattr__(self, name):
             """Delete the first such attribute found in the object list
             """
             for obj in self.__namespaces__:
                 try:
                     return delattr(obj, name)
                 except AttributeError:
                     pass
             raise AttributeError(name)


References
==========





..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    End:



More information about the Python-list mailing list