[Python-Dev] Suggestions for Improvements to namedtuple

Isaac Morland ijmorlan at cs.uwaterloo.ca
Wed Nov 14 19:30:04 CET 2007


I was working on something very similar to namedtuple for a project of my own, 
when it occurred to me that it's generally useful idea and maybe somebody else 
was working on it too.  So I searched, and found Raymond Hettinger's addition 
to collections.py, and it does pretty much what I want.  I have a few 
suggestions which I hope are improvements.  I will discuss each one, and then 
at the bottom I will put my version of namedtuple.  It is not based on the one 
in Python SVN because it was mostly written before I found out about that one. 
If my suggestions meet with approval, I could check out a copy of 
collections.py and make a patch for further comment and eventual submission to 
somebody with checkin privileges.

1. I think it is important that there be a way to create individual namedtuple 
instances from an existing sequence that doesn't involve splitting the sequence 
out into individual parameters and then re-assembling a tuple to pass to the 
base tuple constructor.  In my application, I'm taking database rows and 
creating named tuples from them, with the named tuple type being automatically 
created as appropriate.  So there will be *lots* of named tuple instances 
created, so for efficiency I would prefer to avoid using * to break up the 
sequences generated directly by the database interface.  I would like to pass 
those sequences directly to the base tuple constructor.

To restore to my code the feature of being able to use individual parameters as 
in collections.py, I added a classmethod to the generated classes called 
__fromvalues__.  This uses Signature, my other idea (next message) to convert a 
call matching a procedure signature of (fieldname1, ...) into a dictionary, and 
passes that dictionary into another classmethod __fromdict__ which creates a 
named tuple instance from the dictionary contents.

The problem I see with this is that having to say

Point.__fromvalues__ (11, y=22)

instead of

Point (11, y=22)

is a little verbose.  Perhaps there could be an __fromsequence__ instead for 
the no-unpacking method of instance creation, as the most common use of 
direct-from-sequence creation I think is in a more general circumstance.

2.  It would be nice to be able to have default values for named tuple fields. 
Using Signature it's easy to do this - I just specify a dictionary of defaults 
at named tuple class creation time.

3.  In my opinion __replace__ should be able to replace multiple fields. My 
version takes either two parameters, same as collections.py, or a single 
dictionary containing replacements.

4.  I put as much of the implementation as possible of the named tuple classes 
into a base class which I've called BaseNamedTuple.  This defines the 
classmethods __fromvalues__ and __fromdict__, as well as the regular methods 
__repr__, __asdict__, and __replace__.

5.  It didn't occur to me to use exec ... in so I just create the new type 
using the type() function.  To me, exec is a last resort, but I'm a Python 
newbie so I'd be interested to hear what people have to say about this.

6.  Not an improvement but a concern about my code: the generated classes and 
instances have all the crucial stuff like __fields__ and __signature__ fully 
read-write.  It feels like those should be read-only properties.  I think that 
would require namedtuple to be a metaclass instead of just a function (in order 
for the properties of the generated classes to be read-only).  On the other 
hand, I'm a recovering Java programmer, so maybe it's un-Pythonic to want stuff 
to be read-only.  Here I would especially appreciate any guidance more 
experienced hands can offer.

And now, here is the code, together with a rudimentary example of how this 
could be used to improve the "addr" functions in email.utils:

#!/usr/bin/env python

from operator import itemgetter

class BaseNamedTuple (tuple):
     @classmethod
     def __fromvalues__ (cls, *args, **keys):
         return cls.__fromdict__ (cls.__signature__.expand_args (*args, **keys))

     @classmethod
     def __fromdict__ (cls, d):
         return cls ([d[name] for name in cls.__fields__])

     def __repr__ (self):
         return self.__reprtemplate__ % self

     def __asdict__ (self):
         return dict (zip (self.__fields__, self))

     def __replace__ (self, *args):
         slist = list (self)
         if len (args) == 1:
             sdict = args[0]
         elif len (args) == 2:
             sdict = {args[0]: args[1]}
         else:
             raise TypeError

         for key in sdict:
             slist[self.__indices__[key]] = sdict[key]
         return self.__class__ (slist)

def namedtuple (name, fields, defaults=None):
     fields = tuple (fields)
     result = type (name, (BaseNamedTuple,), {})
     for i in range (len (fields)):
         setattr (result, fields[i], property (itemgetter (i), None, result))
     result.__fields__ = fields
     result.__signature__ = Signature (fields, defaults=defaults)
     result.__reprtemplate__ = "%s(%s)" % (name,
      ", ".join ("%s=%%r" % name for name in fields))
     result.__indices__ = dict ((field, i) for i, field in enumerate (fields))
     return result

from email.utils import formataddr

class test (namedtuple ("test", ("realname", "email"), {'realname': None})):
     @property
     def valid (self):
         return self.email.find ("@") >= 0

     __str__ = formataddr

if __name__ == "__main__":
     e1 = test (("Smith, John", "jsmith at example.com"))
     print "e1.realname =", e1.realname
     print "e1.email =", e1.email
     print "e1 =", repr (e1)
     print "str(e1) =", str (e1)

     e2 = test.__fromvalues__ (email="test at example.com")
     print "e2 =", repr (e2)
     print "str(e2) =", str (e2)

Isaac Morland			CSCF Web Guru
DC 2554C, x36650		WWW Software Specialist


More information about the Python-Dev mailing list