[Python-ideas] recorarray: a mutable alternative to namedtuple

Eric Snow ericsnowcurrently at gmail.com
Sat Mar 28 19:11:34 CET 2015


On Sat, Mar 28, 2015 at 7:37 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> Effectively, namedtuple is just a convenience function for wrapping up a
> bunch of nice-to-have but not essential functionality around an
> immutable struct. Python got by with unnamed tuples for over a decade,
> so it's not like we *have* to have namedtuples. But having got them,
> would we go back to using regular tuples as a struct? Hell no. Having
> named fields is so much better.

+1, though it doesn't *necessarily* follow that a mutable equivalent
is a good idea.  This (and related threads) imply there is at least
some support for the new type in principle.  I haven't followed the
threads too closely so I've missed any mention of solid pythonic use
cases that would give the idea much more solid footing.  However, I
have seen references to prior art on the cheeseshop which may be used
to provide harder evidence (both of support and of solid use cases).

Regardless, I concur that there are many cases where types and
functions have been added to the stdlib that weren't strictly
necessary.  Perhaps if those proposals had come from someone else or
when the mood on python-dev was different then they would not have
been added.  That is what has happened with numerous other
we-have-gotten-by-without-it-so-why-add-it ideas (which may also have
proven themselves as namedtuple has).

Ultimately we have to be careful in this space because, as Raymond
often reminds us, it really is important to make the effort to keep
Python small enough to fit in people's brains (and in *some* regard
we've probably failed there already).  With the great efforts in the
last year to improve packaging, the cheeseshop is increasingly the
better place for new types and helpers to live.

With that in mind, perhaps we should start adding a section to the
bottom of relevant docs that contains links to vetted PyPI packages
(and recipes) that provide extended capabilities.  We've done this
already in a few select places (e.g. the 3.2+ namedtuple docs).

> If this is so easy, why we have namedtuple *and* SimpleNamespace
> in the standard library. Are they both mistakes?
>
> SimpleNamespace is especially interesting. The docs say:
>
> "However, for a structured record type use namedtuple() instead."
>
> https://docs.python.org/3/library/types.html#types.SimpleNamespace

As the person who wrote that I'll point out that I added it to help
make the distinction clearer between the two.  At the time there were
concerns about the similarities and with users getting confused about
which to use.  I will argue that "record type" implies an archive of
data, ergo immutable.  "Structured" refers to being ordered and having
attribute access.  IMHO that statement is clear and helpful, but if it
has proven otherwise we should consider improving it.

In contrast, I see the proposal here as somewhat of a middle ground.
Folks are looking for a factory mechanism that produces classes with
slots and have both iteration (for unpacking) and index lookup.  So
something like this:

class FixedClassMeta(type):
    # Ideally this would be a "classonly" method (like classmethod but
    # class-only) method on FixedClass and not need a metaclass.
    def subclass(base, name, *fields):
        # XXX validate fields first
        args = ', '.join(fields)
        body = '\n    '.join('self.{0} = {0}'.format(f) for f in fields)
        code = """def __init__(self, {}):\n    {}""".format(args, body)
        ns = {}
        exec(code, ns)
        class X(base):
            __slots__ = fields
            __init__ = ns['__init__']
        X.__name__ = name
        X.__qualname__ = X.__qualname__.replace('X', name, 1)
        X.__doc__ = """..."""
        return X


class FixedClass(metaclass=FixedClassMeta):
    __slots__ = ()
    def __repr__(self):
        items = ("{}={!r}".format(f, getattr(self, f)) for f in self.__slots__)
        return "{}({})".format(self.__class__.__name__, ', '.join(items))
    def __iter__(self):  # for unpacking
        return (getattr(self, f) for f in self.__slots__)
    def __getitem__(self, index):
        field = self.__slots__[index]
        try:
            return getattr(self, field)
        except AttributeError:
            raise IndexError(index)
    # Index lookup exists for convenience, but assignment & deletion
    # are not in scope.


def fixedClass(name, field_names):
    """A factory that produces classes with fixed, ordered attributes.

    The returned class has __slots__ set to the field names, as well as
    __iter__ (for unpacking) and __getitem__ implemented.

    """
    if isinstance(field_names, str):
        fields = field_names.replace(',', ' ').split()
    else:
        fields = field_names
    return FixedClass.subclass(name, *fields)


That said, I'm still not clear on what the use cases are.

> Which brings us back to where this thread started: a request for a
> mutable version of namedtuple. That's trickier than namedtuple, because
> we don't have a mutable version of a tuple to inherit from. Lists won't
> do the job, because they have a whole lot of functionality that are
> inappropriate, e.g. sort, reverse, pop methods.
>
> That makes it harder to create a mutable structured record type, not
> simpler.
>
> Think about the functional requirements:
>
> - it should be semantically a struct, not a list or array;

This is the key point.  It is a fixed-size class with iteration for
unpacking and index lookup for convenience.  A full-fledged mutable
namedtuple doesn't make sense (without clear use cases).

>
> - with a fixed set of named fields;
>
> - fields should be ordered: a record with fields foo and bar is not the
> same as a record with fields bar and foo;

Ah, my example above would have to grow __eq__ then.

>
> - accessing fields by index would be a Nice To Have, but not essential;

Exactly.  Not Essential.

>
> - but iteration is essential, for sequence unpacking;

This brings to mind a different proposal that has come up in the past
(a separate "dunder" method for unpacking).  Iteration seems out of
place for here, but we need it for sequence unpacking.

>
> - values in the fields must be mutable;
>
> - it should support equality, but not hashing (since it is mutable);
>
> - it must have a nice repr and/or str;
>
> - being mutable, it may directly or indirectly contain a reference to
> itself (e.g. x.field = x) so it needs to deal with that correctly;

Ah, yes.  "RuntimeError: maximum recursion depth exceeded". :)

>
> - support for pickle;
>
> - like namedtuple, it may benefit from a handful of methods such as
> '_asdict', '_fields', '_make', '_replace' or similar.

Perhaps.  I think there are a few things we can learn from namedtuple
that can be applied for this hypothetical new type/factory.

And to add to your list:

- performance should be a consideration since the apparent use cases
relate to handling many of these as "records".

Again, I'm not sold on the benefit of this over the existing
alternatives.  For records use namedtuple (with the _replace method
for "mutation").

-eric


More information about the Python-ideas mailing list