[Python-ideas] namedtuple literals [Was: RE a new namedtuple]

Steven D'Aprano steve at pearwood.info
Tue Jul 25 21:05:12 EDT 2017


On Tue, Jul 25, 2017 at 08:30:14PM +0100, MRAB wrote:

> Given:
> 
> >>> nt = ntuple(x=1, y=2)
> 
> you have nt[0] == 1 because that's the order of the args.
> 
> But what about:
> 
> >>> nt2 = ntuple(y=2, x=1)
> 
> ? Does that mean that nt[0] == 2? Presumably, yes.

It better be.

> Does nt == nt2?
> 
> If it's False, then you've lost some of the advantage of using names 
> instead of positions.

Not at all. It's a *tuple*, so the fields have a definite order. If you 
don't want a tuple, why are using a tuple? Use SimpleNamespace for an 
unordered "bag of attributes":

py> from types import SimpleNamespace
py> x = SimpleNamespace(spam=4, eggs=3)
py> y = SimpleNamespace(eggs=3, spam=4)
py> x == y
True

> It's a little like saying that functions can be called with keyword 
> arguments, but the order of those arguments still matters!

That's the wrong analogy and it won't work. But people will expect that 
it will, and be surprised when it doesn't!


The real problem here is that we're combining two distinct steps into 
one. The *first* step should be to define the order of the fields in the 
record (a tuple): [x, y] is not the same as [y, x]. Once the field order 
is defined, then you can *instantiate* those fields either positionally, 
or by name in any order.

But by getting rid of that first step, we no longer have the option to 
specify the order of the fields. We can only infer them from the order 
they are given when you instantiate the fields.

Technically, Nick's scheme to implicitly cache the type could work 
around this at the cost of making it impossible to have two types with 
the same field names in different orders. Given:

ntuple(y=1, x=2)

ntuple could look up the *unordered set* {y, x} in the cache, and if 
found, use that type. If not found, define a new type with the fields in 
the stated order [y, x].

So now you can, or at least you will *think* that you can, safely write 
this:

spam = ntuple(x=2, y=1, z=0)  # defines the field order [x, y, z]
eggs = ntuple(z=0, y=1, x=2)  # instantiate using kwargs in any order
assert spam=eggs


But this has a hidden landmine. If *any* module happens to use ntuple 
with the same field names as you, but in a different order, you will 
have mysterious bugs:

x, y, z = spam

You expect x=2, y=1, z=0 because that's the way you defined the field 
order, but unknown to you some other module got in first and defined it 
as [z, y, x] and so your code will silently do the wrong thing.

Even if the cache is per-module, the same problem will apply. If the 
spam and eggs assignments above are in different functions, the field 
order will depend on which function happens to be called first, which 
may not be easily predictable.

I don't see any way that this proposal can be anything by a subtle 
source of bugs. We have two *incompatible* requirements:

- we want to define the order of the fields according to the 
  order we give keyword arguments;

- we want to give keyword arguments in any order without 
  caring about the field order.

We can't have both, and we can't give up either without being a 
surprising source of annoyance and bugs.

As far as I am concerned, this kills the proposal for me. If you care 
about field order, then use namedtuple and explicitly define a class 
with the field order you want. If you don't care about field order, use 
SimpleNamespace.



-- 
Steve


More information about the Python-ideas mailing list