Differences creating tuples and collections.namedtuples

Fri Feb 22 05:38:03 EST 2013

On Tue, 19 Feb 2013 22:38:32 -0500, Terry Reedy wrote:

> On 2/18/2013 7:18 PM, Steven D'Aprano wrote:
>> Terry Reedy wrote:
>>
>>> On 2/18/2013 6:47 AM, John Reid wrote:
>>>
>>>> I was hoping namedtuples could be used as replacements for tuples
>>>>  in all instances.
>>>
>>> This is a mistake in the following two senses. First, tuple is a class
>>> with instances while namedtuple is a class factory that produces
>>> classes. (One could think of namedtuple as a metaclass, but it was not
>>> implemented that way.)
>
>> I think you have misunderstood.
>
> Wrong, which should be evident to anyone who reads the entire paragraph
> as the complete thought exposition it was meant to be. Beside which,
> this negative ad hominem comment is irrelevant to the rest of your post
> about the Liskov Substitution Principle.

Terry, I'm sorry that I have stood on your toes here, no offense was 
intended. It seemed to me, based on the entire paragraph that you wrote, 
that you may have misunderstood the OP's question. The difference in 
signatures between the namedtuple class factory and tuple is irrelevant, 
as I can now see you understand, but by raising it in the first place 
you gave me the impression that you may have misunderstood what the OP 
was attempting to do.

> The rest of the paragraph, in two more pieces:
>
>>> Second, a tuple instance can have any length and different instances
>>> can have different lengths. On the other hand, all instances of a
>>> particular namedtuple class have a fixed length.
>
> In other words, neither the namedtuple object nor any namedtuple class
> object can fully substitute for the tuple class object. Nor can
> instances of any namedtuple class fully substitute for instances of the
> tuple class. Therefore, I claim, the hope that "namedtuples could be
> used as replacements for tuples in all instances" is a futile hope,
> however one interprets that hope.

I did discuss the fixed length issue directly, and agreed with you that 
if your contract is to construct variable-length tuples, then a 
fixed-length namedtuple is not substitutable.

But in practice, one common use-case for tuples (whether named or not) 
is for fixed-length records, and in that use-case, a namedtuple of 
length N should be substitutable for a tuple of length N.

>  >> This affects their initialization.
>
> Part of the effect is independent of initialization. Even if namedtuples
> were initialized by iterator, there would still be glitches. In
> particular, even if John's named tuple class B *could* be initialized as
> B((1,2,3)), it still could not be substituted for t in the code below.
>
>  >>> t = (1,2,3)
>  >>> type(t) is type(t[1:])
> True

Agreed. There are other differences as well, e.g. repr(t) will differ 
between builtin tuples and namedtuples. The only type which is identical 
in every conceivable aspect to tuple is tuple itself. Any subclass or 
subtype[1] must by definition differ in at least one aspect from tuple:

type(some_tuple) is type(())

and in practice will differ in other aspects as well.

  Footnote: [1] Subclass meaning it inherits from tuple; subtype in the
  sense that it duck-types as a tuple, but may or may not share any
  implementation.

LSP cannot be interpreted in isolation. Any non-trivial modification of 
a class will change *something* about the class, after all that's why we 
subclassed it in the first place. Either the interface will be 
different, or the semantics will be different, or both. LSP must always 
be interpreted in the intersection between the promises made by the 
class and the promises your application cares about.

Some promises are more important than others, hence some violations are 
more serious than others. For instance, I think that tuple indexing is a 
critical promise: a "tuple" that cannot be indexed is surely not a tuple. 
The exact form of the repr() of a tuple is generally not important at 
all: a tuple that prints as MyBunchOStuff(...) is still a tuple. In my 
experience, the constructor signature is of moderate importance. But of 
course that depends on what promises you rely on, if you are relying on 
the tuple constructor, then it is critical *to you*.

> The problem I see with the LSP for modeling either abstract or concrete
> entities is that we in fact do define subclasses by subtraction or
> limitation, as well as by augmentation, while the LSP only allows the
> latter.

People do all sorts of things. They write code that is O(N**2) or worse, 
they call eval() on untrusted data, they use isinstance() and break 
duck-typing, etc. That they break LSP does not necessarily mean that 
they should. LSP is one of the five fundamental best-practices for 
object-oriented code, "SOLID":

http://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29

Breaking any of the SOLID principles is a code-smell. That does not mean 
that there is never a good reason to do so, but SOLID is a set of 
principle which have stood the test of time and practice. Any code that 
breaks one of those principles should be should be considered smelly, or 
worse, until justified.

(And for the avoidance of doubt, I am more than satisfied with the 
justification given for the difference in signature between tuples and 
namedtuples.)

-- 
Steven