tuples, index method, Python's design

Thu Apr 12 10:10:12 EDT 2007

On 2007-04-12, Steve Holden <steve at holdenweb.com> wrote:
> Antoon Pardon wrote:
>> On 2007-04-11, Terry Reedy <tjreedy at udel.edu> wrote:
>>> "BJrn Lindqvist" <bjourne at gmail.com> wrote in message 
>>> news:740c3aec0704100824m132c45fbi5c4c3ec0c0fa3a67 at mail.gmail.com...
>>> On 4/10/07, Steve Holden <steve at holdenweb.com> wrote:
>>>> One might perversely allow extension to lists and tuples to allow
>>>>
>>>>    [3, 4] in [1, 2, 3, 4, 5, 6]
>>>>
>>>> to succeed, but that's forcing the use case beyond normal limits.
>>> I'd love to have that! There are at least one million use cases for
>>> finding a sequence in a sequence and implementing it yourself is
>>> non-trivial. Plus then both list and tuple's index methods would work
>>> *exactly* like string's. It would be easier to document and more
>>> useful. A big win.
>>>
>>> =======================
>>> It would be ambiguous: [3,4] in [[1,2], [3,4], [5,6]] is True now.
>>>
>>> Strings are special in that s[i] can only be a (sub)string of length 1.
>>> 'b' in 'abc' is True.  This makes looking for longer substrings easy.
>>>
>>> However, [2] in [1,2,3] is False.  IE, list[i] is not normally a list.  So 
>>> looking for sublists is different from looking for items.
>> 
>> Well I think this illustrates nicely what can happen if you design by
>> use cases.
>> 
>> Let us assume for a moment that finding out if one list is a sublist of
>> a second list gets considered something usefull enough to be included
>> in Python. Now the in operator can't be used for this because it
>> would create ambiguities. So it would become either a new operator
>> or a new method. But whatever the solution it would be different
>> from the string solution.
>> 
> That's because strings are different from other sequences. See below.
>
>> Now if someone would have thought about how "st1 in st2" would
>> generalize to other sequemce if st1 contained more than one
>> character they probably would have found the possible inconsistency
>> that could create and though about using an other way than using
>> the in-operator for this with strings. A way that wouldn't create
>> ambiguities when it was considered to be extended to other sequences.
>> 
> The fact is that strings are the only sequences composed of subsequences 
> of length 1 - in other words the only sequences where type(s) == 
> type(s[0:1]) is an invariant condition.

Yes this allows you to do some things for strings in a way that would
be impossible or ambiguous should you want to do the same things for
other kind of sequences. 

The question is: should you?  

if you want to provide new functionality for strings and you have
the choice between doing it 

  1) in a way, that will make it easy to extend this functionality
     in a consistent way to other sequences 

  2) in a way that will make it impossible to extend this functionality
     in a consistent way to other sequences.

then I think that unless you have very good arguments you should pick (1). 

Because if you pick (2) even if this functionality is never extened in
the language itself, you make it more difficult for programmers to add
this functionality in a consistent way to a subclass of list themselves.

People are always defending duck-typing in this news group and now python
has chosen to choose the option that makes duck-typing more difficult.

> This was discussed (at my instigation, IIRC) on python-dev when Python 
> (2.4?) adopted the enhanced semantics for "in" on strings - formerly 
> only tests for single characters were allowed - but wasn't thought 
> significant enough to deny what was felt to be a "natural" usage for 
> strings only.

Which I consider a pity.

-- 
Antoon Pardon