Flatten... or How to determine sequenceability?

Mon May 28 00:46:31 EDT 2001

[Alex Martelli]
> OK, "is-a sequence" is a typical Korzybskian "is of identity", thus
> I guess it's quite reasonable to have it be a fuzzy concept.
>
> But a need for something sharper does arise.

I agree, but haven't seen anything convincingly useful toward that end.
Call that "something sharper" S.  Is S more akin to universal truth, or to
something merely expedient depending on context?  In my experience, it's the
latter.

> "Behaves like a sequence" (or "can adapt to sequence protocol", or
> "respects sequence interface").  A dictionary "behaves" like a sequence
> in a very partial way: if I do a "for k in dict" and just process
> k I've lost substantial information (all of the values) and have
> warmly and lovingly been fed non-information (an arbitrary
> ordering of the keys).

Except we can't know that *just* from knowing "it's a dict".  For example,
dicts are often used to implement sets, typically using 1 or None as the
only value.  In that context all the info is in the keys, and hiding the
values is a helpful weeding out of irrelevant implementation detail.  We can
agree to say that a dict does, or does not, implement the sequence
interface, but for some apps some of the time either choice will get in the
way.

Overall I expect it's more useful more often to say that a dict does not
implement the sequence interface.  But in 2.2 it does implement the
iteration interface, and people will have to get used to that sequences are
a proper subset of iterable objects.

> I believe the normal concept of sequence does include ordering
> as being significant, and no information loss -- list, tuple,
> array.array, UserList -- no doubt files and (special cases that
> it would be nice to have another sharp test for:-) strings and
> unicode-strings, too.

Files are alone there in not supporting integer indexing (__getitem__(i)),
and that's "the essence" of a sequence to many apps.  So are files "really"
sequences?  Strings are alone in that s[i] returns an object of the same
type as s, and that's been the cause of many an unbounded recursion in
"generic sequence" routines.  So I fear that no matter how we define "it's a
sequence", it won't be particularly useful.  Perhaps we should have a large
set of predefined tag classes (IsIterable, IsOrdered, IsIntegerIndexable,
IsMutable, IsFixedLength, IsHashable, IsLexicographicallyOrdered, IsAString
<wink -- but it reads better than IsElementTypeSameAsContainerType!>, ...)
and let people derive from them however best suits their apps.  Not
particularly useful today, but Guido is working on changes that should allow
efficient subclassing of the builtin types.

> With distinctions within the overall category of sequences, sure
> (mutable vs not, with predefined length vs without).  But it seems
> to me that if either or both of the interfaces and protocol-adaptation
> PEPs passed, 'such-and-such-a-sequence's would be useful
> interfaces/protocols to have.

I do wonder.  If the goal is to document assumptions-- which is a good
thing --people could write comments today.  If the goal is to automate
error-checking, there's more value but it's likely to slow calls
significantly -- and then the audience in practice is slashed.  If the goal
is to speed programs by hoping Python will exploit type information,
somebody's smoking grass again <wink>.  Maybe the greatest value would be to
give us *a* documented answer to "what's a sequence?", which, perfect or
not, would be something everyone can rely on.

otoh-you-can-just-try-it-and-see-whether-it-breaks-ly y'rs  - tim