newbie Q: sequence membership

Sat Nov 17 04:35:41 EST 2007

On Nov 17, 6:02 pm, saccade <tri... at gmail.com> wrote:
> >>> a, b = [], []
> >>> a.append(b)
> >>> b.append(a)
> >>> b in a
> True
> >>> a in a
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> RuntimeError: maximum recursion depth exceeded in cmp
>
> >>> a is a[0]
> False
> >>> a == a[0]
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> RuntimeError: maximum recursion depth exceeded in cmp
>
> ----------
>
> I'm a little new to this language so my mental model on whats going on
> may need to be refined.

And that can be done by reading the fine manual, specifically
    http://docs.python.org/ref/comparisons.html

This chapter contains 2 rules that are relevant to your questions:

R1: """ For the list and tuple types, x in y is true if and only if
there exists an index i such that x == y[i] is true. """ This might be
slightly clearer if read as "In the case of y being a tuple or list
(and x of course being any expression), x in y is true ...."

So that immediately tells you that it uses "x == y[i]", not "x is
y[i]".

R2: """ Tuples and lists are compared lexicographically using
comparison of corresponding elements. This means that to compare
equal, each element must compare equal and the two sequences must be
of the same type and have the same length. """

>
> I expect "a in a" to evaluate to "False". Since it does not it may be
> that while checking equality it uses "==" and not "is".

Yes, it uses "=="; see R1 above.

> If that is the
> reason then the question becomes why doesn't "a == a[0]" evaluate to
> "False"?

R1 says that you must evaluate a == a[0], but both are lists, so R2
says you must evaluate a[0] == a[0][0], but both of those are lists,
so you must evaluate a[0][0] == a[0][0][0] and so on ad infinitum.

So, you might ask, why do "a in b" and "b in a" both return True
(correctly)? That's because "a in b" needs to test "a == b[0]" and
that's the same as "a == a" and list comparison is smart enough to
make the cheap test "x is y[i]" first; a true value here means that
a less cheap (and possibly infinite) evaluation of "x == y[i]" can be
avoided.

> As a side, and if that is the reason, is there a version of
> "in" that uses "is"?

No. You could write your own, but it just doesn't appear practically
useful. For example,

x = string_extracted_from_file_or_db
y = ['1A', '9Z']

It's highly likely that when x == '1A', "x is_in y" is False --
because x and y[0] are different objects. Try explaining that to the
novices.

Worse: Consider z = ['A1', 'Z9']. It's highly likely that when x ==
'A1', "x is_in z" is True -- because an unguaranteed implementation-
dependent caper caches or "interns" some values, so that x and z[0]
are the same object. Try explaining that to the novices!!

Do you have a use case for that?

> "a is in a" does not work.

Correct, it's not valid syntax.

HTH,
John