Usefulness of the "not in" operator

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Oct 16 01:05:25 EDT 2011


On Sat, 15 Oct 2011 15:04:24 -0700, DevPlayer wrote:

> 1. I thought "x not in y" was later added as syntax sugar for "not x in
> y"
> meaning they used the same set of tokens. (Too lazy to check the actual
> tokens)

Whether the compiler has a special token for "not in" is irrelevant. 
Perhaps it uses one token, or two, or none at all because a pre-processor 
changes "x not in y" to "not x in y". That's an implementation detail. 
What's important is whether it is valid syntax or not, and how it is 
implemented.

As it turns out, the Python compiler does not distinguish the two forms:

>>> from dis import dis
>>> dis(compile('x not in y', '', 'single'))
  1           0 LOAD_NAME                0 (x)
              3 LOAD_NAME                1 (y)
              6 COMPARE_OP               7 (not in)
              9 PRINT_EXPR          
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        
>>> dis(compile('not x in y', '', 'single'))
  1           0 LOAD_NAME                0 (x)
              3 LOAD_NAME                1 (y)
              6 COMPARE_OP               7 (not in)
              9 PRINT_EXPR          
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        


Also for what it is worth, "x not in y" goes back to at least Python 1.5, 
and possibly even older. (I don't have any older versions available to 
test.)



> 2. "x not in y" ==>> (True if y.__call__(x) else False) 

y.__call__ is irrelevant. But for what it's worth:

(1) Instead of writing "y.__call__(x)", write "y(x)"

(2) Instead of writing "True if blah else False", write "bool(blah)".


> class Y(object):
>     def __contains__(self, x):
>         for item in y:
>         if x == y:
>             return True
>         return False

You don't have to define a __contains__ method if you just want to test 
each item sequentially. All you need is to obey the sequence protocol and 
define a __getitem__ that works in the conventional way:


>>> class Test:
...     def __init__(self, args):
...             self._data = list(args)
...     def __getitem__(self, i):
...             return self._data[i]
... 
>>> t = Test("abcde")
>>> "c" in t
True
>>> "x" in t
False

Defining a specialist __contains__ method is only necessary for non-
sequences, or if you have some fast method for testing whether an item is 
in the object quickly. If all you do is test each element one at a time, 
in numeric order, don't bother writing __contains__.


> And if you wanted "x not in y" to be a different token you'd have to ADD

Tokens are irrelevant. "x not in y" is defined to be the same as "not x 
in y" no matter what. You can't define "not in" to do something 
completely different.


> class Y(object):
>     def __not_contained__(self, x):
>         for item in self:
>             if x == y:
>                 return False
>         return True
> 
> AND with __not_contained__() you'd always have to iterate the entire
> sequence to make sure even the last item doesn't match.
> 
> SO with one token "x not in y" you DON'T have to itterate through the
> entire sequence thus it is more effiecient.

That's not correct.




-- 
Steven



More information about the Python-list mailing list