[Tutor] Problem with if statements and else statements
Peter Otten
__peter__ at web.de
Tue May 30 06:06:47 EDT 2017
Cameron Simpson wrote:
> On 29May2017 01:17, Alan Gauld <alan.gauld at yahoo.co.uk> wrote:
>>On 29/05/17 00:12, Alex Kleider wrote:
>>> Would
>>>>>> if Month in {'January', '1'}:
>>>
>>> be even better? (regarding efficiency perhaps? Trivial point, I know,
>>> but just wondering.)
>>
>>If in doubt try it out and profile/time it.
>>
>>But I don't think it will make much difference since ultimately
>>it still has to test each value (although a hashing algorithm
>>may be involved that works on partial matches...) But if in
>>doubt...
>
> Hi Alex,
>
> As written it should be a bit slower: to construct a set each member get
> tested for presence. The cost is in making the set, not in searching it.
No, CPython is a bit smarter than that:
>>> dis.dis('if m in {"1", "January"}: pass')
1 0 LOAD_NAME 0 (m)
2 LOAD_CONST 3 (frozenset({'1', 'January'}))
4 COMPARE_OP 6 (in)
6 POP_JUMP_IF_FALSE 8
>> 8 LOAD_CONST 2 (None)
10 RETURN_VALUE
However, there seems to be limit:
>>> def check(n):
... f = io.StringIO()
... dis.dis("if m in {%s}: pass" % ",".join(map(str, range(n))), file=f)
... return f.getvalue().count("LOAD_CONST")
...
>>> check(123)
2
>>> check(124)
125
This is python 3.6; for 3.5 and 3.4 I found a maximum length of 80 using the
same method. Further tests indicate that you cannot have two sets of length
80.
> _However_, supposing your program were doing this a lot. You might well
> have a global (or, better, long lived shared object) containing a set that
> has already been constructed. Then:
>
> if Month in the_set:
>
> is very fast; constant time.
Overall, using globals may still be a good idea. Even if there's often no
direct performance reward (looking up a global is even a tad slower)
you gain predictability.
> Whereas as you would expect, checking a list
> is linear with the size of the list.
>
> So, using a list:
>
> seen = []
> for item in some_item_generator():
> if item in seen:
> continue
> seen.append(item)
> ... do stuff with item, which is new ...
>
> The cost of the "if" goes up linearly as you add more items.
>
> Using a set:
>
> seen = {}
> for item in some_item_generator():
> if item in seen:
> continue
> seen.add(item)
> ... do stuff with item, which is new ...
>
> The second version will be much more effiient as the "seen" set grows; the
> lookup time on the set is essentially O(1) (constant time).
>
> But for an ad hoc 2 element list as in your original example the
> difference will be pretty small; making the 2 element set _should_ be
> slightly more expensive, and isn't the common idiom (==> less readable).
> Personally I use:
>
> if value in ('a', 'b', 'c'):
>
> BTW, in Python we tend to use named like "Fred" for classes (or
> factories), and "fred" for regular variables. And "FRED" for things that
> would be constants in other languages. Eg:
>
> MAX_THINGS = 16
>
> class Foo:
> ....
>
> def FooBah(x):
> return Foo(x, style="bah")
>
> for fred in ....:
>
> Cheers,
> Cameron Simpson <cs at zip.com.au>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list