help make it faster please

Bengt Richter bokr at oz.net
Sat Nov 12 06:10:06 EST 2005


On Sat, 12 Nov 2005 10:46:53 +0100, Sybren Stuvel <sybrenUSE at YOURthirdtower.com.imagination> wrote:

>Bengt Richter enlightened us with:
>> I suspect it's not possible to get '' in the list from
>> somestring.split()
>
>Time to adjust your suspicions:
>
>>>> ';abc;'.split(';')
>['', 'abc', '']
I know about that one ;-)
I meant somestring.split() just like that -- without a splitter argument.
My suspicion remains ;-)

>
>
>>>                    countDict[w] += 1
>>>                else:
>>>                    countDict[w] = 1
>> does that beat the try and get versions? I.e., (untested)
>>                  try: countDict[w] += 1
>>                  except KeyError: countDict[w] = 1
>
>OPs way is faster. A 'try' block is very fast, but trowing & handling
>an exception is slow.
Yes, so it's data sensitive. Same with the other one not mentioned:
                    countDict.setdefault(w, [0])[0] += 1  # accum count in length one list vs as int
                                                          # too much subscripting work though
>
>>                  countDict[w] = countDict.get(w, 0) + 1
>
>I think this has the potential of being faster than OPs method,
>because it's likely to be fully implemented in C.
>
Not sure what you mean. Don't forget, all those expressions are evaluated dynamically
step by step with byte codes being switched on in c, but still a bunch of steps. I.e.,
 >>> import dis
 >>> dis.dis(compile('countDict[w] = countDict.get(w, 0) + 1', '', 'exec'))
   1           0 LOAD_NAME                0 (countDict)
               3 LOAD_ATTR                1 (get)
               6 LOAD_NAME                2 (w)
               9 LOAD_CONST               0 (0)
              12 CALL_FUNCTION            2
              15 LOAD_CONST               1 (1)
              18 BINARY_ADD
              19 LOAD_NAME                0 (countDict)
              22 LOAD_NAME                2 (w)
              25 STORE_SUBSCR
              26 LOAD_CONST               2 (None)
              29 RETURN_VALUE

Looks like if you succeed most of the time without a KeyError, the try/except might be
a tad faster, depending on what the exception block setup/teardown costs.

 >>> dis.dis(compile("""\
 ... try: countDict[w] += 1
 ... except KeyError: countDict[w] = 1
 ... """, '', 'exec'))
   1           0 SETUP_EXCEPT            20 (to 23)
               3 LOAD_NAME                0 (countDict)
               6 LOAD_NAME                1 (w)
               9 DUP_TOPX                 2
              12 BINARY_SUBSCR
              13 LOAD_CONST               0 (1)
              16 INPLACE_ADD
              17 ROT_THREE
              18 STORE_SUBSCR
              19 POP_BLOCK
              20 JUMP_FORWARD            29 (to 52)

   2     >>   23 DUP_TOP
              24 LOAD_NAME                2 (KeyError)
              27 COMPARE_OP              10 (exception match)
              30 JUMP_IF_FALSE           17 (to 50)
              33 POP_TOP
              34 POP_TOP
              35 POP_TOP
              36 POP_TOP
              37 LOAD_CONST               0 (1)
              40 LOAD_NAME                0 (countDict)
              43 LOAD_NAME                1 (w)
              46 STORE_SUBSCR
              47 JUMP_FORWARD             2 (to 52)
         >>   50 POP_TOP
              51 END_FINALLY
         >>   52 LOAD_CONST               1 (None)
              55 RETURN_VALUE

Regards,
Bengt Richter



More information about the Python-list mailing list