iterating over strings seems to be really slow?

Wed Aug 27 17:21:57 EDT 2014

On 2014-08-27 16:53, Rodrick Brown wrote:
> *I'm confused why the former function runs significantly faster when
> wc1() builds the hash on a single pass and doesn't waste memory of
> returning an array of strings? *
> 
> *I would think wc2() to be slower what's going on here? *
> 
> 
> #!/usr/bin/env python
> 
> s = "The black cat jump over the bigger black cat"
> 
> 
> def wc1():
>     word=""
>     m={}
>     for c in s:
>         if c != " ":
>             word += c

String-building a character-at-a-time is slow.  Also, it doesn't
produce the same results as wc2() does.  Check

  if wc1() == wc2():
    print("Success")
  else:
    print("doh!")

> def wc2():
>     m={}
>     for c in s.split():
>         if m.has_key(c):
>             m[c] += 1
>         else:
>             m[c] = 1
>     return(m)

The thing that surprises me is that using collections.Counter() and
collections.defaultdict(int) are also slower than your wc2():

from collections import defaultdict, Counter

def wc3():
    return Counter(s.split())

def wc4():
    m = defaultdict(int)
    for c in s.split():
        m[c] += 1
    return m

-tkc