Problem of function calls from map()

Tue Aug 22 09:46:02 EDT 2006

Dasn  <dasn at bluebottle.com> wrote:
># size of 'dict.txt' is about 3.6M, 154563 lines
>f = open('dict.txt', 'r')
>print "Reading lines..."
>lines = f.readlines()
>print "Done."
 [ ... ]
>def sp1(lines):
>	"""====> sp1() -- List-comprehension"""
>	return [s.split('\t') for s in lines]
 [ ... ]
>def sp4(lines):
>	"""====> sp4() -- Not correct, but very fast"""
>	return map(str.split, lines)
>
>for num in xrange(5):
>	fname = 'sp%(num)s' % locals()
>	print eval(fname).__doc__
>	profile.run(fname+'(lines)')

>====> sp1() -- List-comprehension
>         154567 function calls in 12.240 CPU seconds
 [ ... ]
>====> sp4() -- Not correct, but very fast
>         5 function calls in 3.090 CPU seconds
 [ ... ]
>The problem is the default behavior of str.split should be more complex
>than str.split('\t'). If we could use the str.split('\t') in map(), the
>result would be witty. What do u guys think?

I think there's something weird going on -- sp4 should be making
154563 calls to str.split. So no wonder it goes faster -- it's
not doing any work.

How does [s.split() for s in lines] compare to sp2's
[s.split('\t') for s in lines] ?

-- 
\S -- siona at chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
  ___  |  "Frankly I have no feelings towards penguins one way or the other"
  \X/  |    -- Arthur C. Clarke
   her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump