Method much slower than function?
Neil Cerutti
horpner at yahoo.com
Wed Jun 13 21:28:27 EDT 2007
On 2007-06-14, idoerg at gmail.com <idoerg at gmail.com> wrote:
> Hi all,
>
> I am running Python 2.5 on Feisty Ubuntu. I came across some code that
> is substantially slower when in a method than in a function.
>
> ################# START SOURCE #############
> # The function
>
> def readgenome(filehandle):
> s = ''
> for line in filehandle.xreadlines():
> if '>' in line:
> continue
> s += line.strip()
> return s
>
> # The method in a class
> class bar:
> def readgenome(self, filehandle):
> self.s = ''
> for line in filehandle.xreadlines():
> if '>' in line:
> continue
> self.s += line.strip()
>
> ################# END SOURCE ##############
> When running the function and the method on a 20,000 line text file, I
> get the following:
>
>>>> cProfile.run("bar.readgenome(open('cb_foo'))")
> 20004 function calls in 10.214 CPU seconds
>
> Ordered by: standard name
>
> ncalls tottime percall cumtime percall
> filename:lineno(function)
> 1 0.000 0.000 10.214 10.214 <string>:1(<module>)
> 1 10.205 10.205 10.214 10.214 reader.py:11(readgenome)
> 1 0.000 0.000 0.000 0.000 {method 'disable' of
> '_lsprof.Profiler' objects}
> 19999 0.009 0.000 0.009 0.000 {method 'strip' of 'str'
> objects}
> 1 0.000 0.000 0.000 0.000 {method 'xreadlines' of
> 'file' objects}
> 1 0.000 0.000 0.000 0.000 {open}
>
>
>>>> cProfile.run("z=r.readgenome(open('cb_foo'))")
> 20004 function calls in 0.041 CPU seconds
>
> Ordered by: standard name
>
> ncalls tottime percall cumtime percall
> filename:lineno(function)
> 1 0.000 0.000 0.041 0.041 <string>:1(<module>)
> 1 0.035 0.035 0.041 0.041 reader.py:2(readgenome)
> 1 0.000 0.000 0.000 0.000 {method 'disable' of
> '_lsprof.Profiler' objects}
> 19999 0.007 0.000 0.007 0.000 {method 'strip' of 'str'
> objects}
> 1 0.000 0.000 0.000 0.000 {method 'xreadlines' of
> 'file' objects}
> 1 0.000 0.000 0.000 0.000 {open}
>
>
> The method takes > 10 seconds, the function call 0.041 seconds!
>
> Yes, I know that I wrote the underlying code rather
> inefficiently, and I can streamline it with a single
> file.read() call instead if an xreadlines() + strip loop.
> Still, the differences in performance are rather staggering!
> Any comments?
It is likely the repeated attribute lookup, self.s, that's
slowing it down in comparison to the non-method version.
Try the following simple optimization, using a local variable
instead of an attribute to build up the result.
# The method in a class
class bar:
def readgenome(self, filehandle):
s = ''
for line in filehandle.xreadlines():
if '>' in line:
continue
s += line.strip()
self.s = s
To further speed things up, think about using the str.join idiom
instead of str.+=, and using a generator expression instead of an
explicit loop.
# The method in a class
class bar:
def readgenome(self, filehandle):
self.s = ''.join(line.strip() for line in filehandle)
--
Neil Cerutti
More information about the Python-list
mailing list