[Python-Dev] xreadlines : readlines :: xrange : range
Guido van Rossum
guido@python.org
Thu, 04 Jan 2001 09:16:39 -0500
[Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]
Thomas, I really don't understand it. The getc() source code you
showed calls getc_unlocked(). So how can it be faster? The answer
must be somewhere else... Cache line conflicts, the rewriting of the
loop that I did, a compiler bug, the inlining, who knows. Can you
compare the generated assembly code? On other platforms,
getc_unlocked() typically speeds the readline() test case up by a
significant factor (as in your BSDI numbers, where it's almost 3x
faster).
Could it be that you're mistaken and that somehow getc_unlocked() is
*not* chosen on FreeBSD? Then I could believe it, the rewritten loop
is so different that the optimizer might have done something different
to it. (Check config.h. When all else fails, I put an #error in the
#ifdef branch that I expect not to be taken.)
Could it be that somehow getc_unlocked() is later defined to be the
same as getc(), so choosing it just adds the overhead of calling
f[un]lockfile() for each line?
--Guido van Rossum (home page: http://www.python.org/~guido/)