[Python-Dev] xreadlines : readlines :: xrange : range

Guido van Rossum guido@python.org
Thu, 04 Jan 2001 09:16:39 -0500


[Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]

Thomas, I really don't understand it.  The getc() source code you
showed calls getc_unlocked().  So how can it be faster?  The answer
must be somewhere else...  Cache line conflicts, the rewriting of the
loop that I did, a compiler bug, the inlining, who knows.  Can you
compare the generated assembly code?  On other platforms,
getc_unlocked() typically speeds the readline() test case up by a
significant factor (as in your BSDI numbers, where it's almost 3x
faster).

Could it be that you're mistaken and that somehow getc_unlocked() is
*not* chosen on FreeBSD?  Then I could believe it, the rewritten loop
is so different that the optimizer might have done something different
to it.  (Check config.h.  When all else fails, I put an #error in the
#ifdef branch that I expect not to be taken.)

Could it be that somehow getc_unlocked() is later defined to be the
same as getc(), so choosing it just adds the overhead of calling
f[un]lockfile() for each line?

--Guido van Rossum (home page: http://www.python.org/~guido/)