[Python-Dev] xreadlines : readlines :: xrange : range

Tim Peters tim.one@home.com
Fri, 5 Jan 2001 01:04:56 -0500


[Guido van Rossum]
> ...
> (Unfortunately, from a phone conversation I had last night with
> Tim, there's not much hope of doing something there -- and that
> platform [Win32] sorely needs it!  The hacks that Tim reported
> earlier are definitely not thread-safe.  While it's easy to come
> up with getc_unlocked() for Windows, the locking operations used
> internally there by the /MT code are not exported from MSVCRT.DLL,
> and that's crucial.)

The short course is that I still haven't found a workable way to lock
streams on Windows:  they do have a complete set of stream-locking functions
and macros, but there's no way short of deep magic I can find to get at them
("deep magic" == resort to assembler and patch in function addresses).

The only file-locking functions advertised in the C and platform SDK
libraries are trivial variants of Python's msvcrt.locking, but that has to
do with locking specific file byte-position ranges across processes, not
ensuring the integrity of runtime stream structures across threads.

Perl appears to ignore the issue of thread safety here (on Windows and
everywhere else).

Revealing experiment!

1. I threw away my changes and rebuilt from current CVS.

2. I made one change, expanding the getc() call in get_line to what MSVC
*would* expand it to if we weren't building in thread mode:

    if ((c = (--fp->_cnt >= 0 ?
              0xff & *fp->_ptr++ :
              _filbuf(fp))) == EOF) {

That alone reduced the runtime of my "while 1: readline" test case from over
30 seconds to 12.8.  What I did before went beyond that, by also (in effect)
unrolling the loop and optimizing it.  That bought an additional ~2 seconds.

So compared to Perl's 6 seconds, it looks like we're paying (on Win98SE)
approximately:

   17 seconds for compiling with _MT (threadsafe libc)
    6 seconds to do the work <wink>
    5 seconds for "other stuff", best guess mostly a poor
          platform malloc/realloc
    2 seconds for not optimizing the loop
   --
   30 total

Unfortunately, the smoking gun is the only one whose firing pin we can't
file down on this platform.

so-the-good-news-is-that-it's-impossible-for-perl-not-to-be-at-
    least-twice-as-fast<wink>-ly y'rs  - tim