[Python-Dev] file.readinto performance regression in Python 3.2 vs. 2.7?

Fri Nov 25 10:34:21 CET 2011

On Fri, Nov 25, 2011 at 5:41 PM, Eli Bendersky <eliben at gmail.com> wrote:
>> Eli, the use pattern I was referring to is when you read in chunks,
>> and and append to a running buffer. Presumably if you know in advance
>> the size of the data, you can readinto directly to a region of a
>> bytearray. There by avoiding having to allocate a temporary buffer for
>> the read, and creating a new buffer containing the running buffer,
>> plus the new.
>>
>> Strangely, I find that your readandcopy is faster at this, but not by
>> much, than readinto. Here's the code, it's a bit explicit, but then so
>> was the original:
>>
>> BUFSIZE = 0x10000
>>
>> def justread():
>>    # Just read a file's contents into a string/bytes object
>>    f = open(FILENAME, 'rb')
>>    s = b''
>>    while True:
>>        b = f.read(BUFSIZE)
>>        if not b:
>>            break
>>        s += b
>>
>> def readandcopy():
>>    # Read a file's contents and copy them into a bytearray.
>>    # An extra copy is done here.
>>    f = open(FILENAME, 'rb')
>>    s = bytearray()
>>    while True:
>>        b = f.read(BUFSIZE)
>>        if not b:
>>            break
>>        s += b
>>
>> def readinto():
>>    # Read a file's contents directly into a bytearray,
>>    # hopefully employing its buffer interface
>>    f = open(FILENAME, 'rb')
>>    s = bytearray(os.path.getsize(FILENAME))
>>    o = 0
>>    while True:
>>        b = f.readinto(memoryview(s)[o:o+BUFSIZE])
>>        if not b:
>>            break
>>        o += b
>>
>> And the timings:
>>
>> $ python3 -O -m timeit 'import fileread_bytearray'
>> 'fileread_bytearray.justread()'
>> 10 loops, best of 3: 298 msec per loop
>> $ python3 -O -m timeit 'import fileread_bytearray'
>> 'fileread_bytearray.readandcopy()'
>> 100 loops, best of 3: 9.22 msec per loop
>> $ python3 -O -m timeit 'import fileread_bytearray'
>> 'fileread_bytearray.readinto()'
>> 100 loops, best of 3: 9.31 msec per loop
>>
>> The file was 10MB. I expected readinto to perform much better than
>> readandcopy. I expected readandcopy to perform slightly better than
>> justread. This clearly isn't the case.
>>
>
> What is 'python3' on your machine? If it's 3.2, then this is consistent with
> my results. Try it with 3.3 and for a larger file (say ~100MB and up), you
> may see the same speed as on 2.7

It's Python 3.2. I tried it for larger files and got some interesting results.

readinto() for 10MB files, reading 10MB all at once:

readinto/2.7 100 loops, best of 3: 8.6 msec per loop
readinto/3.2 10 loops, best of 3: 29.6 msec per loop
readinto/3.3 100 loops, best of 3: 19.5 msec per loop

With 100KB chunks for the 10MB file (annotated with #):

matt at stanley:~/Desktop$ for f in read bytearray_read readinto; do for
v in 2.7 3.2 3.3; do echo -n "$f/$v "; "python$v" -m timeit -s 'import
readinto' "readinto.$f()"; done; done
read/2.7 100 loops, best of 3: 7.86 msec per loop # this is actually
faster than the 10MB read
read/3.2 10 loops, best of 3: 253 msec per loop # wtf?
read/3.3 10 loops, best of 3: 747 msec per loop # wtf??
bytearray_read/2.7 100 loops, best of 3: 7.9 msec per loop
bytearray_read/3.2 100 loops, best of 3: 7.48 msec per loop
bytearray_read/3.3 100 loops, best of 3: 15.8 msec per loop # wtf?
readinto/2.7 100 loops, best of 3: 8.93 msec per loop
readinto/3.2 100 loops, best of 3: 10.3 msec per loop # suddenly 3.2
is performing well?
readinto/3.3 10 loops, best of 3: 20.4 msec per loop

Here's the code: http://pastebin.com/nUy3kWHQ

>
> Also, why do you think chunked reads are better here than slurping the whole
> file into the bytearray in one go? If you need it wholly in memory anyway,
> why not just issue a single read?

Sometimes it's not available all at once, I do a lot of socket
programming, so this case is of interest to me. As shown above, it's
also faster for python2.7. readinto() should also be significantly
faster for this case, tho it isn't.

>
> Eli
>
>