[Python-ideas] speeding up shutil.copy*()

Daniel Holth dholth at gmail.com
Sun Mar 3 19:50:23 CET 2013


Great idea. I would also appreciate being able to simply specify the block
size in more places.

This is probably the kind of change that you could get in as a patch.
On Mar 3, 2013 1:40 PM, "Charles-François Natali" <cf.natali at gmail.com>
wrote:

> > This allocates and frees a lot of buffers, and could be optimized with
> > readinto().
> > Unfortunately, I don't think we can change copyfileobj(), because it
> > might be passed objects that don't implement readinto().
>
> Or we could just use:
> if hasattr(fileobj, 'readinto')
>
> hoping that readinto() is really a readinto() implementation and not
> an unrelated method :-)
>
> > sendfile() is a Linux-only syscall. It's also limited to certain kinds
> > of file descriptors. The limitations have been lifted in recent kernel
> > versions.
>
> No, it's not Linux-only, many BSD also have it, although all don't
> support an arbitrary output file descriptor (Solaris does allow
> regular files too). It would be possible to catch EINVAL/EBADF, and
> fall back to a regular copy loop.
>
> Note that the above benchmark is really biased by writing the data to
> /dev/null: with a real target file, the zero-copy wouldn't bring such
> a large gain, because the bottleneck will really be the I/O devices
> (also a read()/write() loop is more expensive in Python than in C).
> But I see at least two cases where it could be interesting: when
> reading/writing from/to a tmpfs partition, or when the source and
> target files are on different disks.
>
> I'm not sure it's worth it though, that's why I'm asking here :-) (but
> I do think readinto() is interesting).
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130303/66061396/attachment.html>


More information about the Python-ideas mailing list