[issue33671] Efficient efficient zero-copy syscalls for shutil.copy* functions (Linux, OSX and Win)

Giampaolo Rodola' report at bugs.python.org
Mon May 28 12:17:25 EDT 2018


New submission from Giampaolo Rodola' <g.rodola at gmail.com>:

Patch in attachment uses platform specific zero-copy syscalls on Linux and Solaris (os.sendfile(2)), Windows (CopyFileW) and OSX (fcopyfile(2)) speeding up shutil.copyfile() and other functions using it (copy(), copy2(), copytree(), move()).

Average speedup for a 512MB file copy is +24% on Linux, +50% on OSX and +48% on Windows by copying file on the same partition (SSD disk was used).

Follows some benchmarks.

Setup
=====

Create 128K, 8M, 512M file:

    $ python -c "import os; f = open('f1', 'wb'); f.write(os.urandom(128 * 1024))"
    $ python -c "import os; f = open('f1', 'wb'); f.write(os.urandom(8 * 1024 * 1024))"
    $ python -c "import os; f = open('f1', 'wb'); f.write(os.urandom(512 * 1024 * 1024))"

Benchmark:

    $ time ./python -m timeit -s 'import shutil; p1 = "f1"; p2 = "f2"' 'shutil.copyfile(p1, p2)'

Linux
=====

128K copy (+13%):

    without patch:
        1000 loops, best of 5: 228 usec per loop
        real    0m1.756s
        user    0m0.386s
        sys     0m1.116s

    with patch:
        1000 loops, best of 5: 198 usec per loop
        real    0m1.464s
        user    0m0.281s
        sys     0m0.958s

8MB copy (+24%):

    without patch:
        50 loops, best of 5: 10.1 msec per loop
        real    0m2.703s
        user    0m0.316s
        sys     0m1.847s

    with patch:
        50 loops, best of 5: 7.78 msec per loop
        real    0m2.447s
        user    0m0.086s
        sys     0m1.682s

512MB copy (+26%):

    without patch:
        1 loop, best of 5: 872 msec per loop
        real    0m5.574s
        user    0m0.402s
        sys     0m3.115s

    with patch:
        1 loop, best of 5: 646 msec per loop
        real    0m5.475s
        user    0m0.037s
        sys     0m2.959s

OSX
===

128K copy (+8.5%):

    without patch:
        500 loops, best of 5: 508 usec per loop
        real    0m2.971s
        user    0m0.442s
        sys     0m2.168s

    with patch:
        500 loops, best of 5: 464 usec per loop
        real    0m2.798s
        user    0m0.379s
        sys     0m2.031s

8MB copy (+67%):

    without patch:
        20 loops, best of 5: 32.8 msec per loop
        real    0m3.672s
        user    0m0.357s
        sys     0m1.434s

    with patch:
        20 loops, best of 5: 10.8 msec per loop
        real    0m1.860s
        user    0m0.079s
        sys     0m0.719s

512MB copy (+50%):

    without patch:
        1 loop, best of 5: 953 msec per loop
        real    0m5.930s
        user    0m1.021s
        sys     0m4.835s
    
    with patch:
        1 loop, best of 5: 480 msec per loop
        real    0m3.150s
        user    0m0.067s
        sys     0m2.740s

Windows
=======

128K copy (+69%):

    without patch:
        50 loops, best of 5: 6.45 msec per loop
    with patch:
        50 loops, best of 5: 1.99 msec per loop

8M copy (+64%):

    without patch:
        10 loops, best of 5: 22.6 msec per loop
    with patch:
        50 loops, best of 5: 7.95 msec per loop

512M copy (+48%):

    without patch:
        1 loop, best of 5: 1.21 sec per loop
    with patch:
        1 loop, best of 5: 629 msec per loop

----------
components: Library (Lib)
files: shutil-zero-copy.diff
keywords: needs review, patch
messages: 317878
nosy: giampaolo.rodola
priority: normal
severity: normal
stage: patch review
status: open
title: Efficient efficient zero-copy syscalls for shutil.copy* functions (Linux, OSX and Win)
type: performance
versions: Python 3.8
Added file: https://bugs.python.org/file47621/shutil-zero-copy.diff

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue33671>
_______________________________________


More information about the Python-bugs-list mailing list