[Python-checkins] bpo-36103: change default buffer size of shutil.copyfileobj() (GH-12115)

Inada Naoki webhook-mailer at python.org
Fri Mar 1 23:31:04 EST 2019


https://github.com/python/cpython/commit/4f1903061877776973c1bbfadd3d3f146920856e
commit: 4f1903061877776973c1bbfadd3d3f146920856e
branch: master
author: Inada Naoki <methane at users.noreply.github.com>
committer: GitHub <noreply at github.com>
date: 2019-03-02T13:31:01+09:00
summary:

bpo-36103: change default buffer size of shutil.copyfileobj() (GH-12115)

It is changed from 16KiB to 64KiB.  The previous default value
is used since 1990.

coreutils chose 128 KiB as minimum buffer size for block device I/O.

But shutil.copyfileobj() can be used for non block devices.
So I choose more conservative value.

As my quick benchmark, performance difference between 64KiB and
128 KiB is up to ~5%.  On the other hand, performance difference
between 32 KiB and 64 KiB can be more than 10% when file is fully
buffered.

This is why 64 KiB is rational value.

files:
A Misc/NEWS.d/next/Library/2019-03-01-16-10-01.bpo-36103.n6VgXL.rst
M Doc/library/shutil.rst
M Lib/shutil.py

diff --git a/Doc/library/shutil.rst b/Doc/library/shutil.rst
index 79d6bd4a06c8..587be3befa09 100644
--- a/Doc/library/shutil.rst
+++ b/Doc/library/shutil.rst
@@ -424,7 +424,7 @@ On Linux, Solaris and other POSIX platforms where :func:`os.sendfile` supports
 copies between 2 regular file descriptors :func:`os.sendfile` is used.
 
 On Windows :func:`shutil.copyfile` uses a bigger default buffer size (1 MiB
-instead of 16 KiB) and a :func:`memoryview`-based variant of
+instead of 64 KiB) and a :func:`memoryview`-based variant of
 :func:`shutil.copyfileobj` is used.
 
 If the fast-copy operation fails and no data was written in the destination
diff --git a/Lib/shutil.py b/Lib/shutil.py
index 9b50c2a9833a..7dd470dfaba4 100644
--- a/Lib/shutil.py
+++ b/Lib/shutil.py
@@ -49,7 +49,7 @@
 elif _WINDOWS:
     import nt
 
-COPY_BUFSIZE = 1024 * 1024 if _WINDOWS else 16 * 1024
+COPY_BUFSIZE = 1024 * 1024 if _WINDOWS else 64 * 1024
 _HAS_SENDFILE = posix and hasattr(os, "sendfile")
 _HAS_FCOPYFILE = posix and hasattr(posix, "_fcopyfile")  # macOS
 
diff --git a/Misc/NEWS.d/next/Library/2019-03-01-16-10-01.bpo-36103.n6VgXL.rst b/Misc/NEWS.d/next/Library/2019-03-01-16-10-01.bpo-36103.n6VgXL.rst
new file mode 100644
index 000000000000..97ed658d3762
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2019-03-01-16-10-01.bpo-36103.n6VgXL.rst
@@ -0,0 +1,3 @@
+Default buffer size used by ``shutil.copyfileobj()`` is changed from 16 KiB
+to 64 KiB on non-Windows platform to reduce system call overhead. Contributed
+by INADA Naoki.



More information about the Python-checkins mailing list