[issue32453] shutil.rmtree can have O(n^2) performance on large dirs
Niklas Hambüchen
report at bugs.python.org
Sun Dec 31 15:11:40 EST 2017
Niklas Hambüchen <nh2 at deditus.de> added the comment:
It turns out I was wrong when saying that there's some cache we're hitting.
In fact, `rm -r` is still precisely O(n^2), even with the coreutils patch I linked.
Quick overview table of the benchmark:
nfiles real user sys
100000 0.51s 0.07s 0.43s
200000 2.46s 0.15s 0.89s
400000 10.78s 0.26s 2.21s
800000 44.72s 0.58s 6.03s
1600000 180.37s 1.06s 10.70s
Each 2x increase of number of files results in 4x increased deletion time.
Full benchmark output:
# time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)
real 0m0.722s
user 0m0.032s
sys 0m0.680s
# time rm -rf dirtest/
real 0m0.519s
user 0m0.074s
sys 0m0.437s
# time (mkdir dirtest && cd dirtest && seq 1 200000 | xargs touch)
real 0m1.576s
user 0m0.044s
sys 0m1.275s
# time rm -r dirtest/
real 0m2.469s
user 0m0.150s
sys 0m0.890s
# time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)
real 0m4.249s
user 0m0.098s
sys 0m2.804s
# time rm -rf dirtest/
real 0m10.782s
user 0m0.265s
sys 0m2.213s
# time (mkdir dirtest && cd dirtest && seq 1 800000 | xargs touch)
real 0m10.533s
user 0m0.204s
sys 0m5.758s
# time rm -rf dirtest/
real 0m44.725s
user 0m0.589s
sys 0m6.037s
# time (mkdir dirtest && cd dirtest && seq 1 1600000 | xargs touch)
real 0m34.480s
user 0m0.382s
sys 0m12.057s
# time rm -r dirtest/
real 3m0.371s
user 0m1.069s
sys 0m10.704s
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32453>
_______________________________________
More information about the Python-bugs-list
mailing list