[Python-checkins] Add comments regarding speed/space/entropy trade-offs (GH-10885)

Miss Islington (bot) webhook-mailer at python.org
Tue Dec 4 03:13:43 EST 2018


https://github.com/python/cpython/commit/7fc633f5a56d9e672cd24133e2e1376347abac6c
commit: 7fc633f5a56d9e672cd24133e2e1376347abac6c
branch: master
author: Raymond Hettinger <rhettinger at users.noreply.github.com>
committer: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com>
date: 2018-12-04T00:13:38-08:00
summary:

Add comments regarding speed/space/entropy trade-offs (GH-10885)

files:
M Lib/random.py

diff --git a/Lib/random.py b/Lib/random.py
index 4b51b6696bfc..a7a86070c0a9 100644
--- a/Lib/random.py
+++ b/Lib/random.py
@@ -333,6 +333,19 @@ def sample(self, population, k):
         # preferred since the list takes less space than the
         # set and it doesn't suffer from frequent reselections.
 
+        # The number of calls to _randbelow() is kept at or near k, the
+        # theoretical minimum.  This is important because running time
+        # is dominated by _randbelow() and because it extracts the
+        # least entropy from the underlying random number generators.
+
+        # Memory requirements are kept to the smaller of a k-length
+        # set or an n-length list.
+
+        # There are other sampling algorithms that do not require
+        # auxiliary memory, but they were rejected because they made
+        # too many calls to _randbelow(), making them slower and
+        # causing them to eat more entropy than necessary.
+
         if isinstance(population, _Set):
             population = tuple(population)
         if not isinstance(population, _Sequence):



More information about the Python-checkins mailing list