[issue37682] random.sample should support iterators

Thomas Dybdahl Ahle report at bugs.python.org
Thu Jul 25 13:03:28 EDT 2019


New submission from Thomas Dybdahl Ahle <lobais at gmail.com>:

Given a generator `f()` we can use `random.sample(list(f()), 10)` to get a uniform sample of the values generated.
This is fine, and fast, as long as `list(f())` easily fits in memory.
However, if it doesn't, one has to implement the reservoir sampling algorithm as a pure python function, which is much slower, and not so easy.

It seems that having a fast reservoir sampling implementation in `random.sample` to use for iterators would be both useful and make the API more predictable.

Currently when passing an iterator `random.sample` throws `TypeError: Population must be a sequence or set.`.
This is inconsistent with most of the standard library which accepts lists and iterators transparently.

I apologize if this enhancement has already been discussed.
I wasn't able to find it.
If wanted, I can write up a pull request.
I believe questions like this: https://stackoverflow.com/questions/12581437/python-random-sample-with-a-generator-iterable-iterator makes it clear that such functionality is wanted and non-obvious.

----------
components: Library (Lib)
messages: 348445
nosy: thomasahle
priority: normal
severity: normal
status: open
title: random.sample should support iterators
type: enhancement
versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37682>
_______________________________________


More information about the Python-bugs-list mailing list