sampling from frequency distribution / histogram without replacement

Gregory Ewing greg.ewing at canterbury.ac.nz
Mon Jan 14 17:59:38 EST 2019


duncan smith wrote:
> Hello,
>       Just checking to see if anyone has attacked this problem before
> for cases where the population size is unfeasibly large.

The fastest way I know of is to create a list of cumulative
frequencies, then generate uniformly distributed numbers and
use a binary search to find where they fall in the list.
That's O(log n) per sample in the size of the list once it's
been set up.

-- 
Greg



More information about the Python-list mailing list