performance of script to write very long lines of random chars

Chris Angelico rosuav at gmail.com
Thu Apr 11 01:53:30 EDT 2013


On Thu, Apr 11, 2013 at 3:33 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> I was originally going to write that using the base64 module would
> introduce bias into the random strings, but after a little investigation,
> I don't think it does.

Assuming that os.urandom() returns bytes with perfectly fair
distribution (exactly equal chance of any value 00-FF - it probably
does, or close to it), and assuming that you work with exact multiples
of 3 bytes and 4 output characters, base64 will give you perfectly
fair distribution of result characters. You take three bytes (24 bits)
and turn them into four characters (6 bits per character, = 24 bits).
You might see some bias if you use less than a full set of four output
characters, though; I haven't dug into the details to check that.

ChrisA



More information about the Python-list mailing list