Generating a large random string

Sean Ross sross at connectmail.carleton.ca
Thu Feb 19 17:28:07 EST 2004


"Andreas Lobinger" <andreas.lobinger at netsurf.de> wrote in message
news:4034D506.2E37519A at netsurf.de...
[snip]
> How to generate (memory and time)-efficient a string containing
> random characters? I have never worked with generators, so my solution
> at the moment is:
>
> import string
> import random
> random.seed(14)
> d = [random.choice(string.letters) for x in xrange(3000)]
> s = "".join(d)
> print s
>
> which is feasible for the 3000, but i need lengths in the range
> 10.000 to 1.000.000.
[snip]

There are several things to try, but here's one attempt that's
relatively fast but whose time (and size) still grow linearly
with the size of n:


from string import letters
from random import choice, sample, seed

# Note: should probably use timeit.py but this will do ...
from time import clock as now

n = 1000000

# your approach
seed(14)
start = now()
s = ''.join([choice(letters) for i in xrange(n)])
took = now() - start
print "old way n: %d took: %2.2fs"%(n, took)

# different approach
seed(14)
# add 1 so population > sample size (n)
factor = n/len(letters) + 1
start = now()
s = ''.join(sample(letters*factor, n))
took = now() - start
print "new way n: %d took: %2.2fs"%(n, took)


# Output: tested on Windows 98 500+mhz 128MB
old way n: 1000000 took: 23.94s
new way n: 1000000 took: 8.90s

There's a start ...

Sean





More information about the Python-list mailing list