a better prime number generator

Tue Oct 23 14:18:28 EDT 2001

On Tue, 23 Oct 2001 11:26:46 -0300, Alves, Carlos Alberto - Coelce
<calves at coelce.com.br> wrote:
>Why not try a variation from basic example in Python Tutorial by Guido van
>Rossum?!
>
>def getPrimesTill(x):
>	primes=[2]
>	for i in range(3,x,2):
>		for x in range(2,i):

        Whoa nellie!           ^^^^

>			if i%x==0:
>				break
>		else:
>			primes.append(i)
>	return primes

First of all, I'd replace both calls to range() with xrange(), to
avoid memory problems.

That done, it works, but it tests many more possible factors than it
needs to if the number actually is prime. Since we're talking about
speed here: On my machine, it takes 9 minutes to run getPrimesTill(2
** 16) even after changing range() to xrange(). The inner loop
actually spends *most* of its time looking for factors of numbers that
*are* prime (since non-primes usually break out of the loop pretty
quickly).  So if we can drastically reduce the number of candidate
factors, it'll help.

As others have pointed out in this thread: If i has no factors less
than or equal to sqrt(i), then there are no factors at all. Why?
Because if there is a factor, x, larger than sqrt(i), then there is a
corresponding factor (i / x) which must be less than sqrt(i).

So you could change that line to

		for x in xrange(2, int(math.sqrt(i)) +1):

Now I get it to compute up to 2 ** 16 in 4 seconds.  Now that's a
hefty optimization!

But wait a minute - we already know that even numbers aren't prime,
and we skip them in the outer loop. So why do we bother testing even
factors? If x was divisible by an even number, x would be even, and we
would have skipped it in the outer loop! We can skip half the tests:

		for x in xrange(3, int(math.sqrt(i)) +1, 2):

That further cuts the execution time by about 40%.  It also, in my
tests, beats Brian Zhou's recursive list comprehension versions by a
wide margin.

But why stop there? You can apply the same principle to other numbers
than 2. Suppose we test 5 and it's not a factor. Then we know we can
skip all multiples of 5 because they won't be factors either! What
this means is that we only need to test prime numbers as possible
factors, and conveniently, we've been building a list of them all
along. Well, surprise - the version posted by Emille van Sebille, from
demo/scripts in the python distribution, does exactly that. So let's
modify it to have the same interface as your function.

def getPrimesTill(max):
    # Based on primes.py from python cvs
    primes = [2]
    i = 3
    while i <= max:
        for p in primes:
            if i%p == 0 or p*p > i: break
        if i%p <> 0:
            primes.append(i)
        i += 2
    # 1 is prime, but we don't want it in the list during the loop
    primes.insert(0, 1) 
    return primes

This takes about 60% of the time of my fastest version.  But I'm not a
big fan of using while to iterate over a sequence, since I think it's
less readable than the equivalent xrange(), so here's my final answer:

def getPrimesTill(max):
    # Based on primes.py from python cvs
    primes = [2]
    for i in xrange(3, max, 2):
        for p in primes:
            if i % p == 0 or p * p > i:
                break
        if i % p != 0:
            primes.append(i)
    primes.insert(0, 1)
    return primes

Hope that helps.

-- Paul Winkler