[Tutor] Prime Factorization Tool

Thu Dec 1 15:11:23 CET 2011

On Thu, Dec 1, 2011 at 7:15 AM, Robert Sjoblom <robert.sjoblom at gmail.com>wrote:

> So I've recently started poking at the Project Euler site, because I
> feel that I need to practice writing code. For those of you interested
> in solving the problems on your own I advice you to not read this, as
> it will spoil the solution.
>
> Problem 3 is this:
> The prime factors of 13195 are 5, 7, 13 and 29.
>
> What is the largest prime factor of the number 600851475143 ?
>
> I came up with this pseudocode:
> #pseudocode:
> #    divide by x until non-prime is found:
> #        if not remainder:
> #            check if it's prime or not:
> #                if not prime: exit loop
> #            if not prime: store in variable
> #        increment x by 1
>
> Here are the functions I came up with:
>
> def isprime(n):
>    if n == 1:
>        return False
>    for x in range(2, n):
>        if n % x == 0:
>            return False
>    else:
>        return True
>
> def factorization(n):
>    factor = 0
>    x = 2
>    while True:
>        if n % x == 0:
>            if isprime(x):
>                factor = x
>            else:
>                return factor
>        x += 1
>
> This, however, feels horribly inefficient. Is there a better way to do
> it? I think most of my problems stem from my weak mathematical skills,
> to be honest. This wasn't too bad, but even for problems 1 and 2 I've
> relied on brute forcing the answer instead of looking for a
> mathematical solution.

Well, there are really only a couple of optimizations that you could make.
That's the nice (bad?) thing about primes - you really only *can* brute
force a solution. That's why nice things like encryption exist.

The less obvious optimization is in reference to primes - you don't
actually have to check all the way up to N. Or even N/2. You only have to
check numbers up to the square root of N. This explanation may not be
mathematically sound, but basically the reason this works is the definition
of prime: N is divisible only by N and 1. If you divide a number by 2 then
then the result will be the largest factor (e.g. 100/2 = 50). But as you
start increasing the divisor then obviously the result has to decrease. The
point at which these numbers are equivalent? The square root, of course.

That will easily decrease the running time of your program because now
instead of dividing N numbers you're dividing sqrt(N) numbers.

The other optimization that you can do is in the factorization loop -
you're checking all factors from 2 to N. The easiest fix here is the fact
that you *know* that no even number can be a prime factor for the simple
reason that it's divisible by 2. So start your loop at 3 and increment by 2
and you've just cut out half the numbers. But wait, there's more! What is
the largest possible factor of a number?

I'll let you think about it for a moment...

Got it?

Hey, no peeking!

Well, I guess you can peek.

N/2 is the largest possible factor. Any number larger than N/2 is somewhere
between N/2 and N/1, and since there's no whole number between 2 and 1 you
know that the largest number you need to check for a factor is N/2, or in
the case of 100/2 = 50.

So for 100, instead of checking if 100 numbers are prime, now you're only
checking:

3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 49

*much* fewer numbers. And for each N in that list you're only checking up
to the sqrt(N). Now you've vastly improved your running time.

Now, I'm not sure what the time complexity of these two operations is, but
go ahead and put this line in your isprime:

if n == 11: print("Checked 11... again")

Can you think of a way to compute the primes you need to check only once?

HTH,
Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20111201/19c66bd2/attachment-0001.html>