[Edu-sig] Pythonic Math must include...

Gregor Lingl gregor.lingl at aon.at
Mon Jan 19 00:31:36 CET 2009


kirby urner schrieb:
> I think you're correct that the sieve best works with a pre-specified
> finite domain: 
>   
....

>> To continue work in this area one (or at least me) has to have
>> some criteria to judge the solutions.
>> Clearly it was advantageous if there was some consensus about
>> these criteria in the community.
>>     
>
> Fortunately, we have hundreds of years of math pedagogy, so in terms
> of avoiding quarrels, start with what's already on the books are "must
> have" and just render it Pythonically.
....

>> There should be some criteria concerning
>> (a) the choice of problems and themes,
>>    e.g. to prefer small problems that expose a single idea  -  or rather not
>> ...   etc.,
>> as well as some
>> (b) code related criteria, like clarity, conciseness, efficiency, beauty (!)
>> etc., ranked according to
>> their priorities.
>>     
>
> This will be up to each professional teacher in whatever walk of life
> -- to judge what to include and what to exclude.  Each teacher will
> find her or himself in agreement with some, disagreement with others,
> over what to include.  Twas ever thus.
>   

I think it's not that easy. I'd like to dive a bit into this topic, 
resuming the code examples
given in this thread and adding a few additional ideas. What concerns 
efficiency, I've
measured the time to compute the 9592/9593 primes in the range up to 
100000 and (in order to
get some idea how the algorithm scales) also those up to 500000 (on my 
machine).

Here is your, Kirby's, code:

def primes():
    sofar = [-1, 2,3] # a running start, -1 proposed by J.H. Conway
    yield sofar[0] # get these out of the way
    yield sofar[1] # the only even prime
    yield sofar[2] # and then 3
    candidate = 5 # we'll increment from here on
    while True: # go forever
        for factor in sofar[1:]: # skip -1 (or don't use it in the first 
place)
            if factor ** 2 > candidate:  # did we pass?
                yield candidate # woo hoo!
                sofar.append(candidate) # keep the gold
                break  # onward!
            if not candidate % factor: # oops, no remainder
                break  # this is a composite
        candidate += 2 # next odd number please

Time:    100000:  1.71 s                500000:  42.72 s
-----

Michel Paul's code:

def primes():
    sofar = [-1, 2,3] # a running start, -1 proposed by J.H. Conway
    yield sofar[0] # get these out of the way
    yield sofar[1] # the only even prime
    yield sofar[2] # and then 3
    candidate = 5 # we'll increment from here on
    while True: # go forever
        for factor in sofar[1:]: # skip -1 (or don't use it in the first 
place)
            if factor ** 2 > candidate:  # did we pass?
                yield candidate # woo hoo!
                sofar.append(candidate) # keep the gold
                break  # onward!
            if not candidate % factor: # oops, no remainder
                break  # this is a composite
        candidate += 2 # next odd number please

Time:    100000:  1.58 s                500000:  32.25 s
-----

I've modified this one slightly, with a surprising effect
(I conjecture mainly due to avoiding copying p repeatedly):

def primes():
    yield(-1)
    p = [2, 3]
    for x in p: yield x
    def is_prime(n):
        for factor in p:
            if factor**2 > n: return True
            if n%factor == 0: return False
    multiple = 6
    while True:
        for cand in multiple-1, multiple+1:
            if is_prime(cand):
                yield cand
                p.append(cand)
        multiple += 6

Time:    100000:  0.14 s                500000:  1.05 s
-----

Scott Daniels code:

def primes():
    for x in -1, 2, 3:
        yield x
    gen = primes().next
    for top in iter(gen, 3):
        pass # skip useless tests (we skip all multiples of 2 or 3)
    factors = []
    # get pump ready for a 5
    check = -1
    limit = 3 * 3 - 2 # How far will 3 work as top prime?
    factors = []
    while True:
        check += 6
        if check >= limit: # move if this pair needs another factor
            top = gen()
            limit = top * top - 2 # limit for both candidates
            factors.append(top)
        for element in check, check + 2:
            for factor in factors:
                if element % factor == 0:
                    break
            else:
                yield element

Time:    100000:  0.07 s                500000:  0.50 s
-----

Compare the above generators to sieve algorithms:

G.L. minimal sieve

def primes(n):
    s = set(range(3,n+1,2))
    for m in range(3, int(n**.5)+1, 2):
        s.difference_update(range(m*m, n+1, 2*m))
    return [2]*(2<=n) + sorted(s)

Time:    100000:  0.014 s                500000:  0.11 s
-----


G.L.: sieve

def primes(n):
    s = set(range(3,n+1,2))
    if n >= 2: s.add(2)
    m=3
    while m * m < n:
        s.difference_update(range(m*m, n+1, 2*m))
        m += 2
        while m not in s: m += 2
    return sorted(s)

Time:    100000:  0.012 s                500000:  0.086 s
-----

Apparently sieves are considerably faster at the cost
that the whole sieve has to be calculated before the primes
can be retrieved. Not a disadvantage if you only want to know
the n-th prime.

So this suggests to make use of the sieve-paradigm also for
prime generators. Here is a rather primitive example:

G.L. sieve based generator:

def primes():
    CHUNKSIZE = 5000 # arbitrary even
    primes = []
    yield 2

    # first chunk
    candidates = range(3,CHUNKSIZE,2)
    chunk = set(candidates)
    for m in candidates:
        if m*m > CHUNKSIZE:
            break
        chunk.difference_update(range(m*m, CHUNKSIZE, m))
    chunk = sorted(list(chunk))
    for p in chunk:
        yield p
    primes.extend(chunk)
    lower = CHUNKSIZE
   
    # more chunks
    while True:
        upper = lower + CHUNKSIZE
        chunk = set(range(lower+1,upper,2))
        for m in primes:
            if m*m > upper:
                break
            chunk.difference_update(range(lower + m - lower%m, upper, m))
        chunk = sorted(list(chunk))
        for p in chunk:
            yield p
        primes.extend(chunk)
        lower = upper

Time:    100000:  0.023 s                500000:  0.113 s
-----
      
Of course there are many more possibilities to realize this, for
instance to use slice-assignment instead of sets. This would
require some somewhat opaque index calculations, but be - as far
as I rember - even faster. Particularly I suppose that there
exist faster algorithms for generating primes, but I admittedly
don't know them.

======================================================

Summing up:

Kirby                          1.71 s                  42.72 s
Michel Paul                    1.58 s                  32.25 s
Michel Paul modified::         0.14 s                   1.05 s
Scott Daniels:                 0.07 s                   0.50 s
G.L. minimal sieve:            0.014 s                  0.109 s
G.L. sieve:                    0.012 s                  0.086 s
G.L. sieve-based generator:    0.023 s                  0.113 s   
(performance depends on CHUNKSIZE)

This exposes in my opinion an unsurmountable dilemma, namely
that usually you cannot meet even those few criteria mentioned
in the beginning in a single solution. 

So under the aspects you exposed at the beginning of this thread,
"Pythonic Math", which title in some sense includes and/or addresses
this dilemma, how would you prefer to weight those criteria?

Regards

Gregor














More information about the Edu-sig mailing list