Algorithm help per favore

Anton Vredegoor anton at vredegoor.doge.nl
Thu Jun 19 10:31:20 EDT 2003


psimmo60 at hotmail.com (Paul Simmonds) wrote:

>wrbt at email.com (Larry) wrote in message news:<2ec1bc1c.0306180746.159679d6 at posting.google.com>...
>> I need to take a list (probably 20k to 40k elements) of numbers and
>> remove consecutive duplicates. Non consecutive duplicates are ok.
>> 
>> Example: [6,3,3,3,5,7,6,3,4,4,3] => [6,3,5,7,6,3,4,3]
>> 
>> The 3 and 6 can appear more than once in the result set because
>> they're separated by another value. Obviously this is trivial to
>> accomplish by walking thru the list and building a new one (or yanking
>> elements out of the existing one) but I'm curious if anyone knows of a
>> more clever way, with speed being a goal more than memory usage.
>
>Another list comp method, taking a slightly different angle:
>
>[lst[i] for i in range(len(lst)) if(i!=len(lst)-1 and
>lst[i]!=lst[i+1])]
>
>Iterating over a list of integers rather than a modified list of
>objects seems to be faster. This would depend how the list was
>modified, but the len method is in C, and I'm guessing the range
>method is too.

Better to measure it! I've written a test script. By the way: your
method has an error, because it chops off the last element
unconditionally ... Having a lot of people compare code produces
alternatives, which in my opinion comes before passing tests.
Tests can be done later.

So below's my script (based partly on your idea and those of some
other posters).

Anton

"""
Test script to time various functions found on clp that
remove elements from L if equal to their neigbour

Output from a test run:

zipped  : 0.985 sec.
ranged  : 0.425 sec.
inplace : 0.206 sec.

"""

import random
import time

def zipped(L):
    return  list(L[:1]) + [x for x,y in zip(L[1:],L[:-1]) if x!=y] 

def ranged(L):
    return list(L[:1])+[L[i] for i in xrange(1,len(L))
        if L[i] != L[i-1]]

def inplace(thelist):
    i=0
    for item in thelist:
        if item==thelist[i]: continue
        i+=1
        thelist[i]=item
    del thelist[i+1:]
    return thelist
    
def test():
    n = 100000
    uniques = 10
    U=range(uniques)
 
    def choice():
        return random.choice(U)
    
    #make a list with random elements selected from U with replacement
    L = [choice() for i in range(n)]
    
    def timeit(f,arg):
        t = time.clock()
        res = f(L)
        print "%-8s: %.3f sec." %(f.__name__,time.clock()-t)
        return res
    
    #remove elements from L if equal to their neigbour
    #the inplace method is called last because it modifies L
    a,b,c = map(timeit,[zipped,ranged,inplace,],[L,L,L])

    assert a==b==c==L

if __name__=='__main__':
    test()





More information about the Python-list mailing list