[Tutor] Removing duplicates

Peter Otten __peter__ at web.de
Thu Aug 2 08:49:12 EDT 2018


Oscar Benjamin wrote:

> On 1 August 2018 at 21:38, Roger Lea Scherer <rls4jc at gmail.com> wrote:
>>
>> I'm trying to get a list of tuples to be a float, a numerator, and a
>> denominator for all the fractions: halves, thirds, fourths etc up to
>> ninths. 1/2 returns the same float as 2/4, 3/6, 4/8. I would like to keep
>> only the 1/2. When I try (line 18) to "pop"  from the list I get a
>> "TypeError:
>> integer argument expected, got float". When I try (line 18)  to "remove"
>> from the list, nothing happens: nothing is removed and I do not receive
>> an error message.
>>
>> What do you think is a good way to solve this?
>>
>> Thank you as always.
>>
>> import math
>>
>> fractions = [(0, 0, 0)]
>>
>> for i in range(1, 10):
>>     for j in range(1, 10):
>>         if i < j:
>>             x = i/j
>>             if x not in fractions:
>>                 fractions.append((x, i, j))
>>     sortedFrac =  sorted(fractions)
>>
>> print(sortedFrac)
>>
>> for i in range(len(sortedFrac)):
>>     try:
>>         if sortedFrac[i][0] == sortedFrac[i-1][0]: # so if the float
>>         equals
>> the previous float
>>             sortedFrac.pop(sortedFrac[i][0])           # remove the
>>             second
>> float
>>         else:
>>             sortedFrac.append(sortedFrac[i][0])
>>     except ValueError:
>>         continue
> 
> Comparing floats for equality can be flakey. Sometimes two floats that
> should be equal will not compare equal e.g.:
> 
>>>> 0.01 + 0.1 - 0.1 == 0.01
> False

Do you know if there's a way to construct an example where

i/k != (n*i)/(n*k)

with preferrably small integers i, k, and n? Python's integer division 
algorithm defeats my naive attempts ;)

I had to resort to i/k == i/(k + 1)

>>> k = 10**18
>>> 1/k == 1/(k+1)
True

> This happens in this case because of intermediate rounding errors. I
> don't think that should affect you since you are doing precisely one
> floating point operation i/j to calculate each of your floats but
> unless you have a very solid understanding of binary floating point I
> would recommend that you shouldn't compare floating point values as in
> a==b.
> 
> For this particular problem I would use integer arithmetic or I would
> use the fractions module. Doing this with integers you should
> normalise the numerator and denominator by dividing out their GCD. The
> fractions module takes care of this for you internally.
> https://docs.python.org/3.7/library/fractions.html
> 
> Otherwise in general to remove duplicates you would be better of with
> a set rather than a list. If you only put x in the set and not i and j
> then the set will automatically take care of duplicates:
> https://docs.python.org/3.7/tutorial/datastructures.html#sets

And if you want to keep i and j you can use a dict:

fractions = {}
for i in range(1, 10):
    for j in range(1, 10):
        if i < j:
            x = i/j
            if x not in fractions:
                fractions[x] = x, i, j
sorted_unique_fractions =  sorted(fractions.values())



More information about the Tutor mailing list