[Tutor] Ways of removing consequtive duplicates from a list

Peter Otten __peter__ at web.de
Mon Jul 18 03:15:14 EDT 2022


On 17/07/2022 18:59, avi.e.gross at gmail.com wrote:
> You could make the case, Peter, that you can use anything as a start that
> will not likely match in your domain. You are correct if an empty string may
> be in the data.
>
> Now an object returned by object is pretty esoteric and ought to be rare and
> indeed each new object seems to be individual.
>
> val=object()
>
> [(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val]
> -->> [1, 2, <object object at 0x00000176F33150D0>, 3]
>
> So the only way to trip this up is to use the same object or another
> reference to it where it is silently ignored.

When you want a general solution for removal of consecutive duplicates
you can put the line

val = object()

into the deduplication function which makes it *very* unlikely that val
will also be passed as an argument to that function.

To quote myself:

> Manprit avoided that in his similar solution by using a special value
> that will compare false except in pathological cases:
>
>> val = object()
>> [(val := ele) for ele in lst if ele != val]

What did I mean with "pathological"?

One problematic case would be an object that compares equal to everything,

class A:
     def __eq__(self, other): return True
     def __ne__(self, other): return False

but that is likely to break the algorithm anyway.

Another problematic case: objects that only implement comparison for
other objects of the same type. For these deduplication will work if you
avoid the out-of-band value:

 >>> class A:
	def __init__(self, name):
		self.name = name
	def __eq__(self, other): return self.name == other.name
	def __ne__(self, other): return self.name != other.name
	def __repr__(self): return f"A(name={self.name})"


 >>> prev = object()
 >>>
 >>> [(prev:=item) for item in map(A, "abc") if item != prev]
Traceback (most recent call last):
   File "<pyshell#57>", line 1, in <module>
     [(prev:=item) for item in map(A, "abc") if item != prev]
   File "<pyshell#57>", line 1, in <listcomp>
     [(prev:=item) for item in map(A, "abc") if item != prev]
   File "<pyshell#54>", line 5, in __ne__
     def __ne__(self, other): return self.name != other.name
AttributeError: 'object' object has no attribute 'name'


 >>> def rm_duplicates(iterable):
     it = iter(iterable)
     try:
         last = next(it)
     except StopIteration:
         return
     yield last
     for item in it:
         if item != last:
             yield item
             last = item

 >>> list(rm_duplicates(map(A, "aabccc")))
[A(name=a), A(name=b), A(name=c)]
 >>>


More information about the Tutor mailing list