[Tutor] Ways of removing consequtive duplicates from a list

avi.e.gross at gmail.com avi.e.gross at gmail.com
Mon Jul 18 00:22:32 EDT 2022


Dennis,

Unpacking is an interesting approach. Your list example seems to return a shorter list which remains iterable. But what does it mean to unpack other iterables like a function that yields? Does the unpacking call it as often as needed to satisfy the first variables you want filled and then pass a usable version of the iterable to the last argument?

Since the question asked was about what approach is in some way better, unpacking can be a sort of hidden cost or it can be done very efficiently.



-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of dn
Sent: Sunday, July 17, 2022 11:34 PM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

On 17/07/2022 20.26, Peter Otten wrote:
> On 17/07/2022 00:01, Alex Kleider wrote:
> 
>> PS My (at least for me easier to comprehend) solution:
>>
>> def rm_duplicates(iterable):
>>      last = ''
>>      for item in iterable:
>>          if item != last:
>>              yield item
>>              last = item
> 
> The problem with this is the choice of the initial value for 'last':

Remember "unpacking", eg

>> def rm_duplicates(iterable):
>>      current, *the_rest = iterable
>>      for item in the_rest:

Then there is the special case, which (assuming it is possible) can be caught as an exception - which will likely need to 'ripple up' through the function-calls because the final collection of 'duplicates' will be empty/un-process-able. (see later comment about "unconditionally")


Playing in the REPL:

>>> iterable = [1,2,3]
>>> first, *rest = iterable
>>> first, rest
(1, [2, 3])
# iterable is a list

>>> iterable = [1,2]
>>> first, *rest = iterable
>>> first, rest
(1, [2])
# iterable is (technically) a list

>>> iterable = [1]
>>> first, *rest = iterable
>>> first, rest
(1, [])
# iterable is an empty list

>>> iterable = []
>>> first, *rest = l
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected at least 1, got 0) # nothing to see here: no duplicates - and no 'originals' either!


>>>> list(rm_duplicates(["", "", 42, "a", "a", ""]))
> [42, 'a', '']   # oops, we lost the initial empty string
> 
> Manprit avoided that in his similar solution by using a special value 
> that will compare false except in pathological cases:
> 
>> val = object()
>> [(val := ele) for ele in lst if ele != val]
> 
> Another fix is to yield the first item unconditionally:
> 
> def rm_duplicates(iterable):
>     it = iter(iterable)
>     try:
>         last = next(it)
>     except StopIteration:
>         return
>     yield last
>     for item in it:
>         if item != last:
>             yield item
>             last = item
> 
> If you think that this doesn't look very elegant you may join me in 
> the https://peps.python.org/pep-0479/ haters' club ;)
This does indeed qualify as 'ugly'. However, it doesn't need to be expressed in such an ugly fashion!

--
Regards,
=dn
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor



More information about the Tutor mailing list