count consecutive elements

Tue Jan 19 05:50:31 EST 2021

On 19/01/2021 10:42, Bischoop wrote:
> On 2021-01-19, Peter Otten <__peter__ at web.de> wrote:
>> On 19/01/2021 04:45, Bischoop wrote:
>>
>>> I sat to it again and solved it.
>>
>> Congratulations!
>>
>>> lil = tuple(set(s)) # list of characters in s
>>>
>>> li=[0,0,0,0,0,0]  # list for counted repeats
>>
>> I see a minor problem here. What happens if s contains more than len(li)
>> characters?
>>
> 
> Thanks, I've created it when came to idea how to solve the problem and
> then forgot that part, will have to update it with two additional line
> of code.
> 
>>> import timeit
>>
>> Since you seem interested in performance: imagine that the input is a
>> very long string. Let's call the length N.
>>
>> How often do you iterate over its characters?
>>
> 
> Does it take time :-) I actually left it because seen guy in thread
> were comparing their time results, I'm pretty much aware that mine
> solution is time consuming and there are better ways to do it but
> I wanted to finish what I started without any additional libraries
> like itertools etc in the way that my knowledge allows. The idea with
> for this tuple just came up in a first seconds when I look at that
> and was not bothering about time or performance especially that later
> on as you seen I had different problem to solve and which took me quite
> a time and when I look at it today I think how couldn't I came with it
> earlier and let myself stuck on it.
> 
> I'm happy that despite I've asked here I've finish it practically wthout
> additional help and yeah ( Thanks to Stefan who just pointed me to look
> at it from different prespective instead pointing what was wrong),
> I'll iterate here n = x [x for x in lil], with big string it can make
> difference. Now, when you asked me that question I came indeed for better
> idea to do this. One loop during which I can append character if not in lil.
> 
>> How often does Tim's solution?
> 
> oh here Im stuck and dont understand what you mean?

[Tim Chase]

>   def consecutive_counter(seq):
>       # I'm not sure if there's something
>       # like this already in itertools
>       cur = nada = object()
>       count = 0
>       for x in seq:
>           if x == cur:
>               count += 1
>           else:
>               if cur is not nada:
>                   yield cur, count
>               cur = x
>               count = 1
>       if cur is not nada:
>           yield cur, count
> 
>   def longest(seq):
>       results = []
>       biggest = 0
>       for item, count in seq:
>           if count > biggest:
>               results = [item]
>               biggest = count
>           elif count == biggest:
>               results.append(item)
>       return results, biggest

Tim's code iterates over the string once whereas you iterate multiple 
times. While this is mainly a question of elegance performance is 
affected when you have *nested* loops:

> for i in lil:
>     c = 0
>     h= lil.index(i)
>     for letter in s:

If you are unlucky and run into a string with 1000 different characters 
the code in your inner loop will execute 1000*1000 times while Tim's 
will run 1000 times. This is called O(N*N) and O(N).

Not all loops are obvious,

> lil.index(letter)

contains another loop, and you now have O(N*N*N) behavior.

This is the worst case, usually the alphabet (li1) will be smaller, but 
avoiding nested loops when possible is still a good habit to get into.