Extracting subsequences composed of the same character

Tim Chase python.list at tim.thechases.com
Thu Mar 31 22:20:29 EDT 2011


On 03/31/2011 07:43 PM, candide wrote:
> "pyyythhooonnn --->  ++++"
>
> and you search for the subquences composed of the same character, here
> you get :
>
> 'yyy', 'hh', 'ooo', 'nnn', '---', '++++'

Or, if you want to do it with itertools instead of the "re" module:

 >>> s = "pyyythhooonnn ---> ++++"
 >>> from itertools import groupby
 >>> [c*length for c, length in ((k, len(list(g))) for k, g in 
groupby(s)) if length > 1]
['yyy', 'hh', 'ooo', 'nnn', '---', '++++']


-tkc





More information about the Python-list mailing list