Extracting subsequences composed of the same character

Tim Chase python.list at tim.thechases.com
Thu Mar 31 21:58:42 EDT 2011


On 03/31/2011 07:43 PM, candide wrote:
> Suppose you have a string, for instance
>
> "pyyythhooonnn --->  ++++"
>
> and you search for the subquences composed of the same character, here
> you get :
>
> 'yyy', 'hh', 'ooo', 'nnn', '---', '++++'

 >>> import re
 >>> s = "pyyythhooonnn ---> ++++"
 >>> [m.group(0) for m in re.finditer(r"(.)\1+", s)]
['yyy', 'hh', 'ooo', 'nnn', '---', '++++']
 >>> [(m.group(0),m.group(1)) for m in re.finditer(r"(.)\1+", s)]
[('yyy', 'y'), ('hh', 'h'), ('ooo', 'o'), ('nnn', 'n'), ('---', 
'-'), ('++++', '+')]

-tkc








More information about the Python-list mailing list