Split a string based on change of character

Andrew Savige ajsavige at yahoo.com.au
Sun Jul 29 02:36:49 EDT 2007


--- "attn.steven.kuo at gmail.com" <attn.steven.kuo at gmail.com> wrote:
> Using itertools:
> 
> import itertools
> 
> s = 'ABBBCC'
> print [''.join(grp) for key, grp in itertools.groupby(s)]

Nice.

> Using re:
> 
> import re
> 
> pat = re.compile(r'((\w)\2*)')
> print [t[0] for t in re.findall(pat, s)]

Also nice. Especially nice that it only returns the outer parens. :-)

> By the way, your pattern seems to work in perl:
> 
> $ perl -le '$, = " "; print split(/(?<=(.))(?!\1)/, "ABBBCC");'
> A A BBB B CC C
> 
> Was that the type of regular expressions you were expecting?

Yes. Here's a simpler example without any backreferences:

s = re.split(r'(?<=\d)(?=\D)', '1B2D3')

That works in Perl but not in Python.
Is it that "chaining" assertions together like this is not supported in Python
re?
Or is that the case only in the split function?

Thanks,
/-\



      ____________________________________________________________________________________
Yahoo!7 Mail has just got even bigger and better with unlimited storage on all webmail accounts. 
http://au.docs.yahoo.com/mail/unlimitedstorage.html



More information about the Python-list mailing list