how to split a string (or sequence) into pairs of characters?

Eddie Corns eddie at holyrood.ed.ac.uk
Fri Aug 16 06:12:22 EDT 2002


Andrew Koenig <ark at research.att.com> writes:

>Berthold> Andrew Koenig <ark at research.att.com> writes:
>Jason> Can anyone come up with a better way of performing these
>Jason> operations? Extra kudos if it easily extends to any sublength
>Jason> and not just pairs.
>>> 
>>> >>> import re
>>> >>> re.findall('..', 'aabbccddee')
>>> ['ab', 'cd', 'ef']

>Berthold> Aehm,

>>>>> re.findall('..', 'aabbccddee')
>Berthold> ['aa', 'bb', 'cc', 'dd', 'ee']

I suspect the OP didn't want to use REs but after going to all the effort to
think about it I'll post my variation anyway.

>>> x = 'aabbccdd'
>>> y = [lh for lh,rh in re.findall (r'((.)\2*)',x)]
>>> print y
['aa', 'bb', 'cc', 'dd']

Will basically find all contiguous blocks of identical characters.  You could
force it to just pairs with \2 instead of \2*, using \2+ gets only sequences
that are longer than 1.

The (other) list comprehension answer was probably the simplest.

Eddie



More information about the Python-list mailing list