Split a string based on change of character

attn.steven.kuo at gmail.com attn.steven.kuo at gmail.com
Sun Jul 29 12:34:13 EDT 2007


On Jul 28, 11:36 pm, Andrew Savige <ajsav... at yahoo.com.au> wrote:

(snipped)

>
> Yes. Here's a simpler example without any backreferences:
>
> s = re.split(r'(?<=\d)(?=\D)', '1B2D3')
>
> That works in Perl but not in Python.
> Is it that "chaining" assertions together like this is not supported in Python
> re?
> Or is that the case only in the split function?
>


The match objects returned by finditer return
the expected span positions:

>>> pat = re.compile(r'(?<=\d)(?=\D)')
>>> s = '1B2D3'
>>> for mobj in pat.finditer(s):
...     print mobj.span()
...
(1, 1)
(3, 3)


>From your original post:

>>> pat = re.compile(r'(?<=(.))(?!\1)')
>>> s = 'ABBBCC'
>>> for mobj in pat.finditer(s):
...     print mobj.span()
...
(1, 1)
(4, 4)
(6, 6)


So, it seems split doesn't split on what
amounts to a zero-width assertion.  I
couldn't find this explanation from a
quick look at the documentation, however.

--
Hope this helps,
Steven




More information about the Python-list mailing list