[Python-Dev] Re: re.split on empty patterns

Mike Coleman mkc at mathdogs.com
Mon Aug 23 00:53:34 CEST 2004


"Brett C." <bac at OCF.Berkeley.EDU> writes:
> Mike Coleman wrote:
> 
> [SNIP]
> >     # alternative 2:
> >     re.structmatch(r'xxx|(?=abc)', 'zzxxxabczz') --> ['zz', 'bbczz']
>                                                                 ^
> >     re.structmatch(r'xxx|(?=abc)', 'zzxxxbbczz') --> ['zz', 'bbczz']
> >     # alternative 3:
> >     re.structmatch(r'xxx|(?=abc)', 'zzxxxabczz') --> ['zz', '', 'bbczz']
>                                                                     ^
> >     re.structmatch(r'xxx|(?=abc)', 'zzxxxbbczz') --> ['zz', 'bbczz']
> >
> 
> I take it the first 'b' in both of the first examples for each alternative
> were supposed to be 'a'?

Yes, that's correct.  Oops.

> And as for which version, I actually like Mike's version more than the one AMK
> and Tim like.  The reason is that the '' in the middle of the example in
> question in the OP tells you where the split would have occurred had split0 (I
> like that or 'split_empty') not been turned on. That way there is no real loss
> of data between the two, but a gain with the new feature being used.


Is there something we can do to move this forward?  It seems like a couple of
people like one option and a couple the other, but I think at least we all
agree that the general feature would be a good idea.  So, should we take a
vote?  Or just go with the more conservative option, in order to get something
in the tree for 2.4?

Mike




More information about the Python-Dev mailing list