[Python-Dev] sre.split question
Chris King
colanderman at gmail.com
Tue Jul 20 19:13:49 CEST 2004
I'm curious as to this bit of code in pattern_split() in Modules/_sre.c:
if (state.start == state.ptr) {
if (last == state.end)
break;
/* skip one character */
state.start = (void*) ((char*) state.ptr + state.charsize);
continue;
}
This precludes use of patterns that can successfully match zero-length
strings (e.g. r'(?<=[A-Za-z])(?=[^A-Za-z])'. Skipping one character
is of course the correct behaviour, but what purpose do the break and
continue serve? The only one I can think of is to stop silly patterns
like r'\s*' from returning a list of characters, but there may be
other reasons I haven't thought of.
(Yes, I know I can get the effect I want by using finditer() ;))
More information about the Python-Dev
mailing list