split on blank lines

Duncan Booth duncan at NOSPAMrcp.co.uk
Tue Dec 2 04:12:42 EST 2003


Hans Nowak <hans at zephyrfalcon.org> wrote in
news:mailman.12.1070300390.16879.python-list at python.org: 

> Duncan Booth wrote:
> 
>> Given that re.compile("^$",
>> re.MULTILINE).findall("foo\n\nbar\n\nbaz") returns ['', ''] I would
>> agree this looks like a bug. You could submit a bug report on
>> Sourceforge. 
> 
> I may be wrong, but I would think that the behavior is correct. "^$"
> matches an empty line.  This is exactly what findall returns... two
> empty lines. 
> 
Perhaps you trimmed too much of the original context, but you have 
misunderstood the original poster's intent.

The original post said:

> can somebody tell me why (using Python 2.3.2)
> 
>>>> import re
>>>> re.compile(r"^$", re.MULTILINE).split("foo\n\nbar\n\nbaz")
> ['foo\n\nbar\n\nbaz']

Notice that the string they are splitting contains two empty lines. I 
pointed out that re.findall correctly spots the two empty lines, and 
therefore you would expect that the split should correctly split the string 
there, but it doesn't.

For the avoidance of doubt: there is an inconsistency of behaviour between 
re.findall and re.split. It looks to me like a bug in the str.split method.

-- 
Duncan Booth                                             duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?




More information about the Python-list mailing list