split on blank lines
Duncan Booth
duncan at NOSPAMrcp.co.uk
Mon Dec 1 09:35:50 EST 2003
jburgy at hotmail.com (Jan Burgy) wrote in
news:807692de.0312010610.4461c0e3 at posting.google.com:
> can somebody tell me why (using Python 2.3.2)
>
>>>> import re
>>>> re.compile(r"^$", re.MULTILINE).split("foo\n\nbar\n\nbaz")
> ['foo\n\nbar\n\nbaz']
>
> ? Being used to Perl semantics, I expect
>
> ['foo\n', 'bar\n', 'baz']
>
> or something equivalent without the '\n' characters in the result
> strings. I have found that
>
>>>> re.compile(r"^\n", re.MULTILINE).split("foo\n\nbar\n\nbaz")
> ['foo\n', 'bar\n', 'baz']
>
> I prefer the first version however because my intent is stated more
> clearly. Could this be a bug in sre.py (I looked at the code for a
> good two minutes but then my head started hurting)
>
Given that re.compile("^$", re.MULTILINE).findall("foo\n\nbar\n\nbaz")
returns ['', ''] I would agree this looks like a bug. You could submit a
bug report on Sourceforge.
Of course, if you really want to state your intentions, you could just use:
>>> "foo\n\nbar\n\nbaz".split('\n\n')
['foo', 'bar', 'baz']
as you aren't doing anything here that obviously benefits from regex
obfuscation.
--
Duncan Booth duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
More information about the Python-list
mailing list