[ python-Bugs-852532 ] ^$ won't split on empty line
SourceForge.net
noreply at sourceforge.net
Tue Dec 2 10:20:27 EST 2003
Bugs item #852532, was opened at 2003-12-02 06:01
Message generated for change (Comment added) made by tim_one
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=852532&group_id=5470
Category: Regular Expressions
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jan Burgy (jburgy)
Assigned to: Fredrik Lundh (effbot)
Summary: ^$ won't split on empty line
Initial Comment:
Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200
32 bit (Intel)] on win32
>>> import re
>>> re.compile('^$', re.MULTILINE).split('foo\n\nbar')
['foo\n\nbar']
I expect ['foo\n', '\nbar'], since, according to the
documentation $ "in MULTILINE mode also matches
before a newline".
Thanks, Jan
----------------------------------------------------------------------
>Comment By: Tim Peters (tim_one)
Date: 2003-12-02 10:20
Message:
Logged In: YES
user_id=31435
Confirmed on Pythons 2.1.3, 2.2.3, 2.3.2, and current CVS.
More generally, split() doesn't appear to split on any empty
(0-length) match. For example,
>>> pat = re.compile(r'\b')
>>> pat.split('(a b)')
['(a b)']
>>> pat.findall('(a b)') # but the pattern matches 4 places
['', '', '', '']
>>>
That's probably a design constraint, but isn't documented.
For example, if you split "abc" by the pattern x*, what do you
expect? The pattern matches (with length 0) at 4 places,
but I bet most people would be surprised to get
['', 'a', 'b', 'c', '']
back instead of (as they do get)
['abc']
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=852532&group_id=5470
More information about the Python-bugs-list
mailing list