To re or not to re ... ( word wrap function?)
Skip Montanaro
skip at pobox.com
Mon Sep 24 15:02:37 EDT 2001
Chris> I did notice that no one used an re-based solution, except Skip
Chris> using
Chris> re.split(r'\s+', s)
Chris> Is this any different than string.split(s) ?
Probably not. Just a brain fart on my part when I wrote the wrap function
ages ago. (I don't use it heavily, and then only to format the occasional
paragraph in Musi-Cal's stored query responses.) You could use
re.split(r'(\s+)', s)
to retain all your white space:
>>> s = "Now is the time for all good men. Hey, what about me?"
>>> re.split(r'\s+', s)
['Now', 'is', 'the', 'time', 'for', 'all', 'good', 'men.', 'Hey,', 'what', 'about', 'me?']
>>> re.split(r'(\s+)', s)
['Now', ' ', 'is', ' ', 'the', ' ', 'time', ' ', 'for', ' ', 'all', ' ', 'good', ' ', 'men.', ' ', 'Hey,', ' ', 'what', ' ', 'about', ' ', 'me?']
The start-of-line initialization might be a bit more complex (you'd have to
toss out any whitespace at the beginning or end of the line being
constructed). Also, you'd probably have to do something like
s = re.sub(r'(\r\n|\r|\n)', ' ', s)
to at least convert newlines to spaces before the split.
--
Skip Montanaro (skip at pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/
More information about the Python-list
mailing list