To re or not to re ... ( word wrap function?)

Skip Montanaro skip at pobox.com
Mon Sep 24 15:02:37 EDT 2001


    Chris> I did notice that no one used an re-based solution, except Skip
    Chris> using

    Chris> re.split(r'\s+', s)

    Chris> Is this any different than string.split(s) ?

Probably not.  Just a brain fart on my part when I wrote the wrap function
ages ago.  (I don't use it heavily, and then only to format the occasional
paragraph in Musi-Cal's stored query responses.)  You could use

    re.split(r'(\s+)', s)

to retain all your white space:

    >>> s = "Now is the time for all good men.  Hey, what about me?"
    >>> re.split(r'\s+', s)
    ['Now', 'is', 'the', 'time', 'for', 'all', 'good', 'men.', 'Hey,', 'what', 'about', 'me?']
    >>> re.split(r'(\s+)', s)
    ['Now', ' ', 'is', ' ', 'the', ' ', 'time', ' ', 'for', ' ', 'all', ' ', 'good', ' ', 'men.', '  ', 'Hey,', ' ', 'what', ' ', 'about', ' ', 'me?']

The start-of-line initialization might be a bit more complex (you'd have to
toss out any whitespace at the beginning or end of the line being
constructed).  Also, you'd probably have to do something like

    s = re.sub(r'(\r\n|\r|\n)', ' ', s)

to at least convert newlines to spaces before the split.

-- 
Skip Montanaro (skip at pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/




More information about the Python-list mailing list