What is built-in method sub

Phlip phlip2005 at gmail.com
Mon Jan 11 19:46:32 EST 2010


>>>>     trailingPattern = '(\S*)\ +?\n'
>>>>     line = re.sub(trailingPattern, '\\1\n', line)

What happens with this?

      trailingPattern = '\s+$'
      line = re.sub(trailingPattern, '', line)

I'm guessing that $ terminates \s+'s greediness without snarfing the underlying 
\n. Then I'm guessing that the lack of a \1 replacer will help the sub work 
faster with less internal string shuffling.

>>> line = line.rstrip()?

is probably faster still, but there might be a technical reason to avoid it.

But these uncertainties are why I write unit tests, including tests for the edge 
cases. (What if it's a \r\n? What if the \n is missing? etc.) That way I don't 
need to memorize re's exact behavior, and if I find a reason to swap in a 
.rstrip(), I can pass all the tests and make sure the substitution works the same.

-- 
   Phlip
   http://c2.com/cgi/wiki?ZeekLand



More information about the Python-list mailing list