whitespace within a string
Jeff Epler
jepler at unpythonic.net
Mon Feb 23 20:44:13 EST 2004
You can use the magic of no-arg split() to do this:
def canonize_whitespace(s):
return " ".join(s.split())
>>> canonize_whitespace("a b\t\tc\td\t e")
'a b c d e'
A regular expression substituion can do the job too
def canonize_whitespace(s):
return re.sub('\s+', ' ', s)
>>> canonize_whitespace("a b\t\tc\td\t e")
'a b c d e'
Of course, if 'x=y' is accepted just like 'x = y' and 'x = y', then
neither of these approaches is good enough.
def canonize_config_line(s):
if not '=' in s: return s
a, b = s.split("=", 1)
return "%s = %s" % (a.strip(), b.strip())
>>> [canonize_config_line(s) for s in
... ['x=y', 'x\t= y', ' x = y ', "#z"]]
['x = y', 'x = y', 'x = y', '#z']
Jeff
More information about the Python-list
mailing list