Stripping whitespace
John Machin
sjmachin at lexicon.net
Wed Jan 23 17:47:35 EST 2008
On Jan 24, 7:57 am, "Reedick, Andrew" <jr9... at ATT.COM> wrote:
>
> Why is it that so many Python people are regex adverse? Use the dashed
> line as a regex. Convert the dashes to dots. Wrap the dots in
> parentheses. Convert the whitespace chars to '\s'. Presto! Simpler,
> cleaner code.
Woo-hoo! Yesterday was HTML day, today is code review day. Yee-haa!
>
> import re
>
> state = 0
> header_line = ''
> pattern = ''
> f = open('a.txt', 'r')
> for line in f:
> if line[-1:] == '\n':
> line = line[:-1]
>
> if state == 0:
> header_line = line
> state += 1
state = 1
> elif state == 1:
> pattern = re.sub(r'-', r'.', line)
> pattern = re.sub(r'\s', r'\\s', pattern)
> pattern = re.sub(r'([.]+)', r'(\1)', pattern)
Consider this:
pattern = ' '.join('(.{%d})' % len(x) for x in line.split())
> print pattern
> state += 1
state = 2
>
> headers = re.match(pattern, header_line)
> if headers:
> print headers.groups()
> else:
> state = 2
assert state == 2
> m = re.match(pattern, line)
> if m:
> print m.groups()
>
> f.close()
>
More information about the Python-list
mailing list