Text parsing via regex

Paul McGuire ptmcg at austin.rr.com
Mon Dec 8 14:21:00 EST 2008


On Dec 8, 12:13 pm, Robocop <btha... at physics.ucsd.edu> wrote:
> I'm having a little text parsing problem that i think would be really
> quick to troubleshoot for someone more versed in python and Regexes.
> I need to write a simple script that parses some arbitrarily long
> string every 50 characters, and does not parse text in the middle of
> words

Are you just wrapping text?  If so, then use the textwrap module.

import textwrap

source_string = "a bunch of nonsense that could be really long, or
really short depending on the situation"

print textwrap.fill(source_string,50)
print textwrap.wrap(source_string,50)

print map(len,textwrap.wrap(source_string,50))
pad50 = lambda s : (s+ " "*50)[:50]
print '|\n'.join(map(pad50,textwrap.wrap(source_string,50)))

Prints:

a bunch of nonsense that could be really long, or
really short depending on the situation
['a bunch of nonsense that could be really long, or',
 'really short depending on the situation']
[49, 39]
a bunch of nonsense that could be really long, or |
really short depending on the situation

-- Paul





More information about the Python-list mailing list