Pythonic way to normalize vertical whitespace

Stephen Hansen apt.shansen at gmail.com
Sun May 10 20:21:56 EDT 2009


On Fri, May 8, 2009 at 8:53 AM, <python at bdurham.com> wrote:

>  I'm looking for suggestions on technique (not necessarily code) about the
> most pythonic way to normalize vertical whitespace in blocks of text so that
> there is never more than 1 blank line between paragraphs. Our source text
> has newlines normalized to single newlines (\n vs. combinations of \r and
> \n), but there may be leading and trailing whitespace around each newline.
>

I'm not sure what's more Pythonic in this case, but the approach I'd use is
a generator. Like so:

--- code to follow --

def cleaner(body):
    empty = False
    for line in body.splitlines():
        line = line.rstrip()
        if line:
            empty = False
            yield line
        else:
            if empty:
                continue
            else:
                empty = True
                yield ''

Text = """
This is my first paragraph.

This is my second.


This is my third.
This is my fourth.

This is my fifth.



This is my sixth.





This is my seventh.

Okay?
"""

print '\n'.join(cleaner(Text))

--- end code ---

You maybe want to tweak the logic a bit depending on how you want to handle
initial blank lines or such, and I'm two paragraphs w/o a blank line between
them (you talk some of going from \n\n\n to \n\n, so I'm not sure if you
want to allow blank lines separating the paragraphs or require it).

I guess this basically is your first approach, just specifically going with
"Use a generator to do it" instead of regular list building cuz that should
be faster and cleaner int his case.... and I don't think the third approach
is really all that needed. Python can do this pretty quickly.

The second approach seems distinctly unPythonic. I use regular expressions
in several places, mind, but if there's a straight-forward clean way to do
anything without them that always seems like the Pythonic thing to me. :)

But that's just me :)

--S
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090510/ddaa1265/attachment-0001.html>


More information about the Python-list mailing list