splitting by double newline

Peter Otten __peter__ at web.de
Mon Feb 7 13:20:38 EST 2011


Nikola Skoric wrote:

> Hello everybody,
> 
> I'd like to split a file by double newlines, but portably. Now,
> splitting by one or more newlines is relatively easy:
> 
> self.tables = re.split("[\r\n]+", bulk)
> 
> But, how can I split on double newlines? I tried several approaches,
> but none worked...

If you open the file in universal newline mode with

with open(filename, "U") as f:
    bulk = f.read()

your data will only contain "\n". You can then split with 

blocks = bulk.split("\n\n") # exactly one empty line

or 

blocks = re.compile(r"\n{2,}").split(bulk) # one or more empty lines

One last variant that doesn't read in the whole file and accepts lines with 
only whitespace as empty:

with open(filename, "U") as f:
    blocks = ("".join(group) for empty, group in itertools.groupby(f, 
key=str.isspace) if not empty)



More information about the Python-list mailing list