How to remove empty lines with re?

Anand Pillai pythonguy at Hotpop.com
Fri Oct 10 12:49:28 EDT 2003


To do this, you need to modify your re to just
this

empty=re.compile('^$')

This of course looks for a pattern where there is beginning just
after end, ie the line is empty :-)

Here is the complete code.

import re

empty=re.compile('^$')
for line in open('test.txt').readlines():
    if empty.match(line):
        continue
    else:
        print line,

The comma at the end of the print is to avoid printing another newline,
since the 'readlines()' method gives you the line with a '\n' at the end.

Also dont forget to compile your regexps for efficiency sake.

HTH

-Anand Pillai


"ted" <tedNOSPAM94107 at yahoo.com> wrote in message news:<vocoudjtp6vv25 at corp.supernews.com>...
> I'm having trouble using the re module to remove empty lines in a file.
> 
> Here's what I thought would work, but it doesn't:
> 
> import re
> f = open("old_site/index.html")
> for line in f:
>     line = re.sub(r'^\s+$|\n', '', line)
>     print line
> 
> Also, when I try to remove some HTML tags, I get even more empty lines:
> 
> import re
> f = open("old_site/index.html")
> for line in f:
>     line = re.sub('<.*?>', '', line)
>     line = re.sub(r'^\s+$|\n', '', line)
>     print line
> 
> I don't know what I'm doing. Any help appreciated.
> 
> TIA,
> Ted




More information about the Python-list mailing list