How to remove empty lines with re?
Anand Pillai
pythonguy at Hotpop.com
Fri Oct 10 12:49:28 EDT 2003
To do this, you need to modify your re to just
this
empty=re.compile('^$')
This of course looks for a pattern where there is beginning just
after end, ie the line is empty :-)
Here is the complete code.
import re
empty=re.compile('^$')
for line in open('test.txt').readlines():
if empty.match(line):
continue
else:
print line,
The comma at the end of the print is to avoid printing another newline,
since the 'readlines()' method gives you the line with a '\n' at the end.
Also dont forget to compile your regexps for efficiency sake.
HTH
-Anand Pillai
"ted" <tedNOSPAM94107 at yahoo.com> wrote in message news:<vocoudjtp6vv25 at corp.supernews.com>...
> I'm having trouble using the re module to remove empty lines in a file.
>
> Here's what I thought would work, but it doesn't:
>
> import re
> f = open("old_site/index.html")
> for line in f:
> line = re.sub(r'^\s+$|\n', '', line)
> print line
>
> Also, when I try to remove some HTML tags, I get even more empty lines:
>
> import re
> f = open("old_site/index.html")
> for line in f:
> line = re.sub('<.*?>', '', line)
> line = re.sub(r'^\s+$|\n', '', line)
> print line
>
> I don't know what I'm doing. Any help appreciated.
>
> TIA,
> Ted
More information about the Python-list
mailing list