How to remove empty lines with re?

ted tedNOSPAM94107 at yahoo.com
Sat Oct 11 03:08:19 EDT 2003


Thanks Anand, works great.


"Anand Pillai" <pythonguy at Hotpop.com> wrote in message
news:84fc4588.0310100849.4546e804 at posting.google.com...
> To do this, you need to modify your re to just
> this
>
> empty=re.compile('^$')
>
> This of course looks for a pattern where there is beginning just
> after end, ie the line is empty :-)
>
> Here is the complete code.
>
> import re
>
> empty=re.compile('^$')
> for line in open('test.txt').readlines():
>     if empty.match(line):
>         continue
>     else:
>         print line,
>
> The comma at the end of the print is to avoid printing another newline,
> since the 'readlines()' method gives you the line with a '\n' at the end.
>
> Also dont forget to compile your regexps for efficiency sake.
>
> HTH
>
> -Anand Pillai
>
>
> "ted" <tedNOSPAM94107 at yahoo.com> wrote in message
news:<vocoudjtp6vv25 at corp.supernews.com>...
> > I'm having trouble using the re module to remove empty lines in a file.
> >
> > Here's what I thought would work, but it doesn't:
> >
> > import re
> > f = open("old_site/index.html")
> > for line in f:
> >     line = re.sub(r'^\s+$|\n', '', line)
> >     print line
> >
> > Also, when I try to remove some HTML tags, I get even more empty lines:
> >
> > import re
> > f = open("old_site/index.html")
> > for line in f:
> >     line = re.sub('<.*?>', '', line)
> >     line = re.sub(r'^\s+$|\n', '', line)
> >     print line
> >
> > I don't know what I'm doing. Any help appreciated.
> >
> > TIA,
> > Ted






More information about the Python-list mailing list