[newbie] how to remove empty lines from webpage/file

jenswaelkens at gmail.com jenswaelkens at gmail.com
Tue Feb 27 05:50:19 EST 2018


Dear all,
I try to get the numerical data from the following webpage:
http://www.astro.oma.be/GENERAL/INFO/nzon/zon_2018.html

With the following code-fragment I was already able to get a partial result:

#!/usr/bin/env python
#memo: install bs4 as follows: sudo easy_install bs4 
# -*- coding: utf-8 -*-
#3 lines below necessary to avoid encoding problem
import sys
reload(sys)
sys.setdefaultencoding('utf8')
import urllib2
file = open("testfile.txt","w") 
source = "http://www.astro.oma.be/GENERAL/INFO/nzon/zon_2018.html"
page = urllib2.urlopen(source)
from bs4 import BeautifulSoup
soup = BeautifulSoup(page,'lxml')
lines=soup.get_text()
file.write(lines)
file.close()

I tried to delete the empty lines but I am totally stuck at this moment, can anyone help me further?

thanks in advance
jens



More information about the Python-list mailing list