[Tutor] scraping and saving in file

Steven D'Aprano steve at pearwood.info
Wed Dec 29 11:38:49 CET 2010


Tommy Kaas wrote:

> I have uploaded a simple table on my web page and try to scrape it and will
> save the result in a text file. I will separate the columns in the file with
> #.
> 
> It works fine but besides # I also get spaces between the columns in the
> text file. How do I avoid that?

The print command puts spaces between the each output object:

 >>> print 1, 2, 3  # Three objects being printed.
1 2 3

To prevent this, use a single output object. There are many ways to do 
this, here are three:

 >>> print "%d%d%d" % (1, 2, 3)
123
 >>> print str(1) + str(2) + str(3)
123
 >>> print ''.join('%s' % n for n in (1, 2, 3))
123


But in your case, the best way is not to use print at all. You are 
writing to a file -- write to the file directly, don't mess about with 
print. Untested:


f = open('tabeltest.txt', 'w')
url = 'http://www.kaasogmulvad.dk/unv/python/tabeltest.htm'
soup = BeautifulSoup(urllib2.urlopen(url).read())
rows = soup.findAll('tr')
for tr in rows:
     cols = tr.findAll('td')
     output = "#".join(cols[i].string for i in (0, 1, 2, 3))
     f.write(output + '\n')  # don't forget the newline after each row
f.close()



-- 
Steven



More information about the Tutor mailing list