[Tutor] scraping and saving in file
Steven D'Aprano
steve at pearwood.info
Wed Dec 29 11:38:49 CET 2010
Tommy Kaas wrote:
> I have uploaded a simple table on my web page and try to scrape it and will
> save the result in a text file. I will separate the columns in the file with
> #.
>
> It works fine but besides # I also get spaces between the columns in the
> text file. How do I avoid that?
The print command puts spaces between the each output object:
>>> print 1, 2, 3 # Three objects being printed.
1 2 3
To prevent this, use a single output object. There are many ways to do
this, here are three:
>>> print "%d%d%d" % (1, 2, 3)
123
>>> print str(1) + str(2) + str(3)
123
>>> print ''.join('%s' % n for n in (1, 2, 3))
123
But in your case, the best way is not to use print at all. You are
writing to a file -- write to the file directly, don't mess about with
print. Untested:
f = open('tabeltest.txt', 'w')
url = 'http://www.kaasogmulvad.dk/unv/python/tabeltest.htm'
soup = BeautifulSoup(urllib2.urlopen(url).read())
rows = soup.findAll('tr')
for tr in rows:
cols = tr.findAll('td')
output = "#".join(cols[i].string for i in (0, 1, 2, 3))
f.write(output + '\n') # don't forget the newline after each row
f.close()
--
Steven
More information about the Tutor
mailing list