Suitable Python code to scrape specific details from web pages.

Roy Smith roy at panix.com
Tue Aug 12 20:30:39 EDT 2014


In article <53eaab7d$0$29979$c3e8da3$5496439d at news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:

> By studying how other scraping programs work, and studying how your racing
> pages store data, you should be able to put the two together and see how to
> get the data you want.

It's also worth mentioning, that some web sites *want* you to have their 
data, and make it easy to do so by exposing it via public APIs or other 
download methods.  Wikipedia.  Many government web sites.  Twitter.  
Facebook.  Reddit.

Whenever you start thinking about web scraping, it's always worth 
spending a little time investigating if such an API exists.  If it does, 
that's where you want to go.  If not, well, there's always Beautiful 
Soup :-)



More information about the Python-list mailing list