[Tutor] Guidance if possible

Alan Gauld alan.gauld at btinternet.com
Fri Apr 12 01:43:59 CEST 2013


On 11/04/13 23:33, Scurvy Scott wrote:

> the other for something like this. I have no intention of doing
> anything professional/shady/annoying with this code and want to write
> it purely for my own amusement as well as to learn and obviously to
> perhaps win something cool.

Which is fine but you should still check the terms and conditions of the 
web sites because many such sites explicitly prohibit the use of  web 
scrapers. Using one could disqualify you from winning, and disguising 
the fact you are using one is non trivial.

> Everyday the program visits the site and scrapes the links for all the contests.
> The program visits each contest page and verifies there is an entry
> form, indicating that the contest is active
> If the contest is active at that moment, it adds the title of the page
> to a text file, if the contest is inactive it adds the title of the
> page to a text file.
> If the contest is active, it fills out the form with my details and sends it off
> If the contest is inactive the title of the page is added to the
> permanently blacklisted text file and never messed with again.
>
> This might be a bit convoluted as well and any pointers are appreciated.


Seems reasonable to me.

Try looking at the http, urllib and cookie stuff in the stdlib.
And then look at tools like Beautiful Soup and Element Tree for the 
content scraping bits.


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/



More information about the Tutor mailing list