[Tutor] Retrieving data from a web site
Dave Angel
davea at davea.name
Sat May 18 04:49:56 CEST 2013
On 05/17/2013 07:57 PM, Phil wrote:
> I'd like to "download" eight digits from a web site where the digits are
> stored as individual graphics. Is this possible, using perhaps, one of
> the countless number of Python modules? Is this the function of a web
> scraper?
>
Anything's possible. But if these "digits" are purposely hard to read,
perhaps to avoid spamming, then the likelihood of your algorithmically
reading them is vanishingly small. For example, "captcha" pictures.
There are libraries to "scrape" textual information from the web page,
no sweat. But that information might not even point directly to the 8
image files. There could be many layers of indirection, through
javascript and other tricks.
But most importantly, if the images are deliberately distorted parodies
of digits, most of us would be stymied, and I don't know any library
anywhere that's intended to "break" such coding.
As a result, I'd recommend starting there. Visit the page in a regular
browser, use screen capture techniques to capture each of the displayed
images, and have at it. If you have no luck with those, no point in
writing the other code, which could be anything from easy to very hard.
--
DaveA
More information about the Tutor
mailing list