[newbie] Is Python what I'm looking for?

Cameron Laird claird at starbase.neosoft.com
Fri May 24 22:40:39 EDT 2002


In article <19eteugnbcppsh0s1kj7gvjr3jvtvfldt2 at 4ax.com>,
Giulio Cespuglio  <giulio.agostini.remove.this at libero.it> wrote:
>Hi there,
>
>My aim is to automatically get specific pieces of information from a
>website, simulating the behaviour of a user filling in HTML forms and
>clicking buttons (a web robot?), then embed them in my HTML page.
>In other words, the pages I need to access are not accessible from a
>standard URL.
>The other part of the problem is of course parsing the resulting HTML
>and extracting the pieces of info I need.
>
>Does Python provide libraries that could help me? Could you please
>give me some keywords/pointers? I'm completely new to Python.
>I would of course set up my web server under windows (Apache?) and the
>necessary plugin.
>
>Can you think of a better way of doing this? Another scripting
>language perhaps?
			.
			.
			.
Most of the common scripting languages are comparably
adept at this, which I most often call "Web scraping".
I don't know of anyone who's written an effective tutor-
ial on Web scraping; I outline architectural aspects in
<URL: http://cedar.intel.com/cgi-bin/ids.dll/content/
content.jsp?cntKey=GenericEditorial::ws_scraping&cntType=
IDS_EDITORIAL&catCode=CJA >.

The pages you want might well be accessible from a con-
ventional URL, although this isn't always apparent.

Python's a fine language to use for such automations.
While there are also special-purpose ones that make
such chores even easier, they're available only with
stiff license fees.
-- 

Cameron Laird <Cameron at Lairds.com>
Business:  http://www.Phaseit.net
Personal:  http://starbase.neosoft.com/~claird/home.html



More information about the Python-list mailing list