[Tutor] python module to search a website

Alan Gauld alan.gauld at btinternet.com
Sun Feb 27 10:07:51 CET 2011


"vineeth" <vineethrakesh at gmail.com> wrote

> looking for scraping. I am looking to obtain the html page that my 
> query
> is going to return.

I'm still notcompletely sure what you mean.
What "query" are you talking about? The http GET request?
Or a query transaction on the remote site?

> Just like when you type in a site like Amazon you
> get a bunch of product listing

When I visit Amazon I get a home page which has
a bunch of products on it. Those prodiucts are provisded
by Amazon's web application and I have no control over it.
If I type a string into the search box Amazons app goes
off to search their database and returns a bunch of links.
Again I ghave no control over which links it returns,
that is done by Amazons application logic.

> the module has to search the website and
> return the html link.

It is impossible for any Python module to search a
remote website, that can only be done by code on
that website server. The best a Python module could
do would be to initiate the search by posting the
appropriate search string. But that uis just standard
html parsing and urllib.

If I understand what you are asking for then I think
it is impossible. And I suspect you are a bit confused
about how web sites work. As a user of a web sitre
you are reliant on the functions provided by the server.

If the web site is purely static, like my tutorial for
example, you could do a search if you knew the
file structure and had access to the folders where
the html is stored, but when the pages are created
dynamically, like Amazon, Ebay etc then it is
impossible to search it. You would need access
to their database.

HTH,

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/




More information about the Tutor mailing list