internet searching program

greg greg at cosc.canterbury.ac.nz
Sat Aug 9 21:12:53 EDT 2008


Michael Tobis wrote:
> I think you are talking about "screen scraping".
> 
> Your program can get the html for the page, and search for an
> appropriate pattern.

However, it wouldn't be "really fast", because you
still have to fetch all the pages that might contain
data you're looking for.

Google searches are fast because they've already
fetched all the web pages in the world and indexed
them.

You might get somewhere using a program that does
a site-specific google search to find potentially
relevant pages, then goes and looks at those pages
for further information.

Another possibility might be to crawl the site and
build your own index based on the information you're
interested in.

-- 
Greg



More information about the Python-list mailing list