[Web-SIG] Support tools for analyzing pages on the Web

Dave Kuhlman dkuhlman at rexx.com
Sat Feb 3 00:39:42 CET 2007


I'd like to implement and explore tools for analyzing Web pages.  I
have in mind things like:

- Tracing links from a Web page.  Building a tree structure of
  links to a specified depth.

- Tracing links to a Web page.  Showing incoming links to a
  specified depth.

- Word count, word frequency analysis, words in context, etc.

- Etc.

Basically, I'm interested in looking at the structure of the Web
and trying to help make it useful.

So, my question: Are there existing tools (in Python) of course for
this kind of thing.  I'd like (1) not to reinvent what is already
there and (2) to make use of what already exists.

I've done a few Web searches, but have not found that much of
interest.

I plan to start with BeautifulSoup.py at a minimum.

Thanks for help.

And, I'd be interested in any ideas and suggestions.

Dave

-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman


More information about the Web-SIG mailing list