HTML search engine written in Python - is there one?

Michal Wallace (sabren) sabren at manifestation.com
Fri May 19 12:43:38 EDT 2000


On Fri, 19 May 2000, Simon Brunning wrote:

> I need something that will build an index of the text content of a
> number of HTML files, and allow  you to nun queries on the index.
> Does anyone know of such a thing, or am I going to have to write my
> own?

Well, since you said in your other email that you'd like to tinker with
it, check out http://ransacker.sourceforge.net/ .. There's an Index
class that lets you index arbitrary chunks of text.. But you'll have
to write the program that actually reads the HTML files (and strips
the HTML tags, if that's what you mean by "text content")... 

It also does a ranked searches, but you'll have to wrap that, too, if
you want the output to show up on the web.

Cheers,

- Michal
-------------------------------------------------------------------------
http://www.manifestation.com/         http://www.linkwatcher.com/metalog/
-------------------------------------------------------------------------





More information about the Python-list mailing list