[Pydotorg-redesign] how to search the site
Simon Willison
cs1spw at bath.ac.uk
Sat Sep 13 21:35:47 EDT 2003
Barry Warsaw wrote:
> If I were to cast my vote <wink> I'd go for the thing that takes the
> least amount of effort to set up and maintain, that doesn't suck. Bonus
> points if we include the mailing list archives as a search corpus.
I just took a look at the mailing list archives and they total just over
700 MB(!) - the largest is Python-Dev at 111 MB. Loading that lot in to
a search engine could be a painful task. It looks like Google has
indexed them all (incredibly) so a targetted Google search limited to
the mail.python.org domain would probably suffice for mailing lists.
I still think there is a big advantage to be had in rolling a custom
search engine for the site though - the ability to highlight certain
site areas for specific keywords for example. I wonder if it would be
possible to use the Google web services API to power a Python.org search
engine? The API terms and conditions www.google.com/apis/api_terms.html
say this:
"""
The Google Web APIs service is made available to you for your personal,
non-commercial use only (at home or at work) [ ... ] And you may not use
the search results provided by the Google Web APIs service with an
existing product or service that competes with products or services
offered by Google.
"""
I have no idea if a search engine for Python.org would count as
"competing with products or services offered by Google". If it doesn't,
a Google API powered search engine would give us all of the benefits of
Google while still allowing the Python site to apply a custom template
to the results and other enhancements (such as recommeded site areas for
specific keywords).
Cheers,
Simon
More information about the Pydotorg-redesign
mailing list