[Tutor] parsing search engine keywords from search referrer log

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Wed Oct 15 17:40:41 EDT 2003



On Wed, 15 Oct 2003, Jaco Smuts (ZA) wrote:

> I'm about to start work on a little program to assist me with analyzing
> my web server logs (in mysql using mod_log_sql).
>
> I want to parse out Search phrases from the referrer fields (where
> referrer is a search engine). Does any one have any ideas on how best to
> approach this, or some code that already does this ?

Hi Jaco,

Do you have examples of some of what these referrer fields look like?

I did a quick Google search: it does look like the Python Community Server
Project

    http://pycs.sourceforge.net/

is writing some interesting software.  For example, they have a
referrer-log analyzer:

http://cvs.sourceforge.net/viewcvs.py/*checkout*/pycs/pycs/analyse_logs.py?content-type=text%2Fplain&rev=1.9

Their approach appears to be use regular expressions to parse the request
message for their logs.  I'm not sure how easy it would be to adopt their
code exactly, but the idea seems right.

Are you familiar with regular expressions?  If not, there's an
introduction to them here:

    http://www.amk.ca/python/howto/regex/

Please feel free to ask questions about them on Tutor; we'll be happy to
help.  Good luck to you!




More information about the Tutor mailing list