[Mailman-Users] Using robots.txt

Steff Watkins s.watkins at nhm.ac.uk
Mon Nov 9 11:56:15 CET 2009


> -----Original Message-----
> From: mailman-users-bounces+s.watkins=nhm.ac.uk at python.org 
> [mailto:mailman-users-bounces+s.watkins=nhm.ac.uk at python.org] 
> On Behalf Of Max Pyziur
> Sent: 09 November 2009 01:09
> To: mailman-users at python.org
> Subject: [Mailman-Users] Using robots.txt
> 
> Greetings
> 
> Mailman's email lists are visible search engine spiders from 
> http://www.somedomain.com/pipermail/emaillistname/etc
> 
> However, /pipermail is an alias of /var/mailman/archives/public/
> 
> per mailman.conf
> 
> I've tried placing a basic robots.txt file at 
> /var/mailman/archives/public/ and set permissions to 644. 
> However, my lists still get spidered.
> 
> Any suggestions on where to place the robots.txt file to 
> prevent spidering?
> 
> Thanks!
> 
> Max Pyziur
> pyz at brama.com

Hello Max,

 AFAIK your robots.txt file should be in the TOP level directory of your
website, so that it is browseable via
http://www.somedomain.com/robots.txt . This is the default location for
it and 'good' spiders will look for it there.

It should contain the allow/deny details for the whole of your website
and in your case would look something like this:

----
User-agent: *
Disallow: /pipermail/
----

.. Which informs all browsers to disallow any URL starting (containing?)
the phrase "/pipermail/".

Give that a whirl and see how it does.

Regards,
Steff Watkins
=======


More information about the Mailman-Users mailing list