[BangPypers] Harvestman error

JAGANADH G jaganadhg at gmail.com
Mon May 31 12:01:38 CEST 2010


On Mon, May 31, 2010 at 3:16 PM, Anand Balachandran Pillai <
abpillai at gmail.com> wrote:

> On Sun, May 30, 2010 at 9:56 PM, JAGANADH G <jaganadhg at gmail.com> wrote:
>
> > Dear All I was trying to run Harvestman(A Python tool for web
> harvesting).
> > I got the following error
> > http://pastebin.com/uPzUs0Xw
> >
> > My configuration file is http://pastebin.com/dfhiy2Q6
> >
> > Can any body help me regarding this.
> >
> > I was trying to harvest my blog with a word filter 'Python'
> >
>
>  There is no word filter anymore. You hit upon a bug which seems to
>  still apply the word-filter code :)
>
>  For filtering based on words or regular expressions on the page content,
>  you can implement a custom crawler. It is pretty easy and a sample
>  already exists. Just modify the code to suit the keyword(s) you want
>  to filter.
>
>  Look for "searchingcrawler.py" inside apps/samples folder and
>  modify the code.
>
>
Thanks Anand .
I will try this


-- 
**********************************
JAGANADH G
http://jaganadhg.freeflux.net/blog


More information about the BangPypers mailing list