[Ncr-Python.in] developing a data mining tool, along with a webserver

aruntheguy at gmail.com aruntheguy at gmail.com
Thu Feb 24 19:32:44 CET 2011


Hi,

I have been doing some data mining/ scrapping work for sometime using
Python. I donot know the kind of documents you are going handle. Since you
mentioned webserver, I think it would be of HTML and XML documents mostly
(May be I am wrong). But if thats what you are doing, then BeautifulSoup is
an important Library for DOM traversing.

And there is a website named http://www.scraperwiki.com, which supports data
scrapping from web and online storage of the data you collect on their
datastore. You can license the data as per your wish.  It supports multiple
languages like Python, PHP, Ruby..etc., This site is helpful, as it reduces
your burden of setting up your own database. If you want data for other
applications, they provide a API for datastore access.

Hope your tool was as I guessed and the information is useful.

--
P.Arunmozhi
Twitter: @tecoholic
Website: http://www.arunmozhi.in

On Thu, Feb 24, 2011 at 5:03 PM, satyaakam goswami <satyaakam at gmail.com>wrote:

>  On Thu, Feb 24, 2011 at 1:30 PM, vijay shanker <deontics at gmail.com>wrote:
>>>>
>>>>> hi rohan,
>>>>>
>>>>> i want to use python for developing a data mining tool, along with a
>>>>> webserver  ..will be using mysql.. provide me some inputs on how to stat
>>>>> with this problem..
>>>>> regards
>>>>> vijay shanker
>>>>>
>>>>
>
> What is this project going to be licensed in have you put up a wiki page or
> some write up somewhere ?
>
> -Satya
> Ps: guys please please adhere to this
> http://lug-iitd.org/Mailing_List_Guidelines i do sound like a broken
> record , but i have to please ....
>
>
>
>
> _______________________________________________
> Ncr-Python.in mailing list
> Ncr-Python.in at python.org
> http://mail.python.org/mailman/listinfo/ncr-python.in
> Mailing list guidelines : http://lug-iitd.org/Mailing_List_Guidelines
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ncr-python.in/attachments/20110225/afcb91ef/attachment.html>


More information about the Ncr-Python.in mailing list