From ajkrell at yahoo.com Tue Jan 21 01:10:56 2014 From: ajkrell at yahoo.com (Aj Krell) Date: Mon, 20 Jan 2014 16:10:56 -0800 (PST) Subject: [omaha] Dev Help Message-ID: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> Greeting s All. ? My name is? Andy and I'm a new developer to Omaha and the Python community. ?I'm also new to this list-serve so apologies if this is a completely inappropriate question for the thread.? ? I'm reaching out to see if anyone would be interested in collaborating on a startup idea.?? It's dealing with (in my opinion) some pretty interesting stuff that I am (in everyone's opinion) very novice with: ? * Multi-threaded web-scraping? (with Urllib) * Natural Language processing (using NLTK) * Pattern Recognition / Clustering Algorithms (using? numpy and matplotlib) * Google App Engine back end using Web2Py framework (or maybe Django if I continue to struggle) ? At this point, I'm prototyping and looking for concept / development help...it is very hard to find. ?If anyone has a good chunk of extra time and a desire to collaborate, I would love to sit down and talk. ? Thanks Andy ajkrell at yahoo.com From dragonfyre13 at gmail.com Tue Jan 21 05:07:29 2014 From: dragonfyre13 at gmail.com (Tim Alexander) Date: Mon, 20 Jan 2014 22:07:29 -0600 Subject: [omaha] Dev Help In-Reply-To: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> References: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> Message-ID: I have a reasonable (in my opinion) set of experience with the libraries listed, as well as a number of others in those spaces, but I'm not presently interested in working on a startup. I'm willing to grab coffee or something and talk over technical concepts/approach though (talking python is always interesting, especially in the graph theory/big data/natural language spaces). On Mon, Jan 20, 2014 at 6:10 PM, Aj Krell wrote: > Greeting > s All. > > My name > is Andy and I'm a new developer to Omaha > and the Python community. I'm also new to this list-serve so apologies if > this is a completely inappropriate question for the thread. > > I'm > reaching out to see if anyone would be interested in collaborating on a > startup > idea. It's dealing with (in my opinion) > some pretty interesting stuff that I am (in everyone's opinion) very novice > with: > > * Multi-threaded web-scraping (with Urllib) > * Natural Language processing (using NLTK) > * Pattern Recognition / Clustering Algorithms (using numpy and > matplotlib) > * Google App Engine back end using Web2Py framework (or maybe > Django if I continue to struggle) > > At this point, I'm prototyping and looking for concept / development > help...it is very hard to find. If > anyone has a good chunk of extra time and a desire to collaborate, I > would love to sit down and talk. > > Thanks > Andy > ajkrell at yahoo.com > _______________________________________________ > Omaha Python Users Group mailing list > Omaha at python.org > https://mail.python.org/mailman/listinfo/omaha > http://www.OmahaPython.org > From wes.turner at gmail.com Tue Jan 21 05:26:56 2014 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 20 Jan 2014 22:26:56 -0600 Subject: [omaha] Dev Help In-Reply-To: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> References: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> Message-ID: These links may be of help to you in your endeavor: http://www.reddit.com/r/Python/comments/1qnbq3/webscraping_selenium_vs_conventional_tools/cdeq2t7 https://developers.google.com/appengine/docs/python/urlfetch/ https://developers.google.com/appengine/docs/python/taskqueue/ http://docs.celeryproject.org/en/latest/index.html Multiprocess is probably what you're after. Celery and appengine task queue have rate-limiting support. The praw reddit api shows how to work with multiprocess and respectable levels of concurrent requests: https://praw.readthedocs.org/en/latest/pages/multiprocess.html Wes Turner Greeting s All. My name is Andy and I'm a new developer to Omaha and the Python community. I'm also new to this list-serve so apologies if this is a completely inappropriate question for the thread. I'm reaching out to see if anyone would be interested in collaborating on a startup idea. It's dealing with (in my opinion) some pretty interesting stuff that I am (in everyone's opinion) very novice with: * Multi-threaded web-scraping (with Urllib) * Natural Language processing (using NLTK) * Pattern Recognition / Clustering Algorithms (using numpy and matplotlib) * Google App Engine back end using Web2Py framework (or maybe Django if I continue to struggle) At this point, I'm prototyping and looking for concept / development help...it is very hard to find. If anyone has a good chunk of extra time and a desire to collaborate, I would love to sit down and talk. Thanks Andy ajkrell at yahoo.com _______________________________________________ Omaha Python Users Group mailing list Omaha at python.org https://mail.python.org/mailman/listinfo/omaha http://www.OmahaPython.org From bkealey at unomaha.edu Tue Jan 21 04:30:38 2014 From: bkealey at unomaha.edu (Burch Kealey) Date: Tue, 21 Jan 2014 03:30:38 +0000 Subject: [omaha] Dev Help In-Reply-To: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> References: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> Message-ID: <6aa9e3327b1b4a01a1ebed2024b621de@DM2PR07MB606.namprd07.prod.outlook.com> Hi Andy What are you trying to do. I guess that is a silly question maybe since your goal is to do a startup - so if you don't want to answer in particular maybe you could describe the corpus you want to attack. Burch ________________________________________ From: Omaha on behalf of Aj Krell Sent: Monday, January 20, 2014 6:10 PM To: omaha at python.org Subject: [omaha] Dev Help Greeting s All. My name is Andy and I'm a new developer to Omaha and the Python community. I'm also new to this list-serve so apologies if this is a completely inappropriate question for the thread. I'm reaching out to see if anyone would be interested in collaborating on a startup idea. It's dealing with (in my opinion) some pretty interesting stuff that I am (in everyone's opinion) very novice with: * Multi-threaded web-scraping (with Urllib) * Natural Language processing (using NLTK) * Pattern Recognition / Clustering Algorithms (using numpy and matplotlib) * Google App Engine back end using Web2Py framework (or maybe Django if I continue to struggle) At this point, I'm prototyping and looking for concept / development help...it is very hard to find. If anyone has a good chunk of extra time and a desire to collaborate, I would love to sit down and talk. Thanks Andy ajkrell at yahoo.com _______________________________________________ Omaha Python Users Group mailing list Omaha at python.org https://mail.python.org/mailman/listinfo/omaha http://www.OmahaPython.org From wes.turner at gmail.com Tue Jan 21 12:27:31 2014 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 21 Jan 2014 05:27:31 -0600 Subject: [omaha] Dev Help In-Reply-To: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> References: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> Message-ID: ... Welcome to Omaha! > * Multi-threaded web-scraping (with Urllib) I already mentioned multiprocess, appengine task queue, celery, and the PRAW reddit API (requests, urllib3). (multithreaded within a single process won't gain you much for web scraping). Requests wraps urllib3 with a nice API; urllib3 reuses sockets, supports redirects, and compression: * http://requests.readthedocs.org/en/latest/ * http://urllib3.readthedocs.org/en/latest/ > * Natural Language processing (using NLTK) Textblob wraps NLTK and pattern: * http://textblob.readthedocs.org/en/latest/ * http://www.reddit.com/r/compsci/comments/1gpdb9/nlp_how_can_i_get_the_computer_to_understand_the/camicro > * Pattern Recognition / Clustering Algorithms (using numpy and matplotlib) For learning numpy, scipy, matplotlib, etc, these are great: https://github.com/jrjohansson/scientific-python-lectures (IPython notebooks) https://github.com/scipy-lectures/scipy-lecture-notes (Sphinx) scikit-learn and statsmodels may also be helpful: http://www.reddit.com/r/pystats/comments/1s1scv/a_tutorial_on_statisticallearning_for_scientific/ http://www.reddit.com/r/opendata/comments/1o8agq/examples_of_data_analysis/ccpqcc2 As far as matplotlib, there's now mpl3d, a D3js implementation, and the webagg backend, which requires IPython on the serverside: https://github.com/jakevdp/mpld3/issues/45 http://nbviewer.ipython.org/github/jrjohansson/scientific-python-lectures/blob/master/Lecture-4-Matplotlib.ipynb https://github.com/jrjohansson/scientific-python-lectures/commits/master/Lecture-4-Matplotlib.ipynb > * Google App Engine back end using Web2Py framework (or maybe Django if I continue to struggle) django-nonrel fork (+ django-appengine, django-dbindexer) makes working with Django and AppEngine easier to start with: https://github.com/django-nonrel http://djangoappengine.readthedocs.org/en/latest/ ... https://github.com/conda (and miniconda or Continuum's anaconda distribution) may be helpful for creating repeatable installations of many/all of these packages. (conda install ) -- Wes Turner On Mon, Jan 20, 2014 at 6:10 PM, Aj Krell wrote: > Greeting > s All. > > My name > is Andy and I'm a new developer to Omaha > and the Python community. I'm also new to this list-serve so apologies if > this is a completely inappropriate question for the thread. > > I'm > reaching out to see if anyone would be interested in collaborating on a > startup > idea. It's dealing with (in my opinion) > some pretty interesting stuff that I am (in everyone's opinion) very novice > with: > > * Multi-threaded web-scraping (with Urllib) > * Natural Language processing (using NLTK) > * Pattern Recognition / Clustering Algorithms (using numpy and > matplotlib) > * Google App Engine back end using Web2Py framework (or maybe > Django if I continue to struggle) > > At this point, I'm prototyping and looking for concept / development > help...it is very hard to find. If > anyone has a good chunk of extra time and a desire to collaborate, I > would love to sit down and talk. > > Thanks > Andy > ajkrell at yahoo.com > _______________________________________________ > Omaha Python Users Group mailing list > Omaha at python.org > https://mail.python.org/mailman/listinfo/omaha > http://www.OmahaPython.org > From bkealey at unomaha.edu Tue Jan 21 04:30:38 2014 From: bkealey at unomaha.edu (Burch Kealey) Date: Tue, 21 Jan 2014 03:30:38 +0000 Subject: [omaha] Dev Help In-Reply-To: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> References: <1390263056.36669.YahooMailNeo@web125604.mail.ne1.yahoo.com> Message-ID: <6aa9e3327b1b4a01a1ebed2024b621de@DM2PR07MB606.namprd07.prod.outlook.com> Hi Andy What are you trying to do. I guess that is a silly question maybe since your goal is to do a startup - so if you don't want to answer in particular maybe you could describe the corpus you want to attack. Burch ________________________________________ From: Omaha on behalf of Aj Krell Sent: Monday, January 20, 2014 6:10 PM To: omaha at python.org Subject: [omaha] Dev Help Greeting s All. My name is Andy and I'm a new developer to Omaha and the Python community. I'm also new to this list-serve so apologies if this is a completely inappropriate question for the thread. I'm reaching out to see if anyone would be interested in collaborating on a startup idea. It's dealing with (in my opinion) some pretty interesting stuff that I am (in everyone's opinion) very novice with: * Multi-threaded web-scraping (with Urllib) * Natural Language processing (using NLTK) * Pattern Recognition / Clustering Algorithms (using numpy and matplotlib) * Google App Engine back end using Web2Py framework (or maybe Django if I continue to struggle) At this point, I'm prototyping and looking for concept / development help...it is very hard to find. If anyone has a good chunk of extra time and a desire to collaborate, I would love to sit down and talk. Thanks Andy ajkrell at yahoo.com _______________________________________________ Omaha Python Users Group mailing list Omaha at python.org https://mail.python.org/mailman/listinfo/omaha http://www.OmahaPython.org From ajkrell at yahoo.com Tue Jan 21 05:19:21 2014 From: ajkrell at yahoo.com (Aj Krell) Date: Mon, 20 Jan 2014 20:19:21 -0800 (PST) Subject: [omaha] Dev Help In-Reply-To: Message-ID: <1390277961.93838.YahooMailAndroidMobile@web125605.mail.ne1.yahoo.com> Hi Tim. Thanks for the quick response and the offer.? I would love to grab a coffee and discuss.? Sometime this week work?? I'm free wed and thurs after 430. Andy