From davidkunio at gmail.com Tue Nov 3 07:29:42 2015 From: davidkunio at gmail.com (David Matsumura) Date: Tue, 3 Nov 2015 06:29:42 -0600 Subject: [Chicago] Finance SIG: Meeting this Friday Message-ID: <1D8F4794-964D-42C8-A677-0024CB61231E@gmail.com> Hi ChiPy, As a reminder there is a meeting this Friday. Dr Jess Stauth of Quantopian will be in town to tell us what it takes to write a winning algorithm. Pizza and Beer @ 6PM at Blue1647. Hope to see you there. http://meetu.ps/2Q1lZP Best Regards, -- David Matsumura 773.230.1761 From brianhray at gmail.com Fri Nov 6 01:44:52 2015 From: brianhray at gmail.com (Brian Ray) Date: Fri, 6 Nov 2015 01:44:52 -0500 Subject: [Chicago] \welcome Scientifc SIG In-Reply-To: References:

Message-ID: See you sunday if you can make it http://goo.gl/forms/SQ10kI9cN5 On Tue, Oct 13, 2015 at 8:36 AM, Daniel Galtieri wrote: > Probably too much of Python newbie to contribute much content wise, but if > there's anything else I can help out with let me know. Looking forward to > these meetings. > > Best, > > Dan > > > ? > -- > Daniel Galtieri > Ph.D. Candidate > Northwestern University > Interdepartmental Neuroscience Program > Laboratory of Dr. D. James Surmeier > Chicago, IL > > > On Mon, Oct 12, 2015 at 8:43 PM, Joshua Herman > wrote: > >> Pinged off means that they will email you not on a thread that will >> message anyone on the list. >> >> On Mon, Oct 12, 2015 at 8:11 PM Lewit, Douglas wrote: >> >>> All this techie talk has got me slightly confused! What does it mean to >>> get "pinged off" a list? :-) >>> >>> Yes, I'm interested in the numerical and scientific applications of >>> Python (i.e. numpy, scipy, matplotlib, sympy, et al ) and would like to >>> learn more.... within my time constraints! Thanks Brian. >>> >>> On Mon, Oct 12, 2015 at 7:41 PM, Brian Ray wrote: >>> >>>> fantastic. Will ping you off the list as we get closer. Meanwhile, >>>> looking for: venue, sponsors, speakers, love. >>>> >>>> Cheers, Brian >>>> >>>> _______________________________________________ >>>> Chicago mailing list >>>> Chicago at python.org >>>> https://mail.python.org/mailman/listinfo/chicago >>>> >>>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -- Brian Ray @brianray (773) 669-7717 -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianhray at gmail.com Fri Nov 6 01:57:21 2015 From: brianhray at gmail.com (Brian Ray) Date: Fri, 6 Nov 2015 01:57:21 -0500 Subject: [Chicago] Fwd: Scientific Lightening Talks In-Reply-To: <047d7b4141285c0d540523d978c2@google.com> References: <047d7b4141285c0d540523d978c2@google.com> Message-ID: Making an executive decision here to get the Scientific SIG rolling Sunday I have decided to collect lightning talks for first meeting. Let's do this! If you have trouble viewing or submitting this form, you can fill it out in Google Forms . Scientific Lightening Talks http://www.meetup.com/_ChiPy_/events/226013805/ Sunday, November 8, 2015 4:00 PM to 6:00 PM DePaul CDM Building, Room 924 243 S. Wabash Ave, Chicago, IL (edit map) ChiPy now has SIG (Special Interest Group) just for those interested in using Python for Scientific Computing and Research. Yes, we need a venue, speakers, and other founding organizers. Tentatively setting this to the First Thurs. * Required What is your talk about? * How long? * - 5 min - 10 min - 15 min - 20 min Who are you * Never submit passwords through Google Forms. -- Brian Ray @brianray (773) 669-7717 -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianhray at gmail.com Fri Nov 6 01:36:57 2015 From: brianhray at gmail.com (brianhray at gmail.com) Date: Fri, 06 Nov 2015 06:36:57 +0000 Subject: [Chicago] Scientific Lightening Talks Message-ID: <089e013a1a20597c880523d97805@google.com> Making an executive decision here to get the Scientific SIG rolling Sunday I have decided to collect lightning talks for first meeting. Let's do this! http://www.meetup.com/_ChiPy_/events/226013805/ Sunday, November 8, 2015 4:00 PM to 6:00 PM DePaul CDM Building, Room 924 243 S. Wabash Ave, Chicago, IL (edit map) ChiPy now has SIG (Special Interest Group) just for those interested in using Python for Scientific Computing and Research. Yes, we need a venue, speakers, and other founding organizers. Tentatively setting this to the First Thurs. I've invited you to fill out the form Scientific Lightening Talks. To fill it out, visit: https://docs.google.com/forms/d/1S_YbIWhLuvWOp1Dng8dDYUB-bcj13dtwd2POwC1gB0o/viewform?c=0&w=1&usp=mail_form_link -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkunio at gmail.com Fri Nov 6 10:38:46 2015 From: davidkunio at gmail.com (David Matsumura) Date: Fri, 6 Nov 2015 09:38:46 -0600 Subject: [Chicago] Finance SIG: Meeting tonight Message-ID: Looking forward to the talks tonight. Pizza and trading algorithms - no better way to spend a Friday night! http://meetu.ps/2Q1lZP -- David Matsumura 773.230.1761 From szybalski at gmail.com Fri Nov 6 15:02:52 2015 From: szybalski at gmail.com (Lukasz Szybalski) Date: Fri, 6 Nov 2015 14:02:52 -0600 Subject: [Chicago] [job] Software Engineer / Programmer Message-ID: Hello, I work for "Producers National Corporation". We support all IT work for 3 different insurance companies. We have been growing every year, and we just acquired additional companies bringing our total to 6 subsidiaries of various kinds. All in auto/commercial auto insurance spectrum. We are looking for a software engineer /programmer to help us meet the demands of our enterprise users(see job description). 90% of the programs we write are written in Python. We are looking for somebody with python web experience. We are currently programming in pyramid web framework but as long as you are familiar with other python web frameworks and JavaScript you should fit right in. We are looking for somebody with 1-3 years of experience, but we are also open to hire entry level programmer. Send resume to "resume at producersnational.com" Subject: Software Engineer Resume - Your name" or email me off list for additional information. Thank you Lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Software Engineer_JobDescription_20151103.pdf Type: application/pdf Size: 39714 bytes Desc: not available URL: From joe.jasinski at gmail.com Mon Nov 9 23:48:50 2015 From: joe.jasinski at gmail.com (Joe Jasinski) Date: Mon, 9 Nov 2015 22:48:50 -0600 Subject: [Chicago] ChiPy November 12th Meeting Message-ID: All, November ChiPy is fast approaching. This month, we are meeting at the Nokia office in the loop. All are welcome! You can find more information about ChiPy at our website http://www.chipy.org/ Hope to see you there! *When:* Thursday November 12th, 7:00pm *How:* You can rsvp at chipy.org or via our Meetup group. *Where:* HERE (Nokia) 100 North Riverside Chicago, IL 60606 *What:* - *Python at Nokia (by MacGregor Felix)* (1:00:00 Minutes) By: Python is known to be a multi-purpose and multi-paradigm programming language. Come see how the Reality Capture & Processing (RCP) group of Nokia HERE is making use of Python?s versatility. We will show you how HERE RCP uses Python?s Object Oriented constructs to represent business models in production systems. You will see how Python?s functional lambdas are used to elegantly facilitate the handling of big data. We will discuss the use of Python not only in production code but also in test code. We not only use Python for production purposes but also to build utilities. We hope to show you how we utilize Python's versatility and closeness to the operating system to build sophisticated tools for development and operational productivity. You?ll see our Test Driven development effort while building Python products and how we use Python in Behavior Driven Development to code language-agnostic acceptance tests for the evolution of software and services. We will also give you a pick at our Python packaging and distribution. - *Python-fu in the GIMP* (0:42:00 Minutes) By: Tanya Schlusser GIMP (the GNU Image Manipulation Program) is great all by itself but is even better with Python-fu. This talk demonstrates a little Python-fu to manipulate images in GIMP, with a little (slightly ugly) hacking to add external libraries. -------------- next part -------------- An HTML attachment was scrubbed... URL: From d-lewit at neiu.edu Mon Nov 9 20:44:56 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Mon, 9 Nov 2015 19:44:56 -0600 Subject: [Chicago] Need advice on this project. Message-ID: Hey guys, I need some advice on this one. I'm attaching the homework assignment so that you understand what I'm trying to do. I went as far as the construction of the Similarity Matrix, which is a matrix of Pearson correlation coefficients. My problem is this. u1.base (which is also attached) contains Users (first column), Items (second column), Ratings (third column) and finally the time stamp in the 4th and final column. (Just discard the 4th column. We're not using it for anything. ) It's taking HOURS for Python to build the similarity matrix. So what I did was: *head -n 5000 u1.base > practice.base* and I also downloaded the PyPy interpreter for Python 3. Then using PyPy (or pypy or whatever) I ran my program on the first ten thousand lines of data from u1.base stored in the new text file, practice.base. Not a problem!!! I still had to wait a couple minutes, but not a couple hours!!! Is there a way to make this program work for such a large set of data? I know my program successfully constructs the Similarity Matrix (i.e. similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines of data. But for 80,000 lines of data the program becomes very slow and overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to get very hot.... a bad sign! ) Does anyone have any recommendations? ( I'm supposed to meet with my prof on Tuesday. I may just explain the problem to him and request a smaller data set to work with. And unfortunately he knows very little about Python. He's primarily a C++ and Java programmer. ) I appreciate the feedback. Thank you!!! Best, Douglas Lewit -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Homework3_Revision2.py Type: text/x-python Size: 3141 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: u1.base Type: application/octet-stream Size: 1586544 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: practice2.base Type: application/octet-stream Size: 391523 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: HW3.pdf Type: application/pdf Size: 230712 bytes Desc: not available URL: From d-lewit at neiu.edu Tue Nov 10 03:10:12 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Tue, 10 Nov 2015 02:10:12 -0600 Subject: [Chicago] Can this be done with a yield statement and generator object? Message-ID: Hey guys, I'm attaching a simple class that I created in Python.... Python 3 to be specific, but I think it should work in Python 2 as well, maybe. Anyhow, is there a way to implement the same concept using a *yield statement* in a function to create a generator object? Just wondering. Let me know, thanks! Best, Douglas Lewit P.S. Obviously if you use a generator object to do this then the generator object would never produce the StopIteration error. But I'm kind of confused about how to create and define a generator object that would produce this cyclical behavior in an array or list. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Circular_List_in_Python.png Type: image/png Size: 277517 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: CircularList.py Type: text/x-python Size: 373 bytes Desc: not available URL: From sunhwanj at gmail.com Tue Nov 10 09:52:58 2015 From: sunhwanj at gmail.com (Sunhwan Jo) Date: Tue, 10 Nov 2015 08:52:58 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References: Message-ID: <5698E9D1-E709-4B18-BE2B-7C60A279DE2A@gmail.com> 1. Your ?correlation? function takes most of the execution time. > def Correlation(p, q): > global PQ_Ratings > sum1 = 0 > sum2 = 0 > numeratorProduct = 1 > denominatorProduct1 = 1 > denominatorProduct2 = 1 > for key in filter( lambda x: x[0] == p or x[0] == q, PQ_Ratings.keys( ) ): > if key[0] == p: > sum1+= PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > else: > sum2+= PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > numeratorProduct+= sum1*sum2 > denominatorProduct1+= sum1**2 > denominatorProduct2+= sum2**2 > return numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) By changing sum1 and sum2 as list comprehension can increase the execution speed about 10x (rough estimate using your code). In addition, the denominator is also wrong. It should be *sum of squared differences* not *square of sum of differences*, but I?m not concerned at this yet. > def Correlation(p, q): > global PQ_Ratings > sum1 = 0 > sum2 = 0 > numeratorProduct = 1 > denominatorProduct1 = 1 > denominatorProduct2 = 1 > keys = [key for key in PQ_Ratings.keys() if key[0] == p or key[0] == q] > sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key in keys if key[0] == p]) > sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key in keys if key[0] == q]) > numeratorProduct+= sum1*sum2 > denominatorProduct1+= sum1**2 > denominatorProduct2+= sum2**2 > return numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) 2. You don?t have to re-calculate sum1 each time. ?sum1" only depends on ?p?. So, you can calculate that only in the outer loop and reuse it. > keys = PQ_Ratings.keys() > for i in range(1, len(SimilarityMatrix)): > sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key in keys if key[0] == i]) > > for j in range(i + 1, len(SimilarityMatrix)): > sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key in keys if key[0] == j]) > numeratorProduct = sum1*sum2 + 1 > denominatorProduct1 = sum1**2 + 1 > denominatorProduct2 = sum2**2 + 1 > SimilarityMatrix[i][j] = numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) This will again speed up but the total execution time is about 200 minutes with +900 users. 3. Is there any reason not to use NumPy array? Using NumPy it finishes less than a fraction of a minute. Notice I also fixed the bug in the nominator and the denominator. > import numpy as np > nitems = max(AverageRatingsOfItems.keys()) > nusers = max([key[0] for key in PQ_Ratings.keys()]) > avg_rating = np.zeros(nitems) > pq_rating = np.zeros((nusers, nitems)) > keys = PQ_Ratings.keys() > for key in keys: > pq_rating[key[0]-1, key[1]-1] = PQ_Ratings[key] > keys = AverageRatingsOfItems.keys() > for key in keys: > avg_rating[key-1] = AverageRatingsOfItems[key] > > startTime = time.time( ) > > #### Let's finish building up our similarity matrix for this problem. > keys = PQ_Ratings.keys() > for i in range(1, len(SimilarityMatrix)): > #sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key in keys if key[0] == i]) > diff1 = np.sum(pq_rating[i-1] - avg_rating) > > for j in range(i + 1, len(SimilarityMatrix)): > #sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key in keys if key[0] == j]) > diff2 = np.sum(pq_rating[j-1] - avg_rating) > numeratorProduct = np.sum(diff1*diff2) > denominatorProduct1 = np.sum(diff1**2) > denominatorProduct2 = np.sum(diff2**2) > SimilarityMatrix[i][j] = numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > On Nov 9, 2015, at 7:44 PM, Lewit, Douglas wrote: > > Hey guys, > > I need some advice on this one. I'm attaching the homework assignment so that you understand what I'm trying to do. I went as far as the construction of the Similarity Matrix, which is a matrix of Pearson correlation coefficients. > > My problem is this. u1.base (which is also attached) contains Users (first column), Items (second column), Ratings (third column) and finally the time stamp in the 4th and final column. (Just discard the 4th column. We're not using it for anything. ) > > It's taking HOURS for Python to build the similarity matrix. So what I did was: > > head -n 5000 u1.base > practice.base > > and I also downloaded the PyPy interpreter for Python 3. Then using PyPy (or pypy or whatever) I ran my program on the first ten thousand lines of data from u1.base stored in the new text file, practice.base. Not a problem!!! I still had to wait a couple minutes, but not a couple hours!!! > > Is there a way to make this program work for such a large set of data? I know my program successfully constructs the Similarity Matrix (i.e. similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines of data. But for 80,000 lines of data the program becomes very slow and overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to get very hot.... a bad sign! ) > > Does anyone have any recommendations? ( I'm supposed to meet with my prof on Tuesday. I may just explain the problem to him and request a smaller data set to work with. And unfortunately he knows very little about Python. He's primarily a C++ and Java programmer. ) > > I appreciate the feedback. Thank you!!! > > Best, > > Douglas Lewit > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago -------------- next part -------------- An HTML attachment was scrubbed... URL: From heflin.rosst at gmail.com Tue Nov 10 09:57:28 2015 From: heflin.rosst at gmail.com (Ross Heflin) Date: Tue, 10 Nov 2015 08:57:28 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References: Message-ID: Might be time to profile. Run your similarity matrix builder with the large dataset against cProfile (or whatever works on PyPy) for some time (30 min) and see where its spending the majority of its time. -Ross On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: > Hey guys, > > I need some advice on this one. I'm attaching the homework assignment so > that you understand what I'm trying to do. I went as far as the > construction of the Similarity Matrix, which is a matrix of Pearson > correlation coefficients. > > My problem is this. u1.base (which is also attached) contains Users > (first column), Items (second column), Ratings (third column) and finally > the time stamp in the 4th and final column. (Just discard the 4th column. > We're not using it for anything. ) > > It's taking HOURS for Python to build the similarity matrix. So what I > did was: > > *head -n 5000 u1.base > practice.base* > > and I also downloaded the PyPy interpreter for Python 3. Then using PyPy > (or pypy or whatever) I ran my program on the first ten thousand lines of > data from u1.base stored in the new text file, practice.base. Not a > problem!!! I still had to wait a couple minutes, but not a couple hours!!! > > > Is there a way to make this program work for such a large set of data? I > know my program successfully constructs the Similarity Matrix (i.e. > similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines > of data. But for 80,000 lines of data the program becomes very slow and > overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to > get very hot.... a bad sign! ) > > Does anyone have any recommendations? ( I'm supposed to meet with my prof > on Tuesday. I may just explain the problem to him and request a smaller > data set to work with. And unfortunately he knows very little about > Python. He's primarily a C++ and Java programmer. ) > > I appreciate the feedback. Thank you!!! > > Best, > > Douglas Lewit > > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -- >From the "desk" of Ross Heflin phone number: (847) <23,504,826th decimal place of pi> -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesclemens at gmail.com Tue Nov 10 10:02:42 2015 From: wesclemens at gmail.com (William E. S. Clemens) Date: Tue, 10 Nov 2015 09:02:42 -0600 Subject: [Chicago] Can this be done with a yield statement and generator object? In-Reply-To: References: Message-ID: I didn't test, but something like this should do the same. def circular_list(array): while True: counter = -1 if counter == len(array) - 1: counter = -1 counter+=1 yield array[counter] On Tue, Nov 10, 2015 at 2:10 AM, Lewit, Douglas wrote: > Hey guys, > > I'm attaching a simple class that I created in Python.... Python 3 to be > specific, but I think it should work in Python 2 as well, maybe. Anyhow, > is there a way to implement the same concept using a *yield statement* in > a function to create a generator object? Just wondering. Let me know, > thanks! > > Best, > > Douglas Lewit > > P.S. Obviously if you use a generator object to do this then the > generator object would never produce the StopIteration error. But I'm kind > of confused about how to create and define a generator object that would > produce this cyclical behavior in an array or list. > > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -- William Clemens Phone: 847.485.9455 E-mail: wesclemens at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam at adamforsyth.net Tue Nov 10 10:46:22 2015 From: adam at adamforsyth.net (Adam Forsyth) Date: Tue, 10 Nov 2015 09:46:22 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References: Message-ID: Hi Douglas, You seem to post interesting homework assignments when I'm looking for a fun problem, thanks. The issue definitely isn't the performance of either Python (the language) or CPython (the implementation). I did the assignment last night, and calculating the matrix for "u1.base" took my code less than 10 seconds. For readability in your Correlation function, try to avoid: globals; creating lambdas inside loops; and indexing with constant keys rather than using argument unpacking (i.e. key[0]). It also helps to follow PEP8 if you want other Python programmers to be able to read your code easily. You probably have an algorithmic error in there somewhere -- it's hard for me to tell for sure because your code is difficult to follow. Read the assignment carefully, and only do what it tells you. For performance, are there different data structures you could use? Are there "batteries included" in Python that could combine some of those individual arithmetic operations? I don't want to be too specific here because implementing the algorithm is the point of the assignment. It looks like you still have two weeks to complete the project, so I'd recommend taking your time, and don't be afraid to start a new version -- it can help you break out of bad patterns you've started in your existing code. Best, Adam On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: > Hey guys, > > I need some advice on this one. I'm attaching the homework assignment so > that you understand what I'm trying to do. I went as far as the > construction of the Similarity Matrix, which is a matrix of Pearson > correlation coefficients. > > My problem is this. u1.base (which is also attached) contains Users > (first column), Items (second column), Ratings (third column) and finally > the time stamp in the 4th and final column. (Just discard the 4th column. > We're not using it for anything. ) > > It's taking HOURS for Python to build the similarity matrix. So what I > did was: > > *head -n 5000 u1.base > practice.base* > > and I also downloaded the PyPy interpreter for Python 3. Then using PyPy > (or pypy or whatever) I ran my program on the first ten thousand lines of > data from u1.base stored in the new text file, practice.base. Not a > problem!!! I still had to wait a couple minutes, but not a couple hours!!! > > > Is there a way to make this program work for such a large set of data? I > know my program successfully constructs the Similarity Matrix (i.e. > similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines > of data. But for 80,000 lines of data the program becomes very slow and > overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to > get very hot.... a bad sign! ) > > Does anyone have any recommendations? ( I'm supposed to meet with my prof > on Tuesday. I may just explain the problem to him and request a smaller > data set to work with. And unfortunately he knows very little about > Python. He's primarily a C++ and Java programmer. ) > > I appreciate the feedback. Thank you!!! > > Best, > > Douglas Lewit > > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From len_wanger at hotmail.com Tue Nov 10 10:51:39 2015 From: len_wanger at hotmail.com (Len Wanger) Date: Tue, 10 Nov 2015 09:51:39 -0600 Subject: [Chicago] Chicago Digest, Vol 123, Issue 6 In-Reply-To: References: Message-ID: Try changing your next routine to this: def __next__(self): while True: if self.counter == len(self.array) - 1: self.counter = -1 self.counter+=1 yield self.array[self.counter] Len > From: chicago-request at python.org > Subject: Chicago Digest, Vol 123, Issue 6 > To: chicago at python.org > Date: Tue, 10 Nov 2015 09:54:39 -0500 > > Send Chicago mailing list submissions to > chicago at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/chicago > or, via email, send a message with subject or body 'help' to > chicago-request at python.org > > You can reach the person managing the list at > chicago-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Chicago digest..." > > > Today's Topics: > > 1. Can this be done with a yield statement and generator object? > (Lewit, Douglas) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 10 Nov 2015 02:10:12 -0600 > From: "Lewit, Douglas" > To: The Chicago Python Users Group > Subject: [Chicago] Can this be done with a yield statement and > generator object? > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > Hey guys, > > I'm attaching a simple class that I created in Python.... Python 3 to be > specific, but I think it should work in Python 2 as well, maybe. Anyhow, > is there a way to implement the same concept using a *yield statement* in a > function to create a generator object? Just wondering. Let me know, > thanks! > > Best, > > Douglas Lewit > > P.S. Obviously if you use a generator object to do this then the generator > object would never produce the StopIteration error. But I'm kind of > confused about how to create and define a generator object that would > produce this cyclical behavior in an array or list. > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: Circular_List_in_Python.png > Type: image/png > Size: 277517 bytes > Desc: not available > URL: > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: CircularList.py > Type: text/x-python > Size: 373 bytes > Desc: not available > URL: > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > > ------------------------------ > > End of Chicago Digest, Vol 123, Issue 6 > *************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From d-lewit at neiu.edu Tue Nov 10 12:09:27 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Tue, 10 Nov 2015 11:09:27 -0600 Subject: [Chicago] Can this be done with a yield statement and generator object? In-Reply-To: References:

Message-ID: That's very logical, Will, thanks! :-) On Tue, Nov 10, 2015 at 9:02 AM, William E. S. Clemens wrote: > I didn't test, but something like this should do the same. > > def circular_list(array): > while True: > counter = -1 > if counter == len(array) - 1: > counter = -1 > counter+=1 > yield array[counter] > > > On Tue, Nov 10, 2015 at 2:10 AM, Lewit, Douglas wrote: > >> Hey guys, >> >> I'm attaching a simple class that I created in Python.... Python 3 to be >> specific, but I think it should work in Python 2 as well, maybe. Anyhow, >> is there a way to implement the same concept using a *yield statement* >> in a function to create a generator object? Just wondering. Let me know, >> thanks! >> >> Best, >> >> Douglas Lewit >> >> P.S. Obviously if you use a generator object to do this then the >> generator object would never produce the StopIteration error. But I'm kind >> of confused about how to create and define a generator object that would >> produce this cyclical behavior in an array or list. >> >> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > > -- > William Clemens > Phone: 847.485.9455 > E-mail: wesclemens at gmail.com > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d-lewit at neiu.edu Tue Nov 10 12:11:30 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Tue, 10 Nov 2015 11:11:30 -0600 Subject: [Chicago] Chicago Digest, Vol 123, Issue 6 In-Reply-To: References:

Message-ID: Thanks Len, pretty simple and straightforward! I'm still getting the feel for the "yield" statement. On Tue, Nov 10, 2015 at 9:51 AM, Len Wanger wrote: > Try changing your next routine to this: > > def __next__(self): > while True: > if self.counter == len(self.array) - 1: > self.counter = -1 > self.counter+=1 > yield self.array[self.counter] > > > Len > > > From: chicago-request at python.org > > Subject: Chicago Digest, Vol 123, Issue 6 > > To: chicago at python.org > > Date: Tue, 10 Nov 2015 09:54:39 -0500 > > > > Send Chicago mailing list submissions to > > chicago at python.org > > > > To subscribe or unsubscribe via the World Wide Web, visit > > https://mail.python.org/mailman/listinfo/chicago > > or, via email, send a message with subject or body 'help' to > > chicago-request at python.org > > > > You can reach the person managing the list at > > chicago-owner at python.org > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of Chicago digest..." > > > > > > Today's Topics: > > > > 1. Can this be done with a yield statement and generator object? > > (Lewit, Douglas) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Tue, 10 Nov 2015 02:10:12 -0600 > > From: "Lewit, Douglas" > > To: The Chicago Python Users Group > > Subject: [Chicago] Can this be done with a yield statement and > > generator object? > > Message-ID: > > > > Content-Type: text/plain; charset="utf-8" > > > > Hey guys, > > > > I'm attaching a simple class that I created in Python.... Python 3 to be > > specific, but I think it should work in Python 2 as well, maybe. Anyhow, > > is there a way to implement the same concept using a *yield statement* > in a > > function to create a generator object? Just wondering. Let me know, > > thanks! > > > > Best, > > > > Douglas Lewit > > > > P.S. Obviously if you use a generator object to do this then the > generator > > object would never produce the StopIteration error. But I'm kind of > > confused about how to create and define a generator object that would > > produce this cyclical behavior in an array or list. > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: < > http://mail.python.org/pipermail/chicago/attachments/20151110/b012ba59/attachment.html > > > > -------------- next part -------------- > > A non-text attachment was scrubbed... > > Name: Circular_List_in_Python.png > > Type: image/png > > Size: 277517 bytes > > Desc: not available > > URL: < > http://mail.python.org/pipermail/chicago/attachments/20151110/b012ba59/attachment.png > > > > -------------- next part -------------- > > A non-text attachment was scrubbed... > > Name: CircularList.py > > Type: text/x-python > > Size: 373 bytes > > Desc: not available > > URL: < > http://mail.python.org/pipermail/chicago/attachments/20151110/b012ba59/attachment.py > > > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > > ------------------------------ > > > > End of Chicago Digest, Vol 123, Issue 6 > > *************************************** > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tottinge at gmail.com Tue Nov 10 12:14:47 2015 From: tottinge at gmail.com (Tim Ottinger) Date: Tue, 10 Nov 2015 11:14:47 -0600 Subject: [Chicago] Can this be done with a yield statement and generator object? In-Reply-To: References:

Message-ID: You mean, like itertools.cycle? On Tue, Nov 10, 2015 at 11:09 AM, Lewit, Douglas wrote: > That's very logical, Will, thanks! :-) > > On Tue, Nov 10, 2015 at 9:02 AM, William E. S. Clemens < > wesclemens at gmail.com> wrote: > >> I didn't test, but something like this should do the same. >> >> def circular_list(array): >> while True: >> counter = -1 >> if counter == len(array) - 1: >> counter = -1 >> counter+=1 >> yield array[counter] >> >> >> On Tue, Nov 10, 2015 at 2:10 AM, Lewit, Douglas wrote: >> >>> Hey guys, >>> >>> I'm attaching a simple class that I created in Python.... Python 3 to be >>> specific, but I think it should work in Python 2 as well, maybe. Anyhow, >>> is there a way to implement the same concept using a *yield statement* >>> in a function to create a generator object? Just wondering. Let me know, >>> thanks! >>> >>> Best, >>> >>> Douglas Lewit >>> >>> P.S. Obviously if you use a generator object to do this then the >>> generator object would never produce the StopIteration error. But I'm kind >>> of confused about how to create and define a generator object that would >>> produce this cyclical behavior in an array or list. >>> >>> >>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> >> >> -- >> William Clemens >> Phone: 847.485.9455 >> E-mail: wesclemens at gmail.com >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -- Tim Ottinger, Anzeneer, Industrial Logic ------------------------------------- http://www.industriallogic.com/ http://agileotter.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From d-lewit at neiu.edu Tue Nov 10 12:20:33 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Tue, 10 Nov 2015 11:20:33 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: <5698E9D1-E709-4B18-BE2B-7C60A279DE2A@gmail.com> References: <5698E9D1-E709-4B18-BE2B-7C60A279DE2A@gmail.com> Message-ID: This was amazingly helpful, thanks! I'll check the denominator of my correlation, but I'm pretty sure that's correct. But it won't hurt to double check it. When I took a slice of my similarity matrix all the correlations were floats in between -1 and +1, so that's a good sign that my computation was correct albeit very time consuming. The reason I didn't use Numpy arrays is because the professor for this class doesn't know a lot of Python, and he uses Microsoft Visual Studio to run my Python programs. I don't know if Numpy is a part of that installation. Numpy is not part of the standard Python installation, so if I submit a program that contains anything from the Numpy library then he won't be able to run my code. I emailed him and asked him about his Python installation, but he didn't get back to me. Thanks for the feedback! Very much appreciated!!! Best, Douglas. On Tue, Nov 10, 2015 at 8:52 AM, Sunhwan Jo wrote: > 1. Your ?correlation? function takes most of the execution time. > > def Correlation(p, q): > global PQ_Ratings > sum1 = 0 > sum2 = 0 > numeratorProduct = 1 > denominatorProduct1 = 1 > denominatorProduct2 = 1 > for key in filter( lambda x: x[0] == p or x[0] == q, > PQ_Ratings.keys( ) ): > if key[0] == p: > sum1+= PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > else: > sum2+= PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > numeratorProduct+= sum1*sum2 > denominatorProduct1+= sum1**2 > denominatorProduct2+= sum2**2 > return > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > By changing sum1 and sum2 as list comprehension can increase the execution > speed about 10x (rough estimate using your code). In addition, the > denominator is also wrong. It should be *sum of squared differences* not > *square of sum of differences*, but I?m not concerned at this yet. > > def Correlation(p, q): > global PQ_Ratings > sum1 = 0 > sum2 = 0 > numeratorProduct = 1 > denominatorProduct1 = 1 > denominatorProduct2 = 1 > keys = [key for key in PQ_Ratings.keys() if key[0] == p or key[0] == > q] > sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > in keys if key[0] == p]) > sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > in keys if key[0] == q]) > numeratorProduct+= sum1*sum2 > denominatorProduct1+= sum1**2 > denominatorProduct2+= sum2**2 > return > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > 2. You don?t have to re-calculate sum1 each time. ?sum1" only depends on > ?p?. So, you can calculate that only in the outer loop and reuse it. > > keys = PQ_Ratings.keys() > for i in range(1, len(SimilarityMatrix)): > sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > in keys if key[0] == i]) > > for j in range(i + 1, len(SimilarityMatrix)): > sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > for key in keys if key[0] == j]) > numeratorProduct = sum1*sum2 + 1 > denominatorProduct1 = sum1**2 + 1 > denominatorProduct2 = sum2**2 + 1 > SimilarityMatrix[i][j] = > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > This will again speed up but the total execution time is about 200 minutes > with +900 users. > > 3. Is there any reason not to use NumPy array? Using NumPy it finishes > less than a fraction of a minute. Notice I also fixed the bug in the > nominator and the denominator. > > import numpy as np > nitems = max(AverageRatingsOfItems.keys()) > nusers = max([key[0] for key in PQ_Ratings.keys()]) > avg_rating = np.zeros(nitems) > pq_rating = np.zeros((nusers, nitems)) > keys = PQ_Ratings.keys() > for key in keys: > pq_rating[key[0]-1, key[1]-1] = PQ_Ratings[key] > keys = AverageRatingsOfItems.keys() > for key in keys: > avg_rating[key-1] = AverageRatingsOfItems[key] > > startTime = time.time( ) > > #### Let's finish building up our similarity matrix for this problem. > keys = PQ_Ratings.keys() > for i in range(1, len(SimilarityMatrix)): > #sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > in keys if key[0] == i]) > diff1 = np.sum(pq_rating[i-1] - avg_rating) > > for j in range(i + 1, len(SimilarityMatrix)): > #sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > for key in keys if key[0] == j]) > diff2 = np.sum(pq_rating[j-1] - avg_rating) > numeratorProduct = np.sum(diff1*diff2) > denominatorProduct1 = np.sum(diff1**2) > denominatorProduct2 = np.sum(diff2**2) > SimilarityMatrix[i][j] = > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > > > On Nov 9, 2015, at 7:44 PM, Lewit, Douglas wrote: > > Hey guys, > > I need some advice on this one. I'm attaching the homework assignment so > that you understand what I'm trying to do. I went as far as the > construction of the Similarity Matrix, which is a matrix of Pearson > correlation coefficients. > > My problem is this. u1.base (which is also attached) contains Users > (first column), Items (second column), Ratings (third column) and finally > the time stamp in the 4th and final column. (Just discard the 4th column. > We're not using it for anything. ) > > It's taking HOURS for Python to build the similarity matrix. So what I > did was: > > *head -n 5000 u1.base > practice.base* > > and I also downloaded the PyPy interpreter for Python 3. Then using PyPy > (or pypy or whatever) I ran my program on the first ten thousand lines of > data from u1.base stored in the new text file, practice.base. Not a > problem!!! I still had to wait a couple minutes, but not a couple hours!!! > > > Is there a way to make this program work for such a large set of data? I > know my program successfully constructs the Similarity Matrix (i.e. > similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines > of data. But for 80,000 lines of data the program becomes very slow and > overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to > get very hot.... a bad sign! ) > > Does anyone have any recommendations? ( I'm supposed to meet with my prof > on Tuesday. I may just explain the problem to him and request a smaller > data set to work with. And unfortunately he knows very little about > Python. He's primarily a C++ and Java programmer. ) > > I appreciate the feedback. Thank you!!! > > Best, > > Douglas Lewit > > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d-lewit at neiu.edu Tue Nov 10 12:21:16 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Tue, 10 Nov 2015 11:21:16 -0600 Subject: [Chicago] Can this be done with a yield statement and generator object? In-Reply-To: References:

Message-ID: Well that just makes it too easy!!! ;-) On Tue, Nov 10, 2015 at 11:14 AM, Tim Ottinger wrote: > You mean, like itertools.cycle? > > > On Tue, Nov 10, 2015 at 11:09 AM, Lewit, Douglas wrote: > >> That's very logical, Will, thanks! :-) >> >> On Tue, Nov 10, 2015 at 9:02 AM, William E. S. Clemens < >> wesclemens at gmail.com> wrote: >> >>> I didn't test, but something like this should do the same. >>> >>> def circular_list(array): >>> while True: >>> counter = -1 >>> if counter == len(array) - 1: >>> counter = -1 >>> counter+=1 >>> yield array[counter] >>> >>> >>> On Tue, Nov 10, 2015 at 2:10 AM, Lewit, Douglas >>> wrote: >>> >>>> Hey guys, >>>> >>>> I'm attaching a simple class that I created in Python.... Python 3 to >>>> be specific, but I think it should work in Python 2 as well, maybe. >>>> Anyhow, is there a way to implement the same concept using a *yield >>>> statement* in a function to create a generator object? Just >>>> wondering. Let me know, thanks! >>>> >>>> Best, >>>> >>>> Douglas Lewit >>>> >>>> P.S. Obviously if you use a generator object to do this then the >>>> generator object would never produce the StopIteration error. But I'm kind >>>> of confused about how to create and define a generator object that would >>>> produce this cyclical behavior in an array or list. >>>> >>>> >>>> >>>> _______________________________________________ >>>> Chicago mailing list >>>> Chicago at python.org >>>> https://mail.python.org/mailman/listinfo/chicago >>>> >>>> >>> >>> >>> -- >>> William Clemens >>> Phone: 847.485.9455 >>> E-mail: wesclemens at gmail.com >>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > > -- > Tim Ottinger, Anzeneer, Industrial Logic > ------------------------------------- > http://www.industriallogic.com/ > http://agileotter.blogspot.com/ > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d-lewit at neiu.edu Tue Nov 10 12:25:57 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Tue, 10 Nov 2015 11:25:57 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References:

Message-ID: 10 seconds???!!! Wow!!! Okay then, I'll buy you dinner if you finish my homework for me!!! ;-) Argument unpacking? What's that? As for lambdas, I just LOVE them! They are so cool, and make certain procedures so much easier. What is PEP8? It sounds like a nutritional supplement or an energy drink! ;-) On Tue, Nov 10, 2015 at 9:46 AM, Adam Forsyth wrote: > Hi Douglas, > > You seem to post interesting homework assignments when I'm looking for a > fun problem, thanks. > > The issue definitely isn't the performance of either Python (the language) > or CPython (the implementation). I did the assignment last night, and > calculating the matrix for "u1.base" took my code less than 10 seconds. > > For readability in your Correlation function, try to avoid: globals; > creating lambdas inside loops; and indexing with constant keys rather than > using argument unpacking (i.e. key[0]). It also helps to follow PEP8 if you > want other Python programmers to be able to read your code easily. > > You probably have an algorithmic error in there somewhere -- it's hard for > me to tell for sure because your code is difficult to follow. Read the > assignment carefully, and only do what it tells you. For performance, are > there different data structures you could use? Are there "batteries > included" in Python that could combine some of those individual arithmetic > operations? I don't want to be too specific here because implementing the > algorithm is the point of the assignment. > > It looks like you still have two weeks to complete the project, so I'd > recommend taking your time, and don't be afraid to start a new version -- > it can help you break out of bad patterns you've started in your existing > code. > > Best, > Adam > > > > On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: > >> Hey guys, >> >> I need some advice on this one. I'm attaching the homework assignment so >> that you understand what I'm trying to do. I went as far as the >> construction of the Similarity Matrix, which is a matrix of Pearson >> correlation coefficients. >> >> My problem is this. u1.base (which is also attached) contains Users >> (first column), Items (second column), Ratings (third column) and finally >> the time stamp in the 4th and final column. (Just discard the 4th column. >> We're not using it for anything. ) >> >> It's taking HOURS for Python to build the similarity matrix. So what I >> did was: >> >> *head -n 5000 u1.base > practice.base* >> >> and I also downloaded the PyPy interpreter for Python 3. Then using PyPy >> (or pypy or whatever) I ran my program on the first ten thousand lines of >> data from u1.base stored in the new text file, practice.base. Not a >> problem!!! I still had to wait a couple minutes, but not a couple hours!!! >> >> >> Is there a way to make this program work for such a large set of data? I >> know my program successfully constructs the Similarity Matrix (i.e. >> similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines >> of data. But for 80,000 lines of data the program becomes very slow and >> overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to >> get very hot.... a bad sign! ) >> >> Does anyone have any recommendations? ( I'm supposed to meet with my >> prof on Tuesday. I may just explain the problem to him and request a >> smaller data set to work with. And unfortunately he knows very little >> about Python. He's primarily a C++ and Java programmer. ) >> >> I appreciate the feedback. Thank you!!! >> >> Best, >> >> Douglas Lewit >> >> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam at adamforsyth.net Tue Nov 10 12:54:34 2015 From: adam at adamforsyth.net (Adam Forsyth) Date: Tue, 10 Nov 2015 11:54:34 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References:

Message-ID: I can't tell if you're joking or not (which is a problem when you're asking for help), but if you're actually not familiar with one or more of those terms, I suggest you Google them. Both are important to understand as a Python programmer. On Tue, Nov 10, 2015 at 11:25 AM, Lewit, Douglas wrote: > 10 seconds???!!! Wow!!! Okay then, I'll buy you dinner if you finish my > homework for me!!! ;-) > > Argument unpacking? What's that? As for lambdas, I just LOVE them! They > are so cool, and make certain procedures so much easier. What is PEP8? It > sounds like a nutritional supplement or an energy drink! ;-) > > On Tue, Nov 10, 2015 at 9:46 AM, Adam Forsyth > wrote: > >> Hi Douglas, >> >> You seem to post interesting homework assignments when I'm looking for a >> fun problem, thanks. >> >> The issue definitely isn't the performance of either Python (the >> language) or CPython (the implementation). I did the assignment last night, >> and calculating the matrix for "u1.base" took my code less than 10 seconds. >> >> For readability in your Correlation function, try to avoid: globals; >> creating lambdas inside loops; and indexing with constant keys rather than >> using argument unpacking (i.e. key[0]). It also helps to follow PEP8 if you >> want other Python programmers to be able to read your code easily. >> >> You probably have an algorithmic error in there somewhere -- it's hard >> for me to tell for sure because your code is difficult to follow. Read the >> assignment carefully, and only do what it tells you. For performance, are >> there different data structures you could use? Are there "batteries >> included" in Python that could combine some of those individual arithmetic >> operations? I don't want to be too specific here because implementing the >> algorithm is the point of the assignment. >> >> It looks like you still have two weeks to complete the project, so I'd >> recommend taking your time, and don't be afraid to start a new version -- >> it can help you break out of bad patterns you've started in your existing >> code. >> >> Best, >> Adam >> >> >> >> On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: >> >>> Hey guys, >>> >>> I need some advice on this one. I'm attaching the homework assignment >>> so that you understand what I'm trying to do. I went as far as the >>> construction of the Similarity Matrix, which is a matrix of Pearson >>> correlation coefficients. >>> >>> My problem is this. u1.base (which is also attached) contains Users >>> (first column), Items (second column), Ratings (third column) and finally >>> the time stamp in the 4th and final column. (Just discard the 4th column. >>> We're not using it for anything. ) >>> >>> It's taking HOURS for Python to build the similarity matrix. So what I >>> did was: >>> >>> *head -n 5000 u1.base > practice.base* >>> >>> and I also downloaded the PyPy interpreter for Python 3. Then using >>> PyPy (or pypy or whatever) I ran my program on the first ten thousand lines >>> of data from u1.base stored in the new text file, practice.base. Not a >>> problem!!! I still had to wait a couple minutes, but not a couple hours!!! >>> >>> >>> Is there a way to make this program work for such a large set of data? >>> I know my program successfully constructs the Similarity Matrix (i.e. >>> similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines >>> of data. But for 80,000 lines of data the program becomes very slow and >>> overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to >>> get very hot.... a bad sign! ) >>> >>> Does anyone have any recommendations? ( I'm supposed to meet with my >>> prof on Tuesday. I may just explain the problem to him and request a >>> smaller data set to work with. And unfortunately he knows very little >>> about Python. He's primarily a C++ and Java programmer. ) >>> >>> I appreciate the feedback. Thank you!!! >>> >>> Best, >>> >>> Douglas Lewit >>> >>> >>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mgraves87 at gmail.com Tue Nov 10 10:37:36 2015 From: mgraves87 at gmail.com (Mark Graves) Date: Tue, 10 Nov 2015 09:37:36 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References:

Message-ID: I think I must have screwed this up, can someone point out my errors? I worked based off Doug's code, then attempted to dictify the results to minimize lookup times in that filter function. Full disclosure: I was only working based off no errors, with no knowledge of the algorithm implementation. code: https://gist.github.com/gravesmedical/58a6b665b553c1294b56 On Tue, Nov 10, 2015 at 8:57 AM, Ross Heflin wrote: > Might be time to profile. > Run your similarity matrix builder with the large dataset against cProfile > (or whatever works on PyPy) for some time (30 min) and see where its > spending the majority of its time. > > -Ross > > On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: > >> Hey guys, >> >> I need some advice on this one. I'm attaching the homework assignment so >> that you understand what I'm trying to do. I went as far as the >> construction of the Similarity Matrix, which is a matrix of Pearson >> correlation coefficients. >> >> My problem is this. u1.base (which is also attached) contains Users >> (first column), Items (second column), Ratings (third column) and finally >> the time stamp in the 4th and final column. (Just discard the 4th column. >> We're not using it for anything. ) >> >> It's taking HOURS for Python to build the similarity matrix. So what I >> did was: >> >> *head -n 5000 u1.base > practice.base* >> >> and I also downloaded the PyPy interpreter for Python 3. Then using PyPy >> (or pypy or whatever) I ran my program on the first ten thousand lines of >> data from u1.base stored in the new text file, practice.base. Not a >> problem!!! I still had to wait a couple minutes, but not a couple hours!!! >> >> >> Is there a way to make this program work for such a large set of data? I >> know my program successfully constructs the Similarity Matrix (i.e. >> similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines >> of data. But for 80,000 lines of data the program becomes very slow and >> overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to >> get very hot.... a bad sign! ) >> >> Does anyone have any recommendations? ( I'm supposed to meet with my >> prof on Tuesday. I may just explain the problem to him and request a >> smaller data set to work with. And unfortunately he knows very little >> about Python. He's primarily a C++ and Java programmer. ) >> >> I appreciate the feedback. Thank you!!! >> >> Best, >> >> Douglas Lewit >> >> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > > -- > From the "desk" of Ross Heflin > phone number: (847) <23,504,826th decimal place of pi> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From len_wanger at hotmail.com Tue Nov 10 15:15:39 2015 From: len_wanger at hotmail.com (Len Wanger) Date: Tue, 10 Nov 2015 14:15:39 -0600 Subject: [Chicago] Chicago Digest, Vol 123, Issue 8 In-Reply-To: References: Message-ID: Doing it in the dunder function __next__ makes it so you can call it with the next keyword or as an iterator. You have to watch out for infinite loops in your case, but it's nice. E.G. c_list = circular_list( ('a', 'b', 'c') ) for i in range(10): print( next(c_list) ) -or- for i in circular_list( ('a', 'b', 'c') ): # warning: I'm in an infinite loop! print(i) Note: This is also already in the standard library. Look at cycle in the itertools module. Note: One more note. Be careful to use an immutable sequence (like a tuple) instead of a list in your call to circular_list or you'll open yourself up to nasty side effects and a long night of debugging! Len > From: chicago-request at python.org > Subject: Chicago Digest, Vol 123, Issue 8 > To: chicago at python.org > Date: Tue, 10 Nov 2015 12:26:05 -0500 > > Send Chicago mailing list submissions to > chicago at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/chicago > or, via email, send a message with subject or body 'help' to > chicago-request at python.org > > You can reach the person managing the list at > chicago-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Chicago digest..." > > > Today's Topics: > > 1. Re: Can this be done with a yield statement and generator > object? (Lewit, Douglas) > 2. Re: Chicago Digest, Vol 123, Issue 6 (Lewit, Douglas) > 3. Re: Can this be done with a yield statement and generator > object? (Tim Ottinger) > 4. Re: Need advice on this project. (Lewit, Douglas) > 5. Re: Can this be done with a yield statement and generator > object? (Lewit, Douglas) > 6. Re: Need advice on this project. (Lewit, Douglas) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 10 Nov 2015 11:09:27 -0600 > From: "Lewit, Douglas" > To: The Chicago Python Users Group > Subject: Re: [Chicago] Can this be done with a yield statement and > generator object? > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > That's very logical, Will, thanks! :-) > > On Tue, Nov 10, 2015 at 9:02 AM, William E. S. Clemens > wrote: > > > I didn't test, but something like this should do the same. > > > > def circular_list(array): > > while True: > > counter = -1 > > if counter == len(array) - 1: > > counter = -1 > > counter+=1 > > yield array[counter] > > > > > > On Tue, Nov 10, 2015 at 2:10 AM, Lewit, Douglas wrote: > > > >> Hey guys, > >> > >> I'm attaching a simple class that I created in Python.... Python 3 to be > >> specific, but I think it should work in Python 2 as well, maybe. Anyhow, > >> is there a way to implement the same concept using a *yield statement* > >> in a function to create a generator object? Just wondering. Let me know, > >> thanks! > >> > >> Best, > >> > >> Douglas Lewit > >> > >> P.S. Obviously if you use a generator object to do this then the > >> generator object would never produce the StopIteration error. But I'm kind > >> of confused about how to create and define a generator object that would > >> produce this cyclical behavior in an array or list. > >> > >> > >> > >> _______________________________________________ > >> Chicago mailing list > >> Chicago at python.org > >> https://mail.python.org/mailman/listinfo/chicago > >> > >> > > > > > > -- > > William Clemens > > Phone: 847.485.9455 > > E-mail: wesclemens at gmail.com > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 2 > Date: Tue, 10 Nov 2015 11:11:30 -0600 > From: "Lewit, Douglas" > To: The Chicago Python Users Group > Subject: Re: [Chicago] Chicago Digest, Vol 123, Issue 6 > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > Thanks Len, pretty simple and straightforward! I'm still getting the feel > for the "yield" statement. > > On Tue, Nov 10, 2015 at 9:51 AM, Len Wanger wrote: > > > Try changing your next routine to this: > > > > def __next__(self): > > while True: > > if self.counter == len(self.array) - 1: > > self.counter = -1 > > self.counter+=1 > > yield self.array[self.counter] > > > > > > Len > > > > > From: chicago-request at python.org > > > Subject: Chicago Digest, Vol 123, Issue 6 > > > To: chicago at python.org > > > Date: Tue, 10 Nov 2015 09:54:39 -0500 > > > > > > Send Chicago mailing list submissions to > > > chicago at python.org > > > > > > To subscribe or unsubscribe via the World Wide Web, visit > > > https://mail.python.org/mailman/listinfo/chicago > > > or, via email, send a message with subject or body 'help' to > > > chicago-request at python.org > > > > > > You can reach the person managing the list at > > > chicago-owner at python.org > > > > > > When replying, please edit your Subject line so it is more specific > > > than "Re: Contents of Chicago digest..." > > > > > > > > > Today's Topics: > > > > > > 1. Can this be done with a yield statement and generator object? > > > (Lewit, Douglas) > > > > > > > > > ---------------------------------------------------------------------- > > > > > > Message: 1 > > > Date: Tue, 10 Nov 2015 02:10:12 -0600 > > > From: "Lewit, Douglas" > > > To: The Chicago Python Users Group > > > Subject: [Chicago] Can this be done with a yield statement and > > > generator object? > > > Message-ID: > > > > > > Content-Type: text/plain; charset="utf-8" > > > > > > Hey guys, > > > > > > I'm attaching a simple class that I created in Python.... Python 3 to be > > > specific, but I think it should work in Python 2 as well, maybe. Anyhow, > > > is there a way to implement the same concept using a *yield statement* > > in a > > > function to create a generator object? Just wondering. Let me know, > > > thanks! > > > > > > Best, > > > > > > Douglas Lewit > > > > > > P.S. Obviously if you use a generator object to do this then the > > generator > > > object would never produce the StopIteration error. But I'm kind of > > > confused about how to create and define a generator object that would > > > produce this cyclical behavior in an array or list. > > > -------------- next part -------------- > > > An HTML attachment was scrubbed... > > > URL: < > > http://mail.python.org/pipermail/chicago/attachments/20151110/b012ba59/attachment.html > > > > > > -------------- next part -------------- > > > A non-text attachment was scrubbed... > > > Name: Circular_List_in_Python.png > > > Type: image/png > > > Size: 277517 bytes > > > Desc: not available > > > URL: < > > http://mail.python.org/pipermail/chicago/attachments/20151110/b012ba59/attachment.png > > > > > > -------------- next part -------------- > > > A non-text attachment was scrubbed... > > > Name: CircularList.py > > > Type: text/x-python > > > Size: 373 bytes > > > Desc: not available > > > URL: < > > http://mail.python.org/pipermail/chicago/attachments/20151110/b012ba59/attachment.py > > > > > > > > > ------------------------------ > > > > > > Subject: Digest Footer > > > > > > _______________________________________________ > > > Chicago mailing list > > > Chicago at python.org > > > https://mail.python.org/mailman/listinfo/chicago > > > > > > > > > ------------------------------ > > > > > > End of Chicago Digest, Vol 123, Issue 6 > > > *************************************** > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 3 > Date: Tue, 10 Nov 2015 11:14:47 -0600 > From: Tim Ottinger > To: The Chicago Python Users Group > Subject: Re: [Chicago] Can this be done with a yield statement and > generator object? > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > You mean, like itertools.cycle? > > > On Tue, Nov 10, 2015 at 11:09 AM, Lewit, Douglas wrote: > > > That's very logical, Will, thanks! :-) > > > > On Tue, Nov 10, 2015 at 9:02 AM, William E. S. Clemens < > > wesclemens at gmail.com> wrote: > > > >> I didn't test, but something like this should do the same. > >> > >> def circular_list(array): > >> while True: > >> counter = -1 > >> if counter == len(array) - 1: > >> counter = -1 > >> counter+=1 > >> yield array[counter] > >> > >> > >> On Tue, Nov 10, 2015 at 2:10 AM, Lewit, Douglas wrote: > >> > >>> Hey guys, > >>> > >>> I'm attaching a simple class that I created in Python.... Python 3 to be > >>> specific, but I think it should work in Python 2 as well, maybe. Anyhow, > >>> is there a way to implement the same concept using a *yield statement* > >>> in a function to create a generator object? Just wondering. Let me know, > >>> thanks! > >>> > >>> Best, > >>> > >>> Douglas Lewit > >>> > >>> P.S. Obviously if you use a generator object to do this then the > >>> generator object would never produce the StopIteration error. But I'm kind > >>> of confused about how to create and define a generator object that would > >>> produce this cyclical behavior in an array or list. > >>> > >>> > >>> > >>> _______________________________________________ > >>> Chicago mailing list > >>> Chicago at python.org > >>> https://mail.python.org/mailman/listinfo/chicago > >>> > >>> > >> > >> > >> -- > >> William Clemens > >> Phone: 847.485.9455 > >> E-mail: wesclemens at gmail.com > >> > >> _______________________________________________ > >> Chicago mailing list > >> Chicago at python.org > >> https://mail.python.org/mailman/listinfo/chicago > >> > >> > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > > > -- > Tim Ottinger, Anzeneer, Industrial Logic > ------------------------------------- > http://www.industriallogic.com/ > http://agileotter.blogspot.com/ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 4 > Date: Tue, 10 Nov 2015 11:20:33 -0600 > From: "Lewit, Douglas" > To: The Chicago Python Users Group > Subject: Re: [Chicago] Need advice on this project. > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > This was amazingly helpful, thanks! I'll check the denominator of my > correlation, but I'm pretty sure that's correct. But it won't hurt to > double check it. When I took a slice of my similarity matrix all the > correlations were floats in between -1 and +1, so that's a good sign that > my computation was correct albeit very time consuming. The reason I didn't > use Numpy arrays is because the professor for this class doesn't know a lot > of Python, and he uses Microsoft Visual Studio to run my Python programs. > I don't know if Numpy is a part of that installation. Numpy is not part of > the standard Python installation, so if I submit a program that contains > anything from the Numpy library then he won't be able to run my code. I > emailed him and asked him about his Python installation, but he didn't get > back to me. > > Thanks for the feedback! Very much appreciated!!! > > Best, > > Douglas. > > > On Tue, Nov 10, 2015 at 8:52 AM, Sunhwan Jo wrote: > > > 1. Your ?correlation? function takes most of the execution time. > > > > def Correlation(p, q): > > global PQ_Ratings > > sum1 = 0 > > sum2 = 0 > > numeratorProduct = 1 > > denominatorProduct1 = 1 > > denominatorProduct2 = 1 > > for key in filter( lambda x: x[0] == p or x[0] == q, > > PQ_Ratings.keys( ) ): > > if key[0] == p: > > sum1+= PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > > else: > > sum2+= PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > > numeratorProduct+= sum1*sum2 > > denominatorProduct1+= sum1**2 > > denominatorProduct2+= sum2**2 > > return > > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > > > > By changing sum1 and sum2 as list comprehension can increase the execution > > speed about 10x (rough estimate using your code). In addition, the > > denominator is also wrong. It should be *sum of squared differences* not > > *square of sum of differences*, but I?m not concerned at this yet. > > > > def Correlation(p, q): > > global PQ_Ratings > > sum1 = 0 > > sum2 = 0 > > numeratorProduct = 1 > > denominatorProduct1 = 1 > > denominatorProduct2 = 1 > > keys = [key for key in PQ_Ratings.keys() if key[0] == p or key[0] == > > q] > > sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > > in keys if key[0] == p]) > > sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > > in keys if key[0] == q]) > > numeratorProduct+= sum1*sum2 > > denominatorProduct1+= sum1**2 > > denominatorProduct2+= sum2**2 > > return > > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > > > > 2. You don?t have to re-calculate sum1 each time. ?sum1" only depends on > > ?p?. So, you can calculate that only in the outer loop and reuse it. > > > > keys = PQ_Ratings.keys() > > for i in range(1, len(SimilarityMatrix)): > > sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > > in keys if key[0] == i]) > > > > for j in range(i + 1, len(SimilarityMatrix)): > > sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > > for key in keys if key[0] == j]) > > numeratorProduct = sum1*sum2 + 1 > > denominatorProduct1 = sum1**2 + 1 > > denominatorProduct2 = sum2**2 + 1 > > SimilarityMatrix[i][j] = > > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > > > > This will again speed up but the total execution time is about 200 minutes > > with +900 users. > > > > 3. Is there any reason not to use NumPy array? Using NumPy it finishes > > less than a fraction of a minute. Notice I also fixed the bug in the > > nominator and the denominator. > > > > import numpy as np > > nitems = max(AverageRatingsOfItems.keys()) > > nusers = max([key[0] for key in PQ_Ratings.keys()]) > > avg_rating = np.zeros(nitems) > > pq_rating = np.zeros((nusers, nitems)) > > keys = PQ_Ratings.keys() > > for key in keys: > > pq_rating[key[0]-1, key[1]-1] = PQ_Ratings[key] > > keys = AverageRatingsOfItems.keys() > > for key in keys: > > avg_rating[key-1] = AverageRatingsOfItems[key] > > > > startTime = time.time( ) > > > > #### Let's finish building up our similarity matrix for this problem. > > keys = PQ_Ratings.keys() > > for i in range(1, len(SimilarityMatrix)): > > #sum1 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] for key > > in keys if key[0] == i]) > > diff1 = np.sum(pq_rating[i-1] - avg_rating) > > > > for j in range(i + 1, len(SimilarityMatrix)): > > #sum2 = sum([PQ_Ratings[key] - AverageRatingsOfItems[key[1]] > > for key in keys if key[0] == j]) > > diff2 = np.sum(pq_rating[j-1] - avg_rating) > > numeratorProduct = np.sum(diff1*diff2) > > denominatorProduct1 = np.sum(diff1**2) > > denominatorProduct2 = np.sum(diff2**2) > > SimilarityMatrix[i][j] = > > numeratorProduct/(math.sqrt(denominatorProduct1)*math.sqrt(denominatorProduct2)) > > > > > > > > > > On Nov 9, 2015, at 7:44 PM, Lewit, Douglas wrote: > > > > Hey guys, > > > > I need some advice on this one. I'm attaching the homework assignment so > > that you understand what I'm trying to do. I went as far as the > > construction of the Similarity Matrix, which is a matrix of Pearson > > correlation coefficients. > > > > My problem is this. u1.base (which is also attached) contains Users > > (first column), Items (second column), Ratings (third column) and finally > > the time stamp in the 4th and final column. (Just discard the 4th column. > > We're not using it for anything. ) > > > > It's taking HOURS for Python to build the similarity matrix. So what I > > did was: > > > > *head -n 5000 u1.base > practice.base* > > > > and I also downloaded the PyPy interpreter for Python 3. Then using PyPy > > (or pypy or whatever) I ran my program on the first ten thousand lines of > > data from u1.base stored in the new text file, practice.base. Not a > > problem!!! I still had to wait a couple minutes, but not a couple hours!!! > > > > > > Is there a way to make this program work for such a large set of data? I > > know my program successfully constructs the Similarity Matrix (i.e. > > similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines > > of data. But for 80,000 lines of data the program becomes very slow and > > overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to > > get very hot.... a bad sign! ) > > > > Does anyone have any recommendations? ( I'm supposed to meet with my prof > > on Tuesday. I may just explain the problem to him and request a smaller > > data set to work with. And unfortunately he knows very little about > > Python. He's primarily a C++ and Java programmer. ) > > > > I appreciate the feedback. Thank you!!! > > > > Best, > > > > Douglas Lewit > > > > > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 5 > Date: Tue, 10 Nov 2015 11:21:16 -0600 > From: "Lewit, Douglas" > To: The Chicago Python Users Group > Subject: Re: [Chicago] Can this be done with a yield statement and > generator object? > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > Well that just makes it too easy!!! ;-) > > On Tue, Nov 10, 2015 at 11:14 AM, Tim Ottinger wrote: > > > You mean, like itertools.cycle? > > > > > > On Tue, Nov 10, 2015 at 11:09 AM, Lewit, Douglas wrote: > > > >> That's very logical, Will, thanks! :-) > >> > >> On Tue, Nov 10, 2015 at 9:02 AM, William E. S. Clemens < > >> wesclemens at gmail.com> wrote: > >> > >>> I didn't test, but something like this should do the same. > >>> > >>> def circular_list(array): > >>> while True: > >>> counter = -1 > >>> if counter == len(array) - 1: > >>> counter = -1 > >>> counter+=1 > >>> yield array[counter] > >>> > >>> > >>> On Tue, Nov 10, 2015 at 2:10 AM, Lewit, Douglas > >>> wrote: > >>> > >>>> Hey guys, > >>>> > >>>> I'm attaching a simple class that I created in Python.... Python 3 to > >>>> be specific, but I think it should work in Python 2 as well, maybe. > >>>> Anyhow, is there a way to implement the same concept using a *yield > >>>> statement* in a function to create a generator object? Just > >>>> wondering. Let me know, thanks! > >>>> > >>>> Best, > >>>> > >>>> Douglas Lewit > >>>> > >>>> P.S. Obviously if you use a generator object to do this then the > >>>> generator object would never produce the StopIteration error. But I'm kind > >>>> of confused about how to create and define a generator object that would > >>>> produce this cyclical behavior in an array or list. > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Chicago mailing list > >>>> Chicago at python.org > >>>> https://mail.python.org/mailman/listinfo/chicago > >>>> > >>>> > >>> > >>> > >>> -- > >>> William Clemens > >>> Phone: 847.485.9455 > >>> E-mail: wesclemens at gmail.com > >>> > >>> _______________________________________________ > >>> Chicago mailing list > >>> Chicago at python.org > >>> https://mail.python.org/mailman/listinfo/chicago > >>> > >>> > >> > >> _______________________________________________ > >> Chicago mailing list > >> Chicago at python.org > >> https://mail.python.org/mailman/listinfo/chicago > >> > >> > > > > > > -- > > Tim Ottinger, Anzeneer, Industrial Logic > > ------------------------------------- > > http://www.industriallogic.com/ > > http://agileotter.blogspot.com/ > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 6 > Date: Tue, 10 Nov 2015 11:25:57 -0600 > From: "Lewit, Douglas" > To: The Chicago Python Users Group > Subject: Re: [Chicago] Need advice on this project. > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > 10 seconds???!!! Wow!!! Okay then, I'll buy you dinner if you finish my > homework for me!!! ;-) > > Argument unpacking? What's that? As for lambdas, I just LOVE them! They > are so cool, and make certain procedures so much easier. What is PEP8? It > sounds like a nutritional supplement or an energy drink! ;-) > > On Tue, Nov 10, 2015 at 9:46 AM, Adam Forsyth wrote: > > > Hi Douglas, > > > > You seem to post interesting homework assignments when I'm looking for a > > fun problem, thanks. > > > > The issue definitely isn't the performance of either Python (the language) > > or CPython (the implementation). I did the assignment last night, and > > calculating the matrix for "u1.base" took my code less than 10 seconds. > > > > For readability in your Correlation function, try to avoid: globals; > > creating lambdas inside loops; and indexing with constant keys rather than > > using argument unpacking (i.e. key[0]). It also helps to follow PEP8 if you > > want other Python programmers to be able to read your code easily. > > > > You probably have an algorithmic error in there somewhere -- it's hard for > > me to tell for sure because your code is difficult to follow. Read the > > assignment carefully, and only do what it tells you. For performance, are > > there different data structures you could use? Are there "batteries > > included" in Python that could combine some of those individual arithmetic > > operations? I don't want to be too specific here because implementing the > > algorithm is the point of the assignment. > > > > It looks like you still have two weeks to complete the project, so I'd > > recommend taking your time, and don't be afraid to start a new version -- > > it can help you break out of bad patterns you've started in your existing > > code. > > > > Best, > > Adam > > > > > > > > On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: > > > >> Hey guys, > >> > >> I need some advice on this one. I'm attaching the homework assignment so > >> that you understand what I'm trying to do. I went as far as the > >> construction of the Similarity Matrix, which is a matrix of Pearson > >> correlation coefficients. > >> > >> My problem is this. u1.base (which is also attached) contains Users > >> (first column), Items (second column), Ratings (third column) and finally > >> the time stamp in the 4th and final column. (Just discard the 4th column. > >> We're not using it for anything. ) > >> > >> It's taking HOURS for Python to build the similarity matrix. So what I > >> did was: > >> > >> *head -n 5000 u1.base > practice.base* > >> > >> and I also downloaded the PyPy interpreter for Python 3. Then using PyPy > >> (or pypy or whatever) I ran my program on the first ten thousand lines of > >> data from u1.base stored in the new text file, practice.base. Not a > >> problem!!! I still had to wait a couple minutes, but not a couple hours!!! > >> > >> > >> Is there a way to make this program work for such a large set of data? I > >> know my program successfully constructs the Similarity Matrix (i.e. > >> similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines > >> of data. But for 80,000 lines of data the program becomes very slow and > >> overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to > >> get very hot.... a bad sign! ) > >> > >> Does anyone have any recommendations? ( I'm supposed to meet with my > >> prof on Tuesday. I may just explain the problem to him and request a > >> smaller data set to work with. And unfortunately he knows very little > >> about Python. He's primarily a C++ and Java programmer. ) > >> > >> I appreciate the feedback. Thank you!!! > >> > >> Best, > >> > >> Douglas Lewit > >> > >> > >> > >> _______________________________________________ > >> Chicago mailing list > >> Chicago at python.org > >> https://mail.python.org/mailman/listinfo/chicago > >> > >> > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > > ------------------------------ > > End of Chicago Digest, Vol 123, Issue 8 > *************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam at adamforsyth.net Tue Nov 10 23:47:10 2015 From: adam at adamforsyth.net (Adam Forsyth) Date: Tue, 10 Nov 2015 22:47:10 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References:

Message-ID: Please keep in mind this a graded project for a course, so we shouldn't post answers, only general help and advice. On Tue, Nov 10, 2015 at 9:37 AM, Mark Graves wrote: > I think I must have screwed this up, can someone point out my errors? > > I worked based off Doug's code, then attempted to dictify the results to > minimize lookup times in that filter function. > > Full disclosure: I was only working based off no errors, with no knowledge > of the algorithm implementation. > > code: > > https://gist.github.com/gravesmedical/58a6b665b553c1294b56 > > On Tue, Nov 10, 2015 at 8:57 AM, Ross Heflin > wrote: > >> Might be time to profile. >> Run your similarity matrix builder with the large dataset against >> cProfile (or whatever works on PyPy) for some time (30 min) and see where >> its spending the majority of its time. >> >> -Ross >> >> On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: >> >>> Hey guys, >>> >>> I need some advice on this one. I'm attaching the homework assignment >>> so that you understand what I'm trying to do. I went as far as the >>> construction of the Similarity Matrix, which is a matrix of Pearson >>> correlation coefficients. >>> >>> My problem is this. u1.base (which is also attached) contains Users >>> (first column), Items (second column), Ratings (third column) and finally >>> the time stamp in the 4th and final column. (Just discard the 4th column. >>> We're not using it for anything. ) >>> >>> It's taking HOURS for Python to build the similarity matrix. So what I >>> did was: >>> >>> *head -n 5000 u1.base > practice.base* >>> >>> and I also downloaded the PyPy interpreter for Python 3. Then using >>> PyPy (or pypy or whatever) I ran my program on the first ten thousand lines >>> of data from u1.base stored in the new text file, practice.base. Not a >>> problem!!! I still had to wait a couple minutes, but not a couple hours!!! >>> >>> >>> Is there a way to make this program work for such a large set of data? >>> I know my program successfully constructs the Similarity Matrix (i.e. >>> similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines >>> of data. But for 80,000 lines of data the program becomes very slow and >>> overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to >>> get very hot.... a bad sign! ) >>> >>> Does anyone have any recommendations? ( I'm supposed to meet with my >>> prof on Tuesday. I may just explain the problem to him and request a >>> smaller data set to work with. And unfortunately he knows very little >>> about Python. He's primarily a C++ and Java programmer. ) >>> >>> I appreciate the feedback. Thank you!!! >>> >>> Best, >>> >>> Douglas Lewit >>> >>> >>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> >> >> -- >> From the "desk" of Ross Heflin >> phone number: (847) <23,504,826th decimal place of pi> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at personnelware.com Wed Nov 11 04:11:11 2015 From: carl at personnelware.com (Carl Karsten) Date: Wed, 11 Nov 2015 03:11:11 -0600 Subject: [Chicago] django streaming zip Message-ID: I have a django app and nginx serving up static files. I want the user to be able to get a set of files. it will be 5 to 10 files, most of them will be 5-10 meg, so about 50-100 meg total. The thought is provide a zip link, click the link, get a zip file. I am trying to avoid creating zip files on the local fs, either ahead of time or on the fly. I am hoping I can pass something a list of file names and it will return the a file like object that I can pass on to whatever and the client will get is zip file as the server creates it. btw, it is a cut list (xml of file names and cut points) 2 png, 2-10 .webm files. -- Carl K -------------- next part -------------- An HTML attachment was scrubbed... URL: From d-lewit at neiu.edu Wed Nov 11 07:10:57 2015 From: d-lewit at neiu.edu (Lewit, Douglas) Date: Wed, 11 Nov 2015 06:10:57 -0600 Subject: [Chicago] Need advice on this project. In-Reply-To: References:

Message-ID: Hey Mark, Please don't sweat it too much! I ran my program overnight. With Pypy it took slightly more than 2 hours. Then I wrote my matrix to a file. Read it back in just to make sure it worked--and it did!!!! My prof said that the data won't change, so the important thing is to just save the matrix in a file and then reuse that matrix (which comes from the training set) and then use it to make some "guesses" about the test set. In other words, I don't have to keep generating the matrix over and over and over again, thank god!!! Basically a one-shot deal and then I can just store the results in a text file and read the results back into a list of lists for future applications. But thanks for the help! I'll check out what you did on Github.... but first a little sleep! Stay warm, Doug. On Tue, Nov 10, 2015 at 9:37 AM, Mark Graves wrote: > I think I must have screwed this up, can someone point out my errors? > > I worked based off Doug's code, then attempted to dictify the results to > minimize lookup times in that filter function. > > Full disclosure: I was only working based off no errors, with no knowledge > of the algorithm implementation. > > code: > > https://gist.github.com/gravesmedical/58a6b665b553c1294b56 > > On Tue, Nov 10, 2015 at 8:57 AM, Ross Heflin > wrote: > >> Might be time to profile. >> Run your similarity matrix builder with the large dataset against >> cProfile (or whatever works on PyPy) for some time (30 min) and see where >> its spending the majority of its time. >> >> -Ross >> >> On Mon, Nov 9, 2015 at 7:44 PM, Lewit, Douglas wrote: >> >>> Hey guys, >>> >>> I need some advice on this one. I'm attaching the homework assignment >>> so that you understand what I'm trying to do. I went as far as the >>> construction of the Similarity Matrix, which is a matrix of Pearson >>> correlation coefficients. >>> >>> My problem is this. u1.base (which is also attached) contains Users >>> (first column), Items (second column), Ratings (third column) and finally >>> the time stamp in the 4th and final column. (Just discard the 4th column. >>> We're not using it for anything. ) >>> >>> It's taking HOURS for Python to build the similarity matrix. So what I >>> did was: >>> >>> *head -n 5000 u1.base > practice.base* >>> >>> and I also downloaded the PyPy interpreter for Python 3. Then using >>> PyPy (or pypy or whatever) I ran my program on the first ten thousand lines >>> of data from u1.base stored in the new text file, practice.base. Not a >>> problem!!! I still had to wait a couple minutes, but not a couple hours!!! >>> >>> >>> Is there a way to make this program work for such a large set of data? >>> I know my program successfully constructs the Similarity Matrix (i.e. >>> similarity between users) for 5,000, 10,000, 20,000 and even 25,000 lines >>> of data. But for 80,000 lines of data the program becomes very slow and >>> overtaxes my CPU. (The fan turns on and the bottom of my laptop starts to >>> get very hot.... a bad sign! ) >>> >>> Does anyone have any recommendations? ( I'm supposed to meet with my >>> prof on Tuesday. I may just explain the problem to him and request a >>> smaller data set to work with. And unfortunately he knows very little >>> about Python. He's primarily a C++ and Java programmer. ) >>> >>> I appreciate the feedback. Thank you!!! >>> >>> Best, >>> >>> Douglas Lewit >>> >>> >>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> >> >> -- >> From the "desk" of Ross Heflin >> phone number: (847) <23,504,826th decimal place of pi> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at sinchok.com Wed Nov 11 10:34:15 2015 From: chris at sinchok.com (Chris Sinchok) Date: Wed, 11 Nov 2015 09:34:15 -0600 Subject: [Chicago] django streaming zip In-Reply-To: References: Message-ID: A couple thoughts on this: 1. You could use the `zipfile` module to create a temporary file, and then stream that, deleting it once the streaming is done. 2. You could try something like this, which appears to do almost exactly what you want: https://github.com/allanlei/python-zipstream (I've personally never tried this package before, though) - Chris On Wed, Nov 11, 2015 at 3:11 AM, Carl Karsten wrote: > I have a django app and nginx serving up static files. > > I want the user to be able to get a set of files. it will be 5 to 10 > files, most of them will be 5-10 meg, so about 50-100 meg total. > > The thought is provide a zip link, click the link, get a zip file. > > I am trying to avoid creating zip files on the local fs, either ahead of > time or on the fly. I am hoping I can pass something a list of file names > and it will return the a file like object that I can pass on to whatever > and the client will get is zip file as the server creates it. > > btw, it is a cut list (xml of file names and cut points) 2 png, 2-10 .webm > files. > > > > > -- > Carl K > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at personnelware.com Wed Nov 11 11:37:35 2015 From: carl at personnelware.com (Carl Karsten) Date: Wed, 11 Nov 2015 10:37:35 -0600 Subject: [Chicago] django streaming zip In-Reply-To: References: Message-ID: "Like Python's ZipFile module, except it works as a generator that provides the file in many small chunks." Neat! and yes, exactly what I was hoping for. On Wed, Nov 11, 2015 at 9:34 AM, Chris Sinchok wrote: > A couple thoughts on this: > > 1. You could use the `zipfile` module to create a temporary file, and then > stream that, deleting it once the streaming is done. > 2. You could try something like this, which appears to do almost exactly > what you want: https://github.com/allanlei/python-zipstream (I've > personally never tried this package before, though) > > - Chris > > On Wed, Nov 11, 2015 at 3:11 AM, Carl Karsten > wrote: > >> I have a django app and nginx serving up static files. >> >> I want the user to be able to get a set of files. it will be 5 to 10 >> files, most of them will be 5-10 meg, so about 50-100 meg total. >> >> The thought is provide a zip link, click the link, get a zip file. >> >> I am trying to avoid creating zip files on the local fs, either ahead of >> time or on the fly. I am hoping I can pass something a list of file names >> and it will return the a file like object that I can pass on to whatever >> and the client will get is zip file as the server creates it. >> >> btw, it is a cut list (xml of file names and cut points) 2 png, 2-10 >> .webm files. >> >> >> >> >> -- >> Carl K >> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -- Carl K -------------- next part -------------- An HTML attachment was scrubbed... URL: From walkersam at gmail.com Thu Nov 12 17:21:28 2015 From: walkersam at gmail.com (Sam Walker) Date: Thu, 12 Nov 2015 16:21:28 -0600 Subject: [Chicago] ChiPy November 12th Meeting In-Reply-To: References: Message-ID: Sorry to spam the list with this, but the RSVP on the chipy.org site has not been working for me. Tried on multiple OS's and browsers and the Recaptcha checkbox just spins and spins..... Is there some other way I can get added to the list for tonight's meeting? --Samuel Walker On Mon, Nov 9, 2015 at 10:48 PM, Joe Jasinski wrote: > All, > > November ChiPy is fast approaching. This month, we are meeting at the > Nokia office in the loop. All are welcome! You can find more > information about ChiPy at our website http://www.chipy.org/ > > Hope to see you there! > > *When:* > Thursday November 12th, 7:00pm > > *How:* > You can rsvp at chipy.org or via our Meetup > group. > > *Where:* > HERE (Nokia) > 100 North Riverside Chicago, IL 60606 > > *What:* > > - *Python at Nokia (by MacGregor Felix)* > (1:00:00 Minutes) > By: > Python is known to be a multi-purpose and multi-paradigm programming > language. Come see how the Reality Capture & Processing (RCP) group of > Nokia HERE is making use of Python?s versatility. We will show you how HERE > RCP uses Python?s Object Oriented constructs to represent business models > in production systems. You will see how Python?s functional lambdas are > used to elegantly facilitate the handling of big data. We will discuss the > use of Python not only in production code but also in test code. We not > only use Python for production purposes but also to build utilities. We > hope to show you how we utilize Python's versatility and closeness to the > operating system to build sophisticated tools for development and > operational productivity. You?ll see our Test Driven development effort > while building Python products and how we use Python in Behavior Driven > Development to code language-agnostic acceptance tests for the evolution of > software and services. We will also give you a pick at our Python packaging > and distribution. > - *Python-fu in the GIMP* > (0:42:00 Minutes) > By: Tanya Schlusser > GIMP (the GNU Image Manipulation Program) is great all by itself but > is even better with Python-fu. This talk demonstrates a little Python-fu to > manipulate images in GIMP, with a little (slightly ugly) hacking to add > external libraries. > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam at adamforsyth.net Thu Nov 12 17:46:10 2015 From: adam at adamforsyth.net (Adam Forsyth) Date: Thu, 12 Nov 2015 16:46:10 -0600 Subject: [Chicago] ChiPy November 12th Meeting In-Reply-To: References: Message-ID: Generally, you should RSVP by noon on the day of the meeting. If Chipy.org doesn't work for you, you can use http://www.meetup.com/_ChiPy_/ You should be OK to get in tonight, we'll make sure you're on the list. On Nov 12, 2015 16:41, "Sam Walker" wrote: > Sorry to spam the list with this, but the RSVP on the chipy.org site has > not been working for me. Tried on multiple OS's and browsers and the > Recaptcha checkbox just spins and spins..... > > Is there some other way I can get added to the list for tonight's meeting? > > --Samuel Walker > > On Mon, Nov 9, 2015 at 10:48 PM, Joe Jasinski > wrote: > >> All, >> >> November ChiPy is fast approaching. This month, we are meeting at the >> Nokia office in the loop. All are welcome! You can find more >> information about ChiPy at our website http://www.chipy.org/ >> >> Hope to see you there! >> >> *When:* >> Thursday November 12th, 7:00pm >> >> *How:* >> You can rsvp at chipy.org or via our Meetup >> group. >> >> *Where:* >> HERE (Nokia) >> 100 North Riverside Chicago, IL 60606 >> >> *What:* >> >> - *Python at Nokia (by MacGregor Felix)* >> (1:00:00 Minutes) >> By: >> Python is known to be a multi-purpose and multi-paradigm programming >> language. Come see how the Reality Capture & Processing (RCP) group of >> Nokia HERE is making use of Python?s versatility. We will show you how HERE >> RCP uses Python?s Object Oriented constructs to represent business models >> in production systems. You will see how Python?s functional lambdas are >> used to elegantly facilitate the handling of big data. We will discuss the >> use of Python not only in production code but also in test code. We not >> only use Python for production purposes but also to build utilities. We >> hope to show you how we utilize Python's versatility and closeness to the >> operating system to build sophisticated tools for development and >> operational productivity. You?ll see our Test Driven development effort >> while building Python products and how we use Python in Behavior Driven >> Development to code language-agnostic acceptance tests for the evolution of >> software and services. We will also give you a pick at our Python packaging >> and distribution. >> - *Python-fu in the GIMP* >> (0:42:00 Minutes) >> By: Tanya Schlusser >> GIMP (the GNU Image Manipulation Program) is great all by itself but >> is even better with Python-fu. This talk demonstrates a little Python-fu to >> manipulate images in GIMP, with a little (slightly ugly) hacking to add >> external libraries. >> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwbogucki at gmail.com Thu Nov 12 17:47:54 2015 From: mwbogucki at gmail.com (Michael Bogucki) Date: Thu, 12 Nov 2015 16:47:54 -0600 Subject: [Chicago] ChiPy November 12th Meeting In-Reply-To: References: Message-ID: Hi Adam, What's the best way to get into the building? On the Washington or Randolph side? Thanks. --Mike On Thu, Nov 12, 2015 at 4:46 PM, Adam Forsyth wrote: > Generally, you should RSVP by noon on the day of the meeting. If Chipy.org > doesn't work for you, you can use http://www.meetup.com/_ChiPy_/ > > You should be OK to get in tonight, we'll make sure you're on the list. > On Nov 12, 2015 16:41, "Sam Walker" wrote: > >> Sorry to spam the list with this, but the RSVP on the chipy.org site has >> not been working for me. Tried on multiple OS's and browsers and the >> Recaptcha checkbox just spins and spins..... >> >> Is there some other way I can get added to the list for tonight's meeting? >> >> --Samuel Walker >> >> On Mon, Nov 9, 2015 at 10:48 PM, Joe Jasinski >> wrote: >> >>> All, >>> >>> November ChiPy is fast approaching. This month, we are meeting at the >>> Nokia office in the loop. All are welcome! You can find more >>> information about ChiPy at our website http://www.chipy.org/ >>> >>> Hope to see you there! >>> >>> *When:* >>> Thursday November 12th, 7:00pm >>> >>> *How:* >>> You can rsvp at chipy.org or via our Meetup >>> group. >>> >>> *Where:* >>> HERE (Nokia) >>> 100 North Riverside Chicago, IL 60606 >>> >>> *What:* >>> >>> - *Python at Nokia (by MacGregor Felix)* >>> (1:00:00 Minutes) >>> By: >>> Python is known to be a multi-purpose and multi-paradigm programming >>> language. Come see how the Reality Capture & Processing (RCP) group of >>> Nokia HERE is making use of Python?s versatility. We will show you how HERE >>> RCP uses Python?s Object Oriented constructs to represent business models >>> in production systems. You will see how Python?s functional lambdas are >>> used to elegantly facilitate the handling of big data. We will discuss the >>> use of Python not only in production code but also in test code. We not >>> only use Python for production purposes but also to build utilities. We >>> hope to show you how we utilize Python's versatility and closeness to the >>> operating system to build sophisticated tools for development and >>> operational productivity. You?ll see our Test Driven development effort >>> while building Python products and how we use Python in Behavior Driven >>> Development to code language-agnostic acceptance tests for the evolution of >>> software and services. We will also give you a pick at our Python packaging >>> and distribution. >>> - *Python-fu in the GIMP* >>> (0:42:00 Minutes) >>> By: Tanya Schlusser >>> GIMP (the GNU Image Manipulation Program) is great all by itself but >>> is even better with Python-fu. This talk demonstrates a little Python-fu to >>> manipulate images in GIMP, with a little (slightly ugly) hacking to add >>> external libraries. >>> >>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam at adamforsyth.net Thu Nov 12 17:49:11 2015 From: adam at adamforsyth.net (Adam Forsyth) Date: Thu, 12 Nov 2015 16:49:11 -0600 Subject: [Chicago] ChiPy November 12th Meeting In-Reply-To: References:

Message-ID: I believe the entrance is by Randolph but you can get there from either street. On Nov 12, 2015 16:48, "Michael Bogucki" wrote: > Hi Adam, > What's the best way to get into the building? On the Washington or > Randolph side? > Thanks. > --Mike > > On Thu, Nov 12, 2015 at 4:46 PM, Adam Forsyth > wrote: > >> Generally, you should RSVP by noon on the day of the meeting. If >> Chipy.org doesn't work for you, you can use >> http://www.meetup.com/_ChiPy_/ >> >> You should be OK to get in tonight, we'll make sure you're on the list. >> On Nov 12, 2015 16:41, "Sam Walker" wrote: >> >>> Sorry to spam the list with this, but the RSVP on the chipy.org site >>> has not been working for me. Tried on multiple OS's and browsers and the >>> Recaptcha checkbox just spins and spins..... >>> >>> Is there some other way I can get added to the list for tonight's >>> meeting? >>> >>> --Samuel Walker >>> >>> On Mon, Nov 9, 2015 at 10:48 PM, Joe Jasinski >>> wrote: >>> >>>> All, >>>> >>>> November ChiPy is fast approaching. This month, we are meeting at the >>>> Nokia office in the loop. All are welcome! You can find more >>>> information about ChiPy at our website http://www.chipy.org/ >>>> >>>> Hope to see you there! >>>> >>>> *When:* >>>> Thursday November 12th, 7:00pm >>>> >>>> *How:* >>>> You can rsvp at chipy.org or via our Meetup >>>> group. >>>> >>>> *Where:* >>>> HERE (Nokia) >>>> 100 North Riverside Chicago, IL 60606 >>>> >>>> *What:* >>>> >>>> - *Python at Nokia (by MacGregor Felix)* >>>> (1:00:00 Minutes) >>>> By: >>>> Python is known to be a multi-purpose and multi-paradigm >>>> programming language. Come see how the Reality Capture & Processing (RCP) >>>> group of Nokia HERE is making use of Python?s versatility. We will show you >>>> how HERE RCP uses Python?s Object Oriented constructs to represent business >>>> models in production systems. You will see how Python?s functional lambdas >>>> are used to elegantly facilitate the handling of big data. We will discuss >>>> the use of Python not only in production code but also in test code. We not >>>> only use Python for production purposes but also to build utilities. We >>>> hope to show you how we utilize Python's versatility and closeness to the >>>> operating system to build sophisticated tools for development and >>>> operational productivity. You?ll see our Test Driven development effort >>>> while building Python products and how we use Python in Behavior Driven >>>> Development to code language-agnostic acceptance tests for the evolution of >>>> software and services. We will also give you a pick at our Python packaging >>>> and distribution. >>>> - *Python-fu in the GIMP* >>>> (0:42:00 Minutes) >>>> By: Tanya Schlusser >>>> GIMP (the GNU Image Manipulation Program) is great all by itself >>>> but is even better with Python-fu. This talk demonstrates a little >>>> Python-fu to manipulate images in GIMP, with a little (slightly ugly) >>>> hacking to add external libraries. >>>> >>>> >>>> _______________________________________________ >>>> Chicago mailing list >>>> Chicago at python.org >>>> https://mail.python.org/mailman/listinfo/chicago >>>> >>>> >>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwbogucki at gmail.com Thu Nov 12 17:50:06 2015 From: mwbogucki at gmail.com (Michael Bogucki) Date: Thu, 12 Nov 2015 16:50:06 -0600 Subject: [Chicago] ChiPy November 12th Meeting In-Reply-To: References:

Message-ID: Thank you sir. ^_^ On Thu, Nov 12, 2015 at 4:49 PM, Adam Forsyth wrote: > I believe the entrance is by Randolph but you can get there from either > street. > On Nov 12, 2015 16:48, "Michael Bogucki" wrote: > >> Hi Adam, >> What's the best way to get into the building? On the Washington or >> Randolph side? >> Thanks. >> --Mike >> >> On Thu, Nov 12, 2015 at 4:46 PM, Adam Forsyth >> wrote: >> >>> Generally, you should RSVP by noon on the day of the meeting. If >>> Chipy.org doesn't work for you, you can use >>> http://www.meetup.com/_ChiPy_/ >>> >>> You should be OK to get in tonight, we'll make sure you're on the list. >>> On Nov 12, 2015 16:41, "Sam Walker" wrote: >>> >>>> Sorry to spam the list with this, but the RSVP on the chipy.org site >>>> has not been working for me. Tried on multiple OS's and browsers and the >>>> Recaptcha checkbox just spins and spins..... >>>> >>>> Is there some other way I can get added to the list for tonight's >>>> meeting? >>>> >>>> --Samuel Walker >>>> >>>> On Mon, Nov 9, 2015 at 10:48 PM, Joe Jasinski >>>> wrote: >>>> >>>>> All, >>>>> >>>>> November ChiPy is fast approaching. This month, we are meeting at the >>>>> Nokia office in the loop. All are welcome! You can find more >>>>> information about ChiPy at our website http://www.chipy.org/ >>>>> >>>>> Hope to see you there! >>>>> >>>>> *When:* >>>>> Thursday November 12th, 7:00pm >>>>> >>>>> *How:* >>>>> You can rsvp at chipy.org or via our Meetup >>>>> group. >>>>> >>>>> *Where:* >>>>> HERE (Nokia) >>>>> 100 North Riverside Chicago, IL 60606 >>>>> >>>>> *What:* >>>>> >>>>> - *Python at Nokia (by MacGregor Felix)* >>>>> (1:00:00 Minutes) >>>>> By: >>>>> Python is known to be a multi-purpose and multi-paradigm >>>>> programming language. Come see how the Reality Capture & Processing (RCP) >>>>> group of Nokia HERE is making use of Python?s versatility. We will show you >>>>> how HERE RCP uses Python?s Object Oriented constructs to represent business >>>>> models in production systems. You will see how Python?s functional lambdas >>>>> are used to elegantly facilitate the handling of big data. We will discuss >>>>> the use of Python not only in production code but also in test code. We not >>>>> only use Python for production purposes but also to build utilities. We >>>>> hope to show you how we utilize Python's versatility and closeness to the >>>>> operating system to build sophisticated tools for development and >>>>> operational productivity. You?ll see our Test Driven development effort >>>>> while building Python products and how we use Python in Behavior Driven >>>>> Development to code language-agnostic acceptance tests for the evolution of >>>>> software and services. We will also give you a pick at our Python packaging >>>>> and distribution. >>>>> - *Python-fu in the GIMP* >>>>> (0:42:00 Minutes) >>>>> By: Tanya Schlusser >>>>> GIMP (the GNU Image Manipulation Program) is great all by itself >>>>> but is even better with Python-fu. This talk demonstrates a little >>>>> Python-fu to manipulate images in GIMP, with a little (slightly ugly) >>>>> hacking to add external libraries. >>>>> >>>>> >>>>> _______________________________________________ >>>>> Chicago mailing list >>>>> Chicago at python.org >>>>> https://mail.python.org/mailman/listinfo/chicago >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Chicago mailing list >>>> Chicago at python.org >>>> https://mail.python.org/mailman/listinfo/chicago >>>> >>>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >>> >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.j.johnson at gmail.com Fri Nov 13 15:07:44 2015 From: thomas.j.johnson at gmail.com (Thomas Johnson) Date: Fri, 13 Nov 2015 20:07:44 +0000 Subject: [Chicago] XBRL Message-ID: If anyone out there has experience dealing with the SEC's XBRL & XML data using Python (or anything else, really) I'd love to buy you a coffee and have a quick chat with you to get an overview of what's available and the best ways to access it -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob.haugen at gmail.com Fri Nov 13 15:21:38 2015 From: bob.haugen at gmail.com (Bob Haugen) Date: Fri, 13 Nov 2015 14:21:38 -0600 Subject: [Chicago] XBRL In-Reply-To: References: Message-ID: Thomas, if you get anything cooking, please report back. Thanks. On Fri, Nov 13, 2015 at 2:07 PM, Thomas Johnson wrote: > If anyone out there has experience dealing with the SEC's XBRL & XML data > using Python (or anything else, really) I'd love to buy you a coffee and > have a quick chat with you to get an overview of what's available and the > best ways to access it > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > From zitterbewegung at gmail.com Fri Nov 13 19:00:42 2015 From: zitterbewegung at gmail.com (Joshua Herman) Date: Sat, 14 Nov 2015 00:00:42 +0000 Subject: [Chicago] XBRL In-Reply-To: References: Message-ID: Have you seen this Python package ? https://github.com/greedo/python-xbrl On Fri, Nov 13, 2015 at 2:21 PM Bob Haugen wrote: > Thomas, if you get anything cooking, please report back. Thanks. > > On Fri, Nov 13, 2015 at 2:07 PM, Thomas Johnson > wrote: > > If anyone out there has experience dealing with the SEC's XBRL & XML data > > using Python (or anything else, really) I'd love to buy you a coffee and > > have a quick chat with you to get an overview of what's available and the > > best ways to access it > > > > _______________________________________________ > > Chicago mailing list > > Chicago at python.org > > https://mail.python.org/mailman/listinfo/chicago > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.j.johnson at gmail.com Fri Nov 13 19:04:53 2015 From: thomas.j.johnson at gmail.com (Thomas Johnson) Date: Sat, 14 Nov 2015 00:04:53 +0000 Subject: [Chicago] XBRL In-Reply-To: References:

Message-ID: Yeah I've been taking a look at different parsing packages. I'm interested more in potential pitfalls - like how often I should expect the data to be wrong, what data is available (e.g., it's not clear to me if forms other than 10-Q and 10-K are available), etc. On Fri, Nov 13, 2015 at 6:01 PM Joshua Herman wrote: > Have you seen this Python package ? > https://github.com/greedo/python-xbrl > On Fri, Nov 13, 2015 at 2:21 PM Bob Haugen wrote: > >> Thomas, if you get anything cooking, please report back. Thanks. >> >> On Fri, Nov 13, 2015 at 2:07 PM, Thomas Johnson >> wrote: >> > If anyone out there has experience dealing with the SEC's XBRL & XML >> data >> > using Python (or anything else, really) I'd love to buy you a coffee and >> > have a quick chat with you to get an overview of what's available and >> the >> > best ways to access it >> > >> > _______________________________________________ >> > Chicago mailing list >> > Chicago at python.org >> > https://mail.python.org/mailman/listinfo/chicago >> > >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikaeltamillow96 at gmail.com Fri Nov 13 20:24:41 2015 From: mikaeltamillow96 at gmail.com (Mike Tamillow) Date: Fri, 13 Nov 2015 19:24:41 -0600 Subject: [Chicago] XBRL In-Reply-To: References:

Message-ID: <43BE4E6A-A7AF-4822-9505-CCA4CFEDD471@gmail.com> Just out of curiosity, are you trying to do anything particular with the data, or just looking to master a skill? I know Sentdex briefly shows accessing SEC data on YouTube from Python. Sent from my iPhone > On Nov 13, 2015, at 6:04 PM, Thomas Johnson wrote: > > Yeah I've been taking a look at different parsing packages. I'm interested more in potential pitfalls - like how often I should expect the data to be wrong, what data is available (e.g., it's not clear to me if forms other than 10-Q and 10-K are available), etc. > >> On Fri, Nov 13, 2015 at 6:01 PM Joshua Herman wrote: >> Have you seen this Python package ? >> https://github.com/greedo/python-xbrl >>> On Fri, Nov 13, 2015 at 2:21 PM Bob Haugen wrote: >>> Thomas, if you get anything cooking, please report back. Thanks. >>> >>> On Fri, Nov 13, 2015 at 2:07 PM, Thomas Johnson >>> wrote: >>> > If anyone out there has experience dealing with the SEC's XBRL & XML data >>> > using Python (or anything else, really) I'd love to buy you a coffee and >>> > have a quick chat with you to get an overview of what's available and the >>> > best ways to access it >>> > >>> > _______________________________________________ >>> > Chicago mailing list >>> > Chicago at python.org >>> > https://mail.python.org/mailman/listinfo/chicago >>> > >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.j.johnson at gmail.com Fri Nov 13 20:40:03 2015 From: thomas.j.johnson at gmail.com (Thomas Johnson) Date: Sat, 14 Nov 2015 01:40:03 +0000 Subject: [Chicago] XBRL In-Reply-To: <43BE4E6A-A7AF-4822-9505-CCA4CFEDD471@gmail.com> References:

<43BE4E6A-A7AF-4822-9505-CCA4CFEDD471@gmail.com> Message-ID: What I'm really trying to do with the data is replace one of our existing data vendors. Basically my startup (FactorWave.com) uses peer-reviewed research from academia and hedge funds to score stocks on various factors. A lot of that research depends on fundamental data. I think our current vendor is mostly just scraping and normalizing the XBRL data, so if I can do that myself it would save us money as well as give us access to more timely and more fine-grained data. On Fri, Nov 13, 2015 at 7:24 PM Mike Tamillow wrote: > Just out of curiosity, are you trying to do anything particular with the > data, or just looking to master a skill? > > I know Sentdex briefly shows accessing SEC data on YouTube from Python. > > Sent from my iPhone > > On Nov 13, 2015, at 6:04 PM, Thomas Johnson > wrote: > > Yeah I've been taking a look at different parsing packages. I'm interested > more in potential pitfalls - like how often I should expect the data to be > wrong, what data is available (e.g., it's not clear to me if forms other > than 10-Q and 10-K are available), etc. > > On Fri, Nov 13, 2015 at 6:01 PM Joshua Herman > wrote: > >> Have you seen this Python package ? >> https://github.com/greedo/python-xbrl >> On Fri, Nov 13, 2015 at 2:21 PM Bob Haugen wrote: >> >>> Thomas, if you get anything cooking, please report back. Thanks. >>> >>> On Fri, Nov 13, 2015 at 2:07 PM, Thomas Johnson >>> wrote: >>> > If anyone out there has experience dealing with the SEC's XBRL & XML >>> data >>> > using Python (or anything else, really) I'd love to buy you a coffee >>> and >>> > have a quick chat with you to get an overview of what's available and >>> the >>> > best ways to access it >>> > >>> > _______________________________________________ >>> > Chicago mailing list >>> > Chicago at python.org >>> > https://mail.python.org/mailman/listinfo/chicago >>> > >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikaeltamillow96 at gmail.com Sat Nov 14 02:08:18 2015 From: mikaeltamillow96 at gmail.com (Michael Tamillow) Date: Sat, 14 Nov 2015 01:08:18 -0600 Subject: [Chicago] XBRL In-Reply-To: References:

<43BE4E6A-A7AF-4822-9505-CCA4CFEDD471@gmail.com> Message-ID: I know this is a little away from the technical aspects, but what about the data vendor do you not like. Is it the price? Have you done a cost/benefit if so? Or are the going under, producing subpar data, delaying the data too much? I am sure you could figure out everything it takes to scrape and normalize the data, but I would assume that if you could estimate a price that you felt it would cost you to do it on your own, you would be able to negotiate, if that is the underlying issue. On Fri, Nov 13, 2015 at 7:40 PM, Thomas Johnson wrote: > What I'm really trying to do with the data is replace one of our existing > data vendors. Basically my startup (FactorWave.com) uses peer-reviewed > research from academia and hedge funds to score stocks on various factors. > A lot of that research depends on fundamental data. I think our current > vendor is mostly just scraping and normalizing the XBRL data, so if I can > do that myself it would save us money as well as give us access to more > timely and more fine-grained data. > > On Fri, Nov 13, 2015 at 7:24 PM Mike Tamillow > wrote: > >> Just out of curiosity, are you trying to do anything particular with the >> data, or just looking to master a skill? >> >> I know Sentdex briefly shows accessing SEC data on YouTube from Python. >> >> Sent from my iPhone >> >> On Nov 13, 2015, at 6:04 PM, Thomas Johnson >> wrote: >> >> Yeah I've been taking a look at different parsing packages. I'm >> interested more in potential pitfalls - like how often I should expect the >> data to be wrong, what data is available (e.g., it's not clear to me if >> forms other than 10-Q and 10-K are available), etc. >> >> On Fri, Nov 13, 2015 at 6:01 PM Joshua Herman >> wrote: >> >>> Have you seen this Python package ? >>> https://github.com/greedo/python-xbrl >>> On Fri, Nov 13, 2015 at 2:21 PM Bob Haugen wrote: >>> >>>> Thomas, if you get anything cooking, please report back. Thanks. >>>> >>>> On Fri, Nov 13, 2015 at 2:07 PM, Thomas Johnson >>>> wrote: >>>> > If anyone out there has experience dealing with the SEC's XBRL & XML >>>> data >>>> > using Python (or anything else, really) I'd love to buy you a coffee >>>> and >>>> > have a quick chat with you to get an overview of what's available and >>>> the >>>> > best ways to access it >>>> > >>>> > _______________________________________________ >>>> > Chicago mailing list >>>> > Chicago at python.org >>>> > https://mail.python.org/mailman/listinfo/chicago >>>> > >>>> _______________________________________________ >>>> Chicago mailing list >>>> Chicago at python.org >>>> https://mail.python.org/mailman/listinfo/chicago >>>> >>> _______________________________________________ >>> Chicago mailing list >>> Chicago at python.org >>> https://mail.python.org/mailman/listinfo/chicago >>> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> >> _______________________________________________ >> Chicago mailing list >> Chicago at python.org >> https://mail.python.org/mailman/listinfo/chicago >> > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tanya at tickel.net Sat Nov 14 12:34:17 2015 From: tanya at tickel.net (Tanya Schlusser) Date: Sat, 14 Nov 2015 11:34:17 -0600 Subject: [Chicago] XBRL Message-ID: > > > Yeah I've been taking a look at different parsing packages. I'm > interested more in potential pitfalls - like how often I should expect the > data to be wrong, what data is available (e.g., it's not clear to me if > forms other than 10-Q and 10-K are available), etc. > I've written stuff for a friend in whatever the google spreadsheet scripting language is. I don't have advice worthy of coffee except to say that - The data seem correct, or else missing. not too much garbling - There is a link to the actual PDF as well if you want the raw data - fields that can have multiple entries can _really_ have a zillion things in them And there are many, many companies -- as far as I can tell on average more than 1500 new companies registered with the SEC every day -- so be ready for a lot of data if you don't have a specific target you're looking for There is a consortium dedicated to promoting the use of XBRL: http://xbrl.us/ They announce webinars on their mailing list. Here's one on how to use XBRL to analyze a company's fundamentals . And apparently next Thursday (the 19th) there's an introductory webinar: XBRL 101: a solid foundation and you probably know the SEC's page, but for grins: http://xbrl.sec.gov/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.j.johnson at gmail.com Sat Nov 14 13:14:45 2015 From: thomas.j.johnson at gmail.com (Thomas Johnson) Date: Sat, 14 Nov 2015 18:14:45 +0000 Subject: [Chicago] XBRL In-Reply-To: References: Message-ID: Thanks, those webinars look really helpful! On Sat, Nov 14, 2015 at 11:34 AM Tanya Schlusser wrote: > > Yeah I've been taking a look at different parsing packages. I'm >> interested more in potential pitfalls - like how often I should expect the >> data to be wrong, what data is available (e.g., it's not clear to me if >> forms other than 10-Q and 10-K are available), etc. >> > > > I've written stuff for a friend in whatever the google spreadsheet > scripting language is. I don't have advice worthy of coffee except to say > that > > > - The data seem correct, or else missing. not too much garbling > - There is a link to the actual PDF as well if you want the raw data > - fields that can have multiple entries can _really_ have a zillion > things in them > > And there are many, many companies -- as far as I can tell on average more > than 1500 new companies registered with the SEC every day -- so be ready > for a lot of data if you don't have a specific target you're looking for > > > There is a consortium dedicated to promoting the use of XBRL: > http://xbrl.us/ > > They announce webinars on their mailing list. Here's one on > how to use XBRL to analyze a company's fundamentals > . > > And apparently next Thursday (the 19th) there's an introductory webinar: > XBRL 101: a solid foundation > > and you probably know the SEC's page, but for grins: > http://xbrl.sec.gov/ > > > _______________________________________________ > Chicago mailing list > Chicago at python.org > https://mail.python.org/mailman/listinfo/chicago > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shekay at pobox.com Mon Nov 16 14:52:00 2015 From: shekay at pobox.com (sheila miguez) Date: Mon, 16 Nov 2015 13:52:00 -0600 Subject: [Chicago] Python Project Night and Office Hours this week Message-ID: Hi all, a 2 for 1 post! Python Office Hours, Wednesday, 18th: http://www.meetup.com/ChicagoPythonistas/events/224880686/ Python Project Night, Thursday, 19th: http://www.meetup.com/ChicagoPythonistas/events/224880571/ I apologize for the duplicate RSVP questions on the meetup event. I've submitted an issue about the problem a couple of times now but it is still an issue. What is this? This is a chance for people of all experiences levels to get together for programming, socializing, and moral support while working on things. If you prefer to work on things on your own, that is okay too! If someone asks for help, let them know that you are working on something and not available for help at the moment. If you don't already have something to work on or study, look through the resources on the wiki. https://wiki.pumpingstationone.org/Python_Office_Hours Thanks all -- shekay at pobox.com -------------- next part -------------- An HTML attachment was scrubbed... URL: