From jek at discorporate.us Tue Mar 10 20:27:10 2009 From: jek at discorporate.us (jason kirtland) Date: Tue, 10 Mar 2009 12:27:10 -0700 Subject: [portland] Meeting TONIGHT! Machine learning, PyParsing, & PyTyrant Message-ID: <49B6BF0E.20202@discorporate.us> Portland Pythonistas, A reminder that we're meeting tonight (Tuesday March 10th) at 7pm at CubeSpace. On the agenda is: * Machine Learning topics by John Melesky * PyParsing examples by Brett Carter * Short intro to PyTyrant/Tokyo Tyrant/Tokyo Cabinet by Michael Schurter ...and as always, discussion on topics of interest, then beer after and more discussion. Hope to see everyone tonight! Addresses, maps etc.: http://www.meetup.com/pdxpython/ Cheers, Jason From markgross at thegnar.org Wed Mar 11 15:34:05 2009 From: markgross at thegnar.org (mgross) Date: Wed, 11 Mar 2009 07:34:05 -0700 Subject: [portland] python bridge talk questions / advice Message-ID: <20090311143405.GA12737@thegnar.org> Last nights pdxpython meeting was pretty much the best I've ever been too. I woke up thinking about some stuff related to python and a possible bridge talk I'm starting to consider doing. One of the many things that stuck in my head from last meeting (besides Machine learning--which was uber cool) was the pytyrant and friends talk hitting on the performance of the database. What struck me is the performance delta in database throughput on Michael's simple sample losing 2 orders of magnitude int run time by using a loop-back network over direct to the file access. It got me thinking, hmm, I know a little about how to drill down on performance and scaling problems and, I know some experts in the performance area I can ask questions of, and I have a few questions on what exactly are the performance issues with python and django workloads? (like, does my cheap-oh ISP have a legitimate point regarding its refusal to support Django sites?) So my question to the list is can folks feed me some workloads I can use in a side project I'm thinking of starting to drill down on the performance and scaling issues with python and Django use for both web applications and python code in general? My thought is that this investigation and results would make an interesting bridge talk. Thanks for any advice or help! --mgross -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From kirby.urner at gmail.com Wed Mar 11 17:00:30 2009 From: kirby.urner at gmail.com (kirby urner) Date: Wed, 11 Mar 2009 09:00:30 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <20090311143405.GA12737@thegnar.org> References: <20090311143405.GA12737@thegnar.org> Message-ID: Ditto re last night's group being thought provoking. Having joined JavaScript Admirers, also meeting at CubeSpace, I'm well aware of CouchDB these days, which is something like Tokyo Cabinet. We've been lucky enough to hear directly from one of the Apache developers. I'm giving a link to my write-up. As a long time RDBMS guy (SQL engineer, though not under the hood to the pyParser level (yet)) I'm quite aware that a generic medical record (mine, yours, anyones) is going to be wildly different per the individual, which in SQL terms looks like a blindingly huge number of tables, many of them completely unoccupied by a given individuals data (e.g. I'm not an XX, but might have XY problems). These map/reduce stowages, on the other hand, are much more "liberal" in terms of being "semi-structured" as we say. That doesn't mean they're not fast or hard to consult. And in terms of each LMR being its own work of art, that's possible too (LMR = legal medical record, in contrast to CRR = clinical research record). If I were on the phone to OHSU right now (an affiliate), I'd be saying the lag in PyTyrant and it's non-inclusion on the Sourceforge page for that project, adds grains of sand to the pan on the side of an Erlang approach over this one. CouchDB is also about replication, large data centers, speed... My thinking these days is CouchDB for LMRs (as a pilot), with national registries (e.g. NCDR) sticking with SQL, meaning the client hospitals using SQL for research as well. You'd suppose they all do that already but actually only some hospitals do a lot of outcomes research (the teaching ones) and some of those still use pre-SQL architectures such as MUMPS (believe or not <-- Jack Palance voice). Anyway, an enjoyable gathering, would've liked to have beer but I'm tightly scheduled these days, not that much wiggle room, maybe next time. Links: http://mybizmo.blogspot.com/2009/03/ppug-2009310.html (last night) http://controlroom.blogspot.com/2009/02/wanderer-cubespace.html (CouchDB presentation) http://mybizmo.blogspot.com/2009/01/admiring-javascript.html (previous meeting) If you want to say "hi" in real time, I'm currently on rtsp://server1.isepp.org/mystream.sdp (I'm representing ISEPP @ Pycon this time around, last time I had ISEPP on my nametag was 1st Buckminsterfullerene Conference @ Santa Barbara, met Harold Kroto and like that, was a lot younger back then, web wrangling for BFI.org). Kirby On Wed, Mar 11, 2009 at 7:34 AM, mgross wrote: > Last nights pdxpython meeting was pretty much the best I've ever been > too. ?I woke up thinking about some stuff related to python and a > possible bridge talk I'm starting to consider doing. > > One of the many things that stuck in my head from last meeting > (besides Machine learning--which was uber cool) was the pytyrant and > friends talk hitting on the performance of the database. > > What struck me is the performance delta in database throughput on > Michael's simple sample losing 2 orders of magnitude int run time by > using a loop-back network over direct to the file access. > > It got me thinking, hmm, I know a little about how to drill down on performance > and scaling problems and, I know some experts in the performance area > I can ask questions of, and I have a few questions on what exactly are > the performance issues with python and django workloads? > (like, does my cheap-oh ISP have a legitimate point regarding its > refusal to support Django sites?) > > So my question to the list is can folks feed me some workloads I can > use in a side project I'm thinking of starting to drill down on the > performance and scaling issues with python and Django use for both web > applications and python code in general? > > My thought is that this investigation and results would make an > interesting bridge talk. > > Thanks for any advice or help! > > --mgross > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: not available > Type: application/pgp-signature > Size: 189 bytes > Desc: Digital signature > URL: > _______________________________________________ > Portland mailing list > Portland at python.org > http://mail.python.org/mailman/listinfo/portland > From python at dylanreinhardt.com Wed Mar 11 17:15:10 2009 From: python at dylanreinhardt.com (Dylan Reinhardt) Date: Wed, 11 Mar 2009 09:15:10 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <20090311143405.GA12737@thegnar.org> References: <20090311143405.GA12737@thegnar.org> Message-ID: <4c645a720903110915k47d64e8x91b68d7b84184b69@mail.gmail.com> On Wed, Mar 11, 2009 at 7:34 AM, mgross wrote: > (like, does my cheap-oh ISP have a legitimate point regarding its > refusal to support Django sites?) They may have a point... if their point is to use a better provider. ;-) The main issue with Django is that it virtually requires shell access to be useful and it's difficult to host without providing what amounts to full shell access anyway. Many discount hosting providers are reluctant to provide shell access, ergo, no Django. WebFaction is not only a very Python-friendly shop, but has gone to the trouble of setting up SELinux-based hosting. So you can have free reign, but only within the correct context. It's worth a good look if you're interested in doing Django dev in a shared hosting environment. For what that's worth... > So my question to the list is can folks feed me some workloads I can > use in a side project I'm thinking of starting to drill down on the > performance and scaling issues with python and Django use for both web > applications and python code in general? If you have specific bottlenecks *after* you've done the standard deployment/caching stuff, I'd love to hear about them. Dylan -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at susens-schurter.com Wed Mar 11 17:23:11 2009 From: michael at susens-schurter.com (Michael Schurter) Date: Wed, 11 Mar 2009 09:23:11 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <20090311143405.GA12737@thegnar.org> References: <20090311143405.GA12737@thegnar.org> Message-ID: <240b71640903110923r141f71a9ofcfb1003cb4244af@mail.gmail.com> On Wed, Mar 11, 2009 at 7:34 AM, mgross wrote: > Last nights pdxpython meeting was pretty much the best I've ever been > too. ?I woke up thinking about some stuff related to python and a > possible bridge talk I'm starting to consider doing. I've been trying to think of a talk as well... I might be posting some ideas to the list for feedback as well. :) > One of the many things that stuck in my head from last meeting > (besides Machine learning--which was uber cool) was the pytyrant and > friends talk hitting on the performance of the database. > > What struck me is the performance delta in database throughput on > Michael's simple sample losing 2 orders of magnitude int run time by > using a loop-back network over direct to the file access. > > It got me thinking, hmm, I know a little about how to drill down on performance > and scaling problems and, I know some experts in the performance area > I can ask questions of, and I have a few questions on what exactly are > the performance issues with python and django workloads? > (like, does my cheap-oh ISP have a legitimate point regarding its > refusal to support Django sites?) How to identify bottlenecks is something I'd love to hear a talk on. While putting "scaling" in the title will probably double your audience, I care more about the step before you scale: figuring out whats running slowly. I think focusing on the bottlenecks between your application and backend data storage would be *really* interesting as the Internet is already full of people comparing mostly meaningless HTTP benchmarks against their frontend servers. Also, focusing on that area might appeal more to non-web people such as DBAs and sysadmins. Just a thought. :) From michael at susens-schurter.com Wed Mar 11 18:34:45 2009 From: michael at susens-schurter.com (Michael Schurter) Date: Wed, 11 Mar 2009 10:34:45 -0700 Subject: [portland] Tokyo Talk Slides Message-ID: <240b71640903111034k44eb75e9sf7ccf1db6908b021@mail.gmail.com> Come and get 'em: http://michael.susens-schurter.com/blog/2009/03/11/tokyo-cabinet-pytyrant-talk/ Someone asked me about data integrity last night, and I told a long story about TCP backoff algorithm issues. However, I forgot the punchline (aka solution). Here's a better explanation: Lets say we have packets PA, PB, and PC: PA - sent at 10:00:00 am from application node: NA PB - sent at 10:00:01 am from application node: NB PC - sent at 10:01:00 am from application node: NC Unfortunately Mr. Sysadmin was doing a massive rsync while those packets were trying to make their way from the application server to the database (Tokyo Tyrant) server. "Luckily" the C in TCP stands for Control[1], so instead of losing data, the database server's operating system tells senders to backoff for a second and try again later. Now if only 1 connection was being used between the application nodes and the database server, the operating system would insure all TCP packets are processed in the order they were sent, regardless of in what order they were received[2]. Unfortunately we have 3 nodes, and therefore 3 separate TCP connections. No spiffy guaranteed ordering for us. So here's the order the database server receives the packets after telling our nodes to backoff because the rsync backup is saturating its NIC: PB, PC, PA All 3 have the same Key (say, a user's session key), so PA's data ends up being the data written last. When you read from this key again, you'd expect to get PC's value, but instead you get PA. Hilarity ensues. And by hilarity, I mean user data is seemingly randomly lost and they see very strange behavior in their browser. The Solution: a Lua extension to automatically timestamp when each key was written. However this takes cooperation from the client-side as well. The client writes a timestamp as the first X digits of the *value* for every key they put (send to the Tokyo Tyrant). The Lua extension reads this timestamp and saves it in a field named "timestamp.$key" (where $key is the key being saved). The trick is if the timestamp for that key is *newer* than the timestamp on the data that just came in, the Lua extension returns an error and does *not* save the data (because its old). In practice the client actually just silently drops the error because if newer data has already been sent, there's really nothing it needs to do. If the timestamp for incoming data is *newer* than the saved timestamp, the lua extension updates both the timestamp key and the actual key we're trying to safely store. And thats what happens 99.999999999999% of the time. Its worth mentioning the Lua extension is *very* fast since its running right on top of the local Tokyo Cabinet database. So saving 2 key/value pairs instead of 1 does not in fact half your performance since the bottleneck is between PyTyrant and Tokyo Tyrant. Lessons learned: 1. Saturating a network connection can cause very very strange things to happen. 2. All of TCPs fancy congestion control and ordering algorithms are only beneficial if you pipe everything through 1 connection. Hope that makes sense! [1] http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Congestion_control [2] http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Ordered_data_transfer.2C_retransmission_of_lost_packets_and_discarding_duplicate_packets From markgross at thegnar.org Thu Mar 12 02:32:38 2009 From: markgross at thegnar.org (mgross) Date: Wed, 11 Mar 2009 18:32:38 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <4c645a720903110915k47d64e8x91b68d7b84184b69@mail.gmail.com> References: <20090311143405.GA12737@thegnar.org> <4c645a720903110915k47d64e8x91b68d7b84184b69@mail.gmail.com> Message-ID: <20090312013238.GA13557@thegnar.org> On Wed, Mar 11, 2009 at 09:15:10AM -0700, Dylan Reinhardt wrote: > On Wed, Mar 11, 2009 at 7:34 AM, mgross wrote: > > > (like, does my cheap-oh ISP have a legitimate point regarding its > > refusal to support Django sites?) > > > They may have a point... if their point is to use a better provider. ;-) > > The main issue with Django is that it virtually requires shell access to be > useful and it's difficult to host without providing what amounts to full > shell access anyway. Many discount hosting providers are reluctant to > provide shell access, ergo, no Django. > > WebFaction is not only a very Python-friendly shop, but has gone to the > trouble of setting up SELinux-based hosting. So you can have free reign, > but only within the correct context. It's worth a good look if you're > interested in doing Django dev in a shared hosting environment. > > For what that's worth... > > > > So my question to the list is can folks feed me some workloads I can > > use in a side project I'm thinking of starting to drill down on the > > performance and scaling issues with python and Django use for both web > > applications and python code in general? > > > If you have specific bottlenecks *after* you've done the standard > deployment/caching stuff, I'd love to hear about them. I be happy to share, but I was hoping for someone to point out some for me to use as starting points ;) --mgross -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From markgross at thegnar.org Thu Mar 12 02:36:10 2009 From: markgross at thegnar.org (mgross) Date: Wed, 11 Mar 2009 18:36:10 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <240b71640903110923r141f71a9ofcfb1003cb4244af@mail.gmail.com> References: <20090311143405.GA12737@thegnar.org> <240b71640903110923r141f71a9ofcfb1003cb4244af@mail.gmail.com> Message-ID: <20090312013610.GB13557@thegnar.org> On Wed, Mar 11, 2009 at 09:23:11AM -0700, Michael Schurter wrote: > On Wed, Mar 11, 2009 at 7:34 AM, mgross wrote: > > Last nights pdxpython meeting was pretty much the best I've ever been > > too. ?I woke up thinking about some stuff related to python and a > > possible bridge talk I'm starting to consider doing. > > I've been trying to think of a talk as well... I might be posting some > ideas to the list for feedback as well. :) > > > One of the many things that stuck in my head from last meeting > > (besides Machine learning--which was uber cool) was the pytyrant and > > friends talk hitting on the performance of the database. > > > > What struck me is the performance delta in database throughput on > > Michael's simple sample losing 2 orders of magnitude int run time by > > using a loop-back network over direct to the file access. > > > > It got me thinking, hmm, I know a little about how to drill down on performance > > and scaling problems and, I know some experts in the performance area > > I can ask questions of, and I have a few questions on what exactly are > > the performance issues with python and django workloads? > > (like, does my cheap-oh ISP have a legitimate point regarding its > > refusal to support Django sites?) > > How to identify bottlenecks is something I'd love to hear a talk on. > While putting "scaling" in the title will probably double your > audience, I care more about the step before you scale: figuring out > whats running slowly. I was thinking of scaling with respect to number of cores. I think I'll just focus on bottlenecks and avoid getting anyones hopes I that I'll figure anything too interesting out. > > I think focusing on the bottlenecks between your application and > backend data storage would be *really* interesting as the Internet is > already full of people comparing mostly meaningless HTTP benchmarks > against their frontend servers. > Thanks! I was thinking of starting with the measurements you showed at the meeting last night, I want to understand why it going though the loop back network device caused a 100x slow down. > Also, focusing on that area might appeal more to non-web people such > as DBAs and sysadmins. > > Just a thought. :) Thanks! --mgross -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From python at dylanreinhardt.com Thu Mar 12 04:20:55 2009 From: python at dylanreinhardt.com (Dylan Reinhardt) Date: Wed, 11 Mar 2009 20:20:55 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <20090312013238.GA13557@thegnar.org> References: <20090311143405.GA12737@thegnar.org> <4c645a720903110915k47d64e8x91b68d7b84184b69@mail.gmail.com> <20090312013238.GA13557@thegnar.org> Message-ID: <4c645a720903112020o1e69f159r424c72dca453649d@mail.gmail.com> On Wed, Mar 11, 2009 at 6:32 PM, mgross wrote: > On Wed, Mar 11, 2009 at 09:15:10AM -0700, Dylan Reinhardt wrote: > > On Wed, Mar 11, 2009 at 7:34 AM, mgross wrote: > > If you have specific bottlenecks *after* you've done the standard > > deployment/caching stuff, I'd love to hear about them. > > I be happy to share, but I was hoping for someone to point out some > for me to use as starting points ;) Honestly, I don't know of many. The ORM and templating systems are pretty lightweight and you can swap in different solutions if you want to. The major gripes I've heard are with the architecture of the admin module... but it's optional and typically not used under load anyway. The problems I've seen with Django deployments are typically configuration issues. Things like using memcached on a virtual host system where memory is sparse. Or forgetting to turn of the DEBUG flag. That's why I'd love to hear if you encounter any issues. I'm just not aware of that many. Dylan -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirby.urner at gmail.com Thu Mar 12 04:48:59 2009 From: kirby.urner at gmail.com (kirby urner) Date: Wed, 11 Mar 2009 20:48:59 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <4c645a720903112020o1e69f159r424c72dca453649d@mail.gmail.com> References: <20090311143405.GA12737@thegnar.org> <4c645a720903110915k47d64e8x91b68d7b84184b69@mail.gmail.com> <20090312013238.GA13557@thegnar.org> <4c645a720903112020o1e69f159r424c72dca453649d@mail.gmail.com> Message-ID: >> >> I be happy to share, but I was hoping for someone to point out some >> for me to use as starting points ;) > This is the kind of diagram that scares me: http://jayant7k.blogspot.com/2007/04/livejournal-system-architecture.html This one is actually enlightening, but also suggests complexity: http://blogs.nuxeo.com/sections/blogs/fermigier/2006_01_22_updated-megaframeworks/downloadFile/attachedFile_1_f0/megaframeworks-v2.png (hope it pulls up for ya). Not much help probably, out of date no doubt. Dylan's endorsement is heartening: > > Honestly, I don't know of many. ?The ORM and templating systems are pretty > lightweight and you can swap in different solutions if you want to. ?The > major gripes I've heard are with the architecture of the admin module... but > it's optional and typically not used under load anyway. > The problems I've seen with Django deployments are typically configuration > issues. ?Things like using memcached on a virtual host system where memory > is sparse. ?Or forgetting to turn of the DEBUG flag. > > That's why I'd love to hear if you encounter any issues. ?I'm just not aware > of that many. > > Dylan So I've been working with MySQL and PostgreSQL, but wonder if the much ballyhooed Django->SQLServer (MSFT) is under someone's belt around here (not mine). Kirby From kirby.urner at gmail.com Sat Mar 14 06:19:32 2009 From: kirby.urner at gmail.com (kirby urner) Date: Fri, 13 Mar 2009 22:19:32 -0700 Subject: [portland] Manga Code (CSN) Message-ID: So "manga code" is different from "pseudo-code" in that it actually runs (given necessary infrastructure), in this case in Python. It's "mange" or "comic book" in the sense of being a caricature of a real production snippet, not "dumbed down" so much as "reduced" (anyone for "shrunk"?). Or call it a "cave painting". Some of you may recall a presentation where I kicked off this idea of back office source code that'd help galvanize the coffee shop biz around a new business model. The business model is what's open source and I discuss it in some detail, with variations, in this blog: http://mybizmo.blogspot.com/search?q=writhe (announcement of CSN as a project) http://coffeeshopsnet.blogspot.com/2009/03/open-source.html (specific post about the source code below) Not everything about this franchise need be open source. Picture the game DVDs (for the juke box) being delivered by armored car, directly from skunkworks, just to be exaggerated about it (seriously though, might be close source like HalfLife2 from Valve but much shorter playing time). Part of the draw is you find these cool games at a CSN affiliate fairly exclusively sometimes, an incentive to visit (novelty is part of the appeal). Probably it's the whole idea of "manga code" that should be the subject of comments, don't need to talk coffee shops a lot (this is PPUG, not a CSN list). I'd be happy to do another presentation though, sometime later this year. The basic idea: you steer bonus points to worthy causes of your choosing as an optional part of the entertainment you get when buying a drink or other menu item. It's a little complicated, hence this cartoon, which is also incomplete (note this is Python 3.x but it takes 30 seconds to recast in 2.x). """ Simulation: CSN Vendor axis fixed: Mars Action: customer selects good, nets bonus to invest towards worthy cause, with skill level rewarded (not pure chance), remainder to vendor. House gets 20% profit (markup). Vendor gets demographic data (not shown). Customer gets kudos (not shown). People who just play for 'Me!' all the time will come across as selfish. """ from random import randint house_gains = 0 # Fine Grind thevendors = {'Mars':0, 'Red Bull':0, 'Jack Daniels':0} # technically a "salon" thecauses = {'GreenPeace':0, 'MercyCorps':0, 'USG':0, # help Uncle Sam sometimes? 'Me!':0} # house games menu thegames = {'Scary Clown':0, 'Teddy Bear Tilt':0, 'Governator':0, 'No Game':0} class Packet: def __init__(self, good = "Spearmint Gum", bonus = 10): self.good = good self.bonus = bonus class Sale: def __init__(self, packet): self.packet = packet def select_game(self): for option,title in enumerate(thegames.keys()): print(option,' --- ',title) self.selection = list(thegames.keys())[int(input(""" Whaddya wanna play genius? """ ))] def select_cause(self): for option,title in enumerate(thecauses.keys()): print(option,' --- ',title) self.worthy = list(thecauses.keys())[int(input(""" What do you care about? """ ))] class Game: def __init__(self, thesale): self.game = thesale.selection self.cause = thesale.worthy self.bonus = thesale.packet.bonus def play(self): """ real game goes here (might be a hard one) """ print("Playing... ", self.game) the_house = .20 the_cause = .01 * randint(21,80) vendor_profit = 1 - the_house - the_cause return [self.bonus * it for it in (the_house, the_cause, vendor_profit)] def purchase(): global house_gains mybuy = Packet("Mars Bar",50) thesale = Sale(mybuy) thesale.select_game() thesale.select_cause() thegame = Game(thesale) proceeds = thegame.play() # real game goes here house_gains = house_gains + proceeds[0] thecauses[thesale.worthy] += proceeds[1] thevendors['Mars'] += proceeds[2] def tester(n): global house_gains for i in range(n): purchase() print(house_gains) print(thecauses) if __name__ == "__main__": tester(10) From markgross at thegnar.org Sun Mar 15 16:28:12 2009 From: markgross at thegnar.org (mgross) Date: Sun, 15 Mar 2009 08:28:12 -0700 Subject: [portland] python bridge talk questions / advice In-Reply-To: <20090311143405.GA12737@thegnar.org> References: <20090311143405.GA12737@thegnar.org> Message-ID: <20090315152812.GA23671@thegnar.org> On Wed, Mar 11, 2009 at 07:34:05AM -0700, mgross wrote: > Last nights pdxpython meeting was pretty much the best I've ever been > too. I woke up thinking about some stuff related to python and a > possible bridge talk I'm starting to consider doing. > > One of the many things that stuck in my head from last meeting > (besides Machine learning--which was uber cool) was the pytyrant and > friends talk hitting on the performance of the database. > > What struck me is the performance delta in database throughput on > Michael's simple sample losing 2 orders of magnitude int run time by > using a loop-back network over direct to the file access. > > It got me thinking, hmm, I know a little about how to drill down on performance > and scaling problems and, I know some experts in the performance area > I can ask questions of, and I have a few questions on what exactly are > the performance issues with python and django workloads? > (like, does my cheap-oh ISP have a legitimate point regarding its > refusal to support Django sites?) > > So my question to the list is can folks feed me some workloads I can > use in a side project I'm thinking of starting to drill down on the > performance and scaling issues with python and Django use for both web > applications and python code in general? > > My thought is that this investigation and results would make an > interesting bridge talk. I'm chickening out on this talk idea. The following is approximately what I was thinking of proposing for a talk, but I think its a bigger topic than I feel good about doing alone. Also, I think the following is more of a long form tutorial talk. ---------- Python web workload performance deep dive with examples. Ever wonder where the MIPS are going? Or if you are getting as much as you should be from your hardware and software? This talk will go over example analyses and results for investigations of where the CPU cycles are going and where latencies are from in a few example workloads. The examples will include, compute bound, network data base related, and a couple of Django workloads. Linux will be the assumed OS for hosting these workloads. ---------- If anyone wants to use this idea, or derivative, for a talk that would be great! If a few folks want to team up on this idea that would be cool too. Note: I've submitted a non python talk proposal already http://opensourcebridge.org/proposals/40 so I don't want to the main guy on this. Thanks, --mgross -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From markgross at thegnar.org Mon Mar 16 04:14:06 2009 From: markgross at thegnar.org (mgross) Date: Sun, 15 Mar 2009 20:14:06 -0700 Subject: [portland] ctypes to call sched_setscheduler advice. Message-ID: <20090316031406.GB24666@thegnar.org> I know that I'm trying to do something a bit wrong, but thats ok with me. Anyway, I would like to call sched_setscheduler from python using ctypes module. My question is: is there an easy way to extract the struct sched_parm struct from the sched.h header file for use within a python program? Thanks for any andvice or other ideas on the easiest ways to call external non-python code. --mgross -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From pacopablo at pacopablo.com Mon Mar 16 15:19:54 2009 From: pacopablo at pacopablo.com (John Hampton) Date: Mon, 16 Mar 2009 07:19:54 -0700 Subject: [portland] ctypes to call sched_setscheduler advice. In-Reply-To: <20090316031406.GB24666@thegnar.org> References: <20090316031406.GB24666@thegnar.org> Message-ID: <49BE600A.4070900@pacopablo.com> mgross wrote: > I know that I'm trying to do something a bit wrong, but thats ok with > me. Anyway, I would like to call sched_setscheduler from python using > ctypes module. My question is: is there an easy way to extract the > struct sched_parm struct from the sched.h header file for use within a > python program? I am not aware of one. If there is, that is definitely a tool that I would like. I've always ended up translating the structures by hand. -John From mcantelon at gmail.com Wed Mar 18 07:27:16 2009 From: mcantelon at gmail.com (Mike Cantelon) Date: Tue, 17 Mar 2009 23:27:16 -0700 Subject: [portland] Portland Pythonistas: Open Web Vancouver would love to hear your talk ideas Message-ID: <8fb626080903172327h24f7c741u58da21cc9c0aa349@mail.gmail.com> 'Allo Portland folks, We up north are having a conference on June 11th & 12th focused on open web technologies, ethics, and ideas. We did a similar conference last year, with speakers including Tim Bray and Chris Messina, and this year our first confirmed keynote is the leader of the Swedish Pirate Party. Anyways, we're looking for speakers and would love to see some Portland Pythonistas come up. If you're interested in presenting and want to get a sense of the conference's flavour, check out last year's list of sessions: http://openwebvancouver.ca/last-years-talks You can submit your talk ideas here: http://openwebvancouver.ca/node/add/talk Full conference details are here: http://openwebvancouver.ca/details We look forward to hearing your ideas! Mike Cantelon, Open Web Vancouver organizing crew -------------- next part -------------- An HTML attachment was scrubbed... URL: From freyley at gmail.com Fri Mar 20 18:29:28 2009 From: freyley at gmail.com (Jeff Schwaber) Date: Fri, 20 Mar 2009 10:29:28 -0700 Subject: [portland] Fwd: New O'Reilly Training Events: Master Classes In-Reply-To: References: Message-ID: <8db4a1910903201029y67f85cb8ob223efe8b720dfcb@mail.gmail.com> Some folks out there are interested in Google's tricks for speedy sites, right? Assuming the answer isn't "have ten billion servers, it helps." Jeff ---------- Forwarded message ---------- From: O'Reilly Conferences Date: Fri, Mar 20, 2009 at 11:00 AM Subject: New O'Reilly Training Events: Master Classes To: freyley at gmail.com As a friend of O'Reilly Conferences, we wanted to alert you to a new set of events taking place in San Francisco on March 30: O'Reilly Training Master Classes. The first three full-day Master Class topics are JavaScript, Creating High Performance Websites, and Project Management. These immersive, expert-led training sessions are perfect if you're short on time--you'll be able to fine tune your skills quickly to stay competitive. Class size is limited to make sure you receive personalized instruction. You'll save 40% by using discount code ORMCC when registering. Additional bonuses to tempt you: A complimentary copy of the instructor's book, plus group registrations of three or more receive a $150 discount per attendee. A snapshot of the Master Classes, which take place at the Mission Bay Conference Center at UCSF, is below; complete details, including information on two iPhone Master Classes coming up in May, are at: http://training.oreilly.com/ JavaScript: The Good Parts Douglas Crockford In this class, you'll discover good JavaScript you can use to create truly extensible code. The author of "JavaScript: The Good Parts" (O'Reilly Media), Crockford is a regular speaker at conferences on advanced JavaScript topics, and will be a featured speaker at Velocity 2009. He serves on the JavaScript 2.0 committee at ECMA. http://training.oreilly.com/javascript Leading and Managing Breakthrough Projects Scott Berkun Author of the bestselling "The Myths of Innovation" (O'Reilly Media), Berkun uses valuable insights and challenging in-class exercises to help you develop the leadership skills and knowledge you need to manage innovative people and projects. http://training.oreilly.com/projectmanagement Creating High Performance Web Sites Steve Souders Working at Google and Yahoo!, Velocity co-chair Steve Souders developed rules that cut up to 25% off response time for page requests. In this class, Souders, the author of the bestselling O'Reilly book, "High Performance Web Sites," explains those rules and shows you how you can greatly improve the performance of your existing web pages. http://training.oreilly.com/highperformancesites Our best, The O'Reilly Conferences Team +++++++++++++++++++++++++++++++++++++++++++++++++++ If you would like to stop receiving promotional offers from O'Reilly, send a blank email to nomail at oreilly.com. O'Reilly Media, Inc. 1005 Gravenstein Highway North, Sebastopol, CA 95472 (707) 827-7000 / (800) 998-9938 ++++++++++++++++++++++++++++++++++++++++++++++++++ From michael.schurter at gmail.com Mon Mar 30 07:43:03 2009 From: michael.schurter at gmail.com (Michael Schurter) Date: Sun, 29 Mar 2009 22:43:03 -0700 Subject: [portland] Request for comments on my OS Bridge proposal Message-ID: <240b71640903292243oeab3230n4fcd84bed74c2b7@mail.gmail.com> Well OS Bridge proposals are just about due, so I decided to finally write one. I'd love your comments and feedbacks on the proposal I discuss here: http://michael.susens-schurter.com/blog/2009/03/29/crowdsourcing-my-os-bridge-talk-proposal/ I think I'm also going to submit a proposal on Django because Python seems under-represented on the proposals list: http://opensourcebridge.org/events/2009/proposals/ Any and all comments welcome! Especially if you think its a terrible idea and I should give up. :-) Also, if anyone else was thinking of doing a Django talk please let me know (is Python used for anything but Django? I kid I kid). I'm sure there are more qualified/experienced people out there to give a talk on Django. Thanks in advance and see everyone on the 14th! From markgross at thegnar.org Mon Mar 30 15:51:44 2009 From: markgross at thegnar.org (mgross) Date: Mon, 30 Mar 2009 06:51:44 -0700 Subject: [portland] Request for comments on my OS Bridge proposal In-Reply-To: <240b71640903292243oeab3230n4fcd84bed74c2b7@mail.gmail.com> References: <240b71640903292243oeab3230n4fcd84bed74c2b7@mail.gmail.com> Message-ID: <20090330135144.GB8179@thegnar.org> On Sun, Mar 29, 2009 at 10:43:03PM -0700, Michael Schurter wrote: > Well OS Bridge proposals are just about due, so I decided to finally > write one. I'd love your comments and feedbacks on the proposal I > discuss here: > > http://michael.susens-schurter.com/blog/2009/03/29/crowdsourcing-my-os-bridge-talk-proposal/ It looks cool to me. --mgross > > I think I'm also going to submit a proposal on Django because Python > seems under-represented on the proposals list: > > http://opensourcebridge.org/events/2009/proposals/ > > Any and all comments welcome! Especially if you think its a terrible > idea and I should give up. :-) > > Also, if anyone else was thinking of doing a Django talk please let me > know (is Python used for anything but Django? I kid I kid). I'm sure > there are more qualified/experienced people out there to give a talk > on Django. > > Thanks in advance and see everyone on the 14th! > _______________________________________________ > Portland mailing list > Portland at python.org > http://mail.python.org/mailman/listinfo/portland -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From mde at micahelliott.com Mon Mar 30 19:10:54 2009 From: mde at micahelliott.com (Micah Elliott) Date: Mon, 30 Mar 2009 10:10:54 -0700 Subject: [portland] Open Source Bridge registration Message-ID: <1edb3c420903301010i377b0f65ub269b36827ce8ffb@mail.gmail.com> Looks like tomorrow is the last day to register for OSB at the discounted rate. I expect most folks here already know about the event (or are already volunteering?). I hadn't seen any mention of the "group discount" here, so I thought I should share what I saw the XPDX group discussing. Here's a $25 discount link that should be applicable to PDX Python: http://opensourcebridge.org/volunteer/for-user-groups/ So if you haven't already signed up, now is a good time to do it. -- @MicahElliott | mde at MicahElliott.com | http://MicahElliott.com Sent from: Beaverton OR United States. From kirby.urner at gmail.com Mon Mar 30 20:49:29 2009 From: kirby.urner at gmail.com (kirby urner) Date: Mon, 30 Mar 2009 11:49:29 -0700 Subject: [portland] Request for comments on my OS Bridge proposal In-Reply-To: <20090330135144.GB8179@thegnar.org> References: <240b71640903292243oeab3230n4fcd84bed74c2b7@mail.gmail.com> <20090330135144.GB8179@thegnar.org> Message-ID: Looks like a good proposal. I'm well aware that Apache might be overkill sometimes, plus other stuff just works out of the box. I imagine many listeners would be hungry for this knowledge at different levels, i.e. I expect at least a few people at OS Bridge to be in that "what's Apache?" category. && If you wanna be ambitious, you could design a taxonomy for web servers based on whether they use multiple processes and threads or whether they're more asynchronous like Medusa (Zope), Twisted. In addition to comparing load/speed benchmarks, there's historical/lineage ways of packaging the material, which includes use cases, e.g. Twisted comes from mutli-user gaming world (where it is still). In terms of performance, passing on form Django in the real world talk by some people who know their stuff, mod_wsgi by itself is probably best for high performance, don't need mod_python per se, which is less predictable in load handling. Hope I got that right -- just passing through, this isn't based on my own tinkering. I've got this upcoming talk for GIS community mentioned in MOTD at osgarden.appspot.com , before which I hope to have geodjango running on my laptop, in addition to URLs in the great beyond (outside 127.0.0.1 **). Thanks to Micah Elliott for the heads up on registering for OS Bridge, just did that. Welcome back to Jason, Michel, Michelle... Adam. I got in last night, bag still in Cedar Rapids (I had one of those Expedia cheapos aka a "bag loser" -- got a lecture from Frontier on how bag loss is par for the course when you do it that way). Kirby PSF member <-- new! && http://mail.python.org/pipermail/edu-sig/2008-December/008884.html (several kinds of Apache) ** http://www.flickr.com/photos/17157315 at N00/3386372968/ (bumper sticker, not mine) From igal at pragmaticraft.com Mon Mar 30 22:38:53 2009 From: igal at pragmaticraft.com (Igal Koshevoy) Date: Mon, 30 Mar 2009 13:38:53 -0700 Subject: [portland] Request for comments on my OS Bridge proposal In-Reply-To: <240b71640903292243oeab3230n4fcd84bed74c2b7@mail.gmail.com> References: <240b71640903292243oeab3230n4fcd84bed74c2b7@mail.gmail.com> Message-ID: <49D12DDD.50107@pragmaticraft.com> I'm glad to see the recently-added Python proposals for the Open Source Bridge conference, and will be glad to see more. Michael Schurter wrote: > Well OS Bridge proposals are just about due, so I decided to finally > write one. I'd love your comments and feedbacks on the proposal I > discuss here: > > http://michael.susens-schurter.com/blog/2009/03/29/crowdsourcing-my-os-bridge-talk-proposal/ > Your talk's description is quite thorough and I like the suggested evaluation metrics. I also like Kirby's suggestion of providing taxonomy to differentiate the various servers. You might find some more dimensions to cover listed in the various tables at: http://en.wikipedia.org/wiki/Comparison_of_web_servers That all said, I find that I can pick the appropriate web server by seeing how well my needs fit some very simple criteria: * Apache: Do I need a very easy way to run apps written in PHP, Ruby (mod_passenger), Python (mod_wsgi), Perl (mod_perl), FastCGI, etc? * Nginx: Do I need a very fast, super efficient, totally reliable server for static content or a simple proxy? * Lighttpd: Why would I choose a server that's inferior in all ways to Apache and Nginx? * HAproxy: Do I need a sophisticated but finicky high-availability proxy server? * CherryPy: Do I need to run CherryPy apps, e.g., TurboGears 1.x? * Thin: Do I need to run Ruby apps on a server where I can't install Passenger? * Mongrel: Why would I choose a server that's inferior in all ways to Thin and Passenger? I'm currently using all the above servers, other than Lighttpd. > I think I'm also going to submit a proposal on Django because Python > seems under-represented on the proposals list: > > http://opensourcebridge.org/events/2009/proposals/ I think that intro and advanced Django talks would be well-received and well-attended. -igal From michael at susens-schurter.com Mon Mar 30 22:55:45 2009 From: michael at susens-schurter.com (Michael Schurter) Date: Mon, 30 Mar 2009 13:55:45 -0700 Subject: [portland] Request for comments on my OS Bridge proposal In-Reply-To: <49D12DDD.50107@pragmaticraft.com> References: <240b71640903292243oeab3230n4fcd84bed74c2b7@mail.gmail.com> <49D12DDD.50107@pragmaticraft.com> Message-ID: <240b71640903301355o4df4fd6fq24a539815d735914@mail.gmail.com> On Mon, Mar 30, 2009 at 1:38 PM, Igal Koshevoy wrote: > Michael Schurter wrote: >> http://michael.susens-schurter.com/blog/2009/03/29/crowdsourcing-my-os-bridge-talk-proposal/ >> > Your talk's description is quite thorough and I like the suggested > evaluation metrics. I also like Kirby's suggestion of providing taxonomy > to differentiate the various servers. You might find some more > dimensions to cover listed in the various tables at: > http://en.wikipedia.org/wiki/Comparison_of_web_servers Thanks for the link. Seems it covers my portability metric well and does a pretty good job on the "features" one as well. > That all said, I find that I can pick the appropriate web server by > seeing how well my needs fit some very simple criteria: > > * Apache: Do I need a very easy way to run apps written in PHP, Ruby > (mod_passenger), Python (mod_wsgi), Perl (mod_perl), FastCGI, etc? > * Nginx: Do I need a very fast, super efficient, totally reliable server > for static content or a simple proxy? > * Lighttpd: Why would I choose a server that's inferior in all ways to > Apache and Nginx? > * HAproxy: Do I need a sophisticated but finicky high-availability proxy > server? > * CherryPy: Do I need to run CherryPy apps, e.g., TurboGears 1.x? > * Thin: Do I need to run Ruby apps on a server where I can't install > Passenger? > * Mongrel: Why would I choose a server that's inferior in all ways to > Thin and Passenger? Excellent! This is exactly the sort of info about Ruby platforms I was looking for! Thanks. :) Good questions to address in general. >> I think I'm also going to submit a proposal on Django because Python >> seems under-represented on the proposals list: >> >> http://opensourcebridge.org/events/2009/proposals/ > I think that intro and advanced Django talks would be well-received and > well-attended. I'd like to be an active contributor or at least a more active community member before I give an advanced talk. Despite being the more mundane proposal, I've gotten the most positive feedback on a intro to Django (especially *not* a blog-in-20-minutes howto, but rather a more when to use Django intro). Thanks igal!