From tim_one@email.msn.com Thu Jul 1 05:30:30 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 00:30:30 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906291201.IAA02535@eric.cnri.reston.va.us> Message-ID: <000101bec37a$7465af00$309e2299@tim> [Guido] > I guess it's all in the perspective. 99.99% of all thread apps I've > ever written use threads primarily to overlap I/O -- if there wasn't > I/O to overlap I wouldn't use a thread. I think I share this > perspective with most of the thread community (after all, threads > originate in the OS world where they were invented as a replacement > for I/O completion routines). Different perspective indeed! Where I've been, you never used something as delicate as a thread to overlap I/O, you instead used the kernel-supported asynch Fortran I/O extensions <0.7 wink>. Those days are long gone, and I've adjusted to that. Time for you to leave the past too : by sheer numbers, most of the "thread community" *today* is to be found typing at a Windows box, where cheap & reliable threads are a core part of the programming culture. They have better ways to overlap I/O there too. Throwing explicit threads at this is like writing a recursive Fibonacci number generator in Scheme, but building the recursion yourself by hand out of explicit continuations . > ... > As far as I can tell, all the examples you give are easily done using > coroutines. Can we call whatever you're asking for coroutines instead > of fake threads? I have multiple agendas, of course. What I personally want for my own work is no more than Icon's generators, formally "semi coroutines", and easily implemented in the interpreter (although not the language) as it exists today. Coroutines, fake threads and continuations are much stronger than generators, and I expect you can fake any of the first three given either of the others. Generators fall out of any of them too (*you* implemented generators once using Python threads, and I implemented general coroutines -- "fake threads" are good enough for either of those). So, yes, for that agenda any means of suspending/resuming control flow can be made to work. I seized on fake threads because Python already has a notion of threads. A second agenda is that Python could be a lovely language for *learning* thread programming; the threading module helps, but fake threads could likely help more by e.g. detecting deadlocks (and pointing them out) instead of leaving a thread newbie staring at a hung system without a clue. A third agenda is related to Mark & Greg's, making Python's threads "real threads" under Windows. The fake thread agenda doesn't tie into that, except to confuse things even more if you take either agenda seriously <0.5 frown>. > I think that when you mention threads, green or otherwise colored, > most people who are at all familiar with the concept will assume they > provide I/O overlapping, except perhaps when they grew up in the > parallel machine world. They didn't suggest I/O to me at all, but I grew up in the disqualified world ; doubt they would to a Windows programmer either (e.g., my employer ships heavily threaded Windows apps of various kinds, and overlapped I/O isn't a factor in any of them; it's mostly a matter of algorithm factoring to keep the real-time incestuous subsystems from growing impossibly complex, and in some of the very expensive apps also a need to exploit multiple processors). BTW, I called them "fake" threads to get away from whatever historical baggage comes attached to "green". > Certainly all examples I give in my never-completed thread tutorial > (still available at > http://www.python.org/doc/essays/threads.html) use I/O as the primary > motivator -- The preceding "99.99% of all thread apps I've ever written use threads primarily to overlap I/O" may explain this . BTW, there is only one example there, which rather dilutes the strength of the rhetorical "all" ... > this kind of example appeals to simples souls (e.g. downloading more than > one file in parallel, which they probably have already seen in action in > their web browser), as opposed to generators or pipelines or coroutines > (for which you need to have some programming theory background to > appreciate the powerful abstraction possibillities they give). I don't at all object to using I/O as a motivator, but the latter point is off base. There is *nothing* in Comp Sci harder to master than thread programming! It's the pinnacle of perplexity, the depth of despair, the king of confusion (stop before I exaggerate ). Generators in particular get re-invented often as a much simpler approach to suspending a subroutine's control flow; indeed, Icon's primary audience is still among the humanities, and even dumb linguists don't seem to have notable problems picking it up. Threads have all the complexities of the other guys, plus races, deadlocks, starvation, load imbalance, non-determinism and non-reproducibility. Threads simply aren't simple-soul material, no matter how pedestrian a motivating *example* may be. I suspect that's why your tutorial remains unfinished: you had no trouble describing the problem to be solved, but got bogged down in mushrooming complications describing how to use threads to solve it. Even so, the simple example at the end is already flawed ("print" isn't atomic in Python, so the print len(text), url may print the len(text) from one thread followed by the url from another). It's not hard to find simple-soul examples for generators either (coroutines & continuations *are* hard to motivate!), especially since Python's for/__getitem__ protocol is already a weak form of generator, and xrange *is* a full-blown generator; e.g., a common question on c.l.py is how to iterate over a sequence backwards: for x in backwards(sequence): print x def backwards(s): for i in xrange(len(s)-1, -1, -1): suspend s[i] Nobody needs a comp sci background to understand what that *does*, or why it's handy. Try iterating over a tree structure instead & then the *power* becomes apparent; this isn't comp-sci-ish either, unless we adopt a "if they've heard of trees, they're impractical dreamers" stance . BTW, iterating over a tree is what os.path.walk does, and a frequent source of newbie confusion (they understand directory trees, they don't grasp the callback-based interface; generating (dirname, names) pairs instead would match their mental model at once). *This* is the stuff for simple souls! > Another good use of threads (suggested by Sam) is for GUI programming. > An old GUI system, News by David Rosenthal at Sun, used threads > programmed in PostScript -- very elegant (and it failed for other > reasons -- if only he had used Python instead :-). > > On the other hand, having written lots of GUI code using Tkinter, the > event-driven version doesn't feel so bad to me. Threads would be nice > when doing things like rubberbanding, but I generally agree with > Ousterhout's premise that event-based GUI programming is more reliable > than thread-based. Every time your Netscape freezes you can bet > there's a threading bug somewhere in the code. I don't use Netscape, but I can assure you the same is true of Internet Explorer -- except there the threading bug is now somewhere in the OS <0.5 wink>. Anyway, 1) There are lots of goods uses for threads, and especially in the Windows and (maybe) multiprocessor NumPy worlds. Those would really be happier with "free-threaded threads", though. 2) Apart from pedagogical purposes, there probably isn't a use for my "fake threads" that couldn't be done easier & better via a more direct (coroutine, continuation) approach; and if I had fake threads, the first thing I'd do for me is rewrite the generator and coroutine packages to use them. So, yes: you win . 3) Python's current threads are good for overlapping I/O. Sometimes. And better addressed by Sam's non-threaded "select" approach when you're dead serious about overlapping lots of I/O. They're also beaten into service under Windows, but not without cries of anguish from Greg and Mark. I don't know, Guido -- if all you wanted threads for was to speed up a little I/O in as convoluted a way as possible, you may have been witness to the invention of the wheel but missed that ox carts weren't the last application . nevertheless-ox-carts-may-be-the-best-ly y'rs - tim From tim_one@email.msn.com Thu Jul 1 08:45:54 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 03:45:54 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> Message-ID: <000901bec395$c042cfa0$309e2299@tim> [Mark Hammond] > Im a little unhappy as this [stackless Python] will break the Active > Debugging stuff ... > ... > So the solution MS came up with was, surprise surprise, the machine stack! > :-) The assumption is that all languages will make _some_ use of the > stack, so they ask a language to report its "stack base address" > and "stack size". Using this information, the debugger sorts into the > correct call sequence. Mark, you can't *really* believe Chris is incapable of hacking around this, right? It's not even clear there's something to be hacked around, since Python is only Python and there's nothing Christian can do to stop other languages that call into Python from using the machine stack, or to call other languages from Python without using the machine stack. So Python "shows up on the stack" no matter what, cross-language. > ... > Bit I also understand completely the silence on this issue. When the > thread started, there was much discussion about exactly what the > hell these continuation/coroutine thingies even were. The Fuchs paper Sam referenced explained it in simple C terms: a continuation is exactly what C setjmp/longjmp would do if setjmp saved (& longjmp restored) the C stack *in addition* to the program counter and machine registers (which they already save/restore). That's all there is to it, at heart: objects capture data state, continuations capture control flow state. Whenever the OS services an interrupt and drops into kernel mode, it captures a continuation for user mode -- they don't *call* it that, but that's what they're doing, and it's as practical as a pencil (well, *more* practical, actually ). > However, there were precious few real-world examples where they could > be used. Nobody asked for any before now <0.5 wink> -- and I see Sam provided some marvelous ones in response to this. > A few acedemic, theoretical places, I think you undervalue those: people working on the underpinnings of languages strive very hard to come up with the simplest possible examples that don't throw away the core of the problem to be solved. That doesn't mean the theoreticians are too air-headed to understand "real world problems"; it's much more that, e.g., "if you can't compare the fringes of two trees cleanly, you can't possibly do anything harder than that cleanly either -- but if you can do this little bit cleanly, we have strong reason to believe there's a large class of difficult real problems you can also do cleanly". If you need a "practical" example of that, picture e.g. a structure-based diff engine for HTML source. Which are really trees defined by tags, and where text-based comparison can be useless (you don't care if "
  • " moved from column 12 of line 16 to column 1 of line 17, but you care a lot if the *number* of
  • tags changed -- so have you have to compare two trees *as* trees). But that's a problem easy enough for generators to solve cleanly. Knuth writes a large (for his books) elevator-simulation program to illustrate coroutines (which are more powerful than generators), and complains that he can't come up with a simpler example that illustrates any point worth making. And he's right! The "literature standard" text-manipulation example at the end of my coroutine module illustrates what Sam was talking about wrt writing straightforward "pull" algorithms for a "push" process, but even that one can be solved with simpler pipeline control flow. At least for *that*, nobody who ever used Unix would doubt the real-world utility of the pipeline model for a nanosecond <1e-9 wink>. If you want a coroutine example, go to a restaurant and order a meal. When you leave, glance back *real quick*. If everyone in the restaurant is dead, they were a meal-generating subroutine; but if they're still serving other customers, your meal-eating coroutine and their meal-generating coroutine worked to mutual benefit . > but the only real contender I have seen brought up was Medusa. There were > certainly no clear examples of "as soon as we have this, I could change > abc to take advantage, and this would give us the very cool xyz" > > So, if anyone else if feeling at all like me about this issue, they are > feeling all warm and fuzzy knowing that a few smart people are giving us > the facility to do something we hope we never, ever have to do. :-) Actually, you'll want to do it a lot . Christian & I have bantered about this a few times a year in pvt, usually motivated by some horrendous kludge posted to c.l.py to solve a problem that any Assistant Professor of Medieval English could solve without effort in Icon. The *uses* aren't esoteric at all. or-at-least-not-more-than-you-make-'em-ly y'rs - tim From MHammond@skippinet.com.au Thu Jul 1 09:18:25 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Thu, 1 Jul 1999 18:18:25 +1000 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <000901bec395$c042cfa0$309e2299@tim> Message-ID: <00b901bec39a$4baf6b80$0801a8c0@bobcat> [Tim tells me it will all be obvious if I just think a little harder ] Your points about "acedemic examples" is well taken. The reality is that, even given these simple examples (which I dared deride as acedemic), the simple fact is Im not seeing "the point". I seriously dont doubt all all you say. However, as Sam and Chris have said many times, it is just a matter of changing the way to you think. Interestingly: Chris said it recently, and continues to say it. Sam said it to me _years_ ago, and said it repeatedly, but hasnt said it recently. Tim hasnt really said it yet :-) This is almost certainly because when your brain does switch, it is a revelation, and really not too hard at all. But after a while, you forget the switch ever took place. Closest analogy I can think of is OO programming. In my experience trying to _learn_ OO programming from a few misc examples and texts was pointless and very hard. You need a language to play with it in. And when you have one, your brain makes the switch, you see the light, and you can't see what was ever mysterious about it. And you tell everyone its easy; "just change the way you think about data" :-) But to all us here, OO programming is just so obvious it goes without saying. Occasionaly a newbie will have trouble with OO concepts in Python, and I personally have trouble seeing what could _possibly_ be difficult about understanding these very simple concepts. So Im just as guilty, just not in this particular case :-) So, short of all us here going and discovering the light using a different language (perish the thought :), my original point stands that until Chris' efforts give us something we can easily play with, some of use _still_ wont see what all the fuss is about. (Although I admit it has nothing to do with either the examples or the applicability of the technology to all sorts of things) Which leaves you poor guys in a catch 22 - without noise of some sort from the rest of us, its hard to keep the momentum going, but without basically a fully working Python with continuations, we wont be making much noise. But-I-will-thank-you-all-personally-and-profusely-when-I-do-see-the-light, ly Mark. From jack@oratrix.nl Thu Jul 1 17:05:50 1999 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 01 Jul 1999 18:05:50 +0200 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: Message by Skip Montanaro , Mon, 28 Jun 1999 16:24:46 -0400 (EDT) , <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> Message-ID: <19990701160555.5D44512B0F@oratrix.oratrix.nl> Recently, Skip Montanaro said: > > Andrew> My personal leaning is that we can get more bang for the buck by > Andrew> working on the Distutils effort, so that installing a package > Andrew> like PyExpat becomes much easier, rather than piling more things > Andrew> into the core distribution. > > Amen to that. See Guido's note and my response regarding soundex in the > Doc-SIG. Perhaps you could get away with a very small core distribution > that only contained the stuff necessary to pull everything else from the net > via http or ftp... I don't know whether this subject belongs on the python-dev list (is there a separate distutils list?), but let's please be very careful with this. The Perl people apparently think that their auto-install stuff is so easy to use that if you find a tool on the net that needs Perl they'll just give you a few incantations you need to build the "correct" perl to run the tool, but I've never managed to do so. My last try was when I spent 2 days to try and get the perl-based Palm software for unix up and running. With various incompatilble versions of perl installed in /usr/local by the systems staff and knowing nothing about perl I had to give up at some point, because it was costing far more time (and diskspace:-) than the whole thing was worth. Something like mailman is (afaik) easy to install for non-pythoneers because it only depends on a single, well-defined Python distribution. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From skip@mojam.com (Skip Montanaro) Thu Jul 1 20:54:14 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 1 Jul 1999 15:54:14 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <19990701160555.5D44512B0F@oratrix.oratrix.nl> References: <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> <19990701160555.5D44512B0F@oratrix.oratrix.nl> Message-ID: <14203.50921.870411.353490@cm-24-29-94-19.nycap.rr.com> Skip> Amen to that. See Guido's note and my response regarding soundex Skip> in the Doc-SIG. Perhaps you could get away with a very small core Skip> distribution that only contained the stuff necessary to pull Skip> everything else from the net via http or ftp... Jack> I don't know whether this subject belongs on the python-dev list Jack> (is there a separate distutils list?), but let's please be very Jack> careful with this. The Perl people apparently think that their Jack> auto-install stuff is so easy to use ... I suppose I should have added a <0.5 wink> to my note. Still, knowing what Guido does and doesn't feel comfortable with in the core distribution would be a good start at seeing where we might like the core to wind up. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 518-372-5583 From tim_one@email.msn.com Fri Jul 2 03:33:23 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 22:33:23 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <19990701160555.5D44512B0F@oratrix.oratrix.nl> Message-ID: <000a01bec433$41a410c0$6da02299@tim> [large vs small distributions] [Jack Jansen] > I don't know whether this subject belongs on the python-dev list (is > there a separate distutils list?), but let's please be very careful > with this. [and recounts his problems with Perl] I must say the idea of a minimal distribution sounds very appealing. But then I consider that Guido never got me to even try Tk until he put it into the std Windows distribution, and I've never given anyone any code that won't work with a fresh-from-the-box distribution either. FrankS's snappy "batteries included" wouldn't carry quite the same punch if it got reduced to "coupons for batteries hidden in the docs" . OTOH, I've got about as much use for XML as MarkH has for continuations , and here-- as in many other places --we've been saved so far by Guido's good judgment about what goes in & what stays out. So it's a good thing he can't ever resign this responsibility . if-20%-of-users-need-something-i'd-include-it-else-not-ly y'rs - tim From guido@CNRI.Reston.VA.US Sun Jul 4 02:56:31 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 03 Jul 1999 21:56:31 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Thu, 01 Jul 1999 00:30:30 EDT." <000101bec37a$7465af00$309e2299@tim> References: <000101bec37a$7465af00$309e2299@tim> Message-ID: <199907040156.VAA10874@eric.cnri.reston.va.us> > [Guido] > > I guess it's all in the perspective. 99.99% of all thread apps I've > > ever written use threads primarily to overlap I/O -- if there wasn't > > I/O to overlap I wouldn't use a thread. I think I share this > > perspective with most of the thread community (after all, threads > > originate in the OS world where they were invented as a replacement > > for I/O completion routines). [Tim] > Different perspective indeed! Where I've been, you never used something as > delicate as a thread to overlap I/O, you instead used the kernel-supported > asynch Fortran I/O extensions <0.7 wink>. > > Those days are long gone, and I've adjusted to that. Time for you to leave > the past too : by sheer numbers, most of the "thread community" > *today* is to be found typing at a Windows box, where cheap & reliable > threads are a core part of the programming culture. No quibble so far... > They have better ways to overlap I/O there too. Really? What are they? The non-threaded primitives for overlapping I/O look like Medusa to me: very high performance, but a pain to use -- because of the event-driven programming model! (Or worse, callback functions.) But maybe there are programming techniques that I'm not even aware of? (Maybe I should define what I mean by overlapping I/O -- basically every situation where disk or network I/O or GUI event handling goes on in parallel with computation or with each other. For example, in my book copying a set of files while at the same time displaying some silly animation of sheets of paper flying through the air *and* watching a Cancel button counts as overlapping I/O, and if I had to code this it would probably be a lot simpler to do using threads. > Throwing explicit threads at this is like writing a recursive > Fibonacci number generator in Scheme, but building the recursion > yourself by hand out of explicit continuations . Aren't you contradicting yourself? You say that threads are ubiquitous and easy on Windows (and I agree), yet you claim that threads are overkill for doing two kinds of I/O or one kind of I/O and some computation in parallel...? I'm also thinking of Java threads. Yes, the GC thread is one of those computational threads you are talking about, but I think the examples I've seen otherwise are mostly about having one GUI component (e.g. an applet) independent from other GUI components (e.g. the browser). To me that's overlapping I/O, since I count GUI events as I/O. > > ... > > As far as I can tell, all the examples you give are easily done using > > coroutines. Can we call whatever you're asking for coroutines instead > > of fake threads? > > I have multiple agendas, of course. What I personally want for my own work > is no more than Icon's generators, formally "semi coroutines", and easily > implemented in the interpreter (although not the language) as it exists > today. > > Coroutines, fake threads and continuations are much stronger than > generators, and I expect you can fake any of the first three given either of > the others. Coroutines, fake threads and continuations? Can you really fake continuations given generators? > Generators fall out of any of them too (*you* implemented > generators once using Python threads, and I implemented general > coroutines -- "fake threads" are good enough for either of those). Hm. Maybe I'm missing something. Why didn't you simply say "you can fake each of the others given any of these"? > So, yes, for that agenda any means of suspending/resuming control flow can > be made to work. I seized on fake threads because Python already has a > notion of threads. > > A second agenda is that Python could be a lovely language for *learning* > thread programming; the threading module helps, but fake threads could > likely help more by e.g. detecting deadlocks (and pointing them out) instead > of leaving a thread newbie staring at a hung system without a clue. Yes. > A third agenda is related to Mark & Greg's, making Python's threads "real > threads" under Windows. The fake thread agenda doesn't tie into that, > except to confuse things even more if you take either agenda seriously <0.5 > frown>. What makes them unreal except for the interpreter lock? Python threads are always OS threads, and that makes them real enough for most purposes... (I'm not sure if there are situations on uniprocessors where the interpreter lock screws things up that aren't the fault of the extension writer -- typically, problems arise when an extension does some blocking I/O but doesn't place Py_{BEGIN,END}_ALLOW_THREADS macros around the call.) > > I think that when you mention threads, green or otherwise colored, > > most people who are at all familiar with the concept will assume they > > provide I/O overlapping, except perhaps when they grew up in the > > parallel machine world. > > They didn't suggest I/O to me at all, but I grew up in the disqualified > world ; doubt they would to a Windows programmer either (e.g., my > employer ships heavily threaded Windows apps of various kinds, and > overlapped I/O isn't a factor in any of them; it's mostly a matter of > algorithm factoring to keep the real-time incestuous subsystems from growing > impossibly complex, and in some of the very expensive apps also a need to > exploit multiple processors). Hm, you admit that they sometimes want to use multiple CPUs, which was explcitly excluded from our discussion (since fake threads don't help there), and I bet that they are also watching some kind of I/O (e.g. whether the user says some more stuff). > BTW, I called them "fake" threads to get away > from whatever historical baggage comes attached to "green". Agreed -- I don't understand where green comes from at all. Does it predate Java? > > Certainly all examples I give in my never-completed thread tutorial > > (still available at > > http://www.python.org/doc/essays/threads.html) use I/O as the primary > > motivator -- > > The preceding "99.99% of all thread apps I've ever written use threads > primarily to overlap I/O" may explain this . BTW, there is only one > example there, which rather dilutes the strength of the rhetorical "all" ... OK, ok. I was planning on more along the same lines. I may have borrowed this idea from a Java book I read. > > this kind of example appeals to simples souls (e.g. downloading more than > > one file in parallel, which they probably have already seen in action in > > their web browser), as opposed to generators or pipelines or coroutines > > (for which you need to have some programming theory background to > > appreciate the powerful abstraction possibillities they give). > > I don't at all object to using I/O as a motivator, but the latter point is > off base. There is *nothing* in Comp Sci harder to master than thread > programming! It's the pinnacle of perplexity, the depth of despair, the > king of confusion (stop before I exaggerate ). I dunno, but we're probably both pretty poor predictors for what beginning programmers find hard. Randy Pausch (of www.alice.org) visited us this week; he points out that we experienced programmers are very bad at gauging what newbies find hard, because we've been trained "too much". He makes the point very eloquently. He also points out that in Alice, users have no problem at all with parallel activities (e.g. the bunny's head rotating while it is also hopping around, etc.). > Generators in particular get re-invented often as a much simpler approach to > suspending a subroutine's control flow; indeed, Icon's primary audience is > still among the humanities, and even dumb linguists don't seem to > have notable problems picking it up. Threads have all the complexities of > the other guys, plus races, deadlocks, starvation, load imbalance, > non-determinism and non-reproducibility. Strange. Maybe dumb linguists are better at simply copying examples without thinking too much about them; personally I had a hard time understanding what Icon was doing when I read about it, probably because I tried to understand how it was done. For threads, I have a simple mental model. For coroutines, my head explodes each time. > Threads simply aren't simple-soul material, no matter how pedestrian a > motivating *example* may be. I suspect that's why your tutorial remains > unfinished: you had no trouble describing the problem to be solved, but got > bogged down in mushrooming complications describing how to use threads to > solve it. No, I simply realized that I had to finish the threading module and release the thread-safe version of urllib.py before I could release the tutorial; and then I was distracted and never got back to it. > Even so, the simple example at the end is already flawed ("print" > isn't atomic in Python, so the > > print len(text), url > > may print the len(text) from one thread followed by the url from another). Fine -- that's a great excuse to introduce locks in the next section. (Most threading tutorials I've seen start by showing flawed examples to create an appreciation for the need of locks.) > It's not hard to find simple-soul examples for generators either (coroutines > & continuations *are* hard to motivate!), especially since Python's > for/__getitem__ protocol is already a weak form of generator, and xrange > *is* a full-blown generator; e.g., a common question on c.l.py is how to > iterate over a sequence backwards: > > for x in backwards(sequence): > print x > > def backwards(s): > for i in xrange(len(s)-1, -1, -1): > suspend s[i] But backwards() also returns, when it's done. What happens with the return value? > Nobody needs a comp sci background to understand what that *does*, or why > it's handy. Try iterating over a tree structure instead & then the *power* > becomes apparent; this isn't comp-sci-ish either, unless we adopt a "if > they've heard of trees, they're impractical dreamers" stance . BTW, > iterating over a tree is what os.path.walk does, and a frequent source of > newbie confusion (they understand directory trees, they don't grasp the > callback-based interface; generating (dirname, names) pairs instead would > match their mental model at once). *This* is the stuff for simple souls! Probably right, although I think that os.path.walk just has a bad API (since it gives you a whole directory at a time instead of giving you each file). > > Another good use of threads (suggested by Sam) is for GUI programming. > > An old GUI system, News by David Rosenthal at Sun, used threads > > programmed in PostScript -- very elegant (and it failed for other > > reasons -- if only he had used Python instead :-). > > > > On the other hand, having written lots of GUI code using Tkinter, the > > event-driven version doesn't feel so bad to me. Threads would be nice > > when doing things like rubberbanding, but I generally agree with > > Ousterhout's premise that event-based GUI programming is more reliable > > than thread-based. Every time your Netscape freezes you can bet > > there's a threading bug somewhere in the code. > > I don't use Netscape, but I can assure you the same is true of Internet > Explorer -- except there the threading bug is now somewhere in the OS <0.5 > wink>. > > Anyway, > > 1) There are lots of goods uses for threads, and especially in the Windows > and (maybe) multiprocessor NumPy worlds. Those would really be happier with > "free-threaded threads", though. > > 2) Apart from pedagogical purposes, there probably isn't a use for my "fake > threads" that couldn't be done easier & better via a more direct (coroutine, > continuation) approach; and if I had fake threads, the first thing I'd do > for me is rewrite the generator and coroutine packages to use them. So, > yes: you win . > > 3) Python's current threads are good for overlapping I/O. Sometimes. And > better addressed by Sam's non-threaded "select" approach when you're dead > serious about overlapping lots of I/O. This is independent of Python, and is (I think) fairly common knowledge -- if you have 10 threads this works fine, but with 100s of them the threads themselves become expensive resources. But then you end up with contorted code which is why high-performance systems require experts to write them. > They're also beaten into service > under Windows, but not without cries of anguish from Greg and Mark. Not sure what you mean here. > I don't know, Guido -- if all you wanted threads for was to speed up a > little I/O in as convoluted a way as possible, you may have been witness to > the invention of the wheel but missed that ox carts weren't the last > application . What were those applications of threads again you were talking about that could be serviced by fake threads that weren't coroutines/generators? --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Sun Jul 4 04:41:32 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 3 Jul 1999 22:41:32 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907040156.VAA10874@eric.cnri.reston.va.us> References: Your message of "Thu, 01 Jul 1999 00:30:30 EDT." <000101bec37a$7465af00$309e2299@tim> Message-ID: <1281066233-51948648@hypernet.com> Hmmm. I jumped back into this one, but never saw my post show up... Threads (real or fake) are useful when more than one thing is "driving" your processing. It's just that in the real world (a place Tim visited, once, but didn't like - or was it vice versa?) those "drivers" are normally I/O. Guido complained that to do it right would require gathering up all the fds and doing a select. I don't think that's true (at least, for a decent fake thread). You just have to select on the one (to see if the I/O will work) and swap or do it accordingly. Also makes it a bit easier for portability (I thought I heard that Mac's select is limited to sockets). I see 2 questions. First, is there enough of an audience (Mac, mostly, I think) without native threads to make them worthwhile? Second, do we want to introduce yet more possibilities for brain-explosions by enabling coroutines / continuations / generators or some such? There is practical value there (as Sam has pointed out, and I now concur, watching my C state machine grow out of control with each new client request). I think the answer to both is probably "yes", and though they have a lot in common technically, they have totally different rationales. - Gordon From tim_one@email.msn.com Sun Jul 4 09:46:09 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 4 Jul 1999 04:46:09 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281066233-51948648@hypernet.com> Message-ID: <000d01bec5f9$a95fa9a0$ea9e2299@tim> [Gordon McMillan] > Hmmm. I jumped back into this one, but never saw my post show up... Me neither! An exclamation point because I see there's a recent post of yours in the Python-Dev archives, but I didn't get it in the mail either. > Threads (real or fake) are useful when more than one thing is > "driving" your processing. It's just that in the real world (a place > Tim visited, once, but didn't like - or was it vice versa?) those > "drivers" are normally I/O. Yes, but that's the consensus view of "real", and so suffers from "ten billion flies can't be wrong" syndrome . If you pitch a parallel system to the NSA, they give you a long list of problems and ask you to sketch the best way to solve them on your platform; as I recall, none had anything to do with I/O even under Guido's definition; instead tons of computation with difficult twists, and enough tight coupling to make threads the natural approach in most cases. If I said any more they'd terminate me with extreme prejudice, and the world doesn't get any realer than that . > Guido complained that to do it right would require gathering up all > the fds and doing a select. I don't think that's true (at least, for > a decent fake thread). You just have to select on the one (to see if > the I/O will work) and swap or do it accordingly. Also makes it a bit > easier for portability (I thought I heard that Mac's select is > limited to sockets). Can you flesh out the "swap" part more? That is, we're in the middle of some C code, so the C stack is involved in the state that's being swapped, and under fake threads we don't have a real thread to magically capture that. > I see 2 questions. First, is there enough of an audience (Mac, > mostly, I think) without native threads to make them worthwhile? > Second, do we want to introduce yet more possibilities for > brain-explosions by enabling coroutines / continuations / generators > or some such? There is practical value there (as Sam has pointed out, > and I now concur, watching my C state machine grow out of control > with each new client request). > > I think the answer to both is probably "yes", and though they have a > lot in common technically, they have totally different rationales. a) Generators aren't enough for Sam's designs. b) Fake threads are roughly comparable to coroutines and continuations wrt power (depending on implementation details, continuations may be strictly most powerful, and coroutines least). c) Christian's stackless Python can, I believe, already do full coroutines, and is close to doing full continuations. So soon we can kick the tires instead of each other . or-what-the-heck-we-can-akk-kick-chris-ly y'rs - tim From tim_one@email.msn.com Sun Jul 4 09:45:58 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 4 Jul 1999 04:45:58 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907040156.VAA10874@eric.cnri.reston.va.us> Message-ID: <000c01bec5f9$a3e86e80$ea9e2299@tim> [Guido and Tim, Guido and Tim] Ouch! This is getting contentious. Let's unwind the "you said, I said, you said" business a bit. Among the three {coroutines, fake threads, continuations}, I expect any could be serviceably simulated via either of the others. There: just saved a full page of sentence diagramming . All offer a strict superset of generator semantics. It follows that, *given* either coroutines or continuations, I indeed see no semantic hole that would be plugged by fake threads. But Python doesn't have any of the three now, and there are two respects in which fake threads may have an advantage over the other two: 1) Pedagogical, a friendlier sandbox for learning "real threads". 2) Python already has *a* notion of threads. So fake threads could be seen as building on that (variation of an existing concept, as opposed to something unprecedented). I'm the only one who seems to see merit in #2, so I won't mention it again: fake threads may be an aid to education, but other than that they're useless crap, and probably cause stains if not outright disk failure . About newbies, I've seen far too many try to learn threads to entertain the notion that they're easier than I think. Threads != parallel programming, though! Approaches like Gelertner's Linda, or Klappholz's "refined languages", *are* easy for newbies to pick up, because they provide clear abstractions that prevent the worst parallelism bugs by offering primitives that *can't* e.g. deadlock. threading.py is a step in the right direction (over the "thread" module) too. And while I don't know what Alice presents as a parallelism API, I'd bet 37 cents unseen that the Alice user doesn't build "parallel activities" out of thread.start_new_thread and raw mutii . About the rest, I think you have a more expansive notion of I/O than I do, although I can squint and see what you mean; e.g., I *could* view almost all of what Dragon's products do as I/O, although it's a real stretch for the thread that just polls the other threads making sure they're still alive . Back to quoting: >> Throwing explicit threads at this is like writing a recursive >> Fibonacci number generator in Scheme, but building the recursion >> yourself by hand out of explicit continuations . > Aren't you contradicting yourself? You say that threads are > ubiquitous and easy on Windows (and I agree), yet you claim that > threads are overkill for doing two kinds of I/O or one kind of I/O and > some computation in parallel...? They're a general approach (like continuations) but, yes, given an asynch I/O interface most times I'd much rather use the latter (like I'd rather use recursion directly when it's available). BTW, I didn't say threads were "easy" under Windows: cheap, reliable & ubiquitous, yes. They're easier than under many other OSes thanks to a rich collection of system-supplied thread gimmicks that actually work, but no way are they "easy". Like you did wrt hiding "thread" under "threading", even under Windows real projects have to create usable app-specific thread-based abstractions (c.f. your on-target remark about Netscape & thread bugs). > I'm also thinking of Java threads. Yes, the GC thread is one of those > computational threads you are talking about, but I think the examples > I've seen otherwise are mostly about having one GUI component (e.g. an > applet) independent from other GUI components (e.g. the browser). To > me that's overlapping I/O, since I count GUI events as I/O. Whereas I don't. So let's just agree to argue about this one with ever-increasing intensity . > ... > What makes them unreal except for the interpreter lock? Python > threads are always OS threads, and that makes them real enough for > most purposes... We should move this part to the Thread-SIG; Mark & Greg are doubtless chomping at the bit to rehash the headaches the global lock causes under Windows ; I'm not so keen either to brush off the potential benefits of multiprocessor parallelism, particularly not with the price of CPUs falling into spare-change range. > (I'm not sure if there are situations on uniprocessors where the > interpreter lock screws things up that aren't the fault of the > extension writer -- typically, problems arise when an extension does > some blocking I/O but doesn't place Py_{BEGIN,END}_ALLOW_THREADS > macros around the call.) Hmm! What kinds of problems happen then? Just a lack of hoped-for overlap, or actual deadlock (the I/O thread needing another thread to proceed for itself to make progress)? If the latter, the extension writer's view of who's at fault may differ from ours . >> (e.g., my employer ships heavily threaded Windows apps of various >> kinds, and overlapped I/O isn't a factor in any of them; it's mostly >> a matter of algorithm factoring to keep the real-time incestuous >> subsystems from growing impossibly complex, and in some of the very >> expensive apps also a need to exploit multiple processors). > Hm, you admit that they sometimes want to use multiple CPUs, which was > explcitly excluded from our discussion (since fake threads don't help > there), I've been ranting about both fake threads and real threads, and don't recall excluding anything; I do think I *should* have, though . > and I bet that they are also watching some kind of I/O (e.g. whether the > user says some more stuff). Sure, and whether the phone rings, and whether text-to-speech is in progress, and tracking the mouse position, and all sorts of other arguably I/O-like stuff too. Some of the subsytems are thread-unaware legacy or 3rd-party code, and need to run in threads dedicated to them because they believe they own the entire machine (working via callbacks). The coupling is too tight to afford IPC mechanisms, though (i.e., running these in a separate process is not an option). Mostly it's algorithm-factoring, though: text-to-speech and speech-to-text both require mondo complex processing, and the "I/O part" of each is a small link at an end of a massive chain. Example: you say something, and you expect to see "the result" the instant you stop speaking. But the CPU cycles required to recognize 10 seconds of speech consumes, alas, about 10 seconds. So we *have* to overlap the speech collection with the signal processing, the acoustic feature extraction, the acoustic scoring, the comparison with canned acoustics for many tens of thousands of words, the language modeling ("sounded most like 'Guido', but considering the context they probably said 'ghee dough'"), and so on. You simply can't write all that as a monolothic algorithm and have a hope of it working; it's most naturally a pipeline, severely complicated in that what pops out of the end of the first stage can have a profound effect on what "should have come out" at the start of the last stage. Anyway, thread-based pseudo-concurreny is a real help in structuring all that. It's *necessary* to overlap speech collection (input) with computation and result-so-far display (output), but it doesn't stop there. > ... > Agreed -- I don't understand where green comes from at all. Does it > predate Java? Don't know, but I never heard of it before Java or outside of Solaris. [about generators & dumb linguists] > Strange. Maybe dumb linguists are better at simply copying examples > without thinking too much about them; personally I had a hard time > understanding what Icon was doing when I read about it, probably > because I tried to understand how it was done. For threads, I have a > simple mental model. For coroutines, my head explodes each time. Yes, I expect the trick for "dumb linguists" is that they don't try to understand. They just use it, and it works or it doesn't. BTW, coroutines are harder to understand because of (paradoxically!) the symmetry; generators are slaves, so you don't have to bifurcate your brain to follow what they're doing . >> print len(text), url >> >> may print the len(text) from one thread followed by the url >> from another). > Fine -- that's a great excuse to introduce locks in the next section. > (Most threading tutorials I've seen start by showing flawed examples > to create an appreciation for the need of locks.) Even better, they start with an endless sequence of flawed examples that makes the reader wonder if there's *any* way to get this stuff to work . >> for x in backwards(sequence): >> print x >> >> def backwards(s): >> for i in xrange(len(s)-1, -1, -1): >> suspend s[i] > But backwards() also returns, when it's done. What happens with the > return value? I don't think a newbie would think to ask that: it would "just work" . Seriously, in Icon people quickly pick up that generators have a "natural lifetime", and when they return their life is over. It hangs together nicely enough that people don't have to think about it. Anyway, "return" and "suspend" both return a value; the only difference is that "return" kills the generator (it can't be resumed again after a return). The pseudo-Python above assumed that a generator signals the end of its life by returning None. Icon uses a different mechanism. > ... > Probably right, although I think that os.path.walk just has a bad API > (since it gives you a whole directory at a time instead of giving you > each file). Well, in Ping's absence I've generally fielded the c.l.py questions about tokenize.py too, and there's a pattern: non-GUI people simply seem to find callbacks confusing! os.path.walk has some other UI glitches (like "arg" is the 3rd argument to walk but the 1st arg to the callback, & people don't know what its purpose is anyway), but I think the callback is the core of it (& "arg" is an artifact of the callback interface). I can't help but opine that part of what people find so confusing about call/cc in Scheme is that it calls a function taking a callback argument too. Generators aren't strong enough to replace call/cc, but they're exactly what's needed to make tokenize's interface match the obvious mental model ("the program is a stream of tokens, and I want to iterate over that"); c.f. Sam's comments too about layers of callbacks vs "normal control flow". >> 3) Python's current threads are good for overlapping I/O. >> Sometimes. And better addressed by Sam's non-threaded "select" >> approach when you're dead serious about overlapping lots of I/O. > This is independent of Python, and is (I think) fairly common > knowledge -- if you have 10 threads this works fine, but with 100s of > them the threads themselves become expensive resources. I think people with a Unix background understand that, but not sure about Windows natives. Windows threads really are cheap, which easily slides into abuse; e.g., the recently-fixed electron-width hole in cleaning up thread states required extreme rates of thread death to provoke, and has been reported by multiple Windows users. An SGI guy was kind enough to confirm the test case died for him too, but did any non-Windows person ever report this bug? > But then you end up with contorted code which is why high-performance > systems require experts to write them. Which feeds back into Sam's agenda: the "advanced" control-flow gimmicks can be used by an expert to implement a high-performance system that doesn't require expertise to use. Fake threads would be good enough for that purpose too (while real threads aren't), although he's got his heart set on one of the others. >> I don't know, Guido -- if all you wanted threads for was to speed up a >> little I/O in as convoluted a way as possible, you may have been witness >> to the invention of the wheel but missed that ox carts weren't the last >> application . > What were those applications of threads again you were talking about > that could be serviced by fake threads that weren't coroutines/generators? First, let me apologize for the rhetorical excess there -- it went too far. Forgive me, or endure more of the same . Second, the answer is (of course) "none", but that was a rant about real threads, not fake ones. so-close-you-can-barely-tell-'em-apart-ly y'rs - tim From gmcm@hypernet.com Sun Jul 4 14:23:31 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Sun, 4 Jul 1999 08:23:31 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000d01bec5f9$a95fa9a0$ea9e2299@tim> References: <1281066233-51948648@hypernet.com> Message-ID: <1281031342-54048300@hypernet.com> [I jump back into a needlessly contentious thread]: [Gordon McMillan - me] > > Threads (real or fake) are useful when more than one thing is > > "driving" your processing. It's just that in the real world (a place > > Tim visited, once, but didn't like - or was it vice versa?) those > > "drivers" are normally I/O. [Tim] > Yes, but that's the consensus view of "real", and so suffers from > "ten billion flies can't be wrong" syndrome . If you pitch a > parallel system to the NSA, I can assure you that gov't work isn't "real", even when the problem domain appears to be, which in this case is assuredly not true . But the point really is that (1) Guido's definition of "I/O" is very broad and (2) given that definition, it probably does account for 99% of the cases. Which is immaterial, if the fix for one fixes the others. > > Guido complained that to do it right would require gathering up all > > the fds and doing a select. I don't think that's true (at least, for > > a decent fake thread). You just have to select on the one (to see if > > the I/O will work) and swap or do it accordingly. Also makes it a bit > > easier for portability (I thought I heard that Mac's select is > > limited to sockets). > > Can you flesh out the "swap" part more? That is, we're in the > middle of some C code, so the C stack is involved in the state > that's being swapped, and under fake threads we don't have a real > thread to magically capture that. Sure - it's spelled "T I S M E R". IFRC, this whole thread started with Guido dumping cold water on the comment that perhaps Chris's work could yield green (er, "fake") threads. > > I see 2 questions. First, is there enough of an audience (Mac, > > mostly, I think) without native threads to make them worthwhile? > > Second, do we want to introduce yet more possibilities for > > brain-explosions by enabling coroutines / continuations / generators > > or some such? There is practical value there (as Sam has pointed out, > > and I now concur, watching my C state machine grow out of control > > with each new client request). > > > > I think the answer to both is probably "yes", and though they have a > > lot in common technically, they have totally different rationales. > > a) Generators aren't enough for Sam's designs. OK, but they're still (minorly) mind expanding for someone from the orthodox C / Python world... > b) Fake threads are roughly comparable to coroutines and > continuations wrt power (depending on implementation details, > continuations may be strictly most powerful, and coroutines least). > > c) Christian's stackless Python can, I believe, already do full > coroutines, and is close to doing full continuations. So soon we > can kick the tires instead of each other . So then we're down to Tim faking the others from whatever Chris comes up with? Sounds dandy to me! (Yah, bitch and moan Tim; you'd do it anyway...). (And yes, we're on the "dev" list; this is all experimental; so Guido can just live with being a bit uncomfortable with it ). The rambling arguments have had to do with "reasons" for doing this stuff. I was just trying to point out that there are a couple valid but very different reasons: 1) Macs. 2) Sam. almost-a-palindrome-ly y'rs - Gordon From tismer@appliedbiometrics.com Sun Jul 4 15:06:01 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sun, 04 Jul 1999 16:06:01 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000c01bec5f9$a3e86e80$ea9e2299@tim> Message-ID: <377F6A49.B48E8000@appliedbiometrics.com> Just a few clarifications. I have no time, but need to share what I learned. Tim Peters wrote: > > [Guido and Tim, Guido and Tim] ... > Among the three {coroutines, fake threads, continuations}, I expect any > could be serviceably simulated via either of the others. There: just saved > a full page of sentence diagramming . All offer a strict superset of > generator semantics. I have just proven that this is not true. Full continuations cannot be expressed by coroutines. All the rest is true. Coroutines and fake threads just need the absence of the C stack. To be more exact: It needs that the current state of the C stack is independent from executing bound Python code (which frames are). Now the big surprize: This *can* be done without removing the C stack. It can give more speed to let the stack wind up to some degree and wind down later. Even some Scheme implementations are doing this. But the complexity to make this work correctly is even higher than to be stackless whenever possible. So this is the basement, but improvements are possible and likely to appear. Anyway, with this, you can build fake threads, coroutines and generators. They all do need a little extra treatment. Switching of context, how to stop a coroutine, how to catch exceptions and so on. You can do all that with some C code. I just believe that even that can be done with Python. Here the unsayable continuation word appears. You must have them if you want to try the above *in* Python. Reason why continuations are the hardest of the above to implement and cannot expressed by them: A continuation is the future of some computation. It allows to change the order of execution of a frame in a radical way. A frame can have as many as one dormant continuation per every function call which appears lexically, and it cannot predict which of these is actually a continuation. From that follows: The Python stack, which keeps intermediate values under the (now wrong) assumption that it knows the execution order, is in the way. Just think of a for loop, but I found other examples where this is the case. Simple stack capturing does just half the job. Conclusion: In order to have full continuations (and believe me, just to try it, without pushing to get it into Python), one needs to identify the stack locations which carry values which are needed in more than one continuation. In other words, we have to find hidden registers. I discovered this rather late when I claimed to have them already, since my examples all worked fine. And in most cases, this will stay so. Just to make every situation consistent, the above extra analysis is a must, IMO. This is my path and the end of this journey: - Refine stackless Python to be strong enough to even stand continuations. This is done. - Write the whole co-anything stuff as a pure extension module. Also done. Without this extension, Python is *not* affected. - Solve the last issue of shared hidden registers. I wrote the analyzer in C already. Now I need time to do the wind/unwind stuff. Also in the extension, no change to Python! When this is done, what do we have? We have Python with continuations, where I'm not sure if that is what we finally want. No question that this is an issue. BUT: We can now implement the real things like fake-threads, coroutines, generators and whatever else we come up with, given these continuations. I hope every listener sees what I really want: When the above sandbox is there, then we can try a lot and find the best, nicest, easiest to grasp, most natural and straight forward way to implement the high level functions and objects. When that is done, and not before, then we should code "the right thing"(TM) in C, and we can forget about continuations if nobody needs them any longer. Sorry that my path got longer than I planned. I cannot understand certain things without implementing them. sincerely - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From klm@digicool.com Sun Jul 4 15:30:00 1999 From: klm@digicool.com (Ken Manheimer) Date: Sun, 4 Jul 1999 10:30:00 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000c01bec5f9$a3e86e80$ea9e2299@tim> <377F6A49.B48E8000@appliedbiometrics.com> Message-ID: <002601bec629$b38eedc0$5a57a4d8@erols.com> I have to say thank you, christian! I think your intent - provide the basis for designers of python's advanced control mechanisms to truly explore, and choose the direction in a well informed way - is ideal, and it's a rare and wonderful opportunity to be able to pursue something like an ideal course. Thanks to your hard work. Whatever comes of this, i think we all have at least refined our understandings of the issues - i know i have. (Thanks also to the ensuing discussion's clarity and incisiveness - i need to thank everyone involved for that...) I may not be able to contribute particularly to the implementation, but i'm glad to be able to grasp the implications as whatever proceeds, proceeds. And i actually expect that the outcome will be much better informed than it would have been without your following through on your own effort to understand. Yay! Ken klm@digicool.com From gmcm@hypernet.com Sun Jul 4 19:25:20 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Sun, 4 Jul 1999 13:25:20 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <377F6A49.B48E8000@appliedbiometrics.com> Message-ID: <1281013244-55137551@hypernet.com> I'll second Ken's congratulations to Christian! [Christian] > ... Full continuations > cannot be expressed by coroutines. All the rest is true. I beg enlightenment from someone more familiar with these high-falutin' concepts. Would the following characterization be accurate? All these beasts (continuations, coroutines, generators) involve the idea of "resumable", but: A generator's state is wholly self-contained A coroutines's state is not necessarily self-contained but it is stable Continuations may have volatile state. Is this right, wrong, necessary, sufficient...?? goto-beginning-to-look-attractive-ly y'rs - Gordon From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Sun Jul 4 23:14:36 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Sun, 4 Jul 1999 18:14:36 -0400 (EDT) Subject: [Python-Dev] Mail getting lost? (was RE: Fake threads) References: <1281066233-51948648@hypernet.com> <000d01bec5f9$a95fa9a0$ea9e2299@tim> Message-ID: <14207.56524.360202.939414@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Me neither! An exclamation point because I see there's a TP> recent post of yours in the Python-Dev archives, but I didn't TP> get it in the mail either. A bad performance problem in Mailman was causing cpu starvation and (I'm surmising) lost messages. I believe I've fixed this in the version currently running on python.org. If you think messages are showing up in the archives but you are still not seeing them delivered to you, please let me know via webmaster@python.org! -Barry From guido@CNRI.Reston.VA.US Mon Jul 5 13:12:41 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 05 Jul 1999 08:12:41 -0400 Subject: [Python-Dev] Welcome Jean-Claude Wippler Message-ID: <199907051212.IAA11729@eric.cnri.reston.va.us> We have a new python-dev member. Welcome, Jean-Claude! (It seems you are mostly interested in lurking, since you turned on digest mode :-) Remember, the list's archives and member list are public; noth are accessible via http://www.python.org/mailman/listinfo/python-dev I would welcome more members -- please suggest names and addresses to me! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Mon Jul 5 13:06:03 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 05 Jul 1999 08:06:03 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Sun, 04 Jul 1999 13:25:20 CDT." <1281013244-55137551@hypernet.com> References: <1281013244-55137551@hypernet.com> Message-ID: <199907051206.IAA11699@eric.cnri.reston.va.us> > [Christian] > > ... Full continuations > > cannot be expressed by coroutines. All the rest is true. [Gordon] > I beg enlightenment from someone more familiar with these > high-falutin' concepts. Would the following characterization be > accurate? > > All these beasts (continuations, coroutines, generators) involve the > idea of "resumable", but: > > A generator's state is wholly self-contained > A coroutines's state is not necessarily self-contained but it is stable > Continuations may have volatile state. > > Is this right, wrong, necessary, sufficient...?? I still don't understand all of this (I have not much of an idea of what Christian's search for hidden registers is about and what kind of analysis he needs) but I think of continuations as requiring (theoretically) coping the current stack (to and from), while generators and coroutines just need their own piece of stack set aside. The difference between any of these and threads (fake or real) is that they pass control explicitly, while threads (typically) presume pre-emptive scheduling, i.e. they make independent parallel progress without explicit synchronization. (Hmm, how do you do this with fake threads? Or are these only required to switch whenever you touch a mutex?) I'm not sure if there's much of a difference between generators and coroutines -- it seems just the termination convention. (Hmm... would/should a generator be able to raise an exception in its caller? A coroutine?) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Mon Jul 5 07:55:02 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 5 Jul 1999 02:55:02 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281013244-55137551@hypernet.com> Message-ID: <000101bec6b3$4e752be0$349e2299@tim> [Gordon] > I beg enlightenment from someone more familiar with these > high-falutin' concepts. Would the following characterization be > accurate? > > All these beasts (continuations, coroutines, generators) involve the > idea of "resumable", but: > > A generator's state is wholly self-contained > A coroutines's state is not necessarily self-contained but it is > stable > Continuations may have volatile state. > > Is this right, wrong, necessary, sufficient...?? > > goto-beginning-to-look-attractive-ly y'rs "goto" is deliciously ironic, for a reason to become clear . Here's my biased short course. NOW First, I have the feeling most people would panic if we simply described Python's current subroutine mechanism under a new name <0.9 wink>. I'll risk that. When Python makes a call, it allocates a frame object. Attached to the frame is the info everyone takes for granted so thinks is "simple & obvious" . Chiefly, "the locals" (name -> object bindings) a little evaluation stack for holding temps and dynamic block-nesting info the offset to the current bytecode instruction, relative to the start of the code object's fixed (immutable) bytecode vector When a subroutine returns, it decrefs the frame and then the frame typically goes away; if it returns because of an exception, though, traceback objects may keep the frame alive. GENERATORS Generators add two new abstract operations, "suspend" and "resume". When a generator suspends, it's exactly like a return today except we simply decline to decref the frame. That's it! The locals, and where we are in the computation, aren't thrown away. A "resume" then consists of *re*starting the frame at its next bytecode instruction, with the retained frame's locals and eval stack just as they were. Some generator properties: + In implementation terms a trivial variation on what Python currently does. + They're asymmetric: "suspend" is something only a generator can do, and "resume" something only its caller can do (this does not preclude a generator from being "the caller" wrt to some other generator, though, and indeed that's very useful in practice). + A generator always returns control directly to its caller, at the point the caller invoked the generator. And upon resumption, a generator always picks up where it left off. + Because a generator remembers where it is and what its locals are, its state and "what to do next" don't have to be encoded in global data structures then decoded from scratch upon entry. That is, whenever you build a little (or large!) state machine to figure out "what to do next" from a collection of persistent flags and state vrbls, chances are good there's a simple algorithm dying to break free of that clutter . COROUTINES Coroutines add only one new abstract operation, "transfer". They're fully symmetric so can get away with only one. "transfer" names a coroutine to transfer to, and gives a value to deliver to it (there are variations, but this one is common & most useful). When A transfers to B, it acts like a generator "suspend" wrt A and like a generator "resume" wrt B. So A remembers where it is, and what its locals etc are, and B gets restarted from the point *it* last transfered to someone else. Coroutines grew up in simulation languages because they're an achingly natural way to model independent objects that interact with feedback. There each object (which may itself be a complex system of other stuff) is written as an infinite loop, transferring control to other objects when it has something to tell them, and transferred to by other objects when they have something to tell it. A Unix pipeline "A | B | C | D" doesn't exploit the full power but is suggestive. A may be written as while 1: x = compute my next output B.transfer(x) # resume B with my output B as while 1: x = A.transfer() # resume A to get my input y = compute something from x and my own history C.transfer(y) # resume C with my output C as while 1: x = B.transfer() # resume B to get my input y = compute something from x and my own history D.transfer(y) # resume D with my output and D as while 1: x = C.transfer() # resume C to get my input y = compute something from x and my own history print y If e.g. C collapses pairs of values from B, it can be written instead as while 1: # get a pair of B's x = B.transfer() y = B.transfer() z = f(x, y, whatever) D.transfer(z) # resume D with my output It's a local modification to C: B doesn't know and shouldn't need to know. This keeps complex algorithms manageable as things evolve. Initialization and shutdown can be delicate, but once the pipe is set up it doesn't even matter which of {A, B, C, D} first gets control! You can view A as pushing results through the pipe, or D as pulling them, or whatever. In reality they're all equal partners. Why these are so much harder to implement than generators: "transfer" *names* who next gets control, while generators always return to their (unnamed) caller. So a generator simply "pops the stack" when it suspends, while coroutine flow need not be (and typically isn't) stack-like. In Python this is currently a coroutine-killer, because the C stack gets intertwined. So if coroutine A merely calls (in the regular sense) function F, and F tries to transfer to coroutine B, the info needed to resume A includes the chunk of the C stack between A and F. And that's why the Python coroutine implementation I referenced earlier uses threads under the covers (where capturing pieces of the C stack isn't a problem). Early versions of coroutines didn't allow for this, though! At first coroutines could only transfer *directly* to other coroutines, and as soon as a coroutine made "a regular call" transfers were prohibited until the call returned (unless the called function kicked off a brand new collection of coroutines, which could then transfer among themselves -- making the distinction leads to convoluted rules, so modern practice is to generalize from the start). Then the current state of each coroutine was contained in a single frame, and it's really no harder to implement than generators. Knuth seems to have this restricted flavor of coroutine in mind when he describes generator behavior as "semi-coroutine". CONTINUATIONS Given the pedagogical structure so far, you're primed to view continuations as an enhancement of coroutines. And that's exactly what will get you nowhere . Continuations aren't more elaborate than coroutines, they're simpler. Indeed, they're simpler than generators, and even simpler than "a regular call"! That's what makes them so confusing at first: they're a different *basis* for *all* call-like behavior. Generators and coroutines are variations on what you already know; continuations challenge your fundamental view of the universe. Legend has it they were discovered when theorists were trying to find a solid reason for why goto statements suck: the growth of "denotational semantics" (DS) boomed at the same time "structured programming" took off. The former is a solid & fruitful approach to formally specifying the semantics of programming languages, built on the lambda calculus (and so dear to the Lisp/Scheme community -- this all ties together, of course ). The early hope was that goto statements would prove to present intractable problems for formal specification, and then "that's why they suck: we can't even sort them out on paper, let alone in practice". But in one of God's cleverer tricks on the programming world , the semantics of goto turned out to be trivial: at a branch point, you can go one of two ways. Represent one of those ways by a function f that computes what happens if you branch one way, and the other way by a function g. Then an if+goto simply picks one of f or g as "the continuation" of the program, depending on whether the "if" condition is true or false. And a plain goto simply replaces the current continuation with a different one (representing what happens at the branch target) unconditionally. So goto turned out to be simpler (from the DS view) than even an assignment stmt! I've often suspected theorists were *surprised* (and maybe appalled <0.7 wink>) when the language folks went on to *implement* the continuation idea. Don't really know, but suppose it doesn't matter anyway. The fact is we're stuck with them now . In theory a continuation is a function that computes "the rest of the program", or "its future". And it really is like a supercharged goto! It's the formal DS basis for all control flow, from goto stmts to exception handling, subsuming vanilla call flow, recursion, generators, coroutines, backtracking, and even loops along the way. To a certain frame of mind (like Sam's, and Christian is temporarily under his evil influence ), this relentless uniformity & consistency of approach is very appealing. Guido tends to like his implementations to mirror his surface semantics, though, and if he has ten constructs they're likely to be implemented ten ways. View that as a preview of future battles that have barely been hinted at so far <0.3 wink>. Anyway, in implementation terms a continuation "is like" what a coroutine would be if you could capture its resumption state at any point (even without the coroutine's knowledge!) and assign that state to a vrbl. So we could say it adds an abstract operation "capture", which essentially captures the program counter, call stack, and local (in Python terms) "block stack" at its point of invocation, and packages all that into a first-class "continuation object". IOW, a building block on top of which a generator's suspend, and the suspend half of a coroutine transfer, can be built. In a pure vision, there's no difference at all between a regular return and the "resume" half of a coroutine transfer: both amount to no more than picking some continuation to evaluate next. A continuation can be captured anywhere (even in the middle of an expression), and any continuation can be invoked at will from anywhere else. Note that "invoking a continuation" is *not* like "a call", though: it's abandoning the current continuation, *replacing* it with another one. In formal DS this isn't formally true (it's still "a call" -- a function application), but in practice it's a call that never returns to its caller so the implementation takes a shortcut. Like a goto, this is as low-level as it gets, and even hard-core continuation fans don't use them directly except as a means to implement better-behaved abstractions. As to whether continuations have "volatile state", I'm not sure what that was asking. If a given continuation is invoked more than once (which is something that's deliberately done when e.g. implementing backtracking searches), then changes made to the locals by the first invocation are visible to the second (& so on), so maybe the answer is "yes". It's more accurate to think of a continuation as being immutable, though: it holds a reference to the structure that implements name bindings, but does not copy (save or restore) the bindings. Quick example, given: (define continuation 0) (define (test) (let ((i 0)) (call/cc (lambda (k) (set! continuation k))) (set! i (+ i 1)) i)) That's like the Python: def test(): i = 0 global continuation continuation = magic to resume at the start of the next line i = i + 1 return i Then (this is interactive output from a Scheme shell): > (test) ; Python "test()" 1 > (continuation) ; Python "continuation()" 2 > (continuation) 3 > (define thisguy continuation) ; Python "thisguy = continuation" > (test) 1 > (continuation) 2 > (thisguy) 4 > too-simple-to-be-obvious?-ly y'rs - tim From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jul 5 17:55:01 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Mon, 5 Jul 1999 12:55:01 -0400 (EDT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <1281013244-55137551@hypernet.com> <000101bec6b3$4e752be0$349e2299@tim> Message-ID: <14208.58213.449486.917974@anthem.cnri.reston.va.us> Wow. That was by far the clearest tutorial on the subject I think I've read. I guess we need (for Tim to have) more 3 day holiday weekends. i-vote-we-pitch-in-and-pay-tim-to-take-/every/-monday-off-so-he-can-write- more-great-stuff-like-this-ly y'rs, -Barry From skip@mojam.com (Skip Montanaro) Mon Jul 5 18:54:45 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 5 Jul 1999 13:54:45 -0400 (EDT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <14208.58213.449486.917974@anthem.cnri.reston.va.us> References: <1281013244-55137551@hypernet.com> <000101bec6b3$4e752be0$349e2299@tim> <14208.58213.449486.917974@anthem.cnri.reston.va.us> Message-ID: <14208.61767.893387.713711@cm-24-29-94-19.nycap.rr.com> Barry> Wow. That was by far the clearest tutorial on the subject I Barry> think I've read. I guess we need (for Tim to have) more 3 day Barry> holiday weekends. What he said. Skip From MHammond@skippinet.com.au Tue Jul 6 02:16:45 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 6 Jul 1999 11:16:45 +1000 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000101bec6b3$4e752be0$349e2299@tim> Message-ID: <000401bec74d$37c8d370$0801a8c0@bobcat> > NOW No problems, fine sailing... > GENERATORS Cruising along - nice day to be out! > COROUTINES Such a pleasant day! > CONTINUATIONS Are they clouds I see? > Given the pedagogical structure so far, you're primed to view > continuations > as an enhancement of coroutines. And that's exactly what will get you > nowhere . Continuations aren't more elaborate than coroutines, > they're simpler. Indeed, they're simpler than generators, A storm warning... > Legend has it they were discovered when theorists were trying > to find a > solid reason for why goto statements suck: the growth of > "denotational > semantics" (DS) boomed at the same time "structured > programming" took off. > The former is a solid & fruitful approach to formally specifying the She's taking on water! > In theory a continuation is a function that computes "the rest of the > program", or "its future". OK - before I abandon ship, I might need my hand-held. Before I start, let me echo Skip and Barry - and excellent precis of a topic I knew nothing about (as I made you painfully aware!) And I will avoid asking you to explain the above paragraph again for now :-) Im a little confused by how these work in practice. I can see how continuations provide the framework to do all these control things. It is clear to me how you can capture the "state" of a running program. Indeed, this is exactly what it seems generators and coroutines do. With continuations, how is the state captured or created? Eg, in the case of implementing a goto or a function call, there doesnt seem to be to be a state available. Does the language supporting continuations allow you to explicitely create one from any arbitary position? I think you sort-of answered that below: > Anyway, in implementation terms a continuation "is like" what > a coroutine > would be if you could capture its resumption state at any point (even > without the coroutine's knowledge!) and assign that state to > a vrbl. So we This makes sense, although it implies a "running state" is necessary for this to work. In the case of transfering control to somewhere you have never been before (eg, a goto or a new function call) how does this work? Your example: > def test(): > i = 0 > global continuation > continuation = magic to resume at the start of the next line > i = i + 1 > return i My main problem is that this looks closer to your description of a kind-of one-sided coroutine - ie, instead of only being capable of transfering control, you can assign the state. I can understand that fine. But in the example, the function _is_ aware its state is being captured - indeed, it is explicitely capturing it. My only other slight conceptual problem was how you implement functions, as I dont understand how the concept of return values fits in at all. But Im sure that would become clearer when the rest of the mud is wiped from my eyes. And one final question: In the context of your tutorial, what do Chris' latest patches arm us with? Given my new-found expertise in this matter I would guess that the guts is there to have at least co-routines, as capturing the state of a running Python program, and restarting it later is possible. Im still unclear about continuations WRT "without the co-routines knowledge", so really unsure what is needed here... The truly final question:-) Assuming Chris' patches were 100% bug free and reliable (Im sure they are very close :-) what would the next steps be to take advantage of it in a "clean" way? ie, assuming Guido blesses them, what exactly could I do in Python? (All I really know is that the C stack has gone - thats it!) Thanks for the considerable time it must be taking to enlightening us! Mark. From jcw@equi4.com Tue Jul 6 10:27:13 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Tue, 06 Jul 1999 11:27:13 +0200 Subject: [Python-Dev] Re: Welcome Jean-Claude Wippler Message-ID: <3781CBF1.B360D466@equi4.com> Thank you Guido, for admitting this newbie to Python-dev :) [Guido: ... you are mostly interested in lurking ... digest mode ...] Fear of being flooded by email, a little shy (who, me?), and yes, a bit of curiosity. Gosh, I got to watch my steps, you figured it all out :) Thanks again. I went through the last month or so of discussion, and am fascinated by the topics and issues you guys are dealing with. And now, seeing Tim's generator/coroutine/continuations description is fantastic. Makes it obvious that I'm already wasting way too much bandwidth. When others come to mind, I'll let them know about this list. But so far, everyone I can come up with already is a member, it seems. -- Jean-Claude From guido@CNRI.Reston.VA.US Tue Jul 6 16:08:37 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 06 Jul 1999 11:08:37 -0400 Subject: [Python-Dev] Welcome to Chris Petrilli Message-ID: <199907061508.LAA12663@eric.cnri.reston.va.us> Chris, would you mind posting a few bits about yourself? Most of the people on this list have met each other at one point or another (with the big exception of the elusive Tim Peters :-); it's nice to know more than a name... --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli@amber.org Tue Jul 6 16:16:10 1999 From: petrilli@amber.org (Christopher Petrilli) Date: Tue, 6 Jul 1999 11:16:10 -0400 Subject: [Python-Dev] Welcome to Chris Petrilli In-Reply-To: <199907061508.LAA12663@eric.cnri.reston.va.us>; from Guido van Rossum on Tue, Jul 06, 1999 at 11:08:37AM -0400 References: <199907061508.LAA12663@eric.cnri.reston.va.us> Message-ID: <19990706111610.A4585@amber.org> On Tue, Jul 06, 1999 at 11:08:37AM -0400, Guido van Rossum wrote: > Chris, would you mind posting a few bits about yourself? Most of the > people on this list have met each other at one point or another (with > the big exception of the elusive Tim Peters :-); it's nice to know > more than a name... As we are all aware, Tim is simply the graduate project of an AI student, running on a network Symbolics machines :-) Honestly though, about me? Um, well, I'm now (along with Brian Lloyd) the Product Management side of Digital Creations, and Zope, so I have a very vested interest in seeing Python succeed---besides my general belief that the better language SHOULd win. My background is actually in architecture, but I've spent the past 5 years working in the cryptography world, mostly in smart cards and PKI. My computer background is bizarre, twisted and quite nefarious... having grown up on a PDP-8/e, rather than PCs. And if the fact that I own 4 Lisp machines means anything, I'm affraid to ask what! For now, I'm just going to watch the masters at work. :-) Chris -- | Christopher Petrilli ``Television is bubble-gum for | petrilli@amber.org the mind.''-Frank Lloyd Wright From tim_one@email.msn.com Wed Jul 7 02:52:15 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 6 Jul 1999 21:52:15 -0400 Subject: [Python-Dev] Fancy control flow Message-ID: <000301bec81b$56e87660$c99e2299@tim> Responding to a msg of Guido's that shows up in the archives but didn't come across the mail link (the proper authorities have been notified, and I'm sure appropriate heads will roll at the appropriate pace ...). > From guido@CNRI.Reston.VA.US Mon, 05 Jul 1999 08:06:03 -0400 > Date: Mon, 05 Jul 1999 08:06:03 -0400 > From: Guido van Rossum guido@CNRI.Reston.VA.US > Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) [generators, coroutines, continuations] > I still don't understand all of this (I have not much of an idea of > what Christian's search for hidden registers is about and what kind of > analysis he needs) but I think of continuations as requiring > (theoretically) coping the current stack (to and from), while > generators and coroutines just need their own piece of stack set aside. A generator needs one frame, period (its own!). "Modern" coroutines can get more involved, mixing regular calls with coroutine transfers in arbitrary ways. About Christian's mysterious quest, we've been pursuing it offline. By "hidden registers" I think he means stuff on the eval stack that should *not* be saved/restored as part of a continuation's state. It's not clear to me that this isn't the empty set. The most debatable case I've seen is the Python "for" loop, which hides an anonymous loop counter on the stack. for i in seq: if func1(i): func2(i) This is more elaborate than necessary , but since it's the one we've discussed offline I'll just stick with it. Suppose func1 saves a continuation on the first iteration, and func2 invokes that continuation on the fifth. Does the resumed continuation "see" the loop as being on its first iteration or as being on its fifth? In favor of the latter is that the loop above "should be" equivalent to this: hidden = 0 while 1: try: temp = seq[hidden] except IndexError: break hidden = hidden + 1 i = temp if func1(i): func2(i) since that's what "for" *does* in Python. With the latter spelling, it's clear that the continuation should see the loop as being on its fifth iteration (continuations see changes in bindings, and making the loop counter a named local exposes it to that rule). But if the entire eval stack is (conceptually) saved/restored, the loop counter is part of it, so the continuation will see the loop counter at its old value. I think it's arguable either way, and argued in favor of "fifth" initially. Now I'm uncertain, but leaning toward "first". > The difference between any of these and threads (fake or real) is that > they pass control explicitly, while threads (typically) presume > pre-emptive scheduling, i.e. they make independent parallel progress > without explicit synchronization. Yes. > (Hmm, how do you do this with fake threads? Or are these only required > to switch whenever you touch a mutex?) I'd say they're only *required* to switch when one tries to acquire a mutex that's already locked. It would be nicer to switch them as ceval already switches "real threads", that is give another one a shot every N bytecodes. > I'm not sure if there's much of a difference between generators and > coroutines -- it seems just the termination convention. A generator is a semi-coroutine, but is the easier half . > (Hmm... would/should a generator be able to raise an exception in its > caller? Definitely. This is all perfectly clear for a generator -- it has a unique & guaranteed still-active place to return *to*. Years ago I tried to rename them "resumable functions" to get across what a trivial variation of plain functions they really are ... > A coroutine?) This one is muddier. A (at line A1) transfers to B (at line B1), which transfers at line B2 to A (at line A2), which at line A3 transfers to B (at line B3), and B raises an exception at line B4. The obvious thing to do is to pass it on to line A3+1, but what if that doesn't catch it either? We got to A3 from A2 from B2 from B1, but B1 is long gone. That's a real difference with generators: resuming a generator is stack-like, while a co-transfer is just moving control around a flat graph, like pushing a pawn around a chessboard. The coroutine implementation I posted 5 years ago almost punted on this one: if any coroutine suffered an unhandled exception, all coroutines were killed and an EarlyExit exception was raised in "the main coroutine" (the name given to the thread of your code that created the coroutine objects to begin with). Deserves more thought than that, though. or-maybe-it-doesn't-ly y'rs - tim From tim_one@email.msn.com Wed Jul 7 05:18:13 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 7 Jul 1999 00:18:13 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000401bec74d$37c8d370$0801a8c0@bobcat> Message-ID: <000001bec82f$bc171500$089e2299@tim> [Mark Hammond] > ... > Thanks for the considerable time it must be taking to enlightening us! You're welcome, but the holiday weekend is up and so is my time. Thank (all of) *you* for the considerable time it must take to endure all this ! Let's hit the highlights (or lowlights, depending on your view): > ... > Im a little confused by how these [continuations] work in practice. Very delicately . Eariler I posted a continuation-based implementation of generators in Scheme, and Sam did the same for a hypothetical Python with a call/cc equivalent. Those are typical enough. Coroutines are actually easier to implement (using continuations), thanks to their symmetry. Again, though, you never want to muck with continuations directly! They're too wild. You get an expert to use them in the bowels of an implementation of something else. > ... > It is clear to me how you can capture the "state" of a running program. > Indeed, this is exactly what it seems generators and coroutines do. Except that generators need only worry about their own frame. Another way to view it is to think of the current computation being run by a (real ) thread -- then capturing a continuation is very much like making a frozen clone of that thread, stuffing it away somewhere for later thawing. > With continuations, how is the state captured or created? There are, of course, many ways to implement these things. Christian is building them on top of the explicit frame objects Python already creates, and that's a fine way for Python. Guido views it as cloning the call stack, and that's accurate too. >> Anyway, in implementation terms a continuation "is like" what >> a coroutine would be if you could capture its resumption state at >> any point (even without the coroutine's knowledge!) and assign that >> state to a vrbl. > This makes sense, although it implies a "running state" is necessary for > this to work. In implementations (like Chris's) that do it all dynamically at runtime, you bet: you not only need a "running state", you can only capture a continuation at the exact point (the specific bytecode) you run the code to capture it. In fact, there *is* "a continuation" at *every* dynamic instance of every bytecode, and the question is then simply which of those you want to save . > In the case of transfering control to somewhere you have never been > before (eg, a goto or a new function call) how does this work? Good eye: it doesn't in this scheme. The "goto" business is a theoretical transformation, in a framework where *every* operation is modeled as a function application, and an entire program's life is modeled as a single function call. Some things are very easy to do in theory . > Your example: >> def test(): >> i = 0 >> global continuation >> continuation = magic to resume at the start of the next line >> i = i + 1 >> return i > My main problem is that this looks closer to your description of a kind-of > one-sided coroutine - ie, instead of only being capable of transfering > control, you can assign the state. I can understand that fine. Good! > But in the example, the function _is_ aware its state is being > captured - indeed, it is explicitely capturing it. In real life, "magic to resume at the start of the next line" may be spelled concretely as e.g. xyz(7) or even a.b That is, anywhere in "test" any sort of (explicit or implicit) call is made *may* be part of a saved continuation, because the callee can capture one -- with or without test's knowledge. > My only other slight conceptual problem was how you implement functions, > as I dont understand how the concept of return values fits in at all. Ya, I didn't mention that. In Scheme, the act of capturing a continuation returns a value. Like so: (define c #f) ; senseless, but Scheme requires definition before reference (define (test) (print (+ 1 (call/cc (lambda (k) (set! c k) 42)))) (newline)) The function called by call/cc there does two things: 1) Stores call/cc's continuation into the global "c". 2) Returns the int 42. > (test) 43 > Is that clear? The call/cc expression returns 42. Then (+ 1 42) is 43; then (print 43) prints the string "43"; then (newline) displays a newline; then (test) returns to *its* caller, which is the Scheme shell's read/eval/print loop. Now that whole sequence of operations-- what happens to the 42 and beyond --*is* "call/cc's continuation", which we stored into the global c. A continuation is itself "a function", that returns its argument to the context where the continuation was captured. So now e.g. > (c 12) 13 > c's argument (12) is used in place of the original call/cc expression; then (+ 1 12) is 13; then (print 13) prints the string "13"; then (newline) displays a newline; then (test) returns to *its* caller, which is *not* (c 12), but just as originally is still the Scheme shell's read/eval/print loop. That last point is subtle but vital, and maybe this may make it clearer: > (begin (c 12) (display "Tim lied!")) 13 > The continuation of (c 12) includes printing "Tim lied!", but invoking a continuation *abandons* the current continuation in favor of the invoked one. Printing "Tim lied!" wasn't part of c's future, so that nasty slur about Tim never gets executed. But: > (define liar #f) > (begin (call/cc (lambda (k) (set! liar k) (c 12))) (display "Tim lied!") (newline)) 13 > (liar 666) Tim lied! > This is why I stick to trivial examples . > And one final question: In the context of your tutorial, what do Chris' > latest patches arm us with? Given my new-found expertise in this matter > I would guess that the guts is there to have at least co-routines, > as capturing the state of a running Python program, and restarting it > later is possible. Im still unclear about continuations WRT "without the > co-routines knowledge", so really unsure what is needed here... Christian is taking his work very seriously here, and as a result is flailing a bit trying to see whether it's possible to do "the 100% right thing". I think he's a lot closer than he thinks he is <0.7 wink>, but in any case he's at worst very close to having full-blown continuations working. Coroutines already work. > The truly final question:-) Assuming Chris' patches were 100% bug free and > reliable (Im sure they are very close :-) what would the next steps be to > take advantage of it in a "clean" way? ie, assuming Guido blesses them, > what exactly could I do in Python? Nothing. What would you like to do? Sam & I tossed out a number of intriguing possibilities, but all of those build *on* what Christian is doing. You won't get anything useful out of the box unless somebody does the work to implement it. I personally have wanted generators in Python since '91, because they're extremely useful in the useless things that I do . There's a thread-based generator interface (Generator.py) in the source distribution that I occasionally use, but that's so slow I usually recode in Icon (no, I'm not a Scheme fan -- I *admire* it, but I rarely use it). I expect rebuilding that on Christian's work will yield a factor of 10-100 speedup for me (beyond losing the thread/mutex overhead, as Chris just pointed out on c.l.py resumption should be much faster than a Python call, since the frame is already set up and raring to go). Would be nice if the language grew some syntax to make generators pleasant as well as fast, but the (lack of) speed is what's really killing it for me now. BTW, I've never tried to "sell" coroutines -- let alone continuations. Just generators. I expect Sam will do a masterful job of selling those. send-today-don't-delay-couldn't-give-or-receive-a-finer-gift-ly y'rs - tim From tismer@appliedbiometrics.com Wed Jul 7 14:11:44 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 07 Jul 1999 15:11:44 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> Message-ID: <37835210.F22A7EC9@appliedbiometrics.com> Tim Peters wrote: > > [Mark Hammond] > > ... > > Thanks for the considerable time it must be taking to enlightening us! > > You're welcome, but the holiday weekend is up and so is my time. Thank (all > of) *you* for the considerable time it must take to endure all this ! Just to let you know that I'm still there, thinking, not coding, still hesitating, but maybe I can conclude now and send it off. This discussion, and especially Tim's input was extremely helpful. He has spent considerable time reading my twisted examples, writing his own, hitting my chin, kicking my -censored-, and proving to me that the truth I was searching doesn't exist. ... > Again, though, you never want to muck with continuations directly! They're > too wild. You get an expert to use them in the bowels of an implementation > of something else. Maybe with one exception: With careful coding, you can use a continuation at the head of a very deep recursion and use it as an early break if the algorithm fails. The effect is the same as bailing out with an exception, despite the fact that no "finally" causes would be obeyed. It is just a incredibly fast jump out of something if you know what you are doing. > > With continuations, how is the state captured or created? > > There are, of course, many ways to implement these things. Christian is > building them on top of the explicit frame objects Python already creates, > and that's a fine way for Python. Guido views it as cloning the call stack, > and that's accurate too. Actually, it is both! What I use (and it works fine) are so-called "push-back frames". My continuations are always appearing in some call. In order to make the caller able to be resumed, I create a push-back frame *from* it. That means, my caller frame is duplicated behind his "f_back" pointer. The original frame stays in place but now becomes a continuation frame with the current stack state preserved. All other locals and stuff are moved to the clone in the f_back which is now the real one. This always works fine, since references to the original caller frame are all intact, just the frame's meaning is modified a little. Well, I will hvae to write a good paper... ... > I personally have wanted generators in Python since '91, because they're > extremely useful in the useless things that I do . There's a > thread-based generator interface (Generator.py) in the source distribution > that I occasionally use, but that's so slow I usually recode in Icon (no, > I'm not a Scheme fan -- I *admire* it, but I rarely use it). I expect > rebuilding that on Christian's work will yield a factor of 10-100 speedup > for me (beyond losing the thread/mutex overhead, as Chris just pointed out > on c.l.py resumption should be much faster than a Python call, since the > frame is already set up and raring to go). I believe so. Well, I admit that the continuation approach is slightly too much for the coroutine/generator case, since they exactly don't have the problem where continuations are suffering a little: Switching between frames which cannot be reached more than once at a time don't need the stack copying/pushback at all. I'm still staying at the secure side for now. But since I have all refcounting accurate already, we can use it to figure out if a frame needs to be copied at all. > Would be nice if the language grew some syntax to make generators pleasant > as well as fast, but the (lack of) speed is what's really killing it for me > now. How about "amb"? :-) (see "teach youself schem in fixnum days, chapter 14 at http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) About my last problems: The hard decision is: - Either I just stop and I'm ready already, and loops are funny. - Or I do the hidden register search, which makes things more complicated and also voidens the pushback trick partially, since then I would manage all stack stuff in one frame. - Or, and that's what I will do finally: For now, I will really just correct the loops. Well, that *is* a change to Python again, but no semantic change. The internal loop counter will no longer be an integer object, but a mutable integer box. I will just create a one-element integer array and count with its zero element. This is correct, since the stack value isn't popped off, so all alive stack copies share this one element. As a side effect, I save the Object/Integer conversion, so I guess it will be faster. *and* this solution does not involve any other change, since the stack layout is identical to before. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From klm@digicool.com Wed Jul 7 16:40:15 1999 From: klm@digicool.com (Ken Manheimer) Date: Wed, 7 Jul 1999 11:40:15 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> Message-ID: <00e001bec88f$02f45c80$5a57a4d8@erols.com> Hokay. I *think* i have this, and i have a question to followup. First, i think the crucial distinction i needed to make was the fact that the stuff inside the body of the call/cc is evaluated only when the call/cc is initially evaluated. What constitutes the "future" of the continuation is the context immediately following the call/cc expression. Your final example is where that's most apparent for me: Tim presented: > [...] > The continuation of (c 12) includes printing "Tim lied!", but invoking a > continuation *abandons* the current continuation in favor of the invoked > one. Printing "Tim lied!" wasn't part of c's future, so that nasty slur > about Tim never gets executed. But: > > > (define liar #f) > > (begin > (call/cc (lambda (k) > (set! liar k) > (c 12))) > (display "Tim lied!") > (newline)) > 13 > > (liar 666) > Tim lied! > > > > This is why I stick to trivial examples . Though not quite as simple, i think this nailed the distinction for me. (Too bad that i'm probably mistaken:-) In any case, one big unknown for me is the expense of continuations. Just how expensive is squirreling away the future, anyway? (:-) If we're deep in a call stack, seems like there can be a lot of lexical-bindings baggage, plus whatever i-don't-know-how-big-it-is control logic there is/was/will be pending. Does the size of these things (continuations) vary extremely, and is the variation anticipatable? I'm used to some surprises about the depth to which some call or other may go, i don't expect as much uncertainty about my objects - and it seems like continuations directly transform the call depth/complexity into data size/complexity... ?? unfamiliar-territory,how-far-can-i-fall?-ly, Ken klm@digicool.com From tismer@appliedbiometrics.com Wed Jul 7 17:12:22 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 07 Jul 1999 18:12:22 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> <00e001bec88f$02f45c80$5a57a4d8@erols.com> Message-ID: <37837C66.1E5C33B9@appliedbiometrics.com> Ken Manheimer wrote: > > Hokay. I *think* i have this, and i have a question to followup. ... > In any case, one big unknown for me is the expense of continuations. Just > how expensive is squirreling away the future, anyway? (:-) The future costs at most to create *one* extra frame with a copy of the original frame's local stack. By keeping the references to all the other frames which were intact, the real cost is of course bigger, since we keep the whole frame path from this one up to the topmost frame alive. As soon as we drop the handle, everything winds up and vanishes. I also changed the frame refcounting to guarantee exactly that behavior. (before, unwinding was explicitly done). > If we're deep in a call stack, seems like there can be a lot of > lexical-bindings baggage, plus whatever i-don't-know-how-big-it-is control > logic there is/was/will be pending. Does the size of these things > (continuations) vary extremely, and is the variation anticipatable? I'm > used to some surprises about the depth to which some call or other may go, i > don't expect as much uncertainty about my objects - and it seems like > continuations directly transform the call depth/complexity into data > size/complexity... ?? Really, no concern necessary. The state is not saved at all (despite one frame), it is just not dropped. :-) Example: You have some application running, in a nesting level of, say, four function calls. This makes four frames. The bottom function now decides to spawn 10 coroutines in a loop and puts them into an array. Your array now holds 10 continuations, where each one is just one frame, which points back to your frame. Now assume, you are running one of the coroutines/generators/whatever, and this one calls another function "bottom", just to have some scenario. Looking from "bottom", there is just a usual frame chain, now 4+1 frames long. To shorten this: The whole story is nothing more than a tree, where exactly one leaf is active at any time, and its view of the call chain is always linear. Continuation jumps are possible to every other frame in the tree. It now only depends of keeping references to the leaf which you just left or not. If the jump removes the left reference to your current frame, then the according chain will ripple away up to the next branch point. If you held a reference, as you will do with a coroutine to resume it, this chain stays as a possible jump target. for-me-it's-a-little-like-Tarzan-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From klm@digicool.com Wed Jul 7 19:00:56 1999 From: klm@digicool.com (Ken Manheimer) Date: Wed, 7 Jul 1999 14:00:56 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) Message-ID: <613145F79272D211914B0020AFF640191D1BF2@gandalf.digicool.com> Christian wrote: > Really, no concern necessary. The state is not saved at all > (despite one frame), it is just not dropped. :-) I have to say, that's not completely reassuring.-) While little or nothing additional is created, stuff that normally would be quite transient remains around. > To shorten this: The whole story is nothing more than a tree, > where exactly one leaf is active at any time, and its view > of the call chain is always linear. That's wonderful - i particularly like that multiple continuations from the same frame only amount to a single retention of the stack for that frame. My concern is not alleviated, however. My concern is the potential, but often-realized hairiness of computation trees. Eg, looped calls to a function amount to nodes with myriad branches - one for each iteration - and each branch can be an arbitrary computation. If there were a continuation retained each time around the loop, worse, somewhere down the call stack within the loop, you could quickly amass a lot of stuff that would otherwise be reaped immediately. So it seems like use of continuations *can* be surprisingly expensive, with the expense commensurate with, and as hard (or easy) to predict as the call dynamics of the call tree. (Boy, i can see how continuations would be useful for backtracking-style chess algorithms and such. Of course, discretion about what parts of the computation is retained at each branch would probably be an important economy for large computations, while stashing the continuation retains everything...) (It's quite possible that i'm missing something - i hope i'm not being thick headed.) Note that i do not raise this to argue against continuations. In fact, they seem to me to be at least the right conceptual foundation for these advanced control structures (i happen to "like" stream abstractions, which i gather is what generators are). It just seems like it may a concern, something about which people experience with continuations experience (eg, the scheme community) would have some lore - accumulated wisdom... ken klm@digicool.com From da@ski.org Wed Jul 7 23:37:09 1999 From: da@ski.org (David Ascher) Date: Wed, 7 Jul 1999 15:37:09 -0700 (Pacific Daylight Time) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281421591-30373695@hypernet.com> Message-ID: [Tim] > Threads can be very useful purely as a means for algorithm > structuring, due to independent control flows. FWIW, I've been following the coroutine/continuation/generator bit with 'academic' interest -- the CS part of my brain likes to read about them. Prompted by Tim's latest mention of Demo/threads/Generator.py, I looked at it (again?) and *immediately* grokked it and realized how it'd fit into a tool I'm writing. Nothing to do with concurrency, I/O, etc -- just compartmentalization of stateful iterative processes (details too baroque to go over). More relevantly, that tool would be useful on thread-less Python's (well, when it reaches usefulness on threaded Pythons =). Consider me pro-generator, and still agnostic on the co* things. --david From guido@CNRI.Reston.VA.US Thu Jul 8 06:08:44 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 08 Jul 1999 01:08:44 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Tue, 06 Jul 1999 21:52:15 EDT." <000301bec81b$56e87660$c99e2299@tim> References: <000301bec81b$56e87660$c99e2299@tim> Message-ID: <199907080508.BAA00623@eric.cnri.reston.va.us> I have a few questions/suggestions about generators. Tim writes that a suspended generator has exactly one stack frame. I'm not sure I like that. The Demo/thread/Generator.py version has no such restriction; anything that has a reference to the generator can put() the next value. Is the restriction really necessary? I can see a good use for a recursive generator, e.g. one that generates a tree traversal: def inorder(node): if node.left: inorder(node.left) suspend node if node.right: inorder(node.right) If I understand Tim, this could not work because there's more than one stack frame involved. On the other hand, he seems to suggest that something like this *is* allowed when using "modern" coroutines. Am I missing something? I though that tree traversal was one of Tim's first examples of generators; would I really have to use an explicit stack to create the traversal? Next, I want more clarity about the initialization and termination conditions. The Demo/thread/Generator.py version is very explicit about initialization: you instantiate the Generator class, passing it a function that takes a Generator instance as an argument; the function executes in a new thread. (I guess I would've used a different interface now -- perhaps inheriting from the Generator class overriding a run() method.) For termination, the normal way to stop seems to be for the generator function to return (rather than calling g.put()), the consumer then gets an EOFError exception the next time it calls g.get(). There's also a way for either side to call g.kill() to stop the generator prematurely. Let me try to translate that to a threadless implementation. We could declare a simple generator as follows: generator reverse(seq): i = len(seq) while i > 0: i = i-1 suspend seq[i] This could be translated by the Python translator into the following, assuming a system class generator which provides the machinery for generators: class reverse(generator): def run(self, seq): i = len(seq) while i > 0: i = i-1 self.suspend(seq[i]) (Perhaps the identifiers generator, run and suspend would be spelled with __...__, but that's just clutter for now.) Now where Tim was writing examples like this: for c in reverse("Hello world"): print c, print I'd like to guess what the underlying machinery would look like. For argument's sake, let's assume the for loop recognizes that it's using a generator (or better, it always needs a generator, and when it's not a generator it silently implies a sequence-iterating generator). So the translator could generate the following: g = reverse("Hello world") # instantiate class reverse while 1: try: c = g.resume() except EOGError: # End Of Generator break print c, print (Where g should really be a unique temporary local variable.) In this model, the g.resume() and g.suspend() calls have all the magic. They should not be accessible to the user. They are written in C so they can play games with frame objects. I guess that the *first* call to g.resume(), for a particular generator instance, should start the generator's run() method; run() is not activated by the instantiation of the generator. Then run() runs until the first suspend() call, which causes the return from the resume() call to happen. Subsequent resume() calls know that there's already is a frame (it's stored in the generator instance) and simply continue its execution where it was. If the run() method returns from the frame, the resume() call is made to raise EOGError (blah, bogus name) which signals the end of the loop. (The user may write this code explicitly if they want to consume the generated elements in a different way than through a for loop.) Looking at this machinery, I think the recursive generator that I wanted could be made to work, by explicitly declaring a generator subclass (instead of using the generator keyword, which is just syntactic sugar) and making calls to methods of self, e.g.: class inorder(generator): def run(self, node): if node.left: self.run(node.left) self.suspend(node) if node.right: self.run(node.right) The generator machinery would (ab)use the fact that Python frames don't necessarily have to be linked in a strict stack order; the generator gets a pointer to the frame to resume from resume(), and there's a "bottom" frame which, when hit, raises the EOGError exception. All currently active frames belonging to the generator stay alive while another resume() is possible. All this is possible by the introduction of an explicit generator object. I think Tim had an implementation in mind where the standard return pointer in the frame is the only thing necessary; actually, I think the return pointer is stored in the calling frame, not in the called frame (Christian? Is this so in your version?). That shouldn't make a difference, except that it's not clear to me how to reference the frame (in the explicitly coded version, which has to exist at least at the bytecode level). With classic coroutines, I believe that there's no difference between the first call and subsequent calls to the coroutine. This works in the Knuth world where coroutines and recursion don't go together; but at least for generators I would hope that it's possible for multiple instances of the same generator to be active simultaneously (e.g. I could be reversing over a list of files and then reverse each of the lines in the file; this uses separate instances of the reverse() generator). So we need a way to reference the generator instance separately from the generator constructor. The machinery I sketched above solves this. After Tim has refined or rebutted this, I think I'll be able to suggest what to do for coroutines. (I'm still baffled by continuations. The question whether the for saved and restored loop should find itself in the 1st or 5th iteration surprises me. Doesn't this cleanly map into some Scheme code that tells us what to do? Or is it unclear because Scheme does all loops through recursion? I presume that if you save the continuation of the 1st iteration and restore it in the 5th, you'd find yourself in the back 1st iteration? But this is another thread.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Thu Jul 8 06:59:24 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:24 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <613145F79272D211914B0020AFF640191D1BF2@gandalf.digicool.com> Message-ID: <002001bec907$07934a80$1d9e2299@tim> [Christian] > Really, no concern necessary. The state is not saved at all > (despite one frame), it is just not dropped. :-) [Ken] > I have to say, that's not completely reassuring.-) While little or > nothing additional is created, stuff that normally would be quite > transient remains around. I don't think this is any different than that keeping a reference to a class instance alive keeps all the attributes of that object alive, and all the objects reachable from them too, despite that you may never again actually reference any of them. If you save a continuation, the implementation *has* to support your doing anything that's *possible* to do from the saved control-flow state -- and if that's a whole big giant gob o' stuff, that's on you. > ... > So it seems like use of continuations *can* be surprisingly expensive, > with the expense commensurate with, and as hard (or easy) to predict as > the call dynamics of the call tree. > > (Boy, i can see how continuations would be useful for backtracking-style > chess algorithms and such. It comes with the territory, though: backtracking searches are *inherently* expensive and notoriously hard to predict, whether you implement them via continuations, or via clever hand-coded assembler using explicit stacks. The number of nodes at a given depth is typically exponential in the depth, and that kills every approach at shallow levels. Christian posted a reference to an implementation of "amb" in Scheme using continuations, and that's a very cute function: given a list of choices, "amb" guarantees to return (if any such exists) that particular list element that allows the rest of the program to "succeed". So if indeed chess is a forced win for white, amb(["P->KR3", "P->KR4", ...]) as the first line of your chess program will return "the" winning move! Works great in theory . > Of course, discretion about what parts of the computation is retained > at each branch would probably be an important economy for large > computations, while stashing the continuation retains everything...) You bet. But if you're not mucking with exponential call trees-- and, believe me, you're usually not --it's not a big deal. > Note that i do not raise this to argue against continuations. In fact, > they seem to me to be at least the right conceptual foundation for these > advanced control structures (i happen to "like" stream abstractions, > which i gather is what generators are). Generators are an "imperative" flavor of stream, yes, potentially useful whenever you have an abstraction that can deliver a sequence of results (from all the lines in a file, to all the digits of pi). A very common occurrence! Heck, without it, Python's "for x in s:" wouldn't be any fun at all . how-do-i-love-thee?-let-me-generate-the-ways-ly y'rs - tim From tim_one@email.msn.com Thu Jul 8 06:59:15 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:15 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <00e001bec88f$02f45c80$5a57a4d8@erols.com> Message-ID: <001d01bec907$024e6a00$1d9e2299@tim> [Ken Manheimer] > First, i think the crucial distinction i needed to make was the fact that > the stuff inside the body of the call/cc is evaluated only when > the call/cc is initially evaluated. What constitutes the "future" of the > continuation is the context immediately following the call/cc expression. Right! call/cc is short for call-with-current-continuation, and "current" refers to the continuation of call/cc itself. call/cc takes a function as an argument, and passes to it its (call/cc's) *own* continuation. This is maximally clever and maximally confusing at first. Christian has a less clever way of spelling it that's likely to be less confusing too. Note that it has to be a *little* tricky, because the obvious API k = gimme_a_continuation_for_here() doesn't work. The future of "gimme_a_..." includes binding k to the result, so you could never invoke the continuation without stomping on k's binding. k = gimme_a_continuation_for_n_bytecodes_beyond_here(n) could work, but is a bit hard to explain coherently . > ... > In any case, one big unknown for me is the expense of continuations. > Just how expensive is squirreling away the future, anyway? (:-) Christian gave a straight answer, so I'll give you the truth : it doesn't matter provided that you don't pay the price if you don't use it. A more interesting question is how much everyone will pay all the time to support the possibility even if they don't use it. But that question is premature since Chris isn't yet aiming to optimize. Even so, the answer so far appears to be "> 0 but not much". in-bang-for-the-buck-continuations-are-cheap-ly y'rs - tim From tim_one@email.msn.com Thu Jul 8 06:59:18 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:18 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <37835210.F22A7EC9@appliedbiometrics.com> Message-ID: <001e01bec907$03f9a900$1d9e2299@tim> >> Again, though, you never want to muck with continuations directly! >> They're too wild. You get an expert to use them in the bowels of an >> implementation of something else. [Christian] > Maybe with one exception: With careful coding, you can use > a continuation at the head of a very deep recursion and use > it as an early break if the algorithm fails. The effect is > the same as bailing out with an exception, despite the fact > that no "finally" causes would be obeyed. It is just a > incredibly fast jump out of something if you know what > you are doing. You don't need continuations for this, though; e.g., in Icon I've done this often, by making the head of the deep recursion a co-expression, doing the recursion via straight calls, and then doing a coroutine resumption of &main when I want to break out. At that point I set the coexp to &null, and GC reclaims the stack frames (the coexp is no longer reachable from outside) when it feels like it . This is a particularly simple application of coroutines that could be packaged up in a simpler way for its own sake; so, again, while continuations may be used fruitfully under the covers here, there's still no reason to make a poor end user wrestle with them. > ... Well, I admit that the continuation approach is slightly too much > for the coroutine/generator case, It's good that you admit that, because generators alone could have been implemented with a 20-line patch . BTW, I expect that by far the bulk of your changes *still* amount to what's needed for disentangling the C stack, right? The continuation implementation has been subtle, but so far I've gotten the impression that it requires little code beyond that required for stacklessness. > ... > How about "amb"? :-) > (see "teach youself schem in fixnum days, chapter 14 at > http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) That's the point at which I think continuations get insane: it's an unreasonably convoluted implementation of a straightforward (via other means) backtracking framework. In a similar vein, I've read 100 times that continuations can be used to implement a notion of (fake) threads, but haven't actually seen an implementation that wasn't depressingly subtle & long-winded despite being just a feeble "proof of concept". These have the *feeling* of e.g. implementing generators on top of real threads: ya, you can do it, but nobody in their right mind is fooled by it . > About my last problems: > The hard decision is: > - Either I just stop and I'm ready already, and loops are funny. OK by me -- forgetting implementation, I still can't claim to know what's the best semantic here. > - Or I do the hidden register search, which makes things more > complicated and also voidens the pushback trick partially, > since then I would manage all stack stuff in one frame. Bleech. > - Or, and that's what I will do finally: > For now, I will really just correct the loops. > > Well, that *is* a change to Python again, but no semantic change. > The internal loop counter will no longer be an integer object, > but a mutable integer box. I will just create a one-element > integer array and count with its zero element. > This is correct, since the stack value isn't popped off, > so all alive stack copies share this one element. Ah, very clever! Yes, that will fly -- the continuations will share a reference to the value rather than the value itself. Perfect! > As a side effect, I save the Object/Integer conversion, so > I guess it will be faster. *and* this solution does not involve > any other change, since the stack layout is identical to before. Right, no downside at all. Except that Guido will hate it . there's-a-disturbance-in-the-force-ly y'rs - tim From tim_one@email.msn.com Thu Jul 8 07:45:51 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 02:45:51 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <002101bec90d$851a3e40$1d9e2299@tim> I'm out of time for tonight so will just address the first one: [Guido van Rossum] > I have a few questions/suggestions about generators. > > Tim writes that a suspended generator has exactly one stack frame. > I'm not sure I like that. The Demo/thread/Generator.py version has no > such restriction; anything that has a reference to the generator can > put() the next value. Is the restriction really necessary? It can simplify the implementation, and (not coincidentally ) the user's mental model of how they work. > I can see a good use for a recursive generator, e.g. one that generates > a tree traversal: Definitely; in fact, recursive generators are particularly useful in both traversals and enumeration of combinatorial objects (permutations, subsets, and so on). > def inorder(node): > if node.left: inorder(node.left) > suspend node > if node.right: inorder(node.right) > > If I understand Tim, this could not work because there's more than one > stack frame involved. That's right. It would be written like this instead: def inorder(node): if node.left: suspend inorder(node.left) suspend node if node.right: suspend inorder(node.right) Now there may be many instances of the "inorder" generator active (as many as the tree is deep), but each one returns directly to its caller, and all but the bottom-most one is "the caller" wrt the generator it invokes. This implies that "suspend expr" treats expr like a generator in much the same way that "for x in expr" does (or may ...). I realize there's some muddiness in that. > On the other hand, he seems to suggest that something like this *is* > allowed when using "modern" coroutines. Yes, and then your original version can be made to work, delivering its results directly to the ultimate consumer instead of (in effect) crawling up the stack each time there's a result. > Am I missing something? Only that I've been pushing generators for almost a decade, and have always pushed the simplest possible version that's sufficient for my needs. However, every time I've made a micron's progress in selling this notion, it's been hijacked by someone else pushing continuations. So I keep pushing the simplest possible version of generators ("resumable function"), in the hopes that someday somebody will remember they don't need to turn Python inside out to get just that much . [much worth discussion skipped for now] > ... > (I'm still baffled by continuations. Actually not, I think! > The question whether the for saved and restored loop should find itself > in the 1st or 5th iteration surprises me. Doesn't this cleanly map into > some Scheme code that tells us what to do? Or is it unclear because > Scheme does all loops through recursion? Bingo: Scheme has no loops. I can model Python's "for" in Scheme in such a way that the continuation sees the 1st iteration, or the 5th, but neither way is obviously right -- or wrong (they both reproduce Python's behavior in the *absence* of continuations!). > I presume that if you save the continuation of the 1st iteration and > restore it in the 5th, you'd find yourself in the back 1st iteration? > But this is another thread.) The short course here is just that any way I've tried to model Python's "for" in *Python* shares the property of the "while 1:" way I posted: the continuation sees the 5th iteration. And some hours I think it probably should , since the bindings of all the locals it sees will be consistent with the 5th iteration's values but not the 1st's. could-live-with-it-either-way-but-"correct"-is-debatable-ly y'rs - tim From tismer@appliedbiometrics.com Thu Jul 8 15:23:11 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 08 Jul 1999 16:23:11 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <001e01bec907$03f9a900$1d9e2299@tim> Message-ID: <3784B44F.C2F76E8A@appliedbiometrics.com> Tim Peters wrote: ... > This is a particularly simple application of coroutines that could be > packaged up in a simpler way for its own sake; so, again, while > continuations may be used fruitfully under the covers here, there's still no > reason to make a poor end user wrestle with them. Well. def longcomputation(prog, *args, **kw): return quickreturn(prog, args, kw) # prog must be something with return function first arg # quickreturn could be done as so: def quickreturn(prog, args, kw): cont = getpcc() # get parent's continuation def jumpback(val=None, cont=cont): putcc(cont, val) # jump to continuation apply(prog, jumpback, args, kw) # and if they want to jump out, they call jumpback with # an optional return value. Can't help it, it still is continuation-ish. > > ... Well, I admit that the continuation approach is slightly too much > > for the coroutine/generator case, > > It's good that you admit that, because generators alone could have been > implemented with a 20-line patch . BTW, I expect that by far the bulk > of your changes *still* amount to what's needed for disentangling the C > stack, right? The continuation implementation has been subtle, but so far > I've gotten the impression that it requires little code beyond that required > for stacklessness. Right. You will see soon. The only bit which cont's need more than coro's is to save more than one stack state for a frame. So, basically, it is just the frame copy operation. If I was heading just for coroutines, then I could save that, but then I need to handle special cases like exception, what to do on return, and so on. Easier to do that one stuff once right. Then I will never dump code for an unforeseen coro-effect, since with cont's, I *may* jump in and bail out wherever I want or don't want. The special cases come later and will be optimized, and naturally they will reduce themselves to what's needed. Example: If I just want to switch to a different coro, I just have to swap two frames. This leads to a data structure which can hold a frame and exchange it with another one. The cont-implementation does something like fetch my current continuation # and this does the frame copy stuff save into local state variable fetch cont from other coro's local state variable jump to new cont Now, if the source and target frames are guaranteed to be different, and if the source frame has no dormant extra cont attached, then it is safe to merge the above steps into one operation, without the need to save local state. In the end, two coro's will jump to each other by doing nothing more than this. Exactly that is what Sam's prototype does right now. WHat he's missing is treatment of the return case. If a coro returns towards the place where it was forked off, then we want to have a cont which is able to handle it properly. That's why exceptions work fine with my stuff: You can put one exceptionhandler on top of all your coroutines which you create. It works without special knowledge of coroutines. After I realized that, I knew the way to go. > > > ... > > How about "amb"? :-) > > (see "teach youself schem in fixnum days, chapter 14 at > > http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) > > That's the point at which I think continuations get insane: it's an > unreasonably convoluted implementation of a straightforward (via other > means) backtracking framework. In a similar vein, I've read 100 times that > continuations can be used to implement a notion of (fake) threads, but > haven't actually seen an implementation that wasn't depressingly subtle & > long-winded despite being just a feeble "proof of concept". Maybe this is a convoluted implementation. But the principle? Return a value to your caller, but stay able to continue and do this again. Two continuations, and with the optimizations from above, it will be nothing. I will show you the code in a few, and you will realize that we are discussing the empty set. The frames have to be used, and the frames are already continuations. Only if they can be reached twice, they will have to be armed for that. Moving back to my new "more code - less words" principle. [mutable ints as loop counters] > Ah, very clever! Yes, that will fly -- the continuations will share a > reference to the value rather than the value itself. Perfect! Actually I'm copying some code out of Marc's counterobject which is nothing more than a mutable integer and hide it in ceval.c, since that doesn't introduce another module for a thing which isn't needed elsewhere, after Guido's hint. Better than to use the array module which doesn't publish its internals and might not always be linked in. > Right, no downside at all. Except that Guido will hate it . I made sure that this is what he hates the lest. off-for-coding-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one@email.msn.com Fri Jul 9 08:47:36 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 9 Jul 1999 03:47:36 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <000c01bec9df$4f935c20$c49e2299@tim> Picking up where we left off, I like Guido's vision of generators fine. The "one frame" version I've described is in fact what Icon provides, and what Guido is doing requires using coroutines instead in that language. Guido's is more flexible, and I'm not opposed to that . OTOH, I *have* seen many a person (including me!) confused by the semantics of coroutines in Icon, so I don't know how much of the additional flexibility converts into additional confusion. One thing I am sure of: having debated the fine points of continuations recently, I'm incapable of judging it harshly today <0.5 wink>. > ... > def inorder(node): > if node.left: inorder(node.left) > suspend node > if node.right: inorder(node.right) The first thing that struck me there is that I'm not sure to whom the suspend transfers control. In the one-frame flavor of generator, it's always to the caller of the function that (lexically) contains the "suspend". Is it possible to keep this all straight if the "suspend" above is changed to e.g. pass_it_back(node) where def pass_it_back(x): suspend x ? I'm vaguely picturing some kind of additional frame state, a pointer to the topmost frame that's "expecting" to receive a suspend. (I see you resolve this in a different way later, though.) > ... > I thought that tree traversal was one of Tim's first examples of > generators; would I really have to use an explicit stack to create > the traversal? As before, still no , but the one-frame version does require an unbroken *implicit* chain back to the intended receiver, with an explicit "suspend" at every step back to that. Let me rewrite the one-frame version in a way that assumes less semantics from "suspend", instead building on the already-assumed new smarts in "for": def inorder(node): if node: for child in inorder(node.left): suspend child suspend node for child in inorder(node.right): suspend child I hope this makes it clearer that the one-frame version spawns two *new* generators for every non-None node, and in purely stack-like fashion (both "recursing down" and "suspending up"). > Next, I want more clarity about the initialization and termination > conditions. Good idea. > The Demo/thread/Generator.py version is very explicit about > initialization: you instantiate the Generator class, passing it a > function that takes a Generator instance as an argument; the function > executes in a new thread. (I guess I would've used a different > interface now -- perhaps inheriting from the Generator class > overriding a run() method.) I would change my coroutine implementation similarly. > For termination, the normal way to stop seems to be for the generator > function to return (rather than calling g.put()), the consumer then gets > an EOFError exception the next time it calls g.get(). There's also a > way for either side to call g.kill() to stop the generator prematurely. A perfectly serviceable interface, but "feels clumsy" in comparison to normal for loops and e.g. reading lines from a file, where *visible* exceptions aren't raised at the end. I expect most sequences to terminate before I do , so (visible) try/except isn't the best UI here. > Let me try to translate that to a threadless implementation. We could > declare a simple generator as follows: > > generator reverse(seq): > i = len(seq) > while i > 0: > i = i-1 > suspend seq[i] > > This could be translated by the Python translator into the following, > assuming a system class generator which provides the machinery for > generators: > > class reverse(generator): > def run(self, seq): > i = len(seq) > while i > 0: > i = i-1 > self.suspend(seq[i]) > > (Perhaps the identifiers generator, run and suspend would be spelled > with __...__, but that's just clutter for now.) > > Now where Tim was writing examples like this: > > for c in reverse("Hello world"): > print c, > print > > I'd like to guess what the underlying machinery would look like. For > argument's sake, let's assume the for loop recognizes that it's using > a generator (or better, it always needs a generator, and when it's not > a generator it silently implies a sequence-iterating generator). In the end I expect these concepts could be unified, e.g. via a new class __iterate__ method. Then for i in 42: could fail simply because ints don't have a value in that slot, while lists and tuples could inherit from SequenceIterator, pushing the generation of the index range into the type instead of explicitly constructed by the eval loop. > So the translator could generate the following: > > g = reverse("Hello world") # instantiate class reverse > while 1: > try: > c = g.resume() > except EOGError: # End Of Generator > break > print c, > print > > (Where g should really be a unique temporary local variable.) > > In this model, the g.resume() and g.suspend() calls have all the magic. > They should not be accessible to the user. This seems at odds with the later: > (The user may write this code explicitly if they want to consume the > generated elements in a different way than through a for loop.) Whether it's at odds or not, I like the latter better. When the machinery is clean & well-designed, expose it! Else in 2002 we'll be subjected to a generatorhacks module . > They are written in C so they can play games with frame objects. > > I guess that the *first* call to g.resume(), for a particular > generator instance, should start the generator's run() method; run() > is not activated by the instantiation of the generator. This can work either way. If it's more convenient to begin run() as part of instantiation, the code for run() can start with an equivalent of if self.first_time: self.first_time = 0 return where self.first_time is set true by the constructor. Then "the frame" will exist from the start. The first resume() will skip over that block and launch into the code, while subsequent resume()s will never even see this block: almost free. > Then run() runs until the first suspend() call, which causes the return > from the resume() call to happen. Subsequent resume() calls know that > there's already is a frame (it's stored in the generator instance) and simply > continue its execution where it was. If the run() method returns from > the frame, the resume() call is made to raise EOGError (blah, bogus > name) which signals the end of the loop. (The user may write this > code explicitly if they want to consume the generated elements in a > different way than through a for loop.) Yes, that parenthetical comment bears repeating . > Looking at this machinery, I think the recursive generator that I > wanted could be made to work, by explicitly declaring a generator > subclass (instead of using the generator keyword, which is just > syntactic sugar) and making calls to methods of self, e.g.: > > class inorder(generator): > def run(self, node): > if node.left: self.run(node.left) > self.suspend(node) > if node.right: self.run(node.right) Going way back to the top, this implies the def pass_it_back(x): suspend x indirection couldn't work -- unless pass_it_back were also a method of inorder. Not complaining, just trying to understand. Once you generalize, it's hard to know when to stop. > The generator machinery would (ab)use the fact that Python frames > don't necessarily have to be linked in a strict stack order; If you call *this* abuse, what words remain to vilify what Christian is doing ? > the generator gets a pointer to the frame to resume from resume(), Ah! That addresses my first question. Are you implicitly assuming a "stackless" eval loop here? Else resuming the receiving frame would appear to push another C stack frame for each value delivered, ever deeper. The "one frame" version of generators doesn't have this headache (since a suspend *returns* to its immediate caller there -- it doesn't *resume* its caller). > and there's a "bottom" frame which, when hit, raises the EOGError > exception. Although desribed at the end, this is something set up at the start, right? To trap a plain return from the topmost invocation of the generator. > All currently active frames belonging to the generator stay alive > while another resume() is possible. And those form a linear chain from the most-recent suspend() back to the primal resume(). Which appears to address an earlier issue not brought up in this message: this provides a well-defined & intuitively clear path for exceptions to follow, yes? I'm not sure about coroutines, but there's something wrong with a generator implementation if the guy who kicks it off can't see errors raised by the generator's execution! This doesn't appear to be a problem here. > All this is possible by the introduction of an explicit generator > object. I think Tim had an implementation in mind where the standard > return pointer in the frame is the only thing necessary; actually, I > think the return pointer is stored in the calling frame, not in the > called frame What I've had in mind is what Majewski implemented 5 years ago, but lost interest in because it couldn't be extended to those blasted continuations . The called frame points back to the calling frame via f->f_back (of course), and I think that's all the return info the one-frame version needs. I expect I'm missing your meaning here. > (Christian? Is this so in your version?). That shouldn't make a > difference, except that it's not clear to me how to reference the frame > (in the explicitly coded version, which has to exist at least at the > bytecode level). "The" frame being which frame specifically, and refrenced from where? Regardless, it must be solvable, since if Christian can (& he thinks he can, & I believe him ) expose a call/cc variant, the generator class could be coded entirely in Python. > With classic coroutines, I believe that there's no difference between > the first call and subsequent calls to the coroutine. This works in > the Knuth world where coroutines and recursion don't go together; That's also a world where co-transfers are implemented via funky self-modifying assembler, custom-crafted for the exact number of coroutines you expect to be using -- I don't recommend Knuth as a guide to *implementing* these beasts <0.3 wink>. That said, yes, provided the coroutines objects all exist, there's nothing special about the first call. About "provided that": if your coroutine objects A and B have "run" methods, you dare not invoke A.run() before B has been constructed (else the first instance of B.transfer() in A chokes -- there's no object to transfer *to*). So, in practice, I think instantiation is still divorced from initiation. One possibility is to hide all that in a cobegin(list_of_coroutine_classes_to_instantiate_and_run) function. But then naming the instances is a puzzle. > but at least for generators I would hope that it's possible for multiple > instances of the same generator to be active simultaneously (e.g. I > could be reversing over a list of files and then reverse each of the > lines in the file; this uses separate instances of the reverse() > generator). Since that's the trick the "one frame" generators *rely* on for recursion, it's surely not a problem in your stronger version. Note that my old coroutine implementation did allow for multiple instances of a coroutine, although the examples posted with it didn't illustrate that. The weakness of coroutines in practice is (in my experience) the requirement that you *name* the target of a transfer. This is brittle; e.g., in the pipeline example I posted, each stage had to know the names of the stages on either side of it. By adopting a target.transfer(optional_value) primitive it's possible to *pass in* the target object as an argument to the coroutine doing the transfer. Then "the names" are all in the setup, and don't pollute the bodies of the coroutines (e.g., each coroutine in the pipeline example could have arguments named "stdin" and "stdout"). I haven't seen a system that *does* this, but it's so obviously the right thing to do it's not worth saying any more about . > So we need a way to reference the generator instance separately from > the generator constructor. The machinery I sketched above solves this. > > After Tim has refined or rebutted this, I think I'll be able to > suggest what to do for coroutines. Please do. Whether or not it's futile, it's fun . hmm-haven't-had-enough-of-that-lately!-ly y'rs - tim From tismer@appliedbiometrics.com Fri Jul 9 13:22:05 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 09 Jul 1999 14:22:05 +0200 Subject: [Python-Dev] Generator details References: <000301bec81b$56e87660$c99e2299@tim> <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <3785E96D.A1641530@appliedbiometrics.com> Guido van Rossum wrote: [snipped all what's addressed to Tim] > All this is possible by the introduction of an explicit generator > object. I think Tim had an implementation in mind where the standard > return pointer in the frame is the only thing necessary; actually, I > think the return pointer is stored in the calling frame, not in the > called frame (Christian? Is this so in your version?). That > shouldn't make a difference, except that it's not clear to me how to > reference the frame (in the explicitly coded version, which has to > exist at least at the bytecode level). No, it isn't. It is still as it was. I didn't change the frame machinery at all. The callee finds his caller in its f_back field. [...] > (I'm still baffled by continuations. The question whether the for > saved and restored loop should find itself in the 1st or 5th iteration > surprises me. Doesn't this cleanly map into some Scheme code that > tells us what to do? Or is it unclear because Scheme does all loops > through recursion? I presume that if you save the continuation of the > 1st iteration and restore it in the 5th, you'd find yourself in the > back 1st iteration? But this is another thread.) In Scheme, Python's for-loop would be a tail-recursive expression, it would especially be its own extra lambda. Doesn't fit. Tim is right when he says that Python isn't Scheme. Yesterday I built your suggested change to for-loops, and it works fine. By turning the loop counter into a mutable object, every reference to it shares the current value, and it behaves like Tim pointed out it should. About Tims reply to this post: [Gui-do] > The generator machinery would (ab)use the fact that Python frames > don't necessarily have to be linked in a strict stack order; [Tim-bot] If you call *this* abuse, what words remain to vilify what Christian is doing ? As a matter of fact, I have been thinking quite long about this *abuse*. At the moment I do not do this. The frame stack becomes a frame tree, and you can jump like Tarzan from leaf to leaf, but I never change the order. Perhaps this can make sense too, but this is curently where *my* brain explodes. Right now I'm happy that there is *always* a view of the top level, and an exception always knows where to wind up. Form that point of view, I'm even more conservative than Guido (above) and Sam (replacing whole frame chains). In a sense, since I don't change the frame chain but only change the current frame, this is like a functional way to use weak references. The continuation approach is to build new paths in a tree, and loose those which are unreachable. Modifying the tree is not part of my model at the moment. This may be interesting to study after we know everything about this tree and wee need even more freedom. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido@CNRI.Reston.VA.US Sat Jul 10 15:28:13 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 10 Jul 1999 10:28:13 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Fri, 09 Jul 1999 14:22:05 +0200." <3785E96D.A1641530@appliedbiometrics.com> References: <000301bec81b$56e87660$c99e2299@tim> <199907080508.BAA00623@eric.cnri.reston.va.us> <3785E96D.A1641530@appliedbiometrics.com> Message-ID: <199907101428.KAA04364@eric.cnri.reston.va.us> [Christian] > The frame stack becomes a frame tree, and you can jump like Tarzan > from leaf to leaf [...]. Christian, I want to kiss you! (OK, just a hug. We're both Europeans. :-) This one remark suddenly made me understand much better what continuations do -- it was the one missing piece of insight I still needed after Tim's explanation and skimming the Scheme tutorial a bit. I'll have to think more about the consequences but this finally made me understand better how to interpreter the mysterious words ``the continuation represents "the rest of the program"''. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Sat Jul 10 16:48:43 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 10 Jul 1999 11:48:43 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Fri, 09 Jul 1999 03:47:36 EDT." <000c01bec9df$4f935c20$c49e2299@tim> References: <000c01bec9df$4f935c20$c49e2299@tim> Message-ID: <199907101548.LAA04399@eric.cnri.reston.va.us> I've been thinking some more about Tim's single-frame generators, and I think I understand better how to implement them now. (And yes, it was a mistake of me to write that the suspend() and resume() methods shouldn't be accessible to the user! Also thanks for the clarification of how to write a recursive generator.) Let's say we have a generator function like this: generator reverse(l): i = len(l) while i > 0: i = i-1 suspend l[i] and a for loop like this: for i in reverse(range(10)): print i What is the expanded version of the for loop? I think this will work: __value, __frame = call_generator(reverse, range(10)) while __frame: i = __value # start of original for loop body print i # end of original for loop body __value, __frame = resume_frame(__frame) (Note that when the original for loop body contains 'continue', this should jump to the resume_frame() call. This is just pseudo code.) Now we must define two new built-in functions: call_generator() and resume_frame(). - call_generator() is like apply() but it returns a pair (result, frame) where result is the function result and frame is the frame, *if* the function returned via suspend. If it returned via return, call_generator() returns None for the frame. - resume_frame() does exactly what its name suggests. It has the same return convention as call_generator(). Note that the for loop throws away the final (non-suspend) return value of the generator -- this just signals the end of the loop. How to translate the generator itself? I've come up with two versions. First version: add a new bytecode SUSPEND, which does the same as RETURN but also marks the frame as resumable. call_generator() then calls the function using a primitive which allows it to specify the frame (e.g. a variant of eval_code2 taking a frame argument). When the call returns, it looks at the resumable bit of the frame to decode whether to return (value, frame) or (value, None). resume_frame() simply marks the frame as non-resumable and continues its execution; upon return it does the same thing as call_generator(). Alternative translation version: introduce a new builtin get_frame() which returns the current frame. The statement "suspend x" gets translated to "return x, get_frame()" and the statement "return x" (including the default "return None" at the end of the function) gets translated to "return x, None". So our example turns into: def reverse(l): i = len(l) while i > 0: i = i-1 return l[i], get_frame() return None, None This of course means that call_generator() can be exactly the same as apply(), and in fact we better get rid of it, so the for loop translation becomes: __value, __frame = reverse(range(10)) while __frame: ...same as before... In a real implementation, get_frame() could be a new bytecode; but it doesn't have to be (making for easier experimentation). (get_frame() makes a fine builtin; there's nothing inherently dangerous to it, in fact people get it all the time, currently using horrible hacks!). I'm not sure which is better; the version without call_generator() allows you to create your own generator without using the 'generator' and 'suspend' keywords, calling get_frame() explicitly. Loose end: what to do when there's a try/finally around a suspend? E.g. generator foo(l): try: for i in l: suspend i+1 finally: print "Done" The second translation variant would cause "Done" to be printed on each suspend *and* on the final return. This is confusing (and in fact I think resuming the frame would be a problem since the return breaks down the try-finally blocks). So I guess the SUSPEND bytecode is a better implementation -- it can suspend the frame without going through try-finally clauses. Then of course we create another loose end: what if the for loop contains a break? Then the frame will never be resumed and its finally clause will never be executed! This sounds bad. Perhaps the destructor of the frame should look at the 'resumable' bit and if set, resume the frame with a system exception, "Killed", indicating an abortion? (This is like the kill() call in Generator.py.) We can increase the likelihood that the frame's desctructor is called at the expected time (right when the for loop terminates), by deleting __frame at the end of the loop. If the resumed frame raises another exception, we ignore it. Its return value is ignored. If it suspends itself again, we resume it with the "Killed" exception again until it dies (thoughts of the Blank Knight come to mind). I am beginning to like this idea. (Not that I have time for an implementation... But it could be done without Christian's patches.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Sat Jul 10 22:09:48 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 10 Jul 1999 17:09:48 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907101428.KAA04364@eric.cnri.reston.va.us> Message-ID: <000501becb18$8ae0e240$e69e2299@tim> [Christian] > The frame stack becomes a frame tree, and you can jump like Tarzan > from leaf to leaf [...]. [Guido] > Christian, I want to kiss you! (OK, just a hug. We're both > Europeans. :-) Not in America, pal -- the only male hugging allowed here is in the two seconds after your team wins the Superbowl -- and even then only so long as you haven't yet taken off your helmets. > This one remark suddenly made me understand much better what > continuations do -- it was the one missing piece of insight I still > needed after Tim's explanation and skimming the Scheme tutorial a bit. It's an insight I was missing too -- continuations are often *invoked* in general directed-graph fashion, and before Christian said that I hadn't realized the *implementation* never sees anything worse than a tree. So next time I see Christian, I'll punch him hard in the stomach, and mumble "good job" loudly enough so that he hears it, but indistinctly enough so I can plausibly deny it in case any other guy overhears us. *That's* the American Way . first-it's-hugging-then-it's-song-contests-ly y'rs - tim From MHammond@skippinet.com.au Sun Jul 11 01:52:22 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Sun, 11 Jul 1999 10:52:22 +1000 Subject: [Python-Dev] Win32 Extensions Registered Users Message-ID: <003c01becb37$a369c3d0$0801a8c0@bobcat> Hi all, As you may or may not have noticed, I have recently begun offering a Registered Users" program where people who use my Windows extensions can pay $50.00 per 2 years, and get a range of benefits. The primary benefits are: * Early access to binary versions. * Registered Users only mailing list (very low volume to date) * Better support from me. The last benefit really isnt to this list - anyone here will obviously get (and hopefully does get) a pretty good response should they need to mail me. The early access to binary versions may be of interest. As everyone on this list spends considerable and worthwhile effort helping Python, I would like to offer everyone here a free registration. If you would like to take advantage, just send me a quick email. I will email you the "top secret" location of the Registered Users page (where the very slick and very new Pythonwin can be found). Also, feel free to join the registered users mailing list at http://mailman.pythonpros.com/mailman/listinfo/win32-reg-users. This is low volume, and once volume does increase an announce list will be created, so you can join without fear of more swamping of your mailbox. And just FYI, I am very pleased with the registration process to date. In about 3 weeks I have around 20 paid users! If I can keep that rate up I will be very impressed (although that already looks highly unlikely :-) Even still, I consider it going well. Mark. From tim_one@email.msn.com Sun Jul 11 20:49:57 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 11 Jul 1999 15:49:57 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Message-ID: <000201becbd6$8df90660$569e2299@tim> [David Ascher] > FWIW, I've been following the coroutine/continuation/generator bit with > 'academic' interest -- the CS part of my brain likes to read about them. > Prompted by Tim's latest mention of Demo/threads/Generator.py, I looked at > it (again?) and *immediately* grokked it and realized how it'd fit into a > tool I'm writing. Nothing to do with concurrency, I/O, etc -- just > compartmentalization of stateful iterative processes (details too baroque > to go over). "stateful iterative process" is a helpful characterization of where these guys can be useful! State captured in variables is the obvious one, but simply "where you are" in a mass of nested loops and conditionals is also "state" -- and a kind of state especially clumsy to encode as data state instead (ever rewrite a hairy recursive routine to use iteration with an explicit stack? it's a transformation that can be mechanized, but the result is usually ugly & often hard to understand). Once it sinks in that it's *possible* to implement a stateful iterative process in this other way, I think you'll find examples popping up all over the place. > More relevantly, that tool would be useful on thread-less > Python's (well, when it reaches usefulness on threaded Pythons =). As Guido pointed out, the API provided by Generator.py is less restrictive than any that can be built with the "one frame" flavor of generator ("resumable function"). Were you able to make enough sense of the long discussion that ensued to guess whether the particular use you had in mind required Generator.py's full power? If you couldn't tell, post the baroque details & I'll tell you . not-putting-too-fine-a-point-on-possible-vs-natural-ly y'rs - tim From da@ski.org Sun Jul 11 21:14:04 1999 From: da@ski.org (David Ascher) Date: Sun, 11 Jul 1999 13:14:04 -0700 (Pacific Daylight Time) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000201becbd6$8df90660$569e2299@tim> Message-ID: On Sun, 11 Jul 1999, Tim Peters wrote: > As Guido pointed out, the API provided by Generator.py is less restrictive > than any that can be built with the "one frame" flavor of generator > ("resumable function"). Were you able to make enough sense of the long > discussion that ensued to guess whether the particular use you had in mind > required Generator.py's full power? If you couldn't tell, post the baroque > details & I'll tell you . I'm pretty sure the use I mentioned would fit in even the simplest version of a generator. As to how much sense I made of the discussion, let's just say I'm glad there's no quiz at the end. I did shudder at the mention of unmentionables (male public displays of affection -- yeaach!), yodel at the mention of Lord Greystoke swinging among stack branches and chuckled at the vision of him being thrown back in a traceback (ouch! ouch! ouch!, "most painful last"...). --david From tim_one@email.msn.com Mon Jul 12 03:26:44 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 11 Jul 1999 22:26:44 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907101548.LAA04399@eric.cnri.reston.va.us> Message-ID: <000001becc0d$fcb64f40$229e2299@tim> [Guido, sketches 112 ways to implement one-frame generators today ] I'm glad you're having fun too! I won't reply in detail here; it's enough for now to happily agree that adding a one-frame generator isn't much of a stretch for the current implementation of the PVM. > Loose end: what to do when there's a try/finally around a suspend? > E.g. > > generator foo(l): > try: > for i in l: > suspend i+1 > finally: > print "Done" > > The second translation variant would cause "Done" to be printed on > each suspend *and* on the final return. This is confusing (and in > fact I think resuming the frame would be a problem since the return > breaks down the try-finally blocks). There are several things to be said about this: + A suspend really can't ever go thru today's normal "return" path, because (among other things) that wipes out the frame's value stack! while (!EMPTY()) { v = POP(); Py_XDECREF(v); } A SUSPEND opcode would let it do what it needs to do without mixing that into the current return path. So my answer to: > I'm not sure which is better; the version without call_generator() > allows you to create your own generator without using the 'generator' > and 'suspend' keywords, calling get_frame() explicitly. is "both" : get_frame() is beautifully clean, but it still needs something like SUSPEND to keep everything straight. Maybe this just amounts to setting "why" to a new WHY_SUSPEND and sorting it all out after the eval loop; OTOH, that code is pretty snaky already. + I *expect* the example code to print "Done" len(l)+1 times! The generator mechanics are the same as the current for/__getitem__ protocol in this respect: if you have N items to enumerate, the enumeration routine will get called N+1 times, and that's life. That is, the fact is that the generator "gets to" execute code N+1 times, and the only reason your original example seems surprising at first is that it doesn't happen to do anything (except exit the "try" block) on the last of those times. Change it to generator foo(l): try: for i in l: suspend i+1 cleanup() # new line finally: print "Done" and then you'd be surprised *not* to see "Done" printed len(l)+1 times. So I think the easiest thing is also the right thing in this case. OTOH, the notion that the "finally" clause should get triggered at all the first len(l) times is debatable. If I picture it as a "resumable function" then, sure, it should; but if I picture the caller as bouncing control back & forth with the generator, coroutine style, then suspension is a just a pause in the generator's execution. The latter is probably the more natural way to picture it, eh? Which feeds into: > Then of course we create another loose end: what if the for loop > contains a break? Then the frame will never be resumed and its > finally clause will never be executed! This sounds bad. Perhaps the > destructor of the frame should look at the 'resumable' bit and if set, > resume the frame with a system exception, "Killed", indicating an > abortion? (This is like the kill() call in Generator.py.) We can > increase the likelihood that the frame's desctructor is called at the > expected time (right when the for loop terminates), by deleting > __frame at the end of the loop. If the resumed frame raises another > exception, we ignore it. Its return value is ignored. If it suspends > itself again, we resume it with the "Killed" exception again until it > dies (thoughts of the Blank Knight come to mind). This may leave another loose end : what if the for loop doesn't contain a break, but dies because of an exception in some line unrelated to the generator? Or someone has used an explicit get_frame() in any case and that keeps a ref to the frame alive? If the semantic is that the generator must be shut down no matter what, then the invoker needs code more like value, frame = generator(args) try: while frame: etc value, frame = resume_frame(frame) finally: if frame: shut_frame_down(frame) OTOH, the possibility that someone *can* do an explicit get_frame suggests that "for" shouldn't assume it's the master of the universe . Perhaps the user's intent was to generate the first 100 values in a for loop, then break out, analyze the results, and decide whether to resume it again by hand (I've done stuff like that ...). So there's also a case to be made for saying that a "finally" clause wrapping a generator body will only be executed if the generator body raises an exception or the generator itself decides it's done; i.e. iff it triggers while the generator is actively running. Just complicating things there . It actually sounds pretty good to raise a Killed exception in the frame destructor! The destructor has to do *something* to trigger the code that drains the frame's value stack anyway, "finally" blocks or not (frame_dealloc doesn't do that now, since there's currently no way to get out of eval_code2 with a non-empty stack). > ... > I am beginning to like this idea. (Not that I have time for an > implementation... But it could be done without Christian's patches.) Or with them too . If stuff is implemented via continuations, the same concerns about try/finally blocks pop up everywhere a continuation is invoked: you (probably) leave the current frame, and may or may not ever come back. So if there's a "finally" clause pending and you don't ever come back, it's a surprise there too. So while you thought you were dealing with dirt-simple one-frame generators, you were *really* thinking about how to make general continuations play nice . solve-one-mystery-and-you-solve-'em-all-ly y'rs - tim From guido@CNRI.Reston.VA.US Mon Jul 12 04:01:04 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Sun, 11 Jul 1999 23:01:04 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Sun, 11 Jul 1999 22:26:44 EDT." <000001becc0d$fcb64f40$229e2299@tim> References: <000001becc0d$fcb64f40$229e2299@tim> Message-ID: <199907120301.XAA06001@eric.cnri.reston.va.us> [Tim seems to be explaining why len(l)+1 and not len(l) -- but I was really thinking about len(l)+1 vs. 1.] > OTOH, the notion that the "finally" clause should get triggered at all the > first len(l) times is debatable. If I picture it as a "resumable function" > then, sure, it should; but if I picture the caller as bouncing control back > & forth with the generator, coroutine style, then suspension is a just a > pause in the generator's execution. The latter is probably the more natural > way to picture it, eh? *This* is what I was getting at, and it points in favor of a SUSPEND opcode since I don't know how to do that in the multiple-return. As you point out, there can be various things on the various in-frame stacks (value stack and block stack) that all get discarded by a return, and that no restart_frame() can restore (unless get_frame() returns a *copy* of the frame, which seems to be defeating the purpose). > OTOH, the possibility that someone *can* do an explicit get_frame suggests > that "for" shouldn't assume it's the master of the universe . Perhaps > the user's intent was to generate the first 100 values in a for loop, then > break out, analyze the results, and decide whether to resume it again by > hand (I've done stuff like that ...). So there's also a case to be made for > saying that a "finally" clause wrapping a generator body will only be > executed if the generator body raises an exception or the generator itself > decides it's done; i.e. iff it triggers while the generator is actively > running. Hmm... I think that if the generator is started by a for loop, it's okay for the loop to assume it is the master of the universe -- just like there's no force in the world (apart from illegal C code :) that can change the hidden loop counter in present-day for loop. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Mon Jul 12 04:36:05 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Sun, 11 Jul 1999 23:36:05 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Sun, 11 Jul 1999 15:49:57 EDT." <000201becbd6$8df90660$569e2299@tim> References: <000201becbd6$8df90660$569e2299@tim> Message-ID: <199907120336.XAA06056@eric.cnri.reston.va.us> [Tim] > "stateful iterative process" is a helpful characterization of where these > guys can be useful! State captured in variables is the obvious one, but > simply "where you are" in a mass of nested loops and conditionals is also > "state" -- and a kind of state especially clumsy to encode as data state > instead (ever rewrite a hairy recursive routine to use iteration with an > explicit stack? it's a transformation that can be mechanized, but the > result is usually ugly & often hard to understand). This is another key description of continuations (maybe not quite worth a hug :). The continuation captures exactly all state that is represented by "position in the program" and no state that is represented by variables. But there are many hairy details. In antiquated assembly, there might not be a call stack, and a continuation could be represented by a single value: the program counter. But now we have a call stack, a value stack, a block stack (in Python) and who knows what else. I'm trying to understand whether we can get away with saving just a pointer to a frame, whether we need to copy the frame, or whether we need to copy the entire frame stack. (In regular Python, the frame stack also contains local variables. These are explicitly exempted from being saved by a continuation. I don't know how Christian does this, but I presume he uses the dictionary which can be shared between frames.) Let's see... Say we have this function: def f(x): try: return 1 + (-x) finally: print "boo" The bytecode (simplified) looks like: SETUP_FINALLY (L1) LOAD_CONST (1) LOAD_FAST (x) UNARY_NEGATIVE BINARY_ADD RETURN_VALUE L1: LOAD_CONST ("boo") PRINT_ITEM PRINT_NEWLINE END_FINALLY Now suppose that the unary minus operator saves its continuation (e.g. because x was a type with a __neg__ method). At this point there is an entry on the block stack pointing to L1 as the try-finally block, and the value stack has the value 1 pushed on it. Clearly if that saved continuation is ever invoked (called? used? activated? What do you call what you do to a continuation?) it should substitute whatever value was passed into the continuation for the result of the unary minus, and the program should continue by pushing it on top of the value stack, adding it to 1, and returning the result, executing the block of code at L1 on the way out. So clearly when the continuation is used, 1 should be on the value stack and L1 should be on trh block stack. Assuming that the unary minus function initially returns just fine, the value stack and the block stack of the frame will be popped. So I conclude that saving a continuation must save at least the value and block stack of the frame being saved. Is it safe not to save the frame and block stacks of frames further down on the call stack? I don't think so -- these are all destroyed when frames are popped off the call stack (even if the frame is kept alive, its value and block stack are always empty when the function has returned). So I hope that Christian has code that saves the frame and block stacks! (It would be fun to try and optimize this by doing it lazily, so that frames which haven't returned yet aren't copied yet.) How does Scheme do this? I don't know if it has something like the block stack, but surely it has a value stack! Still mystified, --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Mon Jul 12 08:03:59 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 12 Jul 1999 03:03:59 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <000201becc34$b79f7900$9b9e2299@tim> [Guido wonders about continuations -- must be a bad night for sleep ] Paul Wilson's book-in-progress has a (large) page of HTML that you can digest quickly and that will clear up many mysteries: ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_142.html Scheme may be the most-often implemented language on Earth (ask 100 Schemers what they use Scheme for, persist until you get the truth, and 81 will eventually tell you that mostly they putz around writing their own Scheme interpreter <0.51 wink>); so there are a *lot* of approaches out there. Wilson describes a simple approach for a compiler. A key to understanding it is that continuations aren't "special" in Scheme: they're the norm. Even plain old calls are set up by saving the caller's continuation, then handing control to the callee. In Wilson's approach, "the eval stack" is a globally shared stack, but at any given moment contains only the eval temps relevant to the function currently executing. In preparation for a call, the caller saves away its state in "a continuation", a record which includes: the current program counter a pointer to the continuation record it inherited a pointer to the structure supporting name resolution (locals & beyond) the current eval stack, which gets drained (emptied) at this point There isn't anything akin to Python's block stack (everything reduces to closures, lambdas and continuations). Note: the continuation is immutable; once constructed, it's never changed. Then the callees' arguments are pushed on the eval stack, a pointer to the continuation as saved above is stored in "the continuation register", and control is transferred to the callee. Then a function return is exactly the same operation as "invoking a continuation": whatever is in the continuation register at the time of the return/invoke is dereferenced, and the PC, continuation register, env pointer and eval stack values are copied out of the continuation record. The return value is passed back in another "virtual register", and pushed onto the eval stack first thing after the guts of the continuation are restored. So this copies the eval stack all the time, at every call and every return/invoke. Kind of. This is partly why "tail calls" are such a big deal in Scheme: a tail call need not (*must* not, in std Scheme) create a new continuation. The target of a tail call simply inherits the continuation pointer inherited by its caller. Of course many Scheme implementations optimize beyond this. > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. In the absence of tail calls, the approach above saves the stack on every call and restores it on every return, so there's no "extra" copying needed when capturing, or invoking, a continuation (cold comfort, I agree ). About Christian's code, we'd better let it speak for itself -- I'm not clear on the details of what he's doing today. Generalities: > ... > So I hope that Christian has code that saves the frame and block > stacks! Yes, but nothing gets copied until a continuation gets captured, and at the start of that I believe only one frame gets cloned. > (It would be fun to try and optimize this by doing it lazily, > so that frames which haven't returned yet aren't copied yet.) He's aware of that . > How does Scheme do this? I don't know if it has something like the > block stack, but surely it has a value stack! Stacks and registers and such aren't part of the language spec, but, you bet -- however it may be spelled in a given implementation, "a value stack" is there. BTW, many optimizing Schemes define a weaker form of continuation too (call/ec, for "escaping continuation"). Skipping the mumbo jumbo definition <0.9 wink>, you can only invoke one of those if its target is on the path back from the invoker to the root of the call tree (climb up tree like Cheetah, not leap across branches like Tarzan). This amounts to a setjmp/longjmp in C -- and may be implemented that way! i-say-do-it-right-or-not-at-all-ly y'rs - tim From tismer@appliedbiometrics.com Mon Jul 12 10:44:06 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 11:44:06 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201becbd6$8df90660$569e2299@tim> <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <3789B8E6.C4CB6840@appliedbiometrics.com> Guido van Rossum wrote: ... > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. You need to preserve the stack and the block stack of a frame, if and only if it can be reached twice. I make this dependent from its refcount. Every frame monitors itself before and after every call_function, if a handler field in the frame "f_callguard" has been set. If so, the callguard is called. Its task is to see wether we must preserve the current state of the frame and to carry this out. The idea is to create a shadow frame "on demand". When I touch a frame with a refcount > 1, I duplicate it at its f_back pointer. By that is is turned into a "continuation frame" which is nothing more than the stack copy, IP, and the block stack. By that, the frame stays in place where it was, all pointers are still fine. The "real" one is now in the back, and the continuation frame's purpose when called is only to restore the state of the "real one" and run it (after doing a new save if necessary). I call this technique "push back frames". > > (In regular Python, the frame stack also contains local variables. > These are explicitly exempted from being saved by a continuation. I > don't know how Christian does this, but I presume he uses the > dictionary which can be shared between frames.) I keep the block stack and a stack copy. All the locals are only existing once. The frame is also only one frame. Actually always a new one (due to push back), but virtually it is "the frame", with multiple continuation frames pointing at it. ... > Clearly if that saved continuation is ever invoked (called? used? > activated? What do you call what you do to a continuation?) I think of throwing. Mine are thrown. The executive of standard frames is "eval_code2_loop(f, passed_retval)", where the executive of a continuation frame is "throw_continuation(f, passed_retval)". ... > Is it safe not to save the frame and block stacks of frames further > down on the call stack? I don't think so -- these are all destroyed > when frames are popped off the call stack (even if the frame is kept > alive, its value and block stack are always empty when the function > has returned). > > So I hope that Christian has code that saves the frame and block > stacks! (It would be fun to try and optimize this by doing it lazily, > so that frames which haven't returned yet aren't copied yet.) :-) I have exactly that, and I do it lazily already. Unless somebody saves a continuation, nothing special happens. But if he does, the push back process follows his path like a zip (? Reißverschluß) and ensures that the path can be walked again. Tarzan has now the end of this liane in his hand. He might use it to swing over, or he might drop it, and it ribbles away and vanishes as if it never existed. Give me some final testing, and you will be able to try it out in a few days. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Mon Jul 12 10:56:00 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 11:56:00 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201becc34$b79f7900$9b9e2299@tim> Message-ID: <3789BBB0.39F6BD20@appliedbiometrics.com> Tim Peters wrote: ... > BTW, many optimizing Schemes define a weaker form of continuation too > (call/ec, for "escaping continuation"). Skipping the mumbo jumbo definition > <0.9 wink>, you can only invoke one of those if its target is on the path > back from the invoker to the root of the call tree (climb up tree like > Cheetah, not leap across branches like Tarzan). This amounts to a > setjmp/longjmp in C -- and may be implemented that way! Right, maybe this would do enough. We will throw away what's not needed, when we know what we actually need... > i-say-do-it-right-or-not-at-all-ly y'rs - tim ...and at the moment I think it was right to take it all. just-fixing-continuations-spun-off-in-an-__init__-which- -is-quite-hard-since-still-recursive,-and-I-will-ship-it-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jul 12 16:42:14 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Mon, 12 Jul 1999 11:42:14 -0400 (EDT) Subject: [Python-Dev] Generator details References: <199907101548.LAA04399@eric.cnri.reston.va.us> <000001becc0d$fcb64f40$229e2299@tim> Message-ID: <14218.3286.847367.125679@anthem.cnri.reston.va.us> | value, frame = generator(args) | try: | while frame: | etc | value, frame = resume_frame(frame) | finally: | if frame: | shut_frame_down(frame) Minor point, but why not make resume() and shutdown() methods on the frame? Isn't this much cleaner? value, frame = generator(args) try: while frame: etc value, frame = frame.resume() finally: if frame: frame.shutdown() -Barry From tismer@appliedbiometrics.com Mon Jul 12 20:39:40 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 21:39:40 +0200 Subject: [Python-Dev] continuationmodule.c preview Message-ID: <378A447C.D4DD24D8@appliedbiometrics.com> This is a multi-part message in MIME format. --------------EDD81E724667AA03453E0C67 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Howdy, please find attached my latest running version of continuationmodule.c which is really able to do continuations. You need stackless Python 0.3 for it, which I just submitted. This module is by no means ready. The central functions are getpcc() and putcc. Call/cc is at the moment to be done like: def callcc(fun, *args, **kw): cont = getpcc() return apply(fun, (cont,)+args, kw) getpcc(level=1) gets a parent's current continuation. putcc(cont, val) throws a continuation. At the moment, these are still frames (albeit special ones) which I will change. They should be turned into objects which have a link to the actual frame, which can be unlinked after a shot or by hand. This makes it easier to clean up circular references. I have a rough implementation of this in Python, also a couple of generators and coroutines, but all not pleasing me yet. Due to the fact that my son is ill, my energy has dropped a little for the moment, so I thought I'd better release something now. I will make the module public when things have been settled a little more. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home --------------EDD81E724667AA03453E0C67 Content-Type: application/x-unknown-content-type-cfile; name="continuationmodule.c" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="continuationmodule.c" Ci8qIENvbnRpbnVhdGlvbiBvYmplY3RzICovCgovKgoKICBDVCA5OTA2MTkKCiAgQWZ0ZXIg bG90cyBvZiB0aG91Z2h0LCBJIGNhbWUgdXAgd2l0aCB0aGlzOgogIEZyYW1lcyBjYW4gYmUg c3dpdGNoZWQgYW5kIHRyZWF0ZWQgbGlrZQogIGNvbnRpbnVhdGlvbnMgaW4gbWFueSBjYXNl cy4KICBUaGlzIHdvcmtzIGZpbmUsIHVudGlsIGEgZnJhbWUgbmVlZHMgdG8KICBiZSByZXR1 cm5lZCB0byBtb3JlIHRoYW4gb25jZS4KICBUaGlzIGlzIHRoZSByZWFsIHNpdHVhdGlvbiwg d2hlcmUgd2UgbmVlZAogIHRvIHVzZSAicmVhbCIgY29udGludWF0aW9ucywgdGhhdCBtZWFu czoKICBUaGUgZnJhbWUncyBleGVjdXRpb24gc3RhdGUgbXVzdCBiZSBzZXQgYnkgdGhlCiAg cmV0dXJuaW5nIGNhbGxlZS4KCiAgSXQgd291bGQgYmUgYSBtYWpvciBkcmF3YmFjayB0byBh bHdheXMgaGF2ZSBjYWxsZWVzCiAgdG8gY2FycnkgdGhlaXIgY3VycmVudCBiYWNrIGNvbnRp bnVhdGlvbiBhcm91bmQuCiAgVGhpcyB3b3VsZCBiZSBhIGxvdCBvZiB1bm5lY2Vzc2FyeSBv dmVyaGVhZCBhbmQgd291bGQKICBzbG93IGRvd24gbW9zdCBvZiB0aGUgZnVuY3Rpb24gY2Fs bHMgd2l0aG91dCBhbnkgYmVuZWZpdC4KCiAgU29sdXRpb24gdG8gdGhpcyBwcm9ibGVtOgog IFdoZW5ldmVyIHRoZSBjdXJyZW50IGNvbnRpbnVhdGlvbiBpcyByZXF1ZXN0ZWQgZnJvbSBQ eXRob24gY29kZSwKICB3ZSB0YWtlIHRoZSBjdXJyZW50IGZyYW1lIGFuZCBkdXBsaWNhdGUg aXQuIER1cGxpY2F0aW9uCiAgb2Njb3VycyBhdCB0aGUgZl9iYWNrIHBvaW50ZXI6IEEgY29w eSBvZiB0aGUgZnJhbWUgaXMgc3BsaWNlZAogIGluIHRoZXJlLiBUaGlzIGtlZXBzIGFsbCBy ZWZlcmVuY2VzIGNvcnJlY3QuCiAgVGhlIGZvcm1lciBjdXJyZW50IGZyYW1lIGlzIG5vdyB0 cmFuc2Zvcm1lZCBpbnRvIGEgY29udGludWF0aW9uCiAgZnJhbWUuCiAgQ29udGludWF0aW9u IGZyYW1lcyBoYXZlIGEgc3BlY2lhbCBmX2V4ZWN1dGUgZW50cnkuCiAgVGhlaXIgb25seSBw dXJwb3NlIGlzIHRvIHBhcmFtZXRlcml6ZSB0aGUgdHJ1ZSBmcmFtZQogIGFuZCBydW4gaXQu CiAgVGhlIGNvcHktaW50by10aGUtYmVoaW5kIHRyaWNrIGlzIGFsd2F5cyBwZXJmb3JtZWQs IHdoZW4KICBhIGNvbnRpbnVhdGlvbiBmcmFtZSBpcyBydW4uIEF0IHRoZSBzYW1lIHRpbWUs IGl0IGlzIGVuc3VyZWQKICB0aGF0IGFueSB0cnVlIGZyYW1lIHdoaWNoIGlzIGludm9rZWQg YnkgYSBjb250aW51YXRpb24gaXMKICBpdHNlbGYgcG9pbnRpbmcgYmFjayB0byBhIGNvbnRp bnVhdGlvbi4KICBCeSBtZWFucyBvZiB0aGlzLCBpdCBhcHBlYXJzIGFzIGlmIHRoZSB3aG9s ZSBleGVjdXRpb24gdHJlZQogIHdlcmUgYnVpbHQgd2l0aCBjb250aW51YXRpb25zIHNsaWNl ZCBiZXR3ZWVuIG5vcm1hbCBmcmFtZXMuCiAgQnV0IHRoaXMgaGFwcGVucyBvbmx5IG9uIGRl bWFuZCwgaW4gYSBsYXp5IG1hbm5lci4KCiAgSSdtIHN1cmUgSSBoYXZlIHJlYWNoZWQgdGhl IG5leHQgbGV2ZWwgb2YgbWFkbmVzcyBieSB0aGF0LgoKICBDaHJpc3RpYW4gVGlzbWVyCiAg SnVuZSAxOSwgMTk5OQoKICBBZGRlbmR1bSA5OTA2MjA6IEl0IHdhcyBjcnVjaWFsIHRvIGNo YW5nZSB0aGUgZnJhbWUgcmVmY291bnRpbmcKICBwb2xpY3ksIGluIG9yZGVyIHRvIGdldCB0 aGlzIHRvIHdvcmsgcGFpbmxlc3NseS4KICBGcmFtZXMgbm93IGhhdmUgdXN1YWxseSBhbHdh eXMgYSByZWZjb3VudCBvZiAxLgogIE9uIHJldHVybiwgZl9iYWNrIGlzIGluY3JlZidkLCB0 aGVuIHRoZSBjdXJyZW50IGZyYW1lIGRlY3JlZidkLgoKICA5OTA2MjQgc3RpbGwgaHVudGlu ZyB0aGUgbGFzdCByZWZjb3VudCBmYXVsdHMgc2luY2UgNSBkYXlzKysuCgogIDk5MDYyNSBy ZWZjb3VudCBmYXVsdHMgc29sdmVkLiBJIGRpZCBub3QgdW5kZXJzdGFuZCB0aGF0CiAgYSBm cmFtZSdzIHN0YWNrc2l6ZSBmaWVsZCBzaG91bGQgbmV2ZXIgYmUgd3JpdHRlbi4KICBTZWUg ZnJhbWVvYmplY3QuYyBmb3IgR3VpZG8ncyBjb21tZW50cy4KCiAgQWRkZWQgcHJvcGVyIGhh bmRsaW5nIG9mIHJlY3Vyc2lvbiBkZXB0aC4gSXQgaXMgc2ltcGxlCiAgYW5kIGNoZWFwOiBT YXZlIGFuZCByZXN0b3JlIHRoZSByZWN1cnNpb24gbGV2ZWwgaW4gdGhlCiAgY29udGludWF0 aW9uIGZyYW1lcy4KCiAgOTkwNjI2IGl0IHNlZW1zIHRvIGJlIHZlcnkgc3RhYmxlIG5vdy4g QnV0OgoKICAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tCiAgVmVyc2lvbiAwLjIKCiAgQWxsIHRoZSBwcm9ibGVtcyBoYXZl IGJlZW4gc29sdmVkIHNvIGZhci4KICBBZnRlciBzb21lIHRlc3RpbmcgYW5kIG1vcmUgcmVh ZGluZyBhYm91dCBTY2hlbWUsIEkKICByZWFsaXplZCB0aGF0IHRoZSBldmFsdWF0aW9uIHN0 YWNrIGlzIGEgcmVhbGx5IGJhZCB0aGluZyEKICBCeSBjb3B5aW5nIHN0YWNrcyBiYWNrLCB3 ZSBhcmUgcmVzdG9yaW5nIG9sZCBpbmZvcm1hdGlvbgogIHdoaWNoIGlzIHdyb25nLiBXZSBo YXZlIHRvIHVzZSB0aGUgY3VycmVudCBpbmZvcm1hdGlvbi4KICBFeGFtcGxlOiBMb29wcyBj YXJyeSB0aGVpciBzdGF0ZSBvbiB0aGUgc3RhY2suIEl0IGlzCiAgYSBiYWQgaWRlYSB0byBy ZXN0b3JlIHRoYXQgdXNpbmcgb2xkIHZhbHVlcy4KCiAgQWxzbywgY3JlYXRpbmcgbmV3IHB1 c2ggYmFjayBmcmFtZXMgYWxsIHRoZSB0aW1lIGlzIG5vdAogIHRoZSBiZXN0IGlkZWEsIHNp bmNlIHRoZXJlIHNob3VsZCBiZSBvbmx5IG9uZSBjb250aW51YXRpb24KICBmcmFtZSBmb3Ig ZXZlcnkgY29udGludWF0aW9uIGVudHJ5IHBvaW50LgoKICBJbiBhbiBpZGVhbCB3b3JsZCwg ZXZlcnkgZW50cnkgcG9pbnQgd291bGQgbm90IGhhdmUgYQogIHN0YWNrLCBidXQgYSBudW1i ZXIgb2YgcmVnaXN0ZXJzIHdoaWNoIGFyZSBhc3NvY2lhdGVkCiAgd2l0aCB0aGUgcHJvZ3Jh bSBsb2NhdGlvbi4gVGhlc2UgcmVnaXN0ZXJzIHNob3VsZCBhbHdheXMKICBjYXJyeSBjdXJy ZW50IHZhbHVlcy4gSG93IGNhbiB3ZSBkbyB0aGlzPwoKICBBIGNvbXBsZXRlIHNvbHV0aW9u IHNlZW1zIHRvIGJlIGltcG9zc2libGUgd2l0aG91dAogIHJld3JpdGluZyBoYWxmIG9mIHRo ZSBQeXRob24ga2VybmVsLiBBbGwgd2UgY2FuIGRvCiAgaXMgdHJ5aW5nIHRvIG1ha2Ugc3Vy ZSB0aGF0IGV2ZXJ5IHByb2dyYW0gbG9jYXRpb24KICB3aGljaCBjYW4gYmUgcmVhY2hlZCBi eSBpdHMgY29udGludWF0aW9uIHVzZXMgdGhlCiAgc2FtZSwgZnJlc2hlc3QgdmVyc2lvbiBv ZiBkYXRhLgoKICA5OTA2MjcgTm8sIGl0ICppcyogcG9zc2libGUuIFdlIG5lZWQgYSBzbWFs bCBjaGFuZ2UKICBvZiBldmFsX2NvZGUgdG8gYWxsb3cgZm9yIHNvbWV0aGluZyBiZWluZyBj YWxsZWQKICBiZWZvcmUgYSBuZXcgZnJhbWUgaXMgcnVuLiBUaGlzIG1lYW5zIHRvIGFkZAog IG9uZSBuZXcgZmllbGQgdG8gZnJhbWVzOiBmX2NhbGxndWFyZC4KCiAgRGlkIGl0LiBJdCB3 b3Jrcy4gSXQgaXMgbm8gY29tcGxldGUgc29sdXRpb24sIGJ1dCBhdCBsZWFzdAogICJmb3Ii IGxvb3BzIGJlaGF2ZSBjb3JyZWN0bHkgaW4gc2ltcGxlIGNhc2VzLgogIFRoZSB0cnVlIHNv bHV0aW9uIGlzOiBFaXRoZXIgdHVybiB0aGUgZXZhbCBpbnRlcnByZXRlcgogIGludG8gYSBy ZWdpc3RlciBtYWNoaW5lLCBvciBkbyB0aGUgZm9sbG93aW5nOgogIE9uIGRlbWFuZCwgYSBk YXRhIGZsb3cgYW5hbHlzaXMgbXVzdCBiZSBkb25lIGZvciB0aGUKICBzdGFjayBpbiBvcmRl ciB0byBmaW5kIHRoZSBsaWZldGltZSBvZiBzdGFjayB2YXJpYWJsZXMuCiAgVGhpcyBpcyBu b3QgdG9vIGhhcmQuIEkgd2lsbCBub3QgZG8gdGhpcyB1bnRpbCBJIGNoYW5nZWQKICB0aGUg b3ZlcmFsbCBjb25jZXB0IGFnYWluOiBGcmFtZSBpbnRlcnByZXRlcnMgbXVzdCBiZWNvbWUK ICBvYmplY3QgYnkgdGhlbXNlbHZlcy4gTmVjZXNzYXJ5IGFjdGlvbnMgZm9yIGNvbnRpbnVh dGlvbnMKICB3aWxsIHRoZW4gYmVjb21lIG1ldGhvZHMgb2YgdGhlc2Ugb2JqZWN0cy4KICBU aGlzIGlzIHRoZSB3YXkgdG8gZ28sIGZvciBWZXJzaW9uIDAuMy4KCiAgLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQoKICA5 OTA3MTAgZHJvcHBlZCB0aGUgZXh0cmEgaGlkZGVuIHJlZ2lzdGVyIGRldGVjdGlvbiBtZXNz LgogIExldCdzIGZvcmdldCBhYm91dCB0aGUgYWJvdmUgY29uc2lkZXJhdGlvbnMuCgogIFN0 YWNrbGVzcyBQeXRob24gZ290IGEgbmV3IG11dGFibGUgY291bnRlciBvYmplY3QgZm9yIHRo ZQogIGZvci1sb29wcywgYW5kIHdlJ3JlIGRvbmUuIEFsc28gcmV3b3JrZWQgdGhlIGRldGVj dGlvbgogIG9mIGRpZmZlcmVudCBkaXNwYXRjaGVyIGluc3RhbmNlcyBxdWl0ZSBtdWNoLgog IEdvaW5nIHRvIHJlbGVhc2Ugc29tZSBzdHVmZiBzb29uLgoKICA5OTA3MTEgTm93IEkgKnRo b3VnaHQqIGl0IHNob3VsZCBhbHNvIHdvcmsgdG8gc2F2ZQogIGNvbnRpbnVhdGlvbnMgaW5z aWRlIG9mIG5lc3RlZCBkaXNwYXRjaGVycy4gQWN0dWFsbHksCiAgaXQgZG9lc24ndC4gTXkg cHJldmVudGl2ZSBjb250aW51YXRpb24gZnJhbWUgY3JlYXRpb24KICB3aGljaCBJIGRvIGZv ciBjYWxsZXJzIGlzIGJyb2tlbiwgaWYgYW4gZXZhbF9jb2RlIGluc3RhbmNlCiAgY2FsbHMg YSBmdW5jdGlvbiB3aGljaCBkb2Vzbid0IGdlbmVyYXRlIHVud2luZCB0b2tlbnMuCiAgVGhh dCBtZWFucywgZXZhbF9jb2RlIGNvbnRpbnVlcyB0byBydW4gdGhpcyBmcmFtZSwgYWx0aG91 Z2gKICBpdCBoYWQgYmVlbiB0dXJuZWQgaW50byBhIGNvbnRpbnVhdGlvbiBmcmFtZS4gVGhp cyBsZWFkcyB0bwogIGZ1bm55IGVmZmVjdHMgYW5kIGlzIGJhZC4KCiAgQnV0IG5vdyBJIHJl YWxpemVkIHdoYXQgdGhlICpyZWFsKiB3YXkgaXMsIGFuZCB0aGF0IGl0IGlzCiAgc28gbXVj aCBlYXNpZXIgdGhhbiBldmVyeXRoaW5nLiBXZSB1c2UgdGhlIGZfY2FsbGd1YXJkLAogIHdo aWNoIG5vdyBtYWtlcyB2ZXJ5IG11Y2ggc2Vuc2UuIFRoZSBjYWxsZ3VhcmQgYmVjb21lcwog IGEgaGFuZGxlciB3aGljaCBjaGVja3Mgd2V0aGVyIHRoZSBjdXJyZW50IGZyYW1lIGhhcyBh CiAgcmVmY291bnQgb2YgbW9yZSB0aGFuIG9uZS4gSWYgc28sIHRoZW4gaXQgaXMgdHVybmVk IGludG8KICBhIGNvbnRpbnVhdGlvbiBmcmFtZSwgYW5kIHdlIHVud2luZCB3aXRoIHRoZSBw dXNoZWQgYmFjawogIGZyYW1lLgoKICA5OTA3MTIgVGhpcyBpcyBkb25lIG5vdy4gVGhlIGNh bGxndWFyZCBpcyBub3cgYWxzbyB1c2VkIHRvCiAgY2hlY2sgZm9yIGZ1bmN0aW9uIGNhbGxz IHdoaWNoIGRpZCBub3QgdW53aW5kIHRoZSBzdGFjay4KICBJbiBvdGhlciB3b3JkczogV2Ug Y2FuIGNhcHR1cmUgdGhlIGNvbnRpbnVhdGlvbnMgY29ycmVjdGx5LAogIGFsc28gaWYgd2Ug YXJlIGNhbGxlZCByZWN1cnNpdmVseSBieSBhbiBfX2luaXRfXyBjb250ZXh0LgoKICAqLwoK I2luY2x1ZGUgIlB5dGhvbi5oIgoKI2luY2x1ZGUgImNvbXBpbGUuaCIKI2luY2x1ZGUgImZy YW1lb2JqZWN0LmgiCgpzdGF0aWMgUHlPYmplY3QgKkVycm9yT2JqZWN0OwoJCgovKiBmb3J3 YXJkICovClB5T2JqZWN0ICogdGhyb3dfY29udGludWF0aW9uKFB5RnJhbWVPYmplY3QgKiwg UHlPYmplY3QgKik7CgpQeU9iamVjdCAqIHN0YWNrX3RvX3R1cGxlKGYpCglQeUZyYW1lT2Jq ZWN0ICpmOwp7CglQeU9iamVjdCAqKnAgPSBmLT5mX3N0YWNrcG9pbnRlcjsKCWludCBzaXpl ID0gcCAtIGYtPmZfdmFsdWVzdGFjazsKCVB5T2JqZWN0ICpyZXQgPSBQeVR1cGxlX05ldyhz aXplKTsKCWlmIChyZXQ9PU5VTEwpCgkJcmV0dXJuIE5VTEw7Cglmb3IgKCA7IHNpemU+MCA7 KSB7CgkJUHlPYmplY3QgKml0ZW0gPSAqLS1wOwoJCVB5X0lOQ1JFRihpdGVtKTsKCQlQeVR1 cGxlX1NFVF9JVEVNKHJldCwgLS1zaXplLCBpdGVtKTsKCX0KCXJldHVybiByZXQ7Cn0KCgoJ ClB5RnJhbWVPYmplY3QgKiBzZXRfZXhlY3V0aW9uX3N0YXRlKGYsIGZiKQoJUHlGcmFtZU9i amVjdCAqIGY7CglQeUZyYW1lT2JqZWN0ICogZmI7CnsKCVB5T2JqZWN0ICoqIGZzcDsKCVB5 T2JqZWN0ICogaXRlbTsKCglpZiAoZi0+Zl9leGVjdXRlICE9IHRocm93X2NvbnRpbnVhdGlv bikgewoJCVB5RXJyX1NldFN0cmluZyAoUHlFeGNfU3lzdGVtRXJyb3IsICJ3cm9uZyBleGVj dXRpb24gc291cmNlIGZyYW1lIik7CgkJcmV0dXJuIE5VTEw7Cgl9CgoJaWYgKGZiLT5mX2V4 ZWN1dGUgPT0gdGhyb3dfY29udGludWF0aW9uKSB7CgkJUHlFcnJfU2V0U3RyaW5nIChQeUV4 Y19TeXN0ZW1FcnJvciwgIndyb25nIGV4ZWN1dGlvbiB0YXJnZXQgZnJhbWUiKTsKCQlyZXR1 cm4gTlVMTDsKCX0KCgkvKiBjb3B5IGV4Y2VwdGlvbiBibG9ja3MgKi8KCWlmIChmLT5mX2li bG9jayA+IDApIHsKCQlmYi0+Zl9pYmxvY2sgPSBmLT5mX2libG9jazsKCQltZW1jcHkoJihm Yi0+Zl9ibG9ja3N0YWNrKSwgJihmLT5mX2Jsb2Nrc3RhY2spLCBmLT5mX2libG9jaypzaXpl b2YoUHlUcnlCbG9jaykpOwoJfQoKCS8qIGNsZWFyIHRoZSB0YXJnZXQgc3RhY2sgKi8KCXdo aWxlIChmYi0+Zl9zdGFja3BvaW50ZXIgPiBmYi0+Zl92YWx1ZXN0YWNrKSB7CgkJaXRlbSA9 ICotLShmYi0+Zl9zdGFja3BvaW50ZXIpOwoJCVB5X1hERUNSRUYoaXRlbSk7Cgl9CgkvKiBj b3B5IHRoZSBzdGFjayAqLwoJZnNwID0gZi0+Zl92YWx1ZXN0YWNrOwoJd2hpbGUgKGZzcCA8 IGYtPmZfc3RhY2twb2ludGVyKSB7CgkJaXRlbSA9ICooZnNwKyspOwoJCVB5X0lOQ1JFRihp dGVtKTsKCQkqKGZiLT5mX3N0YWNrcG9pbnRlcisrKSA9IGl0ZW07Cgl9CgkvKiBjb3B5IGFk ZGl0aW9uYWwgaW5mbyAqLwoJZmItPmZfbGFzdGkgPSBmLT5mX2xhc3RpOwoJZmItPmZfbGlu ZW5vID0gZi0+Zl9saW5lbm87CgkvKiBpbnN0cnVjdGlvbiBwb2ludGVyICovCglmYi0+Zl9u ZXh0X2luc3RyID0gZi0+Zl9uZXh0X2luc3RyOwoJLyogY29weSBhdXhpbGlhcnkgcmVnaXN0 ZXJzICovCglmYi0+Zl9zdGF0dXNmbGFncyA9IGYtPmZfc3RhdHVzZmxhZ3M7CgkvKiBidXQg bm90IHRoZSByZWdpc3RlcnMgKi8KCXJldHVybiBmYjsgLyogc3VjY2VzcyBmbGFnICovCn0K CgovKgpUaGUgaWRlYSBvZiBidWlsZF9jb250aW51YXRpb25fZnJhbWUgaXMKdG8gdGFrZSBh IGZyYW1lIGFuZCBkdXBsaWNhdGUgaXQgYXQgaXRzIGZfYmFjayBwb2ludGVyLgpUaGUgZXhl Y3V0aW9uIHBvaW50ZXIgaXMgY2hhbmdlZCB0byB0aHJvd19jb250aW51YXRpb24uClNpbmNl IHRoZSBmcmFtZSAqYmVjb21lcyogdGhlIGNvbnRpbnVhdGlvbiwgd2hpY2ggbGF0ZXIKZmFs bHMgYmFjayB0byB0aGUgdHJ1ZSBhY3Rpb24sIGFsbCByZWZlcmVuY2VzIHRvIHRoZQpmcmFt ZSBzdGF5IGludGFjdC4gClRoaXMgaXMgYSAiY3JlYXRlIGNvbnRpbnVhdGlvbiBvbiBkZW1h bmQiIGNvbmNlcHQgOi0pCiovCgppbnQgZmluZF9yZWN1cnNpb25fZGVwdGgoZikKCVB5RnJh bWVPYmplY3QgKiBmOwp7CgkvKiAKCSAgd2UgZWl0aGVyIGNvdW50IHVudGlsIHRoZSB0b3Ag b2YgdGhlIGNoYWluLCAKCSAgb3IgdW50aWwgd2UgZmluZCBhbm90aGVyIGNvbnRpbnVhdGlv biBmcmFtZSAKCSAgd2hpY2ggdGVsbHMgdXMgYnkgaXRzIGZfcmVnMSBmaWVsZC4gCgkqLwoJ aW50IGRlcHRoID0gMDsKCXdoaWxlIChmICE9IE5VTEwpIHsKCQlpZiAoZi0+Zl9leGVjdXRl ID09IHRocm93X2NvbnRpbnVhdGlvbikgewoJCQlkZXB0aCArPSBmLT5mX3JlZzE7CgkJCWJy ZWFrOwoJCX0KCQlmID0gZi0+Zl9iYWNrOwoJCWRlcHRoKys7Cgl9CglyZXR1cm4gZGVwdGg7 Cn0KClB5RnJhbWVPYmplY3QgKiBidWlsZF9jb250aW51YXRpb25fZnJhbWUoZikKCVB5RnJh bWVPYmplY3QgKiBmOwp7CglQeUZyYW1lT2JqZWN0ICogZmI7IC8qIGZyYW1lIGluIHRoZSBi YWNrICovCglpbnQgaTsKCglpZiAoZj09TlVMTCkKCQlyZXR1cm4gTlVMTDsKCWlmIChmLT5m X2V4ZWN1dGUgPT0gdGhyb3dfY29udGludWF0aW9uKQoJCXJldHVybiBmOyAvKiBhbHJlYWR5 IGEgY29udGludWF0aW9uICovCgoJZmIgPSBQeUZyYW1lX05ldyhmLT5mX3RzdGF0ZSwgZi0+ Zl9jb2RlLCBmLT5mX2dsb2JhbHMsIGYtPmZfbG9jYWxzKSA7CglpZiAoZmIgPT0gTlVMTCkK CQlyZXR1cm4gTlVMTCA7CgoJLyogbGluayB0aGUgbmV3IGZyYW1lIGJldHdlZW4gdGhpcyBh bmQgYmFjayAqLwoJZmItPmZfYmFjayA9IGYtPmZfYmFjazsKCWYtPmZfYmFjayA9IGZiOwoK CS8qIHRyYWNlYmFja3M6IG5vdCBjb3BpZWQsIHRoZXkgc3RheSBpbiB0aGUgY29udGludWF0 aW9uICovCgoJLyogCgkgIGZfYnVpbHRpbnMsIGZfZ2xvYmFscywgZl9sb2NhbHMsIGZfdHN0 YXRlLCBmX3Jlc3RyaWN0ZWQsIGZfbmxvY2FscyAKCSAgYW5kIGZfZmlyc3RfaW5zdHIgaGF2 ZSBiZWVuIHNldCBieSBQeUZyYW1lX05ldy4KCSAKCSAgbmV2ZXIgdG91Y2ggZl9zdGFja3Np emUhIEl0IG1heSBoYXZlIGJlZW4gc2V0IGxhcmdlciB0aGFuCgkgIG5lZWRlZC4gVGhpcyBj YXVzZWQgbWUgaG91cnMgYW5kIGhvdXJzIG9mIGRlYnVnZ2luZy4KCSovCgoJZm9yIChpPTA7 IGk8Zi0+Zl9ubG9jYWxzOyBpKyspIHsKCQlQeU9iamVjdCAqIGl0ZW0gPSBmLT5mX2xvY2Fs c3BsdXNbaV07CgkJUHlfWElOQ1JFRihpdGVtKTsKCQlmYi0+Zl9sb2NhbHNwbHVzW2ldID0g aXRlbTsKCX0KCgoJUHlfSU5DUkVGKGYtPmZfZGlzcGF0Y2hlcik7CglmYi0+Zl9kaXNwYXRj aGVyID0gZi0+Zl9kaXNwYXRjaGVyOwoJZmItPmZfaG9sZF9yZWYgPSBmLT5mX2hvbGRfcmVm OyBmLT5mX2hvbGRfcmVmID0gTlVMTDsKCWZiLT5mX21lbW9yeSA9IGYtPmZfbWVtb3J5OyBm LT5mX21lbW9yeSA9IE5VTEw7CglmYi0+Zl9jYWxsZ3VhcmQgPSBmLT5mX2NhbGxndWFyZDsg Zi0+Zl9jYWxsZ3VhcmQgPSBOVUxMOwoKCS8qIHJlZ2lzdGVycywgb25seSBvbmNlICovCglm Yi0+Zl9yZWcxID0gZi0+Zl9yZWcxOwoJZmItPmZfcmVnMiA9IGYtPmZfcmVnMjsKCWZiLT5m X3JlZzMgPSBmLT5mX3JlZzM7CgoJLyogY29weSBleGVjdXRvciBhbmQgcHV0IG91cnMgKi8K CWZiLT5mX2V4ZWN1dGUgPSBmLT5mX2V4ZWN1dGU7CglmLT5mX2V4ZWN1dGUgPSB0aHJvd19j b250aW51YXRpb247CgoJLyogcHV0IHJlY3Vyc2lvbl9kZXB0aCBpbnRvIGZfcmVnMSAqLwoJ Zi0+Zl9yZWcxID0gZmluZF9yZWN1cnNpb25fZGVwdGgoZmIpOwoKCWlmICghc2V0X2V4ZWN1 dGlvbl9zdGF0ZShmLCBmYikpCgkJcmV0dXJuIE5VTEw7CglyZXR1cm4gZjsKfQoKClB5RnJh bWVPYmplY3QgKiBmaW5kX2NvZGVfZnJhbWUoZikKCVB5RnJhbWVPYmplY3QgKmY7CnsKCWlm IChmPT1OVUxMKQoJCXJldHVybiBOVUxMOwoJd2hpbGUgKGYtPmZfZXhlY3V0ZSA9PSB0aHJv d19jb250aW51YXRpb24pIHsKCQlmID0gZi0+Zl9iYWNrOwoJCWlmIChmPT1OVUxMKQoJCQly ZXR1cm4gTlVMTDsKCX0KCXJldHVybiBmOwp9CgoKUHlGcmFtZU9iamVjdCAqIG5vcm1hbGl6 ZV9jb250aW51YXRpb24oZikKCVB5RnJhbWVPYmplY3QgKmY7CnsKCS8qIHdhbGsgYmFjayB0 aHJvdWdoIGEgcG9zc2libGUgcGlsZSBvZiBjb250aW51YXRpb25zIGFuZCByZWxpbmsgdG8g dGhlIHJlYWwgY29kZSAqLwoJUHlGcmFtZU9iamVjdCAqZmIgPSBmLT5mX2JhY2s7CglpZiAo ZmI9PU5VTEwpCgkJcmV0dXJuIE5VTEw7CglpZiAoZmItPmZfZXhlY3V0ZSA9PSB0aHJvd19j b250aW51YXRpb24pIHsKCQlmYiA9IG5vcm1hbGl6ZV9jb250aW51YXRpb24oZmIpOwoJCWlm IChmYj09TlVMTCkKCQkJcmV0dXJuIE5VTEw7Cgl9CglQeV9JTkNSRUYoZmIpOwoJUHlfREVD UkVGKGYtPmZfYmFjayk7CglmLT5mX2JhY2sgPSBmYjsKCS8qIHJldHVybiB0aGUgcmVhbCBj b2RlIGZyYW1lICovCglyZXR1cm4gZmI7Cn0KCgpQeUZyYW1lT2JqZWN0ICogZmluZF9jYWxs ZXJfZW50cnkoZikKCVB5RnJhbWVPYmplY3QgKmY7CnsKCS8qIHRoZSBvcHBvc2l0ZSBvZiBm aW5kX2NvZGVfZnJhbWUuIGZpbmQgdGhlIHRydWUgcmV0dXJuLiAqLwoJZiA9IGZpbmRfY29k ZV9mcmFtZShmKTsKCWlmIChmICE9IE5VTEwpCgkJZiAgPSBmLT5mX2JhY2s7CglyZXR1cm4g ZjsKfQoKCmludCBwcm90ZWN0X3RoaXNfZnJhbWVfbmV4dF90aW1lKGYsIHBoYXNlKTsgLyog Zm9yd2FyZCAqLwoKLyoKCUEgZnJhbWUgd2lsbCBjaGVjayB3ZXRoZXIgaXQgY2FuIGJlIHJl YWNoZWQgdHdpY2UgYmVmb3JlIGl0CglpcyBydW4uIElmIHNvLCBpdCBjcmVhdGVzIGEgY29u dGludWF0aW9uIGZyYW1lIGZyb20gaXRzZWxmLAoJcHV0cyB0aGUgcHVzaGVkIGJhY2sgYmVy c2lvbiBvbiB0aGUgZnJhbWUgc3RhY2sgYW5kIGp1bXBzIG9mZi4KCglwaGFzZSAwOiBmcmFt ZSBpcyBhYm91dCB0byBsZWF2ZSB3aXRoIHRoZSB1bndpbmQgdG9rZW4KCXBoYXNlIDE6IGZy YW1lIGlzIGFib3V0IHRvIHN0YXJ0IHRoZSBpbnRlcnByZXRlciBsb29wCglwaGFzZSAyOiBm cmFtZSBoYXMganVzdCBkb25lIGEgdHJ1ZSBmdW5jdGlvbiBjYWxsCiovCgppbnQgcHJvdGVj dF90aGlzX2ZyYW1lKGYsIHBoYXNlKQoJUHlGcmFtZU9iamVjdCAqZjsKCWludCBwaGFzZTsK ewoJUHlGcmFtZU9iamVjdCAqYmFjazsKCglpZiAoZi0+b2JfcmVmY250ID09IDEpIHsgLyog Y2Fubm90IGJlIHJlYWNoZWQgdHdpY2UgKi8KCQlpZiAocGhhc2UgIT0gMCkgLyogd2UgYXJl IG5vdCBvbiBleGl0ICovCgkJCWYtPmZfdGVtcF92YWwgPSBOVUxMOwoJCXJldHVybiAwOwoJ fQoKCWYgPSBidWlsZF9jb250aW51YXRpb25fZnJhbWUoZik7CglpZiAoZj09TlVMTCkKCQly ZXR1cm4gLTE7CgoJLyogb24gZXhpdCwgd2UganVzdCBwcm90ZWN0ICovCglpZiAocGhhc2Ug PT0gMCkKCQlyZXR1cm4gMDsKCgkvKiBvbiBlbnRyeSwgd2UgY2F1c2UgYSByZXN0YXJ0ICov CgkvKiB3ZSBhcmUgY2FsbGVkIHdpdGggCgkgICBwaGFzZSA9PSAxIHdoaWNoIGlzICJiZWZv cmUgZnJhbWUgc3RhcnQiCgkgICBwaGFzZSA9PSAyIHdoaWNoIGlzICJhZnRlciBmdW5jdGlv biBjYWxsIgoJICAgYnV0IHRoZSBhY3Rpb24gYXBwZWFycyB0byBiZSBpZGVudGljYWwuCgkq LwoJYmFjayA9IGZpbmRfY29kZV9mcmFtZShmKTsKCWJhY2stPmZfdGVtcF92YWwgPSBmLT5m X3RlbXBfdmFsOyBmLT5mX3RlbXBfdmFsID0gTlVMTDsKCWJhY2stPmZfY2FsbGd1YXJkID0g cHJvdGVjdF90aGlzX2ZyYW1lX25leHRfdGltZTsKCVB5X0lOQ1JFRihiYWNrKTsKCWYtPmZf dHN0YXRlLT5mcmFtZSA9IGJhY2s7CglQeV9ERUNSRUYoZik7CglyZXR1cm4gLTQyOyAvKiBk aXNwYXRjaCB0aGUgbmV3IGZyYW1lICovCn0KCgovKiAKCUEgZnJhbWUgaXMgYWN0aXZhdGVk IGJ5IGEgY29udGludWF0aW9uIGZyYW1lIG9yCglyZXN0YXJ0ZWQgZm9yIHNvbWUgb3RoZXIg cmVhc29uLiBUaGVyZWZvcmUKCWl0IHNob3VsZCBydW4gd2l0aG91dCBidWlsZGluZyBhIG5l dyBjb250aW51YXRpb24KCWZyYW1lIGltbWVkaWF0ZWx5LCBidXQgYmUgYXJtZWQgdG8gZG8g c28gbmV4dCB0aW1lLgoqLwoKaW50IHByb3RlY3RfdGhpc19mcmFtZV9uZXh0X3RpbWUoZiwg cGhhc2UpCglQeUZyYW1lT2JqZWN0ICpmOwoJaW50IHBoYXNlOwp7CglpZiAocGhhc2UgIT0g MSkgLyoganVzdCBvbiBlbnRyeSAqLwoJCXJldHVybiAwOwoJZi0+Zl9jYWxsZ3VhcmQgPSBw cm90ZWN0X3RoaXNfZnJhbWU7CglmLT5mX3RlbXBfdmFsID0gTlVMTDsKCXJldHVybiAwOwp9 CgoKUHlPYmplY3QgKiB0aHJvd19jb250aW51YXRpb24oZiwgcGFzc2VkX3JldHZhbCkKCVB5 RnJhbWVPYmplY3QgKmY7CglQeU9iamVjdCAqcGFzc2VkX3JldHZhbDsJLyogcGFzc2luZyBh IGZ1bmN0aW9uIHJldHVybiB2YWx1ZSBiYWNrIGluICovCnsKCS8qIAoJICB3ZSBhcmUgdGhl IGV4ZWN1dG9yIG9mIGEgY29udGludWF0aW9uIGZyYW1lLgoJICBPdXIgdGFzayBpcyB0byBl bnN1cmUgdGhhdCB0aGUgY3VycmVudCBzdGF0ZSBvZiB0aGUKCSAgdGFyZ2V0IGZyYW1lIGlz IGFnYWluIHNhdmVkIGFzIGEgY29udGludWF0aW9uLgoJICBUaGUgY2FsbGVyIG9mIHRoZSBy ZWFsIGZyYW1lIHdpbGwgYmUgdHVybmVkIGludG8KCSAgYSBjb250aW51YXRpb24gYXMgd2Vs bC4KCSovCglQeUZyYW1lT2JqZWN0ICpiYWNrOwoKCWJhY2sgPSBub3JtYWxpemVfY29udGlu dWF0aW9uKGYpOwoKCS8qIGVuc3VyZSB0aGF0IGJhY2sgaXMgc2F2ZWQgYW5kIHByb3RlY3Rl ZCAqLwoJaWYgKGJhY2sgIT0gTlVMTCkgewoJCWlmIChwcm90ZWN0X3RoaXNfZnJhbWUoYmFj aywgMCkpIHsKCQkJYmFjayA9IE5VTEw7CgkJfQoJCWVsc2UgewoJCQliYWNrID0gbm9ybWFs aXplX2NvbnRpbnVhdGlvbihmKTsKCQkJLyogZW5zdXJlIHRoYXQgYmFjaydzIGNhbGxlciBp cyBwcm90ZWN0ZWQgKi8KCQkJaWYgKGJhY2stPmZfYmFjayAhPSBOVUxMKSB7CgkJCQlQeUZy YW1lT2JqZWN0ICogb3RoZXIgPSBmaW5kX2NvZGVfZnJhbWUoYmFjay0+Zl9iYWNrKTsKCQkJ CW90aGVyLT5mX2NhbGxndWFyZCA9IHByb3RlY3RfdGhpc19mcmFtZTsKCQkJfQoJCQliYWNr ID0gc2V0X2V4ZWN1dGlvbl9zdGF0ZShmLCBiYWNrKTsKCQl9Cgl9CgkKICAgIGlmIChiYWNr PT1OVUxMKSB7CgkJUHlFcnJfU2V0U3RyaW5nIChQeUV4Y19TeXN0ZW1FcnJvciwgImJyb2tl biBjb250aW51YXRpb24iKTsKCQkvKiB0cnkgdG8gcmVwYWlyICovCgkJYmFjayA9IGZpbmRf Y29kZV9mcmFtZShmKTsKCQlQeV9YSU5DUkVGKGJhY2spOwoJCWYtPmZfdHN0YXRlLT5mcmFt ZSA9IGJhY2s7CgkJUHlfREVDUkVGKGYpOwoJCVB5X1hERUNSRUYocGFzc2VkX3JldHZhbCk7 CgkJcmV0dXJuIE5VTEw7Cgl9IGVsc2UgewoJCVB5X0lOQ1JFRihiYWNrKTsKCQlmLT5mX3Rz dGF0ZS0+ZnJhbWUgPSBiYWNrOwoJCWYtPmZfdHN0YXRlLT5yZWN1cnNpb25fZGVwdGggPSBm LT5mX3JlZzE7CgkJUHlfREVDUkVGKGYpOwoJCXJldHVybiBwYXNzZWRfcmV0dmFsOwoJfQp9 CgoKLyogTm93IFNhbSdzIGNvbnRpbnVhdGlvbiBtZXRob2RzICovCgovKiAKICA5OTA3MDEg QWZ0ZXIgbXkgZnJhbWUgZGVhbGxvY2F0aW9uIHdhcyBjb3JyZWN0ZWQgdG8KICBhbHNvIGNs ZWFuIHVwIHN0YWNrcywgSSByZWFsaXplZCB0aGF0IGFsc28gcHV0Y2MKICBpcyBub3Qgc28g dHJpdmlhbCwgc2luY2UgaXQgaXMgcnVuIGluIGEgY29udGV4dCB3aGVyZQogIHRoZSBleGVj dXRvciB3YW50cyB0byBmaW5pc2ggaXRzIGZyYW1lLCBidXQgcHV0Y2MKICBtaWdodCBkZWFs bG9jYXRlIGl0IHRvbyBlYXJseS4KCiAgT25lIGNvdWxkIG9mIGNvdXJzZSB3YWxrIGFyb3Vu ZCB0aGlzIGJ5IGNyZWF0aW5nCiAgYW4gZXh0cmEgZnJhbWUgd2hpY2ggaGFzIGEgY2FsbGJh Y2sgd2hpY2guLi4KCiAgQnV0IHdlIGhhdmUgdGhpcyBjYWxsZ3VhcmQgcG9pbnRlciBpbiBm cmFtZXMgYWxyZWFkeS4KICBUaGlzIGlzIHRoZSB3YXkgdG8gZ28hCiAgV2hlbmV2ZXIgYSBm cmFtZSBpcyBhYm91dCB0byBiZSBsZWZ0IGZpbmFsbHkgd2l0aAogIGFuIHVud2luZCB0b2tl biwgd2UgbGV0IGl0IGdlbmVyYXRlIGEgY2FsbGJhY2sKICB3aGljaCBkZXN0cm95cyBpdCBp bW1lZGlhdGVseSBiZWZvcmUgcmV0dXJuaW5nIHRvCiAgdGhlIGRpc3BhdGNoZXIuIFBoZXcg Oi0pCiovCgppbnQgZGVzdHJveV90aGlzX2ZyYW1lKGYsIHBoYXNlKQoJUHlGcmFtZU9iamVj dCAqZjsKCWludCBwaGFzZTsKewoJaWYgKGYtPm9iX3JlZmNudCAhPSAxIHx8IHBoYXNlICE9 IDApIHsKCQlQeUVycl9TZXRTdHJpbmcgKFB5RXhjX1N5c3RlbUVycm9yLCAid3JvbmcgYXR0 ZW1wdCB0byBkZXN0cm95IHRoaXMgZnJhbWUiKTsKCQlyZXR1cm4gLTE7Cgl9CglQeV9ERUNS RUYoZik7CglyZXR1cm4gMDsKfQoKCmludCBhY3F1aXJlX2RlYWRfZGlzcGF0Y2hlcnNfZnJh bWVzKGssIGYpCglQeUZyYW1lT2JqZWN0ICprOwoJUHlGcmFtZU9iamVjdCAqZjsKewoJUHlE aXNwYXRjaGVyT2JqZWN0ICpkaywgKmRmOwoJZGsgPSBrLT5mX2Rpc3BhdGNoZXI7CglpZiAo ZGstPmRfYWxpdmUpCgkJcmV0dXJuIC0xOwoJZGYgPSBmLT5mX2Rpc3BhdGNoZXI7CglkbyB7 CgkJUHlfREVDUkVGKGRrKTsKCQlQeV9JTkNSRUYoZGYpOwoJCWstPmZfZGlzcGF0Y2hlciA9 IGRmOwoJCWsgPSBrLT5mX2JhY2s7Cgl9IHdoaWxlIChrICE9IE5VTEwgJiYgay0+Zl9kaXNw YXRjaGVyPT1kayk7CglyZXR1cm4gMDsKfQoKCnN0YXRpYyBQeU9iamVjdCAqCmJ1aWx0aW5f cHV0Y2MgKHNlbGYsIGFyZ3MpCglQeU9iamVjdCAqc2VsZjsKCVB5T2JqZWN0ICphcmdzOwp7 CglQeUZyYW1lT2JqZWN0ICogaywgKiBmOwoJUHlPYmplY3QgKiB2OwoJaWYgKCFQeUFyZ19Q YXJzZVR1cGxlIChhcmdzLCAiT08iLCAmaywgJnYpKSB7CgkJcmV0dXJuIE5VTEw7Cgl9IGVs c2UgaWYgKCFQeUZyYW1lX0NoZWNrIChrKSB8fCBrLT5mX2V4ZWN1dGUgIT0gdGhyb3dfY29u dGludWF0aW9uKSB7CgkJUHlFcnJfU2V0U3RyaW5nIChQeUV4Y19UeXBlRXJyb3IsICJhcmd1 bWVudCBtdXN0IGJlIGEgY29udGludWF0aW9uIGZyYW1lIik7CgkJcmV0dXJuIE5VTEw7Cgl9 IGVsc2UgewoJCVB5VGhyZWFkU3RhdGUgKnRzdGF0ZSA9IFB5VGhyZWFkU3RhdGVfR0VUKCk7 CgkJZiA9IHRzdGF0ZS0+ZnJhbWU7CgkJaWYgKGstPmZfZGlzcGF0Y2hlciAhPSBmLT5mX2Rp c3BhdGNoZXIpIHsKCQkJaWYgKGFjcXVpcmVfZGVhZF9kaXNwYXRjaGVyc19mcmFtZXMoaywg ZikpIHsKCQkJCVB5RXJyX1NldFN0cmluZyAoUHlFeGNfVHlwZUVycm9yLCAiZnJhbWUgb2Jq ZWN0cyBhcmUgaW5jb21wYXRpYmxlIik7CgkJCQlyZXR1cm4gTlVMTDsKCQkJfQoJCX0KCQlQ eV9JTkNSRUYodik7CgkJay0+Zl90ZW1wX3ZhbCA9IHY7CgkJay0+Zl9jYWxsZ3VhcmQgPSBw cm90ZWN0X3RoaXNfZnJhbWU7CgkJUHlfSU5DUkVGKGspOwoJCXRzdGF0ZS0+ZnJhbWUgPSBr OwoJCWlmIChmLT5vYl9yZWZjbnQgPiAxKSB7CgkJCVB5X0RFQ1JFRiAoZik7CgkJfSBlbHNl IHsKCQkJZi0+Zl9jYWxsZ3VhcmQgPSBkZXN0cm95X3RoaXNfZnJhbWU7IC8qIHJlYXNvbjog aXQgaXMgc3RpbGwgcnVubmluZyAqLwoJCX0KCQlQeV9ERUNSRUYoZi0+Zl9kaXNwYXRjaGVy KTsKCQlyZXR1cm4gUHlfVW53aW5kVG9rZW47Cgl9Cn0KCgovKiBmb3J3YXJkICovClB5T2Jq ZWN0ICogZ2V0Y2NfY2F0Y2hfZnJhbWUoUHlGcmFtZU9iamVjdCAqLCBQeU9iamVjdCAqKTsK Ci8qCmdldGNjIGlzIGEgbGl0dGxlIHRyaWNreS4gU2FtIHRyaWVkIHRvIGZldGNoIHRoZSBj dXJyZW50IGZyYW1lCmRpcmVjdGx5IGluIHRoZSBmdW5jdGlvbi4gVGhlIHByb2JsZW0gaXMs IHRoYXQgaW4gdGhpcyBjb250ZXh0LAp0aGUgcnVubmluZyBleGVjdXRvciBpcyBub3QgZG9u ZSB3aXRoIHRoZSBmcmFtZS4gV2hpbGUgd2UgYXJlCmluIHRoZSBleGVjdXRvcidzIGNvbnRl eHQsIHRoZXJlIGlzIHN0aWxsIHNvbWUgc3RhdGUgaW4gdGhlIEMgc3RhY2ssCnNpbmNlIHRo ZSBleGVjdXRvciB3YW50cyB0byBwdXNoIGEgcmVzdWx0IHRvIHRoZSBvYmplY3Qgc3RhY2su CgpUbyBnZXQgdGhpcyBjb3JyZWN0LCB3ZSBoYXZlIGluc3RlYWQgdG8gYmFpbCBvdXQgb2Yg dGhlCmV4ZWN1dG9yLiBUaGUgZnJhbWUgaXMgdGhlbiBpbiBhIGNvbnNpc3RlbnQgc3RhdGUs IGV4YWN0bHkKdGhlIG9uZSB3ZSBuZWVkOgpJdCB3YW50cyB0byBiZSByZS1lbnRlcmVkIHdp dGggYSB2YWx1ZS4KKi8KCnN0YXRpYyBQeU9iamVjdCAqCmJ1aWx0aW5fZ2V0Y2MgKHNlbGYs IGFyZ3MpCglQeU9iamVjdCAqc2VsZjsKCVB5T2JqZWN0ICphcmdzOwp7CglpZiAoIVB5QXJn X1BhcnNlVHVwbGUgKGFyZ3MsICIiKSkgewoJCXJldHVybiBOVUxMOwoJfSBlbHNlIHsKCQlQ eUZyYW1lT2JqZWN0ICpmLCAqYmFjazsKCQlQeVRocmVhZFN0YXRlICp0c3RhdGUgPSBQeVRo cmVhZFN0YXRlX0dFVCgpOwoJCWYgPSBidWlsZF9jb250aW51YXRpb25fZnJhbWUodHN0YXRl LT5mcmFtZSk7CgkJaWYgKGY9PU5VTEwpCgkJCXJldHVybiBOVUxMOwoKCQkvKiBlbnN1cmUg dGhhdCBiYWNrJ3MgY2FsbGVyIGlzIGEgY29udGludWF0aW9uIGFzIHdlbGwgKi8KCQliYWNr ID0gZi0+Zl9iYWNrOwoJCWlmIChiYWNrICE9IE5VTEwgJiYgYmFjay0+Zl9iYWNrICE9IE5V TEwpIHsKCQkJaWYgKGJ1aWxkX2NvbnRpbnVhdGlvbl9mcmFtZShiYWNrLT5mX2JhY2spID09 IE5VTEwpCgkJCQlyZXR1cm4gTlVMTDsKCQl9CgkKCQlQeV9JTkNSRUYgKGYpOwoJCS8qIGdv b2Qgc28gZmFyLCBqdXN0IHRoYXQgd2UgbmVlZCB0byBsZXQgdGhlIGV4ZWN1dGlvbiBmaW5p c2guICovCgkJLyogdGhlcmVmb3JlLCB3ZSBjaGFuZ2UgdGhlIGV4ZWN1dG9yIGFnYWluLCB0 byBnZXQgYSBjYWxsYmFjayAqLwoJCWYtPmZfZXhlY3V0ZSA9IGdldGNjX2NhdGNoX2ZyYW1l IDsKCQlmLT5mX3RlbXBfdmFsID0gKFB5T2JqZWN0ICopIGY7CgkJUHlfREVDUkVGKGYtPmZf ZGlzcGF0Y2hlcik7CgkJcmV0dXJuIFB5X1Vud2luZFRva2VuOwoJfQp9CgpQeU9iamVjdCAq IGdldGNjX2NhdGNoX2ZyYW1lKGYsIHBhc3NlZF9yZXR2YWwpCglQeUZyYW1lT2JqZWN0ICog ZjsKCVB5T2JqZWN0ICogcGFzc2VkX3JldHZhbDsKewoJUHlGcmFtZU9iamVjdCAqZmI7Cgkv KgoJICBIZXJlIHdlIGFyZSBhZ2FpbiwgYWZ0ZXIgdGhlIGN1cnJlbnQgZXhlY3V0b3IgaGFz IGZpbmlzaGVkCgkgIHRoZSBsYXN0IGZ1bmN0aW9uIGNhbGwgdG8gZ2V0Y2MuIFdlIGZpbmFs aXplIG91ciBmcmFtZXMKCSAgbm93IGFuZCByZXR1cm4gb3VyIGNvbnRpbnVhdGlvbiBmcmFt ZS4KCSovCgoJLyogYmUgc3VyZSB0aGF0IGYgYW5kIHBhc3NlZF9yZXR2YWwgYXJlIGlkZW50 aWNhbCAqLwoJaWYoZiE9KFB5RnJhbWVPYmplY3QqKXBhc3NlZF9yZXR2YWwpIHsKCQlQeUVy cl9TZXRTdHJpbmcgKFB5RXhjX1N5c3RlbUVycm9yLCAid3JvbmcgdmFsdWUgaW4gZ2V0Y2Mh Iik7CgkJcmV0dXJuIE5VTEw7IC8qIGNhbm5vdCBoYXBwZW4sIGJ1dCBiZSBzYWZlICovCgl9 CgkvKiBub3cgZmluYWxpemUgdGhpcyBjb250aW51YXRpb24gKi8KCWYtPmZfZXhlY3V0ZSA9 IHRocm93X2NvbnRpbnVhdGlvbjsKCWZiID0gZi0+Zl9iYWNrOwoJUHlfSU5DUkVGKGZiKTsg Lyogc2luY2UgZXZhbF9jb2RlIHdpbGwgZWF0IG9uZSAqLwoJaWYgKHNldF9leGVjdXRpb25f c3RhdGUoZiwgZmIpID09IE5VTEwpCgkJcmV0dXJuIDA7CgkvKiBnaXZlIHRoZSBmcmFtZSBi YWNrIG5vdywgYnV0IHJ1biB0aGUgY29weSAqLwoJZi0+Zl90c3RhdGUtPmZyYW1lID0gZmI7 CglmYi0+Zl90ZW1wX3ZhbCA9IHBhc3NlZF9yZXR2YWw7CglQeV9ERUNSRUYoZi0+Zl9kaXNw YXRjaGVyKTsgLyogd2lsbCBiZSBhc3NpZ25lZCBhbmQgaW5jcmVmJ2Qgb24gdW53aW5kICov CglyZXR1cm4gUHlfVW53aW5kVG9rZW47Cn0KCgovKiAKdmVyeSBtdWNoIHNpbXBsZXIgYW5k IGVhc2llciB0byB1c2UgaXMgdGhpcwpnZXRwY2MsIGdldCBhIHBhcmVudCdzIGN1cnJlbnQg Y29udGludWF0aW9uCiovCgpzdGF0aWMgUHlPYmplY3QgKgpidWlsdGluX2dldHBjYyAoc2Vs ZiwgYXJncykKCVB5T2JqZWN0ICpzZWxmOwoJUHlPYmplY3QgKmFyZ3M7CnsKCWludCBsZXZl bCA9IDE7CglpZiAoIVB5QXJnX1BhcnNlVHVwbGUgKGFyZ3MsICJ8aSIsICZsZXZlbCkpIHsK CQlyZXR1cm4gTlVMTDsKCX0gZWxzZSBpZiAobGV2ZWwgPD0gMCkgewoJCS8qIHVzZSB0aGUg dHJpY2t5IGdldGNjIHZlcnNpb24uICovCgkJUHlPYmplY3QgKnJldDsKCQlhcmdzID0gUHlU dXBsZV9OZXcoMCk7IC8qIG5ldmVyIGZhaWxzICovCgkJcmV0ID0gYnVpbHRpbl9nZXRjYyhz ZWxmLCBhcmdzKTsKCQlQeV9ERUNSRUYoYXJncyk7CgkJcmV0dXJuIHJldDsKCX0gZWxzZSB7 CgkJUHlGcmFtZU9iamVjdCAqZjsKCQlQeVRocmVhZFN0YXRlICp0c3RhdGUgPSBQeVRocmVh ZFN0YXRlX0dFVCgpOwoJCS8qIG1ha2UgYSBjb250aW51YXRpb24gZnJvbSB0aGUgdHJ1ZSBj YWxsZXIgKi8KCQlmID0gdHN0YXRlLT5mcmFtZTsKCQl3aGlsZSAoZiAhPSBOVUxMICYmIGxl dmVsLS0pIAoJCQlmID0gZmluZF9jYWxsZXJfZW50cnkoZik7CgkJaWYgKGY9PU5VTEwpIHsK CQkJUHlFcnJfU2V0U3RyaW5nIChQeUV4Y19WYWx1ZUVycm9yLCAicGFyYW1ldGVyIGV4Y2Vl ZHMgY3VycmVudCBuZXN0aW5nIGxldmVsIik7CgkJCXJldHVybiBOVUxMOwoJCX0KCQkvKiBt YWtlIHN1cmUgdGhpcyBpcyBhIGNvbnRpbnVhdGlvbiAqLwoJCWYgPSBidWlsZF9jb250aW51 YXRpb25fZnJhbWUoZik7CgkJUHlfWElOQ1JFRiAoZik7CgkJewoJCQkvKiBhY3RpdmF0ZSB0 aGUgY2FsbGd1YXJkIHNlY3VyaXR5IGZvciB0aGUgd2hvbGUgY2hhaW4gKi8KCQkJUHlGcmFt ZU9iamVjdCAqeCA9IGY7CgkJCXdoaWxlICh4ICE9IE5VTEwgJiYgeC0+Zl9jYWxsZ3VhcmQ9 PU5VTEwpIHsKCQkJCXgtPmZfY2FsbGd1YXJkID0gcHJvdGVjdF90aGlzX2ZyYW1lOwoJCQkJ eD14LT5mX2JhY2s7CgkJCX0KCQl9CgkJcmV0dXJuIChQeU9iamVjdCAqKSBmOwoJfQp9Cgov KiB2aWV3aW5nIHRoZSBzdGFjayBvZiBhIGZyYW1lICovCgpzdGF0aWMgUHlPYmplY3QgKgpi dWlsdGluX2dldHN0YWNrIChzZWxmLCBhcmdzKQoJUHlPYmplY3QgKnNlbGY7CglQeU9iamVj dCAqYXJnczsKewoJUHlGcmFtZU9iamVjdCAqZjsKCWlmICghUHlBcmdfUGFyc2VUdXBsZSAo YXJncywgIk8hIiwgJlB5RnJhbWVfVHlwZSwgJmYpKSB7CgkJcmV0dXJuIE5VTEw7Cgl9IGVs c2UgewoJCXJldHVybiBzdGFja190b190dXBsZShmKTsKCX0gCn0KCgoKc3RhdGljIFB5TWV0 aG9kRGVmIGNvbnRpbnVhdGlvbl9tZXRob2RzW10gPSB7CiAgeyJwdXRjYyIsCShQeUNGdW5j dGlvbilidWlsdGluX3B1dGNjLCAxfSwKICB7ImdldGNjIiwJKFB5Q0Z1bmN0aW9uKWJ1aWx0 aW5fZ2V0Y2MsIDF9LAogIHsiZ2V0cGNjIiwJKFB5Q0Z1bmN0aW9uKWJ1aWx0aW5fZ2V0cGNj LCAxfSwKICB7ImdldHN0YWNrIiwJKFB5Q0Z1bmN0aW9uKWJ1aWx0aW5fZ2V0c3RhY2ssIDF9 LAogIHtOVUxMLAkJTlVMTH0JCS8qIHNlbnRpbmVsICovCn07CgoKLyogSW5pdGlhbGl6YXRp b24gZnVuY3Rpb24gZm9yIHRoZSBtb2R1bGUgKCptdXN0KiBiZSBjYWxsZWQgaW5pdGNvbnRp bnVhdGlvbikgKi8KCi8qIHZlcnNpb24gY2hlY2sgKi8KCmludCBjaGVja192ZXJzaW9uKCkK ewoJY29uc3QgY2hhciAqc3lzOwoJc3lzID0gUHlfR2V0VmVyc2lvbigpOwoJaWYgKHN0cm5j bXAoc3lzLCBQWV9WRVJTSU9OLCBzdHJsZW4oUFlfVkVSU0lPTikpICE9IDApIHsKCQlyZXR1 cm4gMDsKCX0KCXJldHVybiAxOwp9CgoKI2lmZGVmIF9NU0NfVkVSCl9kZWNsc3BlYyhkbGxl eHBvcnQpCiNlbmRpZgp2b2lkCmluaXRjb250aW51YXRpb24oKQp7CglQeU9iamVjdCAqbSwg KmQ7CgoKCWlmICghY2hlY2tfdmVyc2lvbigpKSB7CgkJUHlFcnJfU2V0U3RyaW5nIChQeUV4 Y19JbXBvcnRFcnJvciwgImFyZ3VtZW50IG11c3QgYmUgYSBmcmFtZSBvYmplY3QiKTsKCQly ZXR1cm47Cgl9CgoJLyogQ3JlYXRlIHRoZSBtb2R1bGUgYW5kIGFkZCB0aGUgZnVuY3Rpb25z ICovCgltID0gUHlfSW5pdE1vZHVsZSgiY29udGludWF0aW9uIiwgY29udGludWF0aW9uX21l dGhvZHMpOwoKCS8qIEFkZCBzb21lIHN5bWJvbGljIGNvbnN0YW50cyB0byB0aGUgbW9kdWxl ICovCglkID0gUHlNb2R1bGVfR2V0RGljdChtKTsKCUVycm9yT2JqZWN0ID0gUHlFcnJfTmV3 RXhjZXB0aW9uKCJjb250aW51YXRpb24uZXJyb3IiLCBOVUxMLCBOVUxMKTsKCVB5RGljdF9T ZXRJdGVtU3RyaW5nKGQsICJlcnJvciIsIEVycm9yT2JqZWN0KTsKfQo= --------------EDD81E724667AA03453E0C67-- From guido@CNRI.Reston.VA.US Mon Jul 12 21:04:21 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 12 Jul 1999 16:04:21 -0400 Subject: [Python-Dev] Python bugs database started Message-ID: <199907122004.QAA09348@eric.cnri.reston.va.us> Barry has installed Jitterbug on python.org and now we can use it to track Python bugs. I already like it much better than the todo wizard, because the response time is much better (the CGI program is written in C). Please try it out -- submit bugs, search for bugs, etc. The URL is http://www.python.org/python-bugs/. Some of you already subscribed to the mailing list (python-bugs-list) -- beware that this list receives a message for each bug reported and each followup. The HTML is preliminary -- it is configurable (somewhat) and I would like to make it look nicer, but don't have the time right now. There are certain features (such as moving bugs to different folders) that are only accessible to authorized users. If you have a good reason I might authorize you. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Tue Jul 13 05:03:25 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:25 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: <199907122004.QAA09348@eric.cnri.reston.va.us> Message-ID: <000701becce4$a973c920$31a02299@tim> > Please try it out -- submit bugs, search for bugs, etc. The URL is > http://www.python.org/python-bugs/. Cool! About those "Jitterbug bugs" (repeated submissions): those popped up for me, DA, and MH. The first and the last are almost certainly using IE5 as their browser, and that DA shows increasing signs of becoming a Windows Mutant too . The first time I submitted a bug, I backed up to the entry page and hit Refresh to get the category counts updated (never saw Jitterbug before, so must play!). IE5 whined about something-or-other being out of date, and would I like to "repost the data"? I said sure. I did that a few other times after posting other bugs, and-- while I don't know for sure --it looks likely that you got a number of resubmissions equal to the number of times I told IE5 "ya, ya, repost whatever you want". Next time I post a bug I'll just close the browser and come back an hour later. If "the repeat bug" goes away then, it's half IE5's fault for being confused about which page it's on, and half mine for assuming IE5 knows what it's doing. meta-bugging-ly y'rs - tim From tim_one@email.msn.com Tue Jul 13 05:03:30 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:30 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907120301.XAA06001@eric.cnri.reston.va.us> Message-ID: <000801becce4$aafd7660$31a02299@tim> [Guido] > ... > Hmm... I think that if the generator is started by a for loop, it's > okay for the loop to assume it is the master of the universe -- just > like there's no force in the world (apart from illegal C code :) that > can change the hidden loop counter in present-day for loop. If it comes to a crunch, me too. I think your idea of forcing an exception in the frame's destructor (to get the stacks cleaned up, and any suspended "finally" blocks executed) renders this a non-issue, though (it will "just work", and if people resort to illegal C code, it will *still* work ). hadn't-noticed-you-can't-spell-"illegal-code"-without-"c"-ly y'rs - tim From tim_one@email.msn.com Tue Jul 13 05:03:33 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:33 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <000901becce4$ac88aa40$31a02299@tim> Backtracking a bit: [Guido] > This is another key description of continuations (maybe not quite > worth a hug :). I suppose a kiss is out of the question, then. > The continuation captures exactly all state that is represented by > "position in the program" and no state that is represented by variables. Right! > But there are many hairy details. In antiquated assembly, there might > not be a call stack, and a continuation could be represented by a > single value: the program counter. But now we have a call stack, a > value stack, a block stack (in Python) and who knows what else. > > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. As you convinced yourself in following paragraphs, for 1st-class continuations "the entire frame stack" *may* be necessary. > ... > How does Scheme do this? I looked up R. Kent Dybvig's doctoral dissertation, at ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/3imp.ps.gz He gives detailed explanations of 3 Scheme implementations there (from whence "3imp", I guess). The first is all heap-based, and looks much like the simple Wilson implementation I summarized yesterday. Dybvig profiled it and discovered it spent half its time in, together, function call overhead and name resolution. So he took a different approach: Scheme is, at heart, just another lexically scoped language, like Algol or Pascal. So how about implementing it with a perfectly conventional shared, contiguous stack? Because that doesn't work: the advanced features (lexical closures with indefinite extent, and user-captured continuations) aren't stack-like. Tough, forget those at the start, and do whatever it takes later to *make* 'em work. So he did. When his stack implementation hit a user's call/cc, it made a physical copy of the entire stack. And everything ran much faster! He points out that "real programs" come in two flavors: 1) Very few, or no, call/cc thingies. Then most calls are no worse than Algol/Pascal/C functions, and the stack implementation runs them at Algol/Pascal/C speed (if we knew of anything faster than a plain stack, the latter would use it). 2) Lots of call/cc thingies. Then "the stack" is likely to be shallow (the program is spending most of its time co-transferring, not recursing deeply), and because the stack is contiguous he can exploit the platform's fastest block-copy operation (no need to chase pointer links, etc). So, in some respects, Dybvig's stack implementation of Scheme was more Pythonic than Python's current implementation . His third implementation was for some propeller-head theoretical "string machine", so I won't even mention it. worrying-about-the-worst-case-can-hurt-the-normal-cases-ly y'rs - tim From da@ski.org Tue Jul 13 05:15:28 1999 From: da@ski.org (David Ascher) Date: Mon, 12 Jul 1999 21:15:28 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Python bugs database started In-Reply-To: <000701becce4$a973c920$31a02299@tim> Message-ID: > About those "Jitterbug bugs" (repeated submissions): those popped up for > me, DA, and MH. The first and the last are almost certainly using IE5 as > their browser, and that DA shows increasing signs of becoming a Windows > Mutant too . > > Next time I post a bug I'll just close the browser and come back an hour > later. If "the repeat bug" goes away then, it's half IE5's fault for being > confused about which page it's on, and half mine for assuming IE5 knows what > it's doing. FYI, I did the same thing but w/ Communicator. (I do use windows, but refuse to use IE =). This one's not specifically MS' fault. From tim_one@email.msn.com Tue Jul 13 05:47:43 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:47:43 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <14218.3286.847367.125679@anthem.cnri.reston.va.us> Message-ID: <001501beccea$d83f6740$31a02299@tim> [Barry] > Minor point, but why not make resume() and shutdown() methods on the > frame? Isn't this much cleaner? > > value, frame = generator(args) > try: > while frame: > etc > value, frame = frame.resume() > finally: > if frame: > frame.shutdown() Yes -- and at least it's better than arguing over what to name them . btw-tabs-in-email-don't-look-the-way-you-expect-them-to-ly y'rs - tim From tim_one@email.msn.com Tue Jul 13 07:47:43 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 02:47:43 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <378A447C.D4DD24D8@appliedbiometrics.com> Message-ID: <000001beccfb$9beb4fa0$2f9e2299@tim> The latest versions of the Icon language (9.3.1 & beyond) sprouted an interesting change in semantics: if you open a file for reading in "translated" (text) mode now, it normalizes Unix, Mac and Windows line endings to plain \n. Writing in text mode still produces what's natural for the platform. Anyone think that's *not* a good idea? c-will-never-get-fixed-ly y'rs - tim From Vladimir.Marangozov@inrialpes.fr Tue Jul 13 12:54:00 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Tue, 13 Jul 1999 12:54:00 +0100 (NFT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000101bec6b3$4e752be0$349e2299@tim> from "Tim Peters" at "Jul 5, 99 02:55:02 am" Message-ID: <199907131154.MAA22698@pukapuka.inrialpes.fr> After a short vacation, I'm trying to swallow the latest discussion about control flow management & derivatives. Could someone help me please by answering two naive questions that popped up spontaneously in my head: Tim Peters wrote: [a biased short course on generators, continuations, coroutines] > > ... > > GENERATORS > > Generators add two new abstract operations, "suspend" and "resume". When a > generator suspends, it's exactly like a return today except we simply > decline to decref the frame. That's it! The locals, and where we are in > the computation, aren't thrown away. A "resume" then consists of > *re*starting the frame at its next bytecode instruction, with the retained > frame's locals and eval stack just as they were. > > ... > > too-simple-to-be-obvious?-ly y'rs - tim Yes. I'm trying to understand the following: 1. What does a generator generate? 2. Clearly, what's the difference between a generator and a thread? -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tismer@appliedbiometrics.com Tue Jul 13 12:41:32 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 13 Jul 1999 13:41:32 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <199907131154.MAA22698@pukapuka.inrialpes.fr> Message-ID: <378B25EC.2739BCE3@appliedbiometrics.com> Vladimir Marangozov wrote: ... > > too-simple-to-be-obvious?-ly y'rs - tim > > Yes. I'm trying to understand the following: > > 1. What does a generator generate? Trying my little understanding. A generator generates a series of results if you ask for it. That's done by a resume call (generator, resume your computation), and the generate continues until he either comes to a suspend (return a value, but be prepared to continue from here) or it does a final return. > 2. Clearly, what's the difference between a generator and a thread? Threads can be scheduled automatically, and they don't return values to each other, natively. Generators are asymmetric to their callers, they're much like functions. Coroutines are more symmetric. They "return" to each other values. They are not determined as caller and callee, but they cooperate on the same level. Therefore, threads and coroutines look more similar, just that coroutines usually are'nt scheduled automatically. Add a scheduler, don't pass values, and you have threads, nearly. (of course I dropped the I/O blocking stuff which doesn't apply and isn't the intent of fake threads). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido@CNRI.Reston.VA.US Tue Jul 13 13:53:52 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 13 Jul 1999 08:53:52 -0400 Subject: [Python-Dev] End of the line In-Reply-To: Your message of "Tue, 13 Jul 1999 02:47:43 EDT." <000001beccfb$9beb4fa0$2f9e2299@tim> References: <000001beccfb$9beb4fa0$2f9e2299@tim> Message-ID: <199907131253.IAA10730@eric.cnri.reston.va.us> > The latest versions of the Icon language (9.3.1 & beyond) sprouted an > interesting change in semantics: if you open a file for reading in > "translated" (text) mode now, it normalizes Unix, Mac and Windows line > endings to plain \n. Writing in text mode still produces what's natural for > the platform. > > Anyone think that's *not* a good idea? I've been thinking about this myself -- exactly what I would do. Not clear how easy it is to implement (given that I'm not so enthused about the idea of rewriting the entire I/O system without using stdio -- see archives). The implementation must be as fast as the current one -- people used to complain bitterly when readlines() or read() where just a tad slower than they *could* be. There's a lookahead of 1 character needed -- ungetc() might be sufficient except that I think it's not guaranteed to work on unbuffered files. Should also do this for the Python parser -- there it would be a lot easier. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Jul 13 15:41:25 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 13 Jul 1999 10:41:25 -0400 (EDT) Subject: [Python-Dev] Python bugs database started References: <199907122004.QAA09348@eric.cnri.reston.va.us> <000701becce4$a973c920$31a02299@tim> Message-ID: <14219.20501.697542.358579@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> The first time I submitted a bug, I backed up to the entry TP> page and hit Refresh to get the category counts updated (never TP> saw Jitterbug before, so must play!). IE5 whined about TP> something-or-other being out of date, and would I like to TP> "repost the data"? I said sure. This makes perfect sense, and explains exactly what's going on. Let's call it "poor design"[1] instead of "user error". A quick scan last night of the Jitterbug site shows no signs of fixes or workarounds. What would Jitterbug have to do to avoid these kinds of problems? Maybe keep a checksum of the current submission and check it against the next one to make sure it's not a re-submit. Maybe a big warning sign reading "Do not repost this form!" Hmm. I think I'll complain on the Jitterbug mailing list. -Barry [1] In the midst of re-reading D. Norman's "The Design of Everyday Things", otherwise I would have said you guys were just incompetent Webweenies :) From da@ski.org Tue Jul 13 17:01:55 1999 From: da@ski.org (David Ascher) Date: Tue, 13 Jul 1999 09:01:55 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Python bugs database started In-Reply-To: <14219.20501.697542.358579@anthem.cnri.reston.va.us> Message-ID: On Tue, 13 Jul 1999, Barry A. Warsaw wrote: > > This makes perfect sense, and explains exactly what's going on. Let's > call it "poor design"[1] instead of "user error". A quick scan last > night of the Jitterbug site shows no signs of fixes or workarounds. > What would Jitterbug have to do to avoid these kinds of problems? > Maybe keep a checksum of the current submission and check it against > the next one to make sure it's not a re-submit. That's be good -- alternatively, insert a 'safe' CGI script after the validation -- "Thanks for submitting the bug. Click here to go back to the home page". From guido@CNRI.Reston.VA.US Tue Jul 13 17:09:48 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 13 Jul 1999 12:09:48 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: Your message of "Tue, 13 Jul 1999 09:01:55 PDT." References: Message-ID: <199907131609.MAA11208@eric.cnri.reston.va.us> > That's be good -- alternatively, insert a 'safe' CGI script after the > validation -- "Thanks for submitting the bug. Click here to go back to > the home page". That makes a lot of sense! I'm now quite sure that I had the same "Repost form data?" experience, and just didn't realized that mattered, because I was staring at the part of the form that was showing the various folders. The Jitterbug software is nice for tracking bugs, but its user interface *SUCKS*. I wish I had the time to redseign that part -- unfortunately it's probably totally integrated with the rest of the code... --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@python.org Tue Jul 13 17:19:26 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Tue, 13 Jul 1999 12:19:26 -0400 (EDT) Subject: [Python-Dev] Python bugs database started References: <199907131609.MAA11208@eric.cnri.reston.va.us> Message-ID: <14219.26382.122095.608613@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> The Jitterbug software is nice for tracking bugs, but its Guido> user interface *SUCKS*. I wish I had the time to redseign Guido> that part -- unfortunately it's probably totally integrated Guido> with the rest of the code... There is an unsupported fork that some guy did that totally revamped the interface: http://lists.samba.org/listproc/jitterbug/0095.html Still not great tho'. -Barry From MHammond@skippinet.com.au Wed Jul 14 03:25:50 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Wed, 14 Jul 1999 12:25:50 +1000 Subject: [Python-Dev] Interrupting a thread Message-ID: <006d01becda0$318035e0$0801a8c0@bobcat> Ive struck this a number of times, and the simple question is "can we make it possible to interrupt a thread without the thread's knowledge" or otherwise stated "how can we asynchronously raise an exception in another thread?" The specific issue is that quite often, I find it necessary to interrupt one thread from another. One example is Pythonwin - rather than use the debugger hooks as IDLE does, I use a secondary thread. But how can I use that thread to interrupt the code executing in the first? (With magic that only works sometimes is how :-) Another example came up on the newsgroup recently - discussion about making Medusa a true Windows NT Service. A trivial solution would be to have a "service thread", that simply runs Medusa's loop in a seperate thread. When the "service thread" recieves a shut-down request from NT, how can it interrupt Medusa? I probably should not have started with a Medusa example - it may have a solution. Pretend I said "any arbitary script written to run similarly to a Unix daemon". There are one or 2 other cases where I have wanted to execute existing code that assumes it runs stand-alone, and can really only be stopped with a KeyboardInterrupt. I can't see a decent way to do this. [I guess this ties into the "signals and threads" limitations - I believe you cant direct signals at threads either?] Is it desirable? Unfortunately, I can see that it might be hard :-( But-sounds-pretty-easy-under-those-fake-threads-ly, Mark. From tim_one@email.msn.com Wed Jul 14 04:56:20 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 23:56:20 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907131154.MAA22698@pukapuka.inrialpes.fr> Message-ID: <000d01becdac$d4dee900$7d9e2299@tim> [Vladimir Marangozov] > Yes. I'm trying to understand the following: > > 1. What does a generator generate? Any sequence of objects: the lines in a file, the digits of pi, a postorder traversal of the nodes of a binary tree, the files in a directory, the machines on a LAN, the critical bugs filed before 3/1/1995, the set of builtin types, all possible ways of matching a regexp to a string, the 5-card poker hands beating a pair of deuces, ... anything! Icon uses the word "generators", and it's derived from that language's ubiquitous use of the beasts to generate paths in a backtracking search space. In OO languages it may be better to name them "iterators", after the closest common OO concept. The CLU language had full-blown (semi-coroutine, like Icon generators) iterators 20 years ago, and the idea was copied & reinvented by many later languages. Sather is probably the best known of those, and also calls them iterators. > 2. Clearly, what's the difference between a generator and a thread? If you can clearly explain what "a thread" is, I can clearly explain the similarities and differences. Well? I'm holding my breath here . Generators/iterators are simpler than threads, whether looked at from a user's viewpoint or an implementor's. Their semantics are synchronous and deterministic. Python's for/__getitem__ protocol *is* an iterator protocol already, but if I ask you which is the 378th 5-card poker hand beating a pair of deuces, and ask you a new question like that every hour, you may start to suspect there may be a better way to *approach* coding enumerations in general . then-again-there-may-not-be-ly y'rs - tim From tim_one@email.msn.com Wed Jul 14 04:56:15 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 23:56:15 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <199907131253.IAA10730@eric.cnri.reston.va.us> Message-ID: <000c01becdac$d2ad6300$7d9e2299@tim> [Tim] > ... Icon ... sprouted an interesting change in semantics: if you open > a file for reading in ...text mode ... it normalizes Unix, Mac and > Windows line endings to plain \n. Writing in text mode still produces > what's natural for the platform. [Guido] > I've been thinking about this myself -- exactly what I would do. Me too . > Not clear how easy it is to implement (given that I'm not so enthused > about the idea of rewriting the entire I/O system without using stdio > -- see archives). The Icon implementation is very simple: they *still* open the file in stdio text mode. "What's natural for the platform" on writing then comes for free. On reading, libc usually takes care of what's needed, and what remains is to check for stray '\r' characters that stdio glossed over. That is, in fileobject.c, replacing if ((*buf++ = c) == '\n') { if (n < 0) buf--; break; } with a block like (untested!) *buf++ = c; if (c == '\n' || c == '\r') { if (c == '\r') { *(buf-1) = '\n'; /* consume following newline, if any */ c = getc(fp); if (c != '\n') ungetc(c, fp); } if (n < 0) buf--; break; } Related trickery needed in readlines. Of course the '\r' business should be done only if the file was opened in text mode. > The implementation must be as fast as the current one -- people used > to complain bitterly when readlines() or read() where just a tad > slower than they *could* be. The above does add one compare per character. Haven't timed it. readlines may be worse. BTW, people complain bitterly anyway, but it's in comparison to Perl text mode line-at-a-time reads! D:\Python>wc a.c 1146880 3023873 25281537 a.c D:\Python> Reading that via def g(): f = open("a.c") while 1: line = f.readline() if not line: break and using python -O took 51 seconds. Running the similar Perl (although it's not idiomatic Perl to assign each line to an explict var, or to test that var in the loop, or to use "if !" instead of "unless" -- did all those to make it more like the Python): open(DATA, ") {last if ! $line;} took 17 seconds. So when people are complaining about a factor of 3, I'm not inclined to get excited about a few percent . > There's a lookahead of 1 character needed -- ungetc() might be > sufficient except that I think it's not guaranteed to work on > unbuffered files. Don't believe I've bumped into that. *Have* bumped into problems with ungetc not playing nice with fseek/ftell, and that's probably enough to kill it right there (alas). > Should also do this for the Python parser -- there it would be a lot > easier. And probably the biggest bang for the buck. the-problem-with-exposing-libc-is-that-libc-isn't-worth-exposing Message-ID: <007401becdb6$22445c80$0801a8c0@bobcat> I asked Guido to provide comments on one of the chapters in our book: I was discussing appending the mode ("t" or "b") to the open() call > p.10, bottom: text mode is the default -- I've never seen the 't' > option described! (So even if it exists, better be silent about it.) > You need to append 'b' to get binary mode instead. This brings up an interesting issue. MSVC exposes a global variable that contains the default mode - ie, you can change the default to binary. (_fmode for those with the docs) This has some implications and questions: * Will Guido ever bow to pressure (when it arrives :) to expose this via the "msvcrt" module? I can imagine where it may be useful in a limited context. A reasonable argument would be that, like _setmode and other MS specific stuff, if it exists it should be exposed. * But even if not, due to the shared CRTL, in COM and other worlds we really cant predict what the default is. Although Python does not touch it, that does not stop someone else touching it. A web-server built using MSVC on Windows may use it? Thus, it appears that to be 100% sure what mode you are using, you should not rely on the default, but should _always_ use "b" or "t" on the file mode. Any thoughts or comments? The case for abandoning the CRTL's text mode gets stronger and stronger! Mark. From tim_one@email.msn.com Wed Jul 14 07:35:31 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 02:35:31 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <000901becdc3$119be9e0$a09e2299@tim> [Mark Hammond] > Ive struck this a number of times, and the simple question is "can we > make it possible to interrupt a thread without the thread's knowledge" > or otherwise stated "how can we asynchronously raise an exception in > another thread?" I don't think there's any portable way to do this. Even restricting the scope to Windows, forget Python for a moment: can you do this reliably with NT threads from C, availing yourself of every trick in the SDK? Not that I know of; not without crafting a new protocol that the targeted threads agree to in advance. > ... > But-sounds-pretty-easy-under-those-fake-threads-ly, Yes, piece o' cake! Fake threads can do anything, because unless we write every stick of their implementation they can't do anything at all . odd-how-solutions-create-more-problems-than-they-solve-ly y'rs - tim From tim_one@email.msn.com Wed Jul 14 07:35:33 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 02:35:33 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <007401becdb6$22445c80$0801a8c0@bobcat> Message-ID: <000a01becdc3$12d94be0$a09e2299@tim> [Mark Hammond] > ... > MSVC exposes a global variable that contains the default [fopen] mode - > ie, you can change the default to binary. (_fmode for those with the > docs) > > This has some implications and questions: > * Will Guido ever bow to pressure (when it arrives :) to expose this via > the "msvcrt" module? No. It changes the advertised semantics of Python builtins, and no option ever does that. If it went in at all, it would have to be exposed as a Python-level feature that changed the semantics similarly on all platforms -- and even then Guido wouldn't put it in . > ... > Thus, it appears that to be 100% sure what mode you are using, you should > not rely on the default, but should _always_ use "b" or "t" on the file > mode. And on platforms that have libc options to treat "t" as if it were "b"? There's no limit to how perverse platform options can get! There's no fully safe ground to stand on, so Python stands on the minimal guarantees libc provides. If a user violates those, tough, they can't use Python. Unless, of course, they contribute a lot of money to the PSA . > ... > Any thoughts or comments? The case for abandoning the CRTL's text mode > gets stronger and stronger! C's text mode is, alas, a bad joke. The only thing worse is Microsoft's half-assed implementation of it <0.5 wink>. ctrl-z-=-eof-even-gets-in-the-way-under-windows!-ly y'rs - tim From MHammond@skippinet.com.au Wed Jul 14 07:58:25 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Wed, 14 Jul 1999 16:58:25 +1000 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <000901becdc3$119be9e0$a09e2299@tim> Message-ID: <007e01becdc6$45982490$0801a8c0@bobcat> > I don't think there's any portable way to do this. Even > restricting the > scope to Windows, forget Python for a moment: can you do > this reliably with > NT threads from C, availing yourself of every trick in the > SDK? Not that I Nope - not if I forget Python. However, when I restrict myself _to_ Python, I find this nice little ceval.c loop and nice little per-thread structures - even with nice-looking exception place-holders ;-) Something tells me that it wont be quite as easy as filling these in (while you have the lock, of course!), but it certainly seems far more plausible than if we consider it a C problem :-) > odd-how-solutions-create-more-problems-than-they-solve-ly y'rs - tim Only because they often open your eyes to a whole new class of problem . Continuations/generators/co-routines (even threads themselves!) would appear to be a good example - for all their power, I shudder to think at the number of questions they will generate! If I understand correctly, it is a recognised deficiency WRT signals and threads - so its all Guido's fault for adding these damn threads in the first place :-) just-more-proof-there-is-no-such-thing-as-a-free-lunch-ly, Mark. From jack@oratrix.nl Wed Jul 14 09:07:59 1999 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 10:07:59 +0200 Subject: [Python-Dev] Python bugs database started In-Reply-To: Message by Guido van Rossum , Tue, 13 Jul 1999 12:09:48 -0400 , <199907131609.MAA11208@eric.cnri.reston.va.us> Message-ID: <19990714080759.D49B2303120@snelboot.oratrix.nl> > The Jitterbug software is nice for tracking bugs, but its user > interface *SUCKS*. I wish I had the time to redseign that part -- > unfortunately it's probably totally integrated with the rest of the > code... We looked into bug tracking systems recently, and basically they all suck. We went with gnats in the end, but it has pretty similar problems on the GUI side. But maybe we could convince some people with too much time on their hands to do a Python bug reporting system:-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jack@oratrix.nl Wed Jul 14 09:21:16 1999 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 10:21:16 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Tim Peters" , Tue, 13 Jul 1999 23:56:15 -0400 , <000c01becdac$d2ad6300$7d9e2299@tim> Message-ID: <19990714082116.6DE96303120@snelboot.oratrix.nl> > The Icon implementation is very simple: they *still* open the file in stdio > text mode. "What's natural for the platform" on writing then comes for > free. On reading, libc usually takes care of what's needed, and what > remains is to check for stray '\r' characters that stdio glossed over. This'll work for Unix and PC conventions, but not for the Mac. Mac end of line is \r, so reading a line from a mac file on unix will give you the whole file. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tismer@appliedbiometrics.com Wed Jul 14 13:13:10 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 14 Jul 1999 14:13:10 +0200 Subject: [Python-Dev] Interrupting a thread References: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <378C7ED6.F0DB4E6E@appliedbiometrics.com> Mark Hammond wrote: ... > Another example came up on the newsgroup recently - discussion about making > Medusa a true Windows NT Service. A trivial solution would be to have a > "service thread", that simply runs Medusa's loop in a seperate thread. Ah, thanks, that was what I'd like to know :-) > When the "service thread" recieves a shut-down request from NT, how can it > interrupt Medusa? Very simple. I do this shutdown stuff already, at a user request. Medusa has its polling loop which is so simple (wait until a timeout, then run again) that I pulled it out of Medusa, and added a polling function. I have even simulated timer objects by this, which do certain tasks from time to time (at the granularity of the loop of course). One of these looks if there is a global object in module __main__ with a special name which is executable. This happens to be the shutdown, which may be injected by another thread as well. I can send you an example. > I probably should not have started with a Medusa example - it may have a > solution. Pretend I said "any arbitary script written to run similarly to > a Unix daemon". There are one or 2 other cases where I have wanted to > execute existing code that assumes it runs stand-alone, and can really only > be stopped with a KeyboardInterrupt. I can't see a decent way to do this. Well, yes, I would want to have this too, and see also no way. > [I guess this ties into the "signals and threads" limitations - I believe > you cant direct signals at threads either?] > > Is it desirable? Unfortunately, I can see that it might be hard :-( > > But-sounds-pretty-easy-under-those-fake-threads-ly, You mean you would catch every signal in the one thread, and redirect it to the right fake thread. Given exactly two real threads, one always sitting waiting in a multiple select, the other running any number of fake threads. Would this be enough to do everything which is done with threads today? maybe-almost-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido@CNRI.Reston.VA.US Wed Jul 14 13:24:53 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 08:24:53 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: Your message of "Wed, 14 Jul 1999 15:10:38 +1000." <007401becdb6$22445c80$0801a8c0@bobcat> References: <007401becdb6$22445c80$0801a8c0@bobcat> Message-ID: <199907141224.IAA12211@eric.cnri.reston.va.us> > I asked Guido to provide comments on one of the chapters in our book: > > I was discussing appending the mode ("t" or "b") to the open() call > > > p.10, bottom: text mode is the default -- I've never seen the 't' > > option described! (So even if it exists, better be silent about it.) > > You need to append 'b' to get binary mode instead. In addition, 't' probably isn't even supported on many Unix systems! > This brings up an interesting issue. > > MSVC exposes a global variable that contains the default mode - ie, you can > change the default to binary. (_fmode for those with the docs) The best thing to do with this variable is to ignore it. In large programs like Python that link together pieces of code that never ever heard about each other, making global changes to the semantics of standard library functions is a bad thing. Code that sets it or requires you to set it is broken. > This has some implications and questions: > * Will Guido ever bow to pressure (when it arrives :) to expose this via > the "msvcrt" module? I can imagine where it may be useful in a limited > context. A reasonable argument would be that, like _setmode and other MS > specific stuff, if it exists it should be exposed. No. (And I've never bought that argument before -- I always use "is there sufficient need and no other way.") > * But even if not, due to the shared CRTL, in COM and other worlds we > really cant predict what the default is. Although Python does not touch > it, that does not stop someone else touching it. A web-server built using > MSVC on Windows may use it? But would be stupid for it to do so, and I would argue that the web server was broken. Since they should know better than this, I doubt they do this (this option is more likely to be used in small, self-contained programs). Until you find a concrete example, let's ignore the possibility. > Thus, it appears that to be 100% sure what mode you are using, you should > not rely on the default, but should _always_ use "b" or "t" on the file > mode. Stop losing sleep over it. > Any thoughts or comments? The case for abandoning the CRTL's text mode > gets stronger and stronger! OK, you write the code :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Wed Jul 14 14:03:07 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Wed, 14 Jul 1999 09:03:07 -0400 (EDT) Subject: [Python-Dev] Python bugs database started In-Reply-To: <19990714080759.D49B2303120@snelboot.oratrix.nl> References: <199907131609.MAA11208@eric.cnri.reston.va.us> <19990714080759.D49B2303120@snelboot.oratrix.nl> Message-ID: <14220.35467.644552.307210@amarok.cnri.reston.va.us> Jack Jansen writes: >But maybe we could convince some people with too much time on their hands to >do a Python bug reporting system:-) Digicool has a relatively simple bug tracking system for Zope which you can try out at http://www.zope.org/Collector/ . -- A.M. Kuchling http://starship.python.net/crew/amk/ I'm going to dance now, I'm afraid. -- Ishtar ends it all, in SANDMAN #45: "Brief Lives:5" From gmcm@hypernet.com Wed Jul 14 15:02:22 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 14 Jul 1999 09:02:22 -0500 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <007401becdb6$22445c80$0801a8c0@bobcat> References: <199907121650.MAA06687@eric.cnri.reston.va.us> Message-ID: <1280165369-10624337@hypernet.com> [Mark] > I asked Guido to provide comments on one of the chapters in our > book: > > I was discussing appending the mode ("t" or "b") to the open() call [Guido] > > p.10, bottom: text mode is the default -- I've never seen the 't' > > option described! (So even if it exists, better be silent about it.) > > You need to append 'b' to get binary mode instead. I hadn't either, until I made the mistake of helping Mr took-6-exchanges-before-he-used-the-right-DLL Embedder, who used it in his code. Certainly not mentioned in man fopen on my Linux box. > This brings up an interesting issue. > > MSVC exposes a global variable that contains the default mode - ie, > you can change the default to binary. (_fmode for those with the > docs) Mentally prepend another underscore. This is something for that other p-language. >... The case for abandoning the CRTL's text > mode gets stronger and stronger! If you're tying this in with Tim's Icon worship, note that in these days of LANS, the issue is yet more complex. It would be dandy if I could read text any old text file and have it look sane, but I may be writing it to a different machine without any way of knowing that. When I bother to manipulate these things, I usually choose to use *nix style text files. But I don't deal with Macs, and the only common Windows tool that can't deal with plain \n is Notepad. and-stripcr.py-is-everywhere-available-on-my-Linux-box-ly y'rs - Gordon From guido@CNRI.Reston.VA.US Wed Jul 14 16:05:04 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 11:05:04 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: Your message of "Wed, 14 Jul 1999 12:25:50 +1000." <006d01becda0$318035e0$0801a8c0@bobcat> References: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <199907141505.LAA12313@eric.cnri.reston.va.us> > Ive struck this a number of times, and the simple question is "can we make > it possible to interrupt a thread without the thread's knowledge" or > otherwise stated "how can we asynchronously raise an exception in another > thread?" > > The specific issue is that quite often, I find it necessary to interrupt > one thread from another. One example is Pythonwin - rather than use the > debugger hooks as IDLE does, I use a secondary thread. But how can I use > that thread to interrupt the code executing in the first? (With magic that > only works sometimes is how :-) > > Another example came up on the newsgroup recently - discussion about making > Medusa a true Windows NT Service. A trivial solution would be to have a > "service thread", that simply runs Medusa's loop in a seperate thread. > When the "service thread" recieves a shut-down request from NT, how can it > interrupt Medusa? > > I probably should not have started with a Medusa example - it may have a > solution. Pretend I said "any arbitary script written to run similarly to > a Unix daemon". There are one or 2 other cases where I have wanted to > execute existing code that assumes it runs stand-alone, and can really only > be stopped with a KeyboardInterrupt. I can't see a decent way to do this. > > [I guess this ties into the "signals and threads" limitations - I believe > you cant direct signals at threads either?] > > Is it desirable? Unfortunately, I can see that it might be hard :-( > > But-sounds-pretty-easy-under-those-fake-threads-ly, Hmm... Forget about signals -- they're twisted Unixisms (even if they are nominally supported on NT). The interesting thing is that you can interrupt the "main" thread easily (from C) using Py_AddPendingCall() -- this registers a function that will be invoked by the main thread the next time it gets to the top of the VM loop. But the mechanism here was designed with a specific purpose in mind, and it doesn't allow you to aim at a specific thread -- it only works for the main thread. It might be possible to add an API that allows you to specify a thread id though... Of course if the thread to be interrupted is blocked waiting for I/O, this is not going to interrupt the I/O. (On Unix, that's what signals do; is there an equivalent on NT? I don't think so.) Why do you say that your magic only works sometimes? You mailed me your code once and the Python side of it looks okay to me: it calls PyErr_SetInterrupt(), which calls Py_AddPendingCall(), which is threadsafe. Of course it only works if the thread you try to interrupt is recognized by Python as the main thread -- perhaps this is not always under your control, e.g. when COM interferes? Where is this going? Is the answer "provide a C-level API like Py_AddPendingCall() that takes a thread ID" good enough? Note that for IDLE, I have another problem -- how to catch the ^C event when Tk is processing events? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Wed Jul 14 16:42:14 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 11:42:14 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <19990714082116.6DE96303120@snelboot.oratrix.nl> Message-ID: <000101bece0f$72095c80$f7a02299@tim> [Tim] > On reading, libc usually takes care of what's needed, and what > remains is to check for stray '\r' characters that stdio glossed over. [Jack Jansen] > This'll work for Unix and PC conventions, but not for the Mac. > Mac end of line is \r, so reading a line from a mac file on unix will > give you the whole file. I don't see how. Did you look at the code I posted? It treats '\r' the same as '\n', except that when it sees an '\r' it eats a following '\n' (if any) too, and replaces the '\r' with '\n' regardless. Maybe you're missing that Python reads lines one character at a time? So e.g. the behavior of the platform libc fgets is irrelevant. From tim_one@email.msn.com Wed Jul 14 16:53:46 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 11:53:46 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <007e01becdc6$45982490$0801a8c0@bobcat> Message-ID: <000301bece11$0f0f9d40$f7a02299@tim> [Tim sez there's no portable way to violate another thread "even in C"] [Mark Hammond] > Nope - not if I forget Python. However, when I restrict myself _to_ > Python, I find this nice little ceval.c loop and nice little per-thread > structures - even with nice-looking exception place-holders ;-) Good point! Python does have its own notion of threads. > Something tells me that it wont be quite as easy as filling these > in (while you have the lock, of course!), but it certainly seems far > more plausible than if we consider it a C problem :-) Adding a scheme that builds on the global lock and Python-controlled thread switches may not be prudent if your life's goal is to make Python free-threaded . But if "if you can't beat 'em, join 'em" rules the day, making Py_AddPendingCall thread safe, adding a target thread argument, and fleshing out the XXX Darn! With the advent of thread state, we should have an array of pending calls per thread in the thread state! Later... comment before it, could go a long way toward facilitating groping in the back seat of dad's car . cheaper-than-renting-a-motel-room-for-sure-ly y'rs - tim From jack@oratrix.nl Wed Jul 14 16:53:36 1999 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 17:53:36 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Tim Peters" , Wed, 14 Jul 1999 11:42:14 -0400 , <000101bece0f$72095c80$f7a02299@tim> Message-ID: <19990714155336.94DA8303120@snelboot.oratrix.nl> > [Jack Jansen] > > This'll work for Unix and PC conventions, but not for the Mac. > > Mac end of line is \r, so reading a line from a mac file on unix will > > give you the whole file. > [...] > > Maybe you're missing that Python reads lines one character at a time? So > e.g. the behavior of the platform libc fgets is irrelevant. You're absolutely right... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@CNRI.Reston.VA.US Wed Jul 14 17:15:12 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 12:15:12 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: Your message of "Wed, 14 Jul 1999 09:03:07 EDT." <14220.35467.644552.307210@amarok.cnri.reston.va.us> References: <199907131609.MAA11208@eric.cnri.reston.va.us> <19990714080759.D49B2303120@snelboot.oratrix.nl> <14220.35467.644552.307210@amarok.cnri.reston.va.us> Message-ID: <199907141615.MAA12513@eric.cnri.reston.va.us> > Digicool has a relatively simple bug tracking system for Zope which > you can try out at http://www.zope.org/Collector/ . I asked, and Collector is dead -- but the new offering (Tracker) isn't ready for prime time yet. I'll suffer through Jitterbug until Tracker is out of beta (the first outsider who submitted a bug also did the Reload thing :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From da@ski.org Wed Jul 14 17:14:47 1999 From: da@ski.org (David Ascher) Date: Wed, 14 Jul 1999 09:14:47 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Interrupting a thread In-Reply-To: <1280166671-10546045@hypernet.com> Message-ID: On Wed, 14 Jul 1999, Gordon McMillan wrote: a reply to the python-dev thread on python-list. You didn't really intend to do that, did you Gordon? =) --david From tim_one@email.msn.com Thu Jul 15 05:21:10 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 00:21:10 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000901becce4$ac88aa40$31a02299@tim> Message-ID: <000001bece79$77edf240$51a22299@tim> Just so Guido doesn't feel like the quesion is being ignored : > ... > How does Scheme do this? [continuations] One more reference here. Previously sketched Wilson's simple heap implementation and Dybvig's simple stack one. They're easy to understand, but are (heap) slow all the time, or (stack) fast most of the time but horribly slow in some cases. For the other extreme end of things, check out: Representing Control in the Presence of First-Class Continuations Robert Hieb, R. Kent Dybvig, and Carl Bruggeman PLDI, June 1990 http://www.cs.indiana.edu/~dyb/papers/stack.ps In part: In this paper we show how stacks can be used to implement activation records in a way that is compatible with continuation operations, multiple control threads, and deep recursion. Our approach allows a small upper bound to be placed on the cost of continuation operations and stack overflow and underflow recovery. ... ordinary procedure calls and returns are not adversely affected. ... One important feature of our method is that the stack is not copied when a continuation is captured. Consequently, capturing a continuation is very efficient, and objects that are known to have dynamic extent can be stack­ allocated and modified since they remain in the locations in which they were originally allocated. By copying only a small portion of the stack when a continuation is reinstated, reinstatement costs are bounded by a small constant. The basic gimmick is a segmented stack, where large segments are heap-allocated and each contains multiple contiguous frames (across their code base, only 1% of frames exceeded 30 machine words). But this is a complicated approach, best suited for industrial-strength native-code compilers (speed at any cost -- the authors go thru hell to save an add here, a pointer store there, etc). At least at the time the paper was written, it was the approach implemented by Dybvig's Chez Scheme (a commercial native-code Scheme compiler noted for high speed). Given that Python allocates frames from the heap, I doubt there's a much faster approach than the one Christian has crafted out of his own sweat and blood! It's worth a paper of its own. or-at-least-two-hugs-ly y'rs - tim From tim_one@email.msn.com Thu Jul 15 08:00:14 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 03:00:14 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <199907141224.IAA12211@eric.cnri.reston.va.us> Message-ID: <000301bece8f$b0dd7060$51a22299@tim> >> I was discussing appending the mode ("t" or "b") to the open() call > In addition, 't' probably isn't even supported on many Unix systems! 't' is not ANSI C, so there's no guarantee that it's portable. Hate to say it, but Python should really strip t out before passing a mode string to fopen! From tim_one@email.msn.com Thu Jul 15 08:00:18 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 03:00:18 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <1280165369-10624337@hypernet.com> Message-ID: <000401bece8f$b2810e40$51a22299@tim> [Mark] >> ... The case for abandoning the CRTL's text mode gets stronger >> and stronger! [Gordon] > If you're tying this in with Tim's Icon worship, Icon inherits stdio behavior-- for the most part --too. It does define its own mode string characters, though (like "t" for translated and "u" for untranslated); Icon has been ported to platforms that can't even spell libc, let alone support it. > note that in these days of LANS, the issue is yet more complex. It would > be dandy if I could read text any old text file and have it look sane, but > I may be writing it to a different machine without any way of knowing that. So where's the problem? No matter *what* machine you end up on, Python could read the thing fine. Or are you assuming some fantasy world in which people sometimes run software other than Python ? Caveat: give the C std a close reading. It guarantees much less about text mode than anyone who hasn't studied it would believe; e.g., text mode doesn't guarantee to preserve chars with the high bit set, or most control chars either (MS's treatment of CTRL-Z as EOF under text mode conforms to the std!). Also doesn't guarantee to preserve a line-- even if composed of nothing but printable chars --if it's longer than 509(!) characters. That's what I mean when I say stdio's text mode is a bad joke. > When I bother to manipulate these things, I usually choose to use > *nix style text files. But I don't deal with Macs, and the only > common Windows tool that can't deal with plain \n is Notepad. I generally create text files in binary mode, faking the \n convention by hand. Of course, I didn't do this before I became a Windows Guy <0.5 wink>. > and-stripcr.py-is-everywhere-available-on-my-Linux-box-ly y'rs A plug for my linefix.py (Python FTP contrib, under System), which converts among Unix/Windows/Mac in any direction (by default, from any to Unix). who-needs-linux-when-there's-a-python-in-the-window-ly y'rs - tim From MHammond@skippinet.com.au Thu Jul 15 08:16:32 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Thu, 15 Jul 1999 17:16:32 +1000 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <000301bece8f$b0dd7060$51a22299@tim> Message-ID: <000801bece91$f80576c0$0801a8c0@bobcat> > 't' is not ANSI C, so there's no guarantee that it's > portable. Hate to say > it, but Python should really strip t out before passing a > mode string to > fopen! OK - thanks all - it is clear that this MS aberration is not, and never will be supported by Python. Not being a standards sort of guy I must admit I assumed both the "t" and "b" were standards. Thanks for the clarifications! Mark. From gstein@lyra.org Thu Jul 15 08:15:20 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 15 Jul 1999 00:15:20 -0700 Subject: [Python-Dev] RE: Python on Windows chapter. References: <000301bece8f$b0dd7060$51a22299@tim> Message-ID: <378D8A88.583A4DBF@lyra.org> Tim Peters wrote: > > >> I was discussing appending the mode ("t" or "b") to the open() call > > > In addition, 't' probably isn't even supported on many Unix systems! > > 't' is not ANSI C, so there's no guarantee that it's portable. Hate to say > it, but Python should really strip t out before passing a mode string to > fopen! Should we also filter the socket type when creating sockets? Or the address family? What if I pass "bamboozle" as the fopen mode? Should that become "bab" after filtering? Oh, but what about those two "b" characters? Maybe just reduce it to one? We also can't forget to filter chmod() arguments... can't have unknown bits set. etc etc In other words, I think the idea of "stripping out the t" is bunk. Python is not fatherly. It gives you the rope and lets you figure it out for yourself. You should know that :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Thu Jul 15 09:59:56 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 04:59:56 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <378D8A88.583A4DBF@lyra.org> Message-ID: <000001becea0$69df8ca0$aea22299@tim> [Tim] > 't' is not ANSI C, so there's no guarantee that it's portable. > Hate to say it, but Python should really strip t out before passing > a mode string to fopen! [Greg Stein] > Should we also filter the socket type when creating sockets? Or the > address family? Filtering 't' is a matter of increasing portability by throwing out an option that doesn't do anything on the platforms that accept it, yet can cause a program to die on platforms that don't -- despite that it says nothing. So it's helpful to toss it, not restrictive. > What if I pass "bamboozle" as the fopen mode? Should that become "bab" > after filtering? Oh, but what about those two "b" characters? Those go far beyond what I suggested, Greg. Even so , it would indeed help a great many non-C programmers if Python defined the mode strings it accepts & barfed on others by default. The builtin open is impossible for a non-C weenie to understand from the docs (as a frustrated sister delights in reminding me). It should be made friendlier. Experts can use a new os.fopen if they need to pass "bamboozle"; fine by me; I do think the builtins should hide as much ill-defined libc crap as possible (btw, "open" is unique in this respect). > Maybe just reduce it to one? We also can't forget to filter chmod() > arguments... can't have unknown bits set. I at least agree that chmod has a miserable UI . > etc etc > > In other words, I think the idea of "stripping out the t" is bunk. > Python is not fatherly. It gives you the rope and lets you figure it out > for yourself. You should know that :-) So should Mark -- but we have his testimony that, like most other people, he has no idea what's "std C" and what isn't. In this case he should have noticed that Python's "open" docs don't admit to "t"'s existence either, but even so I see no reason to take comfort in the expectation that he'll eventually be hanged for this sin. ypu-i'd-rather-"open"-died-when-passed-"t"-ly y'rs - tim From guido@cnri.reston.va.us Thu Jul 15 23:29:54 1999 From: guido@cnri.reston.va.us (Guido van Rossum) Date: 15 Jul 1999 18:29:54 -0400 Subject: [Python-Dev] ISPs and Python Message-ID: <5lu2r5czrx.fsf@eric.cnri.reston.va.us> Remember the days when the big problem was to find an ISP who would install Python? Apparently that problem has gone away... The problem is now to get one that installs a decent set of Python extensions :-) See attached c.l.py post. This is similar to the evolution of Python's name recognition -- used to be, managers would say "what's Python?"; then they said "nobody else uses Python"; now presumably they will have to make up some kind ad-hoc no-Python company policy :-) --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Start of forwarded message ------- From: Sim & Golda Zacks Newsgroups: comp.lang.python Subject: Re: htmllib, cgi, HTMLfmt, genCGI, HTMLgen, html, Zope, ... Date: Wed, 14 Jul 1999 00:00:25 -0400 Organization: ExecPC Internet - Milwaukee, WI Message-ID: <7mh1qu$c6m@newsops.execpc.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I am in the exact same situation as you are. I am a web programmer and I'm trying to implement the CGI and database stuff with Python. I am using the HTMLFMT module from the INTERNET PROGRAMMING book and the cgi module from the standard library. What the HTMLFMT library does for you is just that you don't have to type in all the tags, basically it's nothing magical, if I didn't have it I would have to make something up and it probably wouldn't be half as good. the standard cgi unit gives you all the fields from the form, and I haven't looked at the cgi modules from the book yet to see if they give me any added benefit. The big problem I came across was my web host, and all of the other ones I talked to, refused to install the mysql interface to Python, and it has to be included in the build (or something like that) So I just installed gadfly, which seems to be working great for me right now. I'm still playing with it not in production yet. I have no idea what ZOPE does, but everyone who talks about it seems to love it. Hope this helps Sim Zacks [...] ------- End of forwarded message ------- From mhammond@bigpond.net.au Fri Jul 16 00:21:40 1999 From: mhammond@bigpond.net.au (Mark Hammond) Date: Fri, 16 Jul 1999 09:21:40 +1000 Subject: [Python-Dev] ISPs and Python In-Reply-To: <5lu2r5czrx.fsf@eric.cnri.reston.va.us> Message-ID: <001001becf18$cb850610$0801a8c0@bobcat> > Remember the days when the big problem was to find an ISP who would > install Python? Apparently that problem has gone away... The problem > is now to get one that installs a decent set of Python extensions :-) he he. Yes, hence I believe the general agreement exists that we should begin to focus on these more external issues than the language itself. Pity we all agree, but are still such hackers :-) > looked at the cgi modules from the book yet to see if they > give me any added > benefit. The big problem I came across was my web host, and > all of the other From the ISP's POV, this is reasonable. I wouldnt be surprised to find they started with the same policy for Perl. The issue is less likely to be anything to do with Python, but to do with stability. If every client was allowed to install their own extension, then that could wreak havoc. Some ISPs will allow a private Python build, but some only allow you to use their shared version, which they obviously want kept pretty stable. The answer would seem to be to embrace MALs efforts. Not only should we be looking at pre-compiled (as I believe his effort is) but also towards "batteries included, plus spare batteries, wall charger, car charger and solar panels". ISP targetted installations with _many_ extensions installed could be very useful - who cares if it is 20MB - if they dont want that, let then do it manually with the standard installation like everyone else. There could almost be commercial scope here for a support company. Offering ISP/Corporate specific CDs and support. Installations targetted at machines shared among a huge number of users, with almost every common Python extension any of these users would need. Corporates and ISPs may pay far more handsomly than individuals for this kind of stuff. I know I am ranting still, but I repeat my starting point that addressing issues like this are IMO the single best thing we could do for Python. We could leave the language along for 2 years, and come back to it when this shite is better under control :-) of-course-you-should-all-do-that-while-I-continue-to-hack-ly, Mark. From mal@lemburg.com Fri Jul 16 08:44:20 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 16 Jul 1999 09:44:20 +0200 Subject: [Python-Dev] ISPs and Python References: <001001becf18$cb850610$0801a8c0@bobcat> Message-ID: <378EE2D4.A67F5BD@lemburg.com> Mark Hammond wrote: > > > Remember the days when the big problem was to find an ISP who would > > install Python? Apparently that problem has gone away... The problem > > is now to get one that installs a decent set of Python extensions :-) > > he he. Yes, hence I believe the general agreement exists that we should > begin to focus on these more external issues than the language itself. > Pity we all agree, but are still such hackers :-) > > > looked at the cgi modules from the book yet to see if they > > give me any added > > benefit. The big problem I came across was my web host, and > > all of the other > > >From the ISP's POV, this is reasonable. I wouldnt be surprised to find > they started with the same policy for Perl. The issue is less likely to be > anything to do with Python, but to do with stability. If every client was > allowed to install their own extension, then that could wreak havoc. Some > ISPs will allow a private Python build, but some only allow you to use > their shared version, which they obviously want kept pretty stable. > > The answer would seem to be to embrace MALs efforts. Not only should we be > looking at pre-compiled (as I believe his effort is) but also towards > "batteries included, plus spare batteries, wall charger, car charger and > solar panels". ISP targetted installations with _many_ extensions > installed could be very useful - who cares if it is 20MB - if they dont > want that, let then do it manually with the standard installation like > everyone else. mxCGIPython is a project aimed at exactly this situation. The only current caveat with it is that the binaries are not capable of loading shared extensions (maybe some linker guru could help here). In summary the cgipython binaries are complete Python interpreters with a frozen Python standard lib included. This means that you only need to install a single file on your ISP account and you're set for CGI/Python. More infos + the binaries are available here: http://starship.skyport.net/~lemburg/mxCGIPython.html The package could also be tweaked to include a set of common extensions, I suppose, since it uses freeze.py to do most of the job. > There could almost be commercial scope here for a support company. > Offering ISP/Corporate specific CDs and support. Installations targetted > at machines shared among a huge number of users, with almost every common > Python extension any of these users would need. Corporates and ISPs may > pay far more handsomly than individuals for this kind of stuff. > > I know I am ranting still, but I repeat my starting point that addressing > issues like this are IMO the single best thing we could do for Python. We > could leave the language along for 2 years, and come back to it when this > shite is better under control :-) Naa, that would spoil all the fun ;-) But anyways, going commercial with Python is not that far-fetched anymore nowadays... something like what the Linux distributors are doing for Linux could probably also be done with Python. Which brings us back to the package name topic or better the import mechanism... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 168 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From skip@mojam.com (Skip Montanaro) Fri Jul 16 19:04:58 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 16 Jul 1999 14:04:58 -0400 (EDT) Subject: [Python-Dev] Python bugs database started In-Reply-To: <14219.20501.697542.358579@anthem.cnri.reston.va.us> References: <199907122004.QAA09348@eric.cnri.reston.va.us> <000701becce4$a973c920$31a02299@tim> <14219.20501.697542.358579@anthem.cnri.reston.va.us> Message-ID: <14223.29664.66832.630010@94.chicago-33-34rs.il.dial-access.att.net> TP> The first time I submitted a bug, I backed up to the entry page and TP> hit Refresh to get the category counts updated (never saw Jitterbug TP> before, so must play!). IE5 whined about something-or-other being TP> out of date, and would I like to "repost the data"? I said sure. Barry> This makes perfect sense, and explains exactly what's going on. Barry> Let's call it "poor design"[1] instead of "user error". A quick Barry> scan last night of the Jitterbug site shows no signs of fixes or Barry> workarounds. What would Jitterbug have to do to avoid these Barry> kinds of problems? If the submission form uses METHOD=GET instead of METHOD=POST, the backup problem should go away. Skip (finally hobbling through my email after the move to Illinois...) From tim_one@email.msn.com Sun Jul 18 08:06:16 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 18 Jul 1999 03:06:16 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <199907131253.IAA10730@eric.cnri.reston.va.us> Message-ID: <000b01bed0ec$075c47a0$36a02299@tim> > The latest versions of the Icon language [convert \r\n, \r and \n to > plain \n in text mode upon read, and convert \n to the platform convention > on write] It's a trend : the latest version of the REBOL language also does this. The Java compiler does it for Java source files, but I don't know how runtime file read/write work in Java. Anyone know offhand if there's a reliable way to determine whether an open file descriptor (a C FILE*) is seekable? if-i'm-doomed-to-get-obsessed-by-this-may-as-well-make-it-faster- too-ly y'rs - tim From mal@lemburg.com Sun Jul 18 21:29:43 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 18 Jul 1999 22:29:43 +0200 Subject: [Python-Dev] End of the line References: <000b01bed0ec$075c47a0$36a02299@tim> Message-ID: <37923937.4E73E8D8@lemburg.com> Tim Peters wrote: > > Anyone know offhand if there's a reliable way to determine whether an open > file descriptor (a C FILE*) is seekable? I'd simply use trial&error: if (fseek(stream,0,SEEK_CUR) < 0) { if (errno != EBADF)) { /* Not seekable */ errno = 0; } else /* Error */ ; } else /* Seekable */ ; How to get this thread safe is left as exercise to the interested reader ;) Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 166 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da@ski.org Thu Jul 22 00:41:28 1999 From: da@ski.org (David Ascher) Date: Wed, 21 Jul 1999 16:41:28 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Perl 5.6 'feature list' Message-ID: Not all that exciting, but good to know what they're doing: http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html From tim_one@email.msn.com Thu Jul 22 03:52:26 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 21 Jul 1999 22:52:26 -0400 Subject: [Python-Dev] Perl 5.6 'feature list' In-Reply-To: Message-ID: <000f01bed3ed$3b509800$642d2399@tim> [David Ascher] > Not all that exciting, but good to know what they're doing: > > http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html It is good to know, and I didn't, so thanks for passing that on! I see they're finally stealing Python's version numbering scheme . In other news, I just noticed that REBOL threw 1st-class continuations *out* of the language, leaving just the "escape up the current call chain" exception-handling (throw/catch) kind. This isn't an open project, so it's hard to second-guess why. Or easy, depending on how you look at it . i-suggest-looking-at-it-the-right-way-ly y'rs - tim From jim@digicool.com Thu Jul 22 13:15:08 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 08:15:08 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a default value Message-ID: <37970B4C.8E8C741E@digicool.com> I like the list pop method because it provides a way to use lists as thread safe queues and stacks (since append and pop are protected by the global interpreter lock). With pop, you can essentially test whether the list is empty and get a value if it isn't in one atomic operation: try: foo=queue.pop(0) except IndexError: ... empty queue case else: ... non-empty case, do something with foo Unfortunately, this incurs exception overhead. I'd rather do something like: foo=queue.pop(0,marker) if foo is marker: ... empty queue case else: ... non-empty case, do something with foo I'd be happy to provide a patch. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fredrik@pythonware.com Thu Jul 22 14:14:50 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 22 Jul 1999 15:14:50 +0200 Subject: [Python-Dev] Perl 5.6 'feature list' References: Message-ID: <001501bed444$2f5dbe90$f29b12c2@secret.pythonware.com> David Ascher wrote: > Not all that exciting, but good to know what they're doing: > > http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html well, "unicode all the way down" and "language level event loop" sounds pretty exciting to me... (but christian's work beats it all, of course...) From skip@mojam.com (Skip Montanaro) Thu Jul 22 15:24:53 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 22 Jul 1999 09:24:53 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a default value In-Reply-To: <37970B4C.8E8C741E@digicool.com> References: <37970B4C.8E8C741E@digicool.com> Message-ID: <14231.10515.423401.512972@153.chicago-41-42rs.il.dial-access.att.net> Jim> I like the list pop method because it provides a way to use lists Jim> as thread safe queues and stacks (since append and pop are Jim> protected by the global interpreter lock). The global interpreter lock is a property of the current implementation of Python, not of the language itself. At one point in the past Greg Stein created a set of patches that eliminated the lock. While it's perhaps convenient to use now, it may not always exist. I'm not so sure that it should be used as a motivator for changes to libraries in the standard distribution. Skip From jim@digicool.com Thu Jul 22 15:47:13 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 10:47:13 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a defaultvalue References: <37970B4C.8E8C741E@digicool.com> <14231.10515.423401.512972@153.chicago-41-42rs.il.dial-access.att.net> Message-ID: <37972EF1.372C2CB1@digicool.com> Skip Montanaro wrote: > > Jim> I like the list pop method because it provides a way to use lists > Jim> as thread safe queues and stacks (since append and pop are > Jim> protected by the global interpreter lock). > > The global interpreter lock is a property of the current implementation of > Python, not of the language itself. At one point in the past Greg Stein > created a set of patches that eliminated the lock. While it's perhaps > convenient to use now, it may not always exist. I'm not so sure that it > should be used as a motivator for changes to libraries in the standard > distribution. If the global interpreter lock goes away, then some other locking mechanism will be used to make built-in object operations atomic. For example, in Greg's changes, each list was protected by a list lock. The key is that pop combines checking for an empty list and removing an element into a single operation. As long as the operations append and pop are atomic, then lists can be used as thread-safe stacks and queues. The benefit of the proposal does not really depend on the global interpreter lock. It only depends on list operations being atomic. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gmcm@hypernet.com Thu Jul 22 17:07:31 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:07:31 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37970B4C.8E8C741E@digicool.com> Message-ID: <1279466648-20991135@hypernet.com> Jim Fulton writes: > With pop, you can essentially test whether the list is > empty and get a value if it isn't in one atomic operation: > > try: > foo=queue.pop(0) > except IndexError: > ... empty queue case > else: > ... non-empty case, do something with foo > > Unfortunately, this incurs exception overhead. I'd rather do > something like: > > foo=queue.pop(0,marker) > if foo is marker: > ... empty queue case > else: > ... non-empty case, do something with foo I'm assuming you're asking for the equivalent of: def pop(self, default=None): much like dict.get? Then how do I get the old behavior? (I've been known to do odd things - like change behavior based on the number of args - in extension modules, but this ain't an extension). - Gordon From fredrik@pythonware.com Thu Jul 22 16:23:00 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 22 Jul 1999 17:23:00 +0200 Subject: [Python-Dev] End of the line References: <000001beccfb$9beb4fa0$2f9e2299@tim> Message-ID: <009901bed456$161a4950$f29b12c2@secret.pythonware.com> Tim Peters wrote: > The latest versions of the Icon language (9.3.1 & beyond) sprouted an > interesting change in semantics: if you open a file for reading in > "translated" (text) mode now, it normalizes Unix, Mac and Windows line > endings to plain \n. Writing in text mode still produces what's natural for > the platform. > > Anyone think that's *not* a good idea? if we were to change this, how would you tell Python to open a file in text mode? From jim@digicool.com Thu Jul 22 16:30:22 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 11:30:22 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279466648-20991135@hypernet.com> Message-ID: <3797390E.50972562@digicool.com> Gordon McMillan wrote: > > Then how do I get the old behavior? Just pass 0 or 1 argument. >(I've been known to do odd > things - like change behavior based on the number of args - in > extension modules, but this ain't an extension). It *is* a built-in method. It will be handled just like dictionaries handle the second argument to get. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gmcm@hypernet.com Thu Jul 22 17:33:06 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:33:06 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <3797390E.50972562@digicool.com> Message-ID: <1279465114-21083404@hypernet.com> Jim Fulton wrote: > Gordon McMillan wrote: > > > > Then how do I get the old behavior? > > Just pass 0 or 1 argument. > > >(I've been known to do odd > > things - like change behavior based on the number of args - in > > extension modules, but this ain't an extension). > > It *is* a built-in method. It will be handled just like > dictionaries handle the second argument to get. d.get(nonexistantkey) does not throw an exception, it returns None. If list.pop() does not throw an exception when list is empty, it's new behavior. Which are you asking for: breaking code that expects IndexError Violating Pythonic expectations by, in effect, creating 2 methods list.pop(void) list.pop(default_return) - Gordon From jim@digicool.com Thu Jul 22 16:44:22 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 11:44:22 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> Message-ID: <37973C56.5ACFFDEC@digicool.com> Gordon McMillan wrote: > > Jim Fulton wrote: > > > Gordon McMillan wrote: > > > > > > Then how do I get the old behavior? > > > > Just pass 0 or 1 argument. > > > > >(I've been known to do odd > > > things - like change behavior based on the number of args - in > > > extension modules, but this ain't an extension). > > > > It *is* a built-in method. It will be handled just like > > dictionaries handle the second argument to get. > > d.get(nonexistantkey) does not throw an exception, it returns None. Oops, I'd forgotten that. > If list.pop() does not throw an exception when list is empty, it's > new behavior. > > Which are you asking for: > breaking code that expects IndexError No. > Violating Pythonic expectations by, in effect, creating 2 methods > list.pop(void) > list.pop(default_return) Yes, except that I disagree that this is non-pythonic. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From mal@lemburg.com Thu Jul 22 18:27:53 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 19:27:53 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> <37973C56.5ACFFDEC@digicool.com> Message-ID: <37975499.FB61E4E3@lemburg.com> Jim Fulton wrote: > > > Violating Pythonic expectations by, in effect, creating 2 methods > > list.pop(void) > > list.pop(default_return) > > Yes, except that I disagree that this is non-pythonic. Wouldn't a generic builtin for these kinds of things be better, e.g. a function returning a default value in case an exception occurs... something like: tryexcept(list.pop(), IndexError, default) which returns default in case an IndexError occurs. Don't think this would be much faster that the explicit try:...except: though... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gmcm@hypernet.com Thu Jul 22 17:54:58 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:54:58 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37973C56.5ACFFDEC@digicool.com> Message-ID: <1279463517-21179480@hypernet.com> Jim Fulton wrote: > > Gordon McMillan wrote: ... > > Violating Pythonic expectations by, in effect, creating 2 methods > > list.pop(void) > > list.pop(default_return) > > Yes, except that I disagree that this is non-pythonic. > I'll leave the final determination to Mr. Python, but I disagree. Offhand I can't think of a built-in that can't be expressed in normal Python notation, where "optional" args are really defaulted args. Which would lead us to either a new list method, or redefining pop: def pop(usedefault=0, default=None) and making you use 2 args. But maybe I've missed a precedent because I'm so used to it. (Hmm, I guess string.split is a sort-of precedent, because the first default arg behaves differently than anything you could pass in). - Gordon From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Jul 22 19:33:57 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 22 Jul 1999 14:33:57 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> <37973C56.5ACFFDEC@digicool.com> <37975499.FB61E4E3@lemburg.com> Message-ID: <14231.25621.888844.205034@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> Wouldn't a generic builtin for these kinds of things be M> better, e.g. a function returning a default value in case M> an exception occurs... something like: M> tryexcept(list.pop(), IndexError, default) M> which returns default in case an IndexError occurs. Don't think M> this would be much faster that the explicit try:...except: M> though... Don't know if this would be better (or useful, etc.), but it could possibly be faster than explicit try/except, because with try/except you have to instantiate the exception object. Presumably tryexcept() -- however it was spelled -- would catch the exception in C, thus avoiding the overhead of exception object instantiation. -Barry From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Jul 22 19:36:09 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 22 Jul 1999 14:36:09 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <3797390E.50972562@digicool.com> <1279465114-21083404@hypernet.com> Message-ID: <14231.25753.710299.405579@anthem.cnri.reston.va.us> >>>>> "Gordo" == Gordon McMillan writes: Gordo> Which are you asking for: breaking code that expects Gordo> IndexError Violating Pythonic expectations by, in effect, Gordo> creating 2 methods Gordo> list.pop(void) Gordo> list.pop(default_return) The docs /do/ say that list.pop() is experimental, so that probably gives Guido all the out he'd need to change the semantics :). I myself have yet to use list.pop() so I don't know how disasterous the change in semantics would be to existing code. -Barry From jim@digicool.com Thu Jul 22 17:49:33 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 12:49:33 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> Message-ID: <37974B9D.59C2D45E@digicool.com> Gordon McMillan wrote: > > Offhand I can't think of a built-in that can't be expressed in normal > Python notation, where "optional" args are really defaulted args. I can define the pop I want in Python as follows: _marker=[] class list: ... def pop(index=-1, default=marker): try: v=self[index] except IndexError: if default is not marker: return default if self: m='pop index out of range' else: m='pop from empty list' raise IndexError, m del self[index] return v Although I'm not sure why the "pythonicity" of an interface should depend on it's implementation. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim@digicool.com Thu Jul 22 17:53:26 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 12:53:26 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> Message-ID: <37974C86.2EC53BE7@digicool.com> BTW, a good precedent for what I want is getattr. getattr(None,'spam') raises an error, but: getattr(None,'spam',1) returns 1 Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Jul 22 20:02:21 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 22 Jul 1999 15:02:21 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> Message-ID: <14231.27325.387718.435420@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> BTW, a good precedent for what I want JF> is getattr. JF> getattr(None,'spam') JF> raises an error, but: JF> getattr(None,'spam',1) JF> returns 1 Okay, how did this one sneak in, huh? I didn't even realize this had been added to getattr()! CVS reveals it was added b/w 1.5.1 and 1.5.2a1, so maybe I just missed the checkin message. Fred, the built-in-funcs doc needs updating: http://www.python.org/doc/current/lib/built-in-funcs.html FWIW, the CVS log message says this feature is experimental too. :) -Barry From jim@digicool.com Thu Jul 22 20:20:46 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 15:20:46 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> Message-ID: <37976F0E.DFB4067B@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "JF" == Jim Fulton writes: > > JF> BTW, a good precedent for what I want > JF> is getattr. > > JF> getattr(None,'spam') > > JF> raises an error, but: > > JF> getattr(None,'spam',1) > > JF> returns 1 > > Okay, how did this one sneak in, huh? I don't know. Someone told me about it. I find it wildly useful. > I didn't even realize this had > been added to getattr()! CVS reveals it was added b/w 1.5.1 and > 1.5.2a1, so maybe I just missed the checkin message. > > Fred, the built-in-funcs doc needs updating: > > http://www.python.org/doc/current/lib/built-in-funcs.html > > FWIW, the CVS log message says this feature is experimental too. :) Eek! I want it to stay! I also really like list.pop. :) Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From Fred L. Drake, Jr." References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> Message-ID: <14231.28776.160422.442859@weyr.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> BTW, a good precedent for what I want JF> is getattr. JF> getattr(None,'spam') JF> raises an error, but: JF> getattr(None,'spam',1) JF> returns 1 Barry A. Warsaw writes: > Fred, the built-in-funcs doc needs updating: This is done in the CVS repository; thanks for pointing out the oversight! Do people realize that pop() already has an optional parameter? That *is* in the docs: http://www.python.org/docs/current/lib/typesseq-mutable.html See note 4 below the table. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw@python.org Thu Jul 22 20:37:20 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Thu, 22 Jul 1999 15:37:20 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <14231.29424.569863.149366@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> I don't know. Someone told me about it. I find it JF> wildly useful. No kidding! :) From mal@lemburg.com Thu Jul 22 21:32:23 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 22:32:23 +0200 Subject: [Python-Dev] Importing extension modules Message-ID: <37977FD7.BD7A9826@lemburg.com> I'm currently testing a pure Python version of mxDateTime (my date/time package), which uses a contributed Python version of the C extension. Now, to avoid problems with pickled DateTime objects (they include the complete module name), I would like to name *both* the Python and the C extension version mxDateTime. With the current lookup scheme (shared mods are searched before Python modules) this is no problem since the shared mod is found before the Python version and used instead, so getting this working is rather simple. The question is: will this setup remain a feature in future versions of Python ? (Does it work this way on all platforms ?) Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Thu Jul 22 21:45:24 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 22:45:24 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <379782E4.7DC79460@lemburg.com> Jim Fulton wrote: > > [getattr(obj,name[,default])] > > Okay, how did this one sneak in, huh? > > I don't know. Someone told me about it. I find it > wildly useful. Me too... ;-) > > I didn't even realize this had > > been added to getattr()! CVS reveals it was added b/w 1.5.1 and > > 1.5.2a1, so maybe I just missed the checkin message. http://www.deja.com/getdoc.xp?AN=366635977 -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer@appliedbiometrics.com Thu Jul 22 21:50:42 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 22 Jul 1999 22:50:42 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <37978422.F36BB130@appliedbiometrics.com> > > Fred, the built-in-funcs doc needs updating: > > > > http://www.python.org/doc/current/lib/built-in-funcs.html > > > > FWIW, the CVS log message says this feature is experimental too. :) > > Eek! I want it to stay! > > I also really like list.pop. :) Seconded! Also, things which appeared between some alphas and made it upto the final, are just there. It would be fair to update the CVS tree and say the features made it into the dist, even if it just was a mistake not to remove them in time. It was time enough. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Jul 22 21:50:36 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 22 Jul 1999 16:50:36 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> Message-ID: <14231.33820.422195.45250@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> http://www.deja.com/getdoc.xp?AN=366635977 Ah, thanks! Your rationale was exactly the reason why I added dict.get(). I'm still not 100% sure about list.pop() though, since it's not exactly equivalent -- list.pop() modifies the list as a side-effect :) Makes me think you might want an alternative spelling for list[s], call it list.get() and put the optional default on that method. Then again, maybe list.pop() with an optional default is good enough. -Barry From jim@digicool.com Thu Jul 22 21:55:05 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 16:55:05 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> Message-ID: <37978529.B1AC5273@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> http://www.deja.com/getdoc.xp?AN=366635977 > > Ah, thanks! Your rationale was exactly the reason why I added > dict.get(). I'm still not 100% sure about list.pop() though, since > it's not exactly equivalent -- list.pop() modifies the list as a > side-effect :) Makes me think you might want an alternative spelling > for list[s], call it list.get() and put the optional default on that > method. Then again, maybe list.pop() with an optional default is good > enough. list.get and list.pop are different, since get wouldn't modify the list and pop would. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From bwarsaw@python.org Thu Jul 22 22:13:49 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Thu, 22 Jul 1999 17:13:49 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> <37978529.B1AC5273@digicool.com> Message-ID: <14231.35214.1590.898304@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> list.get and list.pop are different, since get wouldn't modify JF> the list and pop would. Right. Would we need them both? From jim@digicool.com Thu Jul 22 22:36:03 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 17:36:03 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> <37978529.B1AC5273@digicool.com> <14231.35214.1590.898304@anthem.cnri.reston.va.us> Message-ID: <37978EC3.CAAF2632@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "JF" == Jim Fulton writes: > > JF> list.get and list.pop are different, since get wouldn't modify > JF> the list and pop would. > > Right. Would we need them both? Sure. Since a sequence is sort of a special kind of mapping, get makes sense. I definately, want pop. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tim_one@email.msn.com Fri Jul 23 04:08:05 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:05 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37976F0E.DFB4067B@digicool.com> Message-ID: <000201bed4b8$951f9ae0$2c2d2399@tim> [Barry] > FWIW, the CVS log message says this feature [3-arg getattr] is > experimental too. :) [Jim] > Eek! I want it to stay! > > I also really like list.pop. :) Don't panic: Guido has never removed a feature explicitly called "experimental"; he's only removed non-experimental ones. that's-why-we-call-stackless-python-"an-experiment"-ly y'rs - tim From tim_one@email.msn.com Fri Jul 23 04:08:07 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:07 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <009901bed456$161a4950$f29b12c2@secret.pythonware.com> Message-ID: <000301bed4b8$964492e0$2c2d2399@tim> [Tim] > The latest versions of the Icon language ... normalizes Unix, Mac > and Windows line endings to plain \n. Writing in text mode still > produces what's natural for the platform. [/F] > if we were to change this, how would you > tell Python to open a file in text mode? Meaning whatever it is the platform libc does? In Icon or REBOL, you don't. Icon is more interesting because they changed the semantics of their "t" (for "translated") mode without providing any way to go back to the old behavior (REBOL did this too, but didn't have Icon's 15 years of history to wrestle with). Curiously (I doubt Griswold *cared* about this!), the resulting behavior still conforms to ANSI C, because that std promises little about text mode semantics in the presence of non-printable characters. Nothing of mine would miss C's raw text mode (lack of) semantics, so I don't care. I *would* like Python to define portable semantics for the mode strings it accepts in the builtin open regardless, and push platform-specific silliness (including raw C text mode, if someone really wants that; or MS's "c" mode, etc) into a new os.fopen function. Push random C crap into expert modules, where it won't baffle my sister <0.7 wink>. I expect Python should still open non-binary files in the platform's text mode, though, to minimize surprises for C extensions mucking with the underlying stream object (Icon/REBOL don't have this problem, although Icon opens the file in native libc text mode anyway). next-step:-define-tabs-to-mean-8-characters-and-drop-unicode-in- favor-of-7-bit-ascii-ly y'rs - tim From tim_one@email.msn.com Fri Jul 23 04:08:02 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:02 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37975499.FB61E4E3@lemburg.com> Message-ID: <000101bed4b8$9395eda0$2c2d2399@tim> [M.-A. Lemburg] > Wouldn't a generic builtin for these kinds of things be > better, e.g. a function returning a default value in case > an exception occurs... something like: > > tryexcept(list.pop(), IndexError, default) > > which returns default in case an IndexError occurs. Don't > think this would be much faster that the explicit try:...except: > though... As a function (builtin or not), tryexcept will never get called if list.pop() raises an exception. tryexcept would need to be a new statement type, and the compiler would have to generate code akin to try: whatever = list.pop() except IndexError: whatever = default If you want to do it in a C function instead to avoid the Python-level exception overhead, the compiler would have to wrap list.pop() in a lambda in order to delay evaluation until the C code got control; and then you've got worse overhead . generalization-is-the-devil's-playground-ly y'rs - tim From tim_one@email.msn.com Fri Jul 23 08:23:27 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 23 Jul 1999 03:23:27 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a defaultvalue In-Reply-To: <37970B4C.8E8C741E@digicool.com> Message-ID: <000201bed4dc$41c9f240$392d2399@tim> In a moment of insanity, Guido gave me carte blanche to suggest new list methods, and list.pop & list.extend were the result. I considered spec'ing list.pop to take an optional "default on bad index" argument too, but after playing with it didn't like it (always appeared just as easy & clearer to use "if list:" / "while list:" etc). Jim has a novel use I hadn't considered: > With pop, you can essentially test whether the list is > empty and get a value if it isn't in one atomic operation: > > try: > foo=queue.pop(0) > except IndexError: > ... empty queue case > else: > ... non-empty case, do something with foo > > Unfortunately, this incurs exception overhead. I'd rather do > something like: > > foo=queue.pop(0,marker) > if foo is marker: > ... empty queue case > else: > ... non-empty case, do something with foo It's both clever and pretty. OTOH, the original try/except isn't expensive unless the "except" triggers frequently, in which case (the queue is often empty) a thread is likely better off with a yielding Queue.get() call. So this strikes me as useful only for thread micro-optimization, and a kind of optimization most users should be steered away from anyway. Does anyone have a real use for this outside of threads? If not, I'd rather it not go in. For threads that need an optimized non-blocking probe, I'd write it: gotone = 0 if queue: try: foo = queue.pop(0) gotone = 1 except IndexError: pass if gotone: # use foo else: # twiddle thumbs For the IndexError to trigger there, a thread has to lose its bytecode slice between a successful "if queue" and the queue.pop, and not get another chance to run until other threads have emptied the queue. From mal@lemburg.com Fri Jul 23 09:27:47 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 23 Jul 1999 10:27:47 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> Message-ID: <37982783.E60E9941@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > Wouldn't a generic builtin for these kinds of things be > > better, e.g. a function returning a default value in case > > an exception occurs... something like: > > > > tryexcept(list.pop(), IndexError, default) > > > > which returns default in case an IndexError occurs. Don't > > think this would be much faster that the explicit try:...except: > > though... > > As a function (builtin or not), tryexcept will never get called if > list.pop() raises an exception. Dang. You're right... > tryexcept would need to be a new statement > type, and the compiler would have to generate code akin to > > try: > whatever = list.pop() > except IndexError: > whatever = default > > If you want to do it in a C function instead to avoid the Python-level > exception overhead, the compiler would have to wrap list.pop() in a lambda > in order to delay evaluation until the C code got control; and then you've > got worse overhead . Oh well, forget the whole idea then. list.pop() is really not needed that often anyways to warrant the default arg thing, IMHO. dict.get() and getattr() have the default arg as performance enhancement and I believe that you wouldn't get all that much better performance on average by adding a second optional argument to list.pop(). BTW, there is a generic get() function in mxTools (you know where...) in case someone should be looking for such a beast. It works with all sequences and mappings. Also, has anybody considered writing list.pop(..,default) this way: if list: obj = list.pop() else: obj = default No exceptions, no changes, fast as hell :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer@appliedbiometrics.com Fri Jul 23 11:39:27 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 23 Jul 1999 12:39:27 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> Message-ID: <3798465F.33A253D4@appliedbiometrics.com> "M.-A. Lemburg" wrote: ... > Also, has anybody considered writing list.pop(..,default) this way: > > if list: > obj = list.pop() > else: > obj = default > > No exceptions, no changes, fast as hell :-) Yes, that's the best way to go, I think. But wasn't the primary question directed on an atomic function which is thread-safe? I'm not sure, this thread has grown too fast :-) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From mal@lemburg.com Fri Jul 23 12:07:22 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 23 Jul 1999 13:07:22 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <3798465F.33A253D4@appliedbiometrics.com> Message-ID: <37984CEA.1DF062F6@lemburg.com> Christian Tismer wrote: > > "M.-A. Lemburg" wrote: > ... > > Also, has anybody considered writing list.pop(..,default) this way: > > > > if list: > > obj = list.pop() > > else: > > obj = default > > > > No exceptions, no changes, fast as hell :-) > > Yes, that's the best way to go, I think. > But wasn't the primary question directed on > an atomic function which is thread-safe? > I'm not sure, this thread has grown too fast :-) I think that was what Jim had in mind in the first place. Hmm, so maybe we're not after lists after all: maybe what we need is access to the global interpreter lock in Python, so that we can write: sys.lock.acquire() if list: obj = list.pop() else: obj = default sys.lock.release() Or maybe we need some general lock in the thread module for these purposes... don't know. It's been some time since I used threads. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim@digicool.com Fri Jul 23 12:58:23 1999 From: jim@digicool.com (Jim Fulton) Date: Fri, 23 Jul 1999 07:58:23 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <3798465F.33A253D4@appliedbiometrics.com> Message-ID: <379858DF.D317A40F@digicool.com> Christian Tismer wrote: > > "M.-A. Lemburg" wrote: > ... > > Also, has anybody considered writing list.pop(..,default) this way: > > > > if list: > > obj = list.pop() > > else: > > obj = default > > > > No exceptions, no changes, fast as hell :-) > > Yes, that's the best way to go, I think. > But wasn't the primary question directed on > an atomic function which is thread-safe? Right. And the above code doesn't solve this problem. Tim's code *does* solve the problem. It's the code we were using. It is a bit verbose though. > I'm not sure, this thread has grown too fast :-) Don't they all? Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From Fred L. Drake, Jr." References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> Message-ID: <14232.34105.421424.838212@weyr.cnri.reston.va.us> Tim Peters wrote: > As a function (builtin or not), tryexcept will never get called if > list.pop() raises an exception. M.-A. Lemburg writes: > Oh well, forget the whole idea then. list.pop() is really not Giving up already? Wouldn't you just love this as an expression operator (which could work)? How about: top = list.pop() excepting IndexError, default Hehehe... ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip@mojam.com (Skip Montanaro) Fri Jul 23 17:23:31 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 23 Jul 1999 11:23:31 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.34105.421424.838212@weyr.cnri.reston.va.us> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> Message-ID: <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> Fred> Giving up already? Wouldn't you just love this as an expression Fred> operator (which could work)? Fred> How about: Fred> top = list.pop() excepting IndexError, default Why not go all the way to Perl with top = list.pop() unless IndexError ??? ;-) Skip From Fred L. Drake, Jr." References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> Message-ID: <14232.39065.687719.135590@weyr.cnri.reston.va.us> Skip Montanaro writes: > Why not go all the way to Perl with > > top = list.pop() unless IndexError Trying to kill me, Skip? ;-) Actually, the semantics are different. If we interpret that using the Perl semantics for "unless", don't we have the same thing as: if not IndexError: top = list.pop() Since IndexError will normally be a non-empty string or a class, this is pretty much: if 0: top = list.pop() which certainly isn't quite as interesting. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip@mojam.com (Skip Montanaro) Fri Jul 23 21:23:12 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 23 Jul 1999 15:23:12 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.39065.687719.135590@weyr.cnri.reston.va.us> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> <14232.39065.687719.135590@weyr.cnri.reston.va.us> Message-ID: <14232.52576.746910.229435@227.chicago-26-27rs.il.dial-access.att.net> Fred> Skip Montanaro writes: >> Why not go all the way to Perl with >> >> top = list.pop() unless IndexError Fred> Trying to kill me, Skip? ;-) Nope, just a flesh wound. I'll wait for the resulting infection to really do you in. ;-) Fred> Actually, the semantics are different. If we interpret that using Fred> the Perl semantics for "unless", don't we have the same thing as: Yes, but the flavor is the same. Reading Perl code that uses the unless keyword always seemed counterintuitive to me. Something like x = y unless foo; always reads to me like, "Assign y to x. No, wait a minute. I forgot something. Only do that if foo isn't true." What was so bad about if (!foo) { x = y; } That was my initial reaction to the use of the trailing except. We argue a lot in the Python community about whether or not a proposed language feature increases the expressive power of the language or not (which is a good idea in my opinion). The Perl community has apparently never been afflicted with that disease. smiles all 'round... Skip From tismer@appliedbiometrics.com Sat Jul 24 00:36:33 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sat, 24 Jul 1999 01:36:33 +0200 Subject: [Python-Dev] continuations for the curious Message-ID: <3798FC81.A57E9CFE@appliedbiometrics.com> Howdy, my modules are nearly ready. I will be out of my office for two weeks, but had no time to finalize and publish yet. Stackless Python has reached what I wanted it to reach: A continuation can be saved at every opcode. The continuationmodule has been shrunk heavily. Some extension is still needed, continuations are still frames, but they can be picked like Sam wanted it originally. Sam, I'm pretty sure this is more than enough for coroutines. Just have a look at getpcc(), this is now very easy. All involved frames are armed so that they *can* save themselves, but will do so just if necessary. The cheapest solution I could think of, no other optimization is necessary. If your coroutine functions like to swap two frames, and if they manage to do so that the refcount of the target stays at one, no extra frame will be generated. That's it, really. If someone wants to play, get the stackless module, replace ceval.c, and build continuationmodule.c as a dll or whatever. testct.py contains a lot of crap. The first implementation of class coroutine is working right. The second one is wrong by concept. later - chris ftp://ftp.pns.cc/pub/veryfar.zip -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing@nightmare.com Sat Jul 24 02:52:00 1999 From: rushing@nightmare.com (Sam Rushing) Date: Fri, 23 Jul 1999 18:52:00 -0700 (PDT) Subject: [Python-Dev] continuations for the curious In-Reply-To: <3798FC81.A57E9CFE@appliedbiometrics.com> References: <3798FC81.A57E9CFE@appliedbiometrics.com> Message-ID: <14233.7163.919863.981628@seattle.nightmare.com> Hey Chris, I think you're missing some include files from 'veryfar.zip'? ceval.c: In function `PyEval_EvalCode': ceval.c:355: warning: return makes pointer from integer without a cast ceval.c: In function `PyEval_EvalCode_nr': ceval.c:375: `Py_UnwindToken' undeclared (first use this function) ceval.c:375: (Each undeclared identifier is reported only once ceval.c:375: for each function it appears in.) ceval.c: In function `eval_code2_setup': ceval.c:490: structure has no member named `f_execute' ceval.c:639: structure has no member named `f_first_instr' ceval.c:640: structure has no member named `f_next_instr' -Sam From tim_one@email.msn.com Sat Jul 24 03:16:16 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 23 Jul 1999 22:16:16 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37984CEA.1DF062F6@lemburg.com> Message-ID: <000c01bed57a$82a79620$832d2399@tim> > ... > Hmm, so maybe we're not after lists after all: maybe what > we need is access to the global interpreter lock in Python, > so that we can write: > > sys.lock.acquire() > if list: > obj = list.pop() > else: > obj = default > sys.lock.release() The thread attempting the sys.lock.acquire() necessarily already owns the global lock, so the attempt to acquire it is a guaranteed deadlock -- arguably not helpful . > Or maybe we need some general lock in the thread module for these > purposes... don't know. It's been some time since I used > threads. Jim could easily allocate a list lock for this purpose if that's what he wanted; and wrap it in a class with a nice interface too. He'd eventually end up with the std Queue.py module, though. But if he doesn't want the overhead of an exception when the queue is empty, he sure doesn't want the comparatively huge overhead of a (any flavor of) lock either (which may drag the OS into the picture). There's nothing wrong with wanting a fast thread-safe queue! I just don't like the idea of adding an otherwise-ugly new gimmick to core lists for it; also have to wonder about Jim's larger picture if he's writing stuff in Python that's *so* time-critical that the overhead of an ordinary exception from time to time is a genuine problem. The verbosity of the alternative can be hidden in a lock-free class or function, if it's the clumsiness instead of the time that's grating. From mal@lemburg.com Sat Jul 24 09:38:59 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 24 Jul 1999 10:38:59 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000c01bed57a$82a79620$832d2399@tim> Message-ID: <37997BA3.B5AB23B4@lemburg.com> Tim Peters wrote: > > > ... > > Hmm, so maybe we're not after lists after all: maybe what > > we need is access to the global interpreter lock in Python, > > so that we can write: > > > > sys.lock.acquire() > > if list: > > obj = list.pop() > > else: > > obj = default > > sys.lock.release() > > The thread attempting the sys.lock.acquire() necessarily already owns the > global lock, so the attempt to acquire it is a guaranteed deadlock -- > arguably not helpful . True, sys.lock.acquire() would have to set a flag *not* to release the lock until the next call to sys.lock.release(), which then clears this flag again. Sort of a lock for the unlocking the lock ;-) Could this work, or am I having a mind twister somewhere in there again ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 160 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gmcm@hypernet.com Sat Jul 24 13:41:39 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 24 Jul 1999 07:41:39 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37997BA3.B5AB23B4@lemburg.com> Message-ID: <1279306201-30642004@hypernet.com> M.-A. Lemburg writes: > True, sys.lock.acquire() would have to set a flag *not* to release > the lock until the next call to sys.lock.release(), which then > clears this flag again. Sort of a lock for the unlocking the lock > ;-) > > Could this work, or am I having a mind twister somewhere in > there again ? Sounds like a critical section to me. On Windows, those are lightweight and very handy. You can build one with Python thread primitives, but unfortunately, they come out on the heavy side. Locks come in 4 types, categorized by whether they can be released only by the owning thread, and whether they can be acquired recursively. The interpreter lock is in the opposite quadrant from a critical section, so "sys.lock.freeze()" and "sys.lock.thaw()" have little chance of having an efficient implementation on any platform. A shame. That would be pretty cool. - Gordon From tim_one@email.msn.com Sun Jul 25 19:57:50 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 25 Jul 1999 14:57:50 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <000c01becdac$d2ad6300$7d9e2299@tim> Message-ID: <000001bed6cf$984cd8e0$b02d2399@tim> [Tim, notes that Perl line-at-a-time text mode input runs 3x faster than Python's on his platform] And much to my surprise, it turns out Perl reads lines a character at a time too! And they do not reimplement stdio. But they do cheat. Perl's internals are written on top of an abstract IO API, with "PerlIO *" instead of "FILE *", "PerlIO_tell(PerlIO *)" instead of "ftell(FILE*)", and so on. Nothing surprising in the details, except maybe that stdin is modeled as a function "PerlIO *PerlIO_stdin(void)" instead of as global data (& ditto for stdout/stderr). The usual *implementation* of these guys is as straight macro substitution to the corresponding C stdio call. It's possible to implement them some other way, but I don't see anything in the source that suggests anyone has done so, except possibly to build it all on AT&T's SFIO lib. So where's the cheating? In these API functions: int PerlIO_has_base(PerlIO *); int PerlIO_has_cntptr(PerlIO *); int PerlIO_canset_cnt(PerlIO *); char *PerlIO_get_ptr(PerlIO *); int PerlIO_get_cnt(PerlIO *); void PerlIO_set_cnt(PerlIO *,int); void PerlIO_set_ptrcnt(PerlIO *,char *,int); char *PerlIO_get_base(PerlIO *); int PerlIO_get_bufsiz(PerlIO *); In almost all platform stdio implementations, the C FILE struct has members that may vary in name but serve the same purpose: an internal buffer, and some way (pointer or offset) to get at "the next" buffer character. The guys above are usually just (after layers & layers of config stuff sets it up) macros that expand into the platform's internal way of spelling these things. For example, the count member is spelled under Windows as fp->_cnt under VC, or as fp->level under Borland. The payoff is in Perl's sv_gets function, in file sv.c. This is long and very complicated, but at its core has a fast inner loop that copies characters (provided the PerlIO_has/canXXX functions say it's possible) directly from the stdio buffer into a Perl string variable -- in the way a platform fgets function *would* do it if it bothered to optimize fgets. In my experience, platforms usually settle for the same kind of fgetc/EOF?/newline? loop Python uses, as if fgets were a stdio client rather than a stdio primitive. Perl's keeps everything in registers inside the loop, updates the FILE struct members only at the boundaries, and doesn't check for EOF except at the boundaries (so long as the buffer has unread stuff in it, you can't be at EOF). If the stdio buffer is exhausted before the input terminator is seen (Perl has "input record separator" and "paragraph mode" gimmicks, so it's hairier than just looking for \n), it calls PerlIO_getc once to force the platform to refill the buffer, and goes back to the screaming loop. Major hackery, but major payoff (on most platforms) too. The abstract I/O layer is a fine idea regardless. The sad thing is that the real reason Perl is so fast here is that platform fgets is so needlessly slow. perl-input-is-faster-than-c-input-ly y'rs - tim From tim_one@email.msn.com Mon Jul 26 05:58:31 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 26 Jul 1999 00:58:31 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37982783.E60E9941@lemburg.com> Message-ID: <000601bed723$81bb1020$492d2399@tim> [M.-A. Lemburg] > ... > Oh well, forget the whole idea then. list.pop() is really not > needed that often anyways to warrant the default arg thing, IMHO. > dict.get() and getattr() have the default arg as performance > enhancement I like their succinctness too; count = dict.get(key, 0) is helpfully "slimmer" than either of try: count = dict[key] except KeyError: count = 0 or count = 0 if dict.has_key(key): count = dict[key] > and I believe that you wouldn't get all that much better performance > on average by adding a second optional argument to list.pop(). I think you wouldn't at *all*, except in Jim's novel case. That is, when a list is empty, it's usually the signal to get out of a loop, and you can either test if list: item = list.pop() else: break today or item = list.pop(-1, marker) if item is marker: break tomorrow. The second way doesn't buy anything to my eye, and the first way is very often the pretty while list: item = list.pop() if-it-weren't-for-jim's-use-i'd-see-no-use-at-all-ly y'rs - tim From mal@lemburg.com Mon Jul 26 09:31:01 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 26 Jul 1999 10:31:01 +0200 Subject: [Python-Dev] Thread locked sections References: <1279306201-30642004@hypernet.com> Message-ID: <379C1CC5.51A89688@lemburg.com> Gordon McMillan wrote: > > M.-A. Lemburg writes: > > > True, sys.lock.acquire() would have to set a flag *not* to release > > the lock until the next call to sys.lock.release(), which then > > clears this flag again. Sort of a lock for the unlocking the lock > > ;-) > > > > Could this work, or am I having a mind twister somewhere in > > there again ? > > Sounds like a critical section to me. On Windows, those are > lightweight and very handy. You can build one with Python thread > primitives, but unfortunately, they come out on the heavy side. > > Locks come in 4 types, categorized by whether they can be released > only by the owning thread, and whether they can be acquired > recursively. The interpreter lock is in the opposite quadrant from a > critical section, so "sys.lock.freeze()" and "sys.lock.thaw()" have > little chance of having an efficient implementation on any platform. Actually, I think all that's needed is another global like the interpreter_lock in ceval.c. Since this lock is only accessed via abstract functions, I presume the unlock flag could easily be added. The locking section would only focus on Python, though: other threads could still be running provided they don't execute Python code, e.g. write data to a spooler. So it's not really the equivalent of a critical section as the one you can define in C. PS: I changed the subject line... hope this doesn't kill the thread ;) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 158 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From Brian@digicool.com Mon Jul 26 14:46:00 1999 From: Brian@digicool.com (Brian Lloyd) Date: Mon, 26 Jul 1999 09:46:00 -0400 Subject: [Python-Dev] End of the line Message-ID: <613145F79272D211914B0020AFF6401914DC02@gandalf.digicool.com> > [Tim, notes that Perl line-at-a-time text mode input runs 3x > faster than > Python's on his platform] > > And much to my surprise, it turns out Perl reads lines a > character at a time > too! And they do not reimplement stdio. But they do cheat. > > [some notes on the cheating and PerlIO api snipped] > > The usual *implementation* of these guys is as straight macro > substitution > to the corresponding C stdio call. It's possible to > implement them some > other way, but I don't see anything in the source that > suggests anyone has > done so, except possibly to build it all on AT&T's SFIO lib. Hmm - speed bonuses not withstanding, an implementation of such a beast in the Python sources would've helped a lot to reduce the ugly hairy gymnastics required to get Python going on Win CE, where (until very recently) there was no concept of most of the things you expect to find in stdio... Brian Lloyd brian@digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From mhammond@skippinet.com.au Mon Jul 26 23:49:56 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 08:49:56 +1000 Subject: [Python-Dev] Thread locked sections In-Reply-To: <379C1CC5.51A89688@lemburg.com> Message-ID: <002801bed7b9$2fa8b620$0801a8c0@bobcat> > Actually, I think all that's needed is another global like > the interpreter_lock in ceval.c. Since this lock is only > accessed via abstract functions, I presume the unlock flag could > easily be added. Well, my personal opinion is that this is really quite wrong. The most obvious thing to me is that we are exposing an implementation detail we all would dearly like to see removed one day - the global interpreter lock. But even if we ignore that, it seems to me that you are describing an application abstraction, not a language abstraction. This thread started with Jim wanting a thread-safe, atomic list operation. This is not an unusual requirement (ie, a thread-safe, atomic operation), so languages give you access to primitives that let you build this. To my mind, you are asking for the equivilent of a C function that says "suspend all threads except me, cos Im doing something _really_ important". C does not provide that, and I have never thought it should. As Gordon said, Win32 has critical sections, but these are really just lightweight locks. I really dont see how Python is different - it gives you all the tools you need to build these abstractions. I really dont see what you are after that can not be done with a lock. If the performance is a problem, then to paraphrase the Timbot, it may be questionable if you are using Python appropriately in this case. Mark. From tim_one@email.msn.com Tue Jul 27 02:41:17 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 26 Jul 1999 21:41:17 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <613145F79272D211914B0020AFF6401914DC02@gandalf.digicool.com> Message-ID: <000b01bed7d1$1eac5620$eea22299@tim> [Tim, on the cheating PerlIO API] [Brian Lloyd] > Hmm - speed bonuses not withstanding, an implementation of > such a beast in the Python sources would've helped a lot to > reduce the ugly hairy gymnastics required to get Python going > on Win CE, where (until very recently) there was no concept > of most of the things you expect to find in stdio... I don't think it would have helped you there. If e.g. ftell is missing, it's no easier to implement it yourself under the name "PerlIO_ftell" than under the name "ftell" ... Back before Larry Wall got it into in his head that Perl is a grand metaphor for freedom and creativity (or whatever), he justifiably claimed that Perl's great achievement was in taming Unix. Which it did! Perl essentially defined yet a 537th variation of libc/shell/tool semantics, but in a way that worked the same across its 536 Unix hosts. The PerlIO API is a great help with *that*: if a platform is a little off kilter in its implementation of one of these functions, Perl can use a corresponding PerlIO wrapper to hide the shortcoming in a platform-specific file, and the rest of Perl blissfully assumes everything works the same everywhere. That's a good, cool idea. Ironically, Perl does more to hide gratuitous platform differences here than Python does! But it's just a pile of names if you've got no stdio to build on. let's-model-PythonIO-on-the-win32-api-ly y'rs - tim From mhammond@skippinet.com.au Tue Jul 27 03:13:09 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 12:13:09 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <000b01bed7d1$1eac5620$eea22299@tim> Message-ID: <002a01bed7d5$93a4a780$0801a8c0@bobcat> > let's-model-PythonIO-on-the-win32-api-ly y'rs - tim Interestingly, this raises a point worth mentioning sans-wink :-) Win32 has quite a nice concept that file handles (nearly all handles really) are "waitable". Indeed, in the Win32 world, this feature usually prevents me from using the "threading" module - I need to wait on objects other than threads or locks (usually files, but sometimes child processes). I also usually need a "wait for the first one of these objects", which threading doesnt provide, but that is a digression... What Im getting at is that a Python IO model should maybe go a little further than "tradtional" IO - asynchronous IO and synchronisation capabilities should also be specified. Of course, these would be optional, but it would be excellent if a platform could easily slot into pre-defined Python semantics if possible. Is this reasonable, or really simply too hard to abstract in the manner I an talking!? Mark. From mal@lemburg.com Tue Jul 27 09:31:27 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 10:31:27 +0200 Subject: [Python-Dev] Thread locked sections References: <002801bed7b9$2fa8b620$0801a8c0@bobcat> Message-ID: <379D6E5F.B29251EF@lemburg.com> Mark Hammond wrote: > > > Actually, I think all that's needed is another global like > > the interpreter_lock in ceval.c. Since this lock is only > > access From mal@lemburg.com Tue Jul 27 10:23:05 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 11:23:05 +0200 Subject: [Python-Dev] Thread locked sections References: <002801bed7b9$2fa8b620$0801a8c0@bobcat> Message-ID: <379D7A79.DB97B2C@lemburg.com> [The previous mail got truncated due to insufficient disk space; here is a summary...] Mark Hammond wrote: > > > Actually, I think all that's needed is another global like > > the interpreter_lock in ceval.c. Since this lock is only > > accessed via abstract functions, I presume the unlock flag could > > easily be added. > > Well, my personal opinion is that this is really quite wrong. The most > obvious thing to me is that we are exposing an implementation detail we all > would dearly like to see removed one day - the global interpreter lock. > > But even if we ignore that, it seems to me that you are describing an > application abstraction, not a language abstraction. This thread started > with Jim wanting a thread-safe, atomic list operation. This is not an > unusual requirement (ie, a thread-safe, atomic operation), so languages > give you access to primitives that let you build this. > > To my mind, you are asking for the equivilent of a C function that says > "suspend all threads except me, cos Im doing something _really_ important". > C does not provide that, and I have never thought it should. As Gordon > said, Win32 has critical sections, but these are really just lightweight > locks. I really dont see how Python is different - it gives you all the > tools you need to build these abstractions. > > I really dont see what you are after that can not be done with a lock. If > the performance is a problem, then to paraphrase the Timbot, it may be > questionable if you are using Python appropriately in this case. The locked section may not be leading in the right direction, but it surely helps in situations where you cannot otherwise enforce useage of an object specific lock, e.g. for builtin file objects (some APIs insist on getting the real thing, not a thread safe wrapper). Here is a hack that let's you do much the same with an unpatched Python interpreter: sys.setcheckinterval(sys.maxint) # *) # >=10 Python OPs to flush the ticker counter and have the new # check interavl setting take effect: 0==0; 0==0; 0==0; 0==0 try: ...lock section... finally: sys.setcheckinterval(10) *) sys.setcheckinterval should really return the previous value so that we can reset the value to the original one afterwards. Note that the lock section may not call code which uses the Py_*_ALLOW_THREADS macros. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 157 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From rushing@nightmare.com Tue Jul 27 11:33:03 1999 From: rushing@nightmare.com (Sam Rushing) Date: Tue, 27 Jul 1999 03:33:03 -0700 (PDT) Subject: [Python-Dev] continuations for the curious In-Reply-To: <3798FC81.A57E9CFE@appliedbiometrics.com> References: <3798FC81.A57E9CFE@appliedbiometrics.com> Message-ID: <14237.33980.82091.445607@seattle.nightmare.com> --IuOx2/xRLE Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I've been playing for a bit, trying to write my own coroutine class (obeying the law of "you won't understand it until you write it yourself") based on one I've worked up for 'lunacy'. I think I have it, let me know what you think: >>> from coroutine import * >>> cc = coroutine (counter, 100, 10) >>> cc.resume() 100 >>> cc.resume() 110 >>> Differences: 1) callcc wraps the 'escape frame' with a lambda, so that it can be invoked like any other function. this actually simplifies the bootstrapping, because starting the function is identical to resuming it. 2) the coroutine object keeps track of who resumed it, so that it can resume the caller without having to know who it is. 3) the coroutine class keeps track of which is the currently 'active' coroutine. It's currently a class variable, but I think this can lead to leaks, so it might have to be made a global. +----------------------------------------------------------------- | For those folks (like me) that were confused about where to get | all the necessary files for building the latest Stackless Python, | here's the procedure: | | 1) unwrap a fresh copy of 1.5.2 | 2) unzip | http://www.pns.cc/anonftp/pub/stackless_990713.zip | on top of it | 3) then, unzip | ftp://ftp.pns.cc/pub/veryfar.zip | on top of that | 4) add "continuation continuationmodule.c" to Modules/Setup -Sam --IuOx2/xRLE Content-Type: text/plain Content-Description: coroutine.py Content-Disposition: inline; filename="coroutine.py" Content-Transfer-Encoding: 7bit # -*- Mode: Python; tab-width: 4 -*- import continuation def callcc (fun, *args, **kw): k = continuation.getpcc(2) # eerie, that numeral two kfun = lambda v,k=k: continuation.putcc (k, v) return apply (fun, (kfun,)+args, kw) class coroutine: current = None def __init__ (self, f, *a, **kw): self.state = lambda v,f=f,a=a,kw=kw: apply (f, a, kw) self.caller = None def resume (self, value=None): caller = coroutine.current callcc (caller._save) self.caller = caller coroutine.current = self self.state (value) def _save (self, state): self.state = state def resume_caller (value): me = coroutine.current me.caller.resume (value) def resume_main (value): main.resume (value) main = coroutine (None) coroutine.current = main # counter/generator def counter (start=0, step=1): n = start while 1: resume_caller (n) n = n + step # same-fringe def _tree_walker (t): if type(t) is type([]): for x in t: _tree_walker (x) else: resume_caller (t) def tree_walker (t): _tree_walker (t) resume_caller (None) def same_fringe (t1, t2): co1 = coroutine (tree_walker, t1) co2 = coroutine (tree_walker, t2) while 1: leaf1 = co1.resume() leaf2 = co2.resume() print 'leaf1: %s leaf2: %s' % (leaf1, leaf2) if leaf1 == leaf2: if leaf1 is None: return 1 else: return 0 --IuOx2/xRLE-- From jack@oratrix.nl Tue Jul 27 13:04:39 1999 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 27 Jul 1999 14:04:39 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Mark Hammond" , Tue, 27 Jul 1999 12:13:09 +1000 , <002a01bed7d5$93a4a780$0801a8c0@bobcat> Message-ID: <19990727120440.13D5F303120@snelboot.oratrix.nl> > What Im getting at is that a Python IO model should maybe go a little > further than "tradtional" IO - asynchronous IO and synchronisation > capabilities should also be specified. Of course, these would be optional, > but it would be excellent if a platform could easily slot into pre-defined > Python semantics if possible. What Python could do with reasonable ease is a sort of "promise" model, where an I/O operation returns an object that waits for the I/O to complete upon access or destruction. Something like def foo(): obj = stdin.delayed_read() obj2 = stdout.delayed_write("data") do_lengthy_computation() data = obj.get() # Here we wait for the read to complete del obj2 # Here we wait for the write to complete. This gives a fairly nice programming model. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mhammond@skippinet.com.au Tue Jul 27 13:10:56 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 22:10:56 +1000 Subject: [Python-Dev] Thread locked sections In-Reply-To: <379D7A79.DB97B2C@lemburg.com> Message-ID: <004201bed829$16211c40$0801a8c0@bobcat> [Marc writes] > The locked section may not be leading in the right direction, > but it surely helps in situations where you cannot otherwise > enforce useage of an object specific lock, e.g. for builtin > file objects (some APIs insist on getting the real thing, not > a thread safe wrapper). Really, all this boils down to is that you want a Python-ish critical section - ie, a light-weight lock. This presumably would be desirable if it could be shown Python locks are indeed "heavy" - I know that from the C POV they may be considered as such, but I havent seen many complaints about lock speed from Python. So in an attempt to get _some_ evidence, I wrote a test program that used the Queue module to append 10000 integers then remove them all. I then hacked the queue module to remove all locking, and ran the same test. The results were 2.4 seconds for the non-locking version, vs 3.8 for the standard version. Without time (or really inclination ) to take this further, it _does_ appear a native Python "critical section" could indeed save a few milli-seconds for a few real-world apps. So if we ignore the implementation details Marc started spelling, does the idea of a Python "critical section" appeal? Could simply be a built-in way of saying "no other _Python_ threads should run" (and of-course the "allow them again"). The semantics could be simply to ensure the Python program integrity - it need say nothing about the Python internal "state" as such. Mark. From mal@lemburg.com Tue Jul 27 13:27:55 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 14:27:55 +0200 Subject: [Python-Dev] continuations for the curious References: <3798FC81.A57E9CFE@appliedbiometrics.com> <14237.33980.82091.445607@seattle.nightmare.com> Message-ID: <379DA5CB.B3619365@lemburg.com> Sam Rushing wrote: > > +----------------------------------------------------------------- > | For those folks (like me) that were confused about where to get > | all the necessary files for building the latest Stackless Python, > | here's the procedure: Thanks... this guide made me actually try it ;-) > | > | 1) unwrap a fresh copy of 1.5.2 > | 2) unzip > | http://www.pns.cc/anonftp/pub/stackless_990713.zip > | on top of it > | 3) then, unzip > | ftp://ftp.pns.cc/pub/veryfar.zip > | on top of that It seems that Christian forgot the directory information in this ZIP file. You have to move the continuationmodule.c file to Modules/ by hand. > | 4) add "continuation continuationmodule.c" to Modules/Setup -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 157 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mhammond@skippinet.com.au Tue Jul 27 15:45:12 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 28 Jul 1999 00:45:12 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <19990727120440.13D5F303120@snelboot.oratrix.nl> Message-ID: <004401bed83e$a9252b70$0801a8c0@bobcat> [Jack seems to like an asynch IO model] > def foo(): > obj = stdin.delayed_read() > obj2 = stdout.delayed_write("data") > do_lengthy_computation() > data = obj.get() # Here we wait for the read to complete > del obj2 # Here we wait for the write to > complete. > > This gives a fairly nice programming model. Indeed. Taking this a little further, I come up with something like: inlock = threading.Lock() buffer = stdin.delayed_read(inlock) outlock = threading.Lock() stdout.delayed_write(outlock, "The data") fired = threading.Wait(inlock, outlock) # new fn :-) if fired is inlock: # etc. The idea is we can make everything wait on a single lock abstraction. threading.Wait() could accept lock objects, thread objects, Sockets, etc. Obviously a bit to work out, but it does make an appealing model. OTOH, I wonder how it fits with continutations etc. Not too badly from my weak understanding. May be an interesting convergence! Mark. From jack@oratrix.nl Tue Jul 27 16:31:13 1999 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 27 Jul 1999 17:31:13 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Mark Hammond" , Wed, 28 Jul 1999 00:45:12 +1000 , <004401bed83e$a9252b70$0801a8c0@bobcat> Message-ID: <19990727153113.4A2F1303120@snelboot.oratrix.nl> > [Jack seems to like an asynch IO model] > > > def foo(): > > obj = stdin.delayed_read() > > obj2 = stdout.delayed_write("data") > > do_lengthy_computation() > > data = obj.get() # Here we wait for the read to complete > > del obj2 # Here we wait for the write to > > complete. > > > > This gives a fairly nice programming model. > > Indeed. Taking this a little further, I come up with something like: > > inlock = threading.Lock() > buffer = stdin.delayed_read(inlock) > > outlock = threading.Lock() > stdout.delayed_write(outlock, "The data") > > fired = threading.Wait(inlock, outlock) # new fn :-) > > if fired is inlock: # etc. I think this is exactly what I _didn't_ want:-) I'd like the delayed read to return an object that will automatically wait when I try to get the data from it, and the delayed write object to automatically wait when I garbage-collect it. Of course, there's no reason why you couldn't also wait on these objects (or, on unix, pass them to select(), or whatever). On second thought the method of the delayed read should be called read() in stead of get(), of course. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mhammond@skippinet.com.au Tue Jul 27 23:21:19 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 28 Jul 1999 08:21:19 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <19990727153113.4A2F1303120@snelboot.oratrix.nl> Message-ID: <000c01bed87e$5af42060$0801a8c0@bobcat> [I missed Jack's point] > I think this is exactly what I _didn't_ want:-) > > I'd like the delayed read to return an object that will > automatically wait > when I try to get the data from it, and the delayed write object to > automatically wait when I garbage-collect it. OK - that is fine. My driving requirement was that I be able to wait on _multiple_ files at the same time - ie, I dont know which one will complete first. There is no reason then why your initial suggestion can not satisfy my requirement, as long as the "buffer type object" returned from read is itself waitable. I agree there is no driving need for a seperate buffer type object and seperate waitable object necessarily. [OTOH, your scheme could be simply built on top of my scheme as a framework] Unfortunately, this doesnt seem to have grabbed anyone elses interest.. Mark. From da@ski.org Wed Jul 28 22:46:21 1999 From: da@ski.org (David Ascher) Date: Wed, 28 Jul 1999 14:46:21 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Tcl news Message-ID: 8.2b1 is released: Some surprising news: they now use cygwin tools to do the windows build. Not surprising news: they still haven't incorporated some bug fixes I submitted eons ago =) http://www.scriptics.com/software/relnotes/tcl8.2b1 --david From tim_one@email.msn.com Thu Jul 29 04:10:40 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 28 Jul 1999 23:10:40 -0400 Subject: [Python-Dev] RE: delayed I/O; multiple waits In-Reply-To: <000c01bed87e$5af42060$0801a8c0@bobcat> Message-ID: <001201bed96f$f06990c0$71a22299@tim> [Mark Hammond] > ... > Unfortunately, this doesnt seem to have grabbed anyone elses interest.. You lost me when you said it should be optional -- that's fine for an extension module, but it sounded like you wanted this to somehow be part of the language core. If WaitForMultipleObjects (which is what you *really* want ) is thought to be a cool enough idea to be in the core, we should think about how to implement it on non-Win32 platforms too. needs-more-words-ly y'rs - tim From mhammond@skippinet.com.au Thu Jul 29 04:52:47 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Thu, 29 Jul 1999 13:52:47 +1000 Subject: [Python-Dev] RE: delayed I/O; multiple waits In-Reply-To: <001201bed96f$f06990c0$71a22299@tim> Message-ID: <002e01bed975$d392d910$0801a8c0@bobcat> > You lost me when you said it should be optional -- that's fine for an > extension module, but it sounded like you wanted this to Cool - I admit I knew it was too vague, but left it in anyway. > the language core. If WaitForMultipleObjects (which is what > you *really* Sort-of. IMO, the threading module does need a WaitForMultipleObjects (whatever the spelling) but I also recall the discussion that this is not trivial. But what I _really_ want is an enhanced concept of "waitable" - threading can only wait on locks and threads. If we have this, the WaitForMultiple would become even more pressing, but they are not directly related. So, I see 2 issues, both of which usually prevent me personally from using the threading module in the real world. By "optional", I meant a way for a platform to slot into existing "waitable" semantics. Win32 file operations are waitable. I dont really want native win32 file operations to be in the core, but I would like some standard way that, if possible, I could map the waitable semantics to Python waitable semantics. Thus, although the threading module knows nothing about win32 file objects or handles, it would be nice if it could still wait on them. > needs-more-words-ly y'rs - tim Unfortunately, if I knew exactly what I wanted I would be asking for implementation advice rather than grasping at straws :-) Attempting to move from totally raw to half-baked, I suppose this is what I had in mind: * Platform optionally defines what a "waitable" object is, in the same way it now defines what a lock is. Locks are currently _required_ only with threading - waitables would never be required. * Python defines a "waitable" protocol - eg, a new "tp_wait"/"__wait__" slot. If this slot is filled/function exists, it is expected to provide a "waitable" object or NULL/None. * Threading support for platforms that support it define a tp_wait slot that maps the Thread ID to the "waitable object" * Ditto lock support for the plaform. * Extensions such as win32 handles also provide this. * Dream up extensions to file objects a-la Jack's idea. When a file is opened asynch, tp_wait returns non-NULL (via platform specific hooks), or NULL when opened sync (making it not waitable). Non-asynch platforms need zero work here - the asynch open fails, tp_wait slot never filled in. Thus, for platforms that provide no extra asynch support, threading can still only wait on threads and locks. The threading module could take advantage of the new protocol thereby supporting any waitable object. Like I said, only half-baked, but I think expresses a potentially workable idea. Does this get closer to either a) explaining what I meant, or b) confirming I am dribbling? Biggest problem I see is that the only platform that may take advantage is Windows, thereby making a platform specific solution (such as win32event I use now) perfectly reasonable. Maybe my focus should simply be on allowing win32event.WaitFor* to accept threading instances and standard Python lock objects!! Mark. From Brian@digicool.com Fri Jul 30 15:23:49 1999 From: Brian@digicool.com (Brian Lloyd) Date: Fri, 30 Jul 1999 10:23:49 -0400 Subject: [Python-Dev] RE: NT select.select? Message-ID: <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> > Is there some low limit on maximum number of sockets you can > have in the > Python-NT's select call? A program that happens to work > perfectly on Linux > seems to die on NT around 64(?) sockets to the 'too many file > descriptors > in call' error. > > Any portable ways to bypass it? > > -Markus Hi Markus, It turns out that NT has a default 64 fd limit on arguments to select(). The good news is that you can actually bump the limit up to whatever number you want by specifying a define when compiling python15.dll. If you have the ability to rebuild your python15.dll, you can add the define: FD_SETSIZE=1024 to the preprocessor options for the python15 project to raise the limit to 1024 fds. The default 64 fd limit is too low for anyone trying to run an async server that handles even a modest load, so I've submitted a bug report to python.org asking that the define above find its way into the next python release... Brian Lloyd brian@digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From guido@CNRI.Reston.VA.US Fri Jul 30 16:04:58 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 30 Jul 1999 11:04:58 -0400 Subject: [Python-Dev] RE: NT select.select? In-Reply-To: Your message of "Fri, 30 Jul 1999 10:23:49 EDT." <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> References: <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> Message-ID: <199907301504.LAA13183@eric.cnri.reston.va.us> > It turns out that NT has a default 64 fd limit on arguments to > select(). The good news is that you can actually bump the limit up > to whatever number you want by specifying a define when compiling > python15.dll. > > If you have the ability to rebuild your python15.dll, you can add > the define: > > FD_SETSIZE=1024 > > to the preprocessor options for the python15 project to raise the > limit to 1024 fds. > > The default 64 fd limit is too low for anyone trying to run > an async server that handles even a modest load, so I've > submitted a bug report to python.org asking that the define > above find its way into the next python release... Brian, (Also in response to your bug report.) I'm a little worried that upping the limit to 1024 would cause some performance problems if you're making a lot of select() calls. The select allocates three arrays of length FD_SETSIZE+3; each array item is 12 bytes. This is a total allocation of more than 36K for a meager select() call! And all that memory also has to be cleared by the FD_ZERO() call. If you actually have that many sockets, that's worth paying for (the socket objects themselves use up just as much memory, and your Python data structures for the sockets, no matter how small, are probably several times bigger), but for a more typical program, I see this as a lot of overhead. Is there a way that this can be done more dynamically, e.g. by making the set size as big as needed on windows but no bigger? (Before you suggest allocating that memory statically, remember it's possible to call select from multiple threads. Allocating 36K of thread-local space for each thread also doesn't sound too pleasant.) --Guido van Rossum (home page: http://www.python.org/~guido/) From Brian@digicool.com Fri Jul 30 19:25:01 1999 From: Brian@digicool.com (Brian Lloyd) Date: Fri, 30 Jul 1999 14:25:01 -0400 Subject: [Python-Dev] RE: NT select.select? Message-ID: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Guido wrote: > > Brian, > > (Also in response to your bug report.) I'm a little worried that > upping the limit to 1024 would cause some performance problems if > you're making a lot of select() calls. The select allocates three > arrays of length FD_SETSIZE+3; each array item is 12 bytes. This is a > total allocation of more than 36K for a meager select() call! > And all > that memory also has to be cleared by the FD_ZERO() call. > > If you actually have that many sockets, that's worth paying for (the > socket objects themselves use up just as much memory, and your Python > data structures for the sockets, no matter how small, are probably > several times bigger), but for a more typical program, I see > this as a > lot of overhead. > > Is there a way that this can be done more dynamically, e.g. by making > the set size as big as needed on windows but no bigger? > > (Before you suggest allocating that memory statically, remember it's > possible to call select from multiple threads. Allocating 36K of > thread-local space for each thread also doesn't sound too pleasant.) > > --Guido van Rossum (home page: http://www.python.org/~guido/) Hmm - after going through all of the Win32 sdks, it doesn't appear to be possible to do it any other way than as a -D option at compile time, so optimizing for the common case (folks who _don't_ need large numbers of fds) is reasonable. Since we distribute a python15.dll with Zope on windows, this isn't that big a deal for us - we can just compile in a higher limit in our distributed dll. I was mostly thinking of the win32 users who don't have the ability to rebuild their dll, but maybe this isn't that much of a problem; I suspect that the people who write significant socket apps that would run into this problem probably have access to a compiler if they need it. Brian Lloyd brian@digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From da@ski.org Fri Jul 30 19:59:37 1999 From: da@ski.org (David Ascher) Date: Fri, 30 Jul 1999 11:59:37 -0700 (Pacific Daylight Time) Subject: [Python-Dev] RE: NT select.select? In-Reply-To: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Message-ID: On Fri, 30 Jul 1999, Brian Lloyd wrote: > Since we distribute a python15.dll with Zope on windows, this > isn't that big a deal for us - we can just compile in a higher > limit in our distributed dll. I was mostly thinking of the win32 > users who don't have the ability to rebuild their dll, but > maybe this isn't that much of a problem; I suspect that the > people who write significant socket apps that would run into > this problem probably have access to a compiler if they need it. It's a worthy piece of knowledge to document somehow -- I'm not sure where that should be... From Fred L. Drake, Jr." References: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Message-ID: <14241.63361.737047.998159@weyr.cnri.reston.va.us> David Ascher writes: > It's a worthy piece of knowledge to document somehow -- I'm not sure where > that should be... Perhaps a paragraph in the library reference? If someone can send along a clear bit of text (unformatted is fine), I'll be glad to add it. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From tim_one at email.msn.com Thu Jul 1 06:30:30 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 00:30:30 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906291201.IAA02535@eric.cnri.reston.va.us> Message-ID: <000101bec37a$7465af00$309e2299@tim> [Guido] > I guess it's all in the perspective. 99.99% of all thread apps I've > ever written use threads primarily to overlap I/O -- if there wasn't > I/O to overlap I wouldn't use a thread. I think I share this > perspective with most of the thread community (after all, threads > originate in the OS world where they were invented as a replacement > for I/O completion routines). Different perspective indeed! Where I've been, you never used something as delicate as a thread to overlap I/O, you instead used the kernel-supported asynch Fortran I/O extensions <0.7 wink>. Those days are long gone, and I've adjusted to that. Time for you to leave the past too : by sheer numbers, most of the "thread community" *today* is to be found typing at a Windows box, where cheap & reliable threads are a core part of the programming culture. They have better ways to overlap I/O there too. Throwing explicit threads at this is like writing a recursive Fibonacci number generator in Scheme, but building the recursion yourself by hand out of explicit continuations . > ... > As far as I can tell, all the examples you give are easily done using > coroutines. Can we call whatever you're asking for coroutines instead > of fake threads? I have multiple agendas, of course. What I personally want for my own work is no more than Icon's generators, formally "semi coroutines", and easily implemented in the interpreter (although not the language) as it exists today. Coroutines, fake threads and continuations are much stronger than generators, and I expect you can fake any of the first three given either of the others. Generators fall out of any of them too (*you* implemented generators once using Python threads, and I implemented general coroutines -- "fake threads" are good enough for either of those). So, yes, for that agenda any means of suspending/resuming control flow can be made to work. I seized on fake threads because Python already has a notion of threads. A second agenda is that Python could be a lovely language for *learning* thread programming; the threading module helps, but fake threads could likely help more by e.g. detecting deadlocks (and pointing them out) instead of leaving a thread newbie staring at a hung system without a clue. A third agenda is related to Mark & Greg's, making Python's threads "real threads" under Windows. The fake thread agenda doesn't tie into that, except to confuse things even more if you take either agenda seriously <0.5 frown>. > I think that when you mention threads, green or otherwise colored, > most people who are at all familiar with the concept will assume they > provide I/O overlapping, except perhaps when they grew up in the > parallel machine world. They didn't suggest I/O to me at all, but I grew up in the disqualified world ; doubt they would to a Windows programmer either (e.g., my employer ships heavily threaded Windows apps of various kinds, and overlapped I/O isn't a factor in any of them; it's mostly a matter of algorithm factoring to keep the real-time incestuous subsystems from growing impossibly complex, and in some of the very expensive apps also a need to exploit multiple processors). BTW, I called them "fake" threads to get away from whatever historical baggage comes attached to "green". > Certainly all examples I give in my never-completed thread tutorial > (still available at > http://www.python.org/doc/essays/threads.html) use I/O as the primary > motivator -- The preceding "99.99% of all thread apps I've ever written use threads primarily to overlap I/O" may explain this . BTW, there is only one example there, which rather dilutes the strength of the rhetorical "all" ... > this kind of example appeals to simples souls (e.g. downloading more than > one file in parallel, which they probably have already seen in action in > their web browser), as opposed to generators or pipelines or coroutines > (for which you need to have some programming theory background to > appreciate the powerful abstraction possibillities they give). I don't at all object to using I/O as a motivator, but the latter point is off base. There is *nothing* in Comp Sci harder to master than thread programming! It's the pinnacle of perplexity, the depth of despair, the king of confusion (stop before I exaggerate ). Generators in particular get re-invented often as a much simpler approach to suspending a subroutine's control flow; indeed, Icon's primary audience is still among the humanities, and even dumb linguists don't seem to have notable problems picking it up. Threads have all the complexities of the other guys, plus races, deadlocks, starvation, load imbalance, non-determinism and non-reproducibility. Threads simply aren't simple-soul material, no matter how pedestrian a motivating *example* may be. I suspect that's why your tutorial remains unfinished: you had no trouble describing the problem to be solved, but got bogged down in mushrooming complications describing how to use threads to solve it. Even so, the simple example at the end is already flawed ("print" isn't atomic in Python, so the print len(text), url may print the len(text) from one thread followed by the url from another). It's not hard to find simple-soul examples for generators either (coroutines & continuations *are* hard to motivate!), especially since Python's for/__getitem__ protocol is already a weak form of generator, and xrange *is* a full-blown generator; e.g., a common question on c.l.py is how to iterate over a sequence backwards: for x in backwards(sequence): print x def backwards(s): for i in xrange(len(s)-1, -1, -1): suspend s[i] Nobody needs a comp sci background to understand what that *does*, or why it's handy. Try iterating over a tree structure instead & then the *power* becomes apparent; this isn't comp-sci-ish either, unless we adopt a "if they've heard of trees, they're impractical dreamers" stance . BTW, iterating over a tree is what os.path.walk does, and a frequent source of newbie confusion (they understand directory trees, they don't grasp the callback-based interface; generating (dirname, names) pairs instead would match their mental model at once). *This* is the stuff for simple souls! > Another good use of threads (suggested by Sam) is for GUI programming. > An old GUI system, News by David Rosenthal at Sun, used threads > programmed in PostScript -- very elegant (and it failed for other > reasons -- if only he had used Python instead :-). > > On the other hand, having written lots of GUI code using Tkinter, the > event-driven version doesn't feel so bad to me. Threads would be nice > when doing things like rubberbanding, but I generally agree with > Ousterhout's premise that event-based GUI programming is more reliable > than thread-based. Every time your Netscape freezes you can bet > there's a threading bug somewhere in the code. I don't use Netscape, but I can assure you the same is true of Internet Explorer -- except there the threading bug is now somewhere in the OS <0.5 wink>. Anyway, 1) There are lots of goods uses for threads, and especially in the Windows and (maybe) multiprocessor NumPy worlds. Those would really be happier with "free-threaded threads", though. 2) Apart from pedagogical purposes, there probably isn't a use for my "fake threads" that couldn't be done easier & better via a more direct (coroutine, continuation) approach; and if I had fake threads, the first thing I'd do for me is rewrite the generator and coroutine packages to use them. So, yes: you win . 3) Python's current threads are good for overlapping I/O. Sometimes. And better addressed by Sam's non-threaded "select" approach when you're dead serious about overlapping lots of I/O. They're also beaten into service under Windows, but not without cries of anguish from Greg and Mark. I don't know, Guido -- if all you wanted threads for was to speed up a little I/O in as convoluted a way as possible, you may have been witness to the invention of the wheel but missed that ox carts weren't the last application . nevertheless-ox-carts-may-be-the-best-ly y'rs - tim From tim_one at email.msn.com Thu Jul 1 09:45:54 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 03:45:54 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> Message-ID: <000901bec395$c042cfa0$309e2299@tim> [Mark Hammond] > Im a little unhappy as this [stackless Python] will break the Active > Debugging stuff ... > ... > So the solution MS came up with was, surprise surprise, the machine stack! > :-) The assumption is that all languages will make _some_ use of the > stack, so they ask a language to report its "stack base address" > and "stack size". Using this information, the debugger sorts into the > correct call sequence. Mark, you can't *really* believe Chris is incapable of hacking around this, right? It's not even clear there's something to be hacked around, since Python is only Python and there's nothing Christian can do to stop other languages that call into Python from using the machine stack, or to call other languages from Python without using the machine stack. So Python "shows up on the stack" no matter what, cross-language. > ... > Bit I also understand completely the silence on this issue. When the > thread started, there was much discussion about exactly what the > hell these continuation/coroutine thingies even were. The Fuchs paper Sam referenced explained it in simple C terms: a continuation is exactly what C setjmp/longjmp would do if setjmp saved (& longjmp restored) the C stack *in addition* to the program counter and machine registers (which they already save/restore). That's all there is to it, at heart: objects capture data state, continuations capture control flow state. Whenever the OS services an interrupt and drops into kernel mode, it captures a continuation for user mode -- they don't *call* it that, but that's what they're doing, and it's as practical as a pencil (well, *more* practical, actually ). > However, there were precious few real-world examples where they could > be used. Nobody asked for any before now <0.5 wink> -- and I see Sam provided some marvelous ones in response to this. > A few acedemic, theoretical places, I think you undervalue those: people working on the underpinnings of languages strive very hard to come up with the simplest possible examples that don't throw away the core of the problem to be solved. That doesn't mean the theoreticians are too air-headed to understand "real world problems"; it's much more that, e.g., "if you can't compare the fringes of two trees cleanly, you can't possibly do anything harder than that cleanly either -- but if you can do this little bit cleanly, we have strong reason to believe there's a large class of difficult real problems you can also do cleanly". If you need a "practical" example of that, picture e.g. a structure-based diff engine for HTML source. Which are really trees defined by tags, and where text-based comparison can be useless (you don't care if "
  • " moved from column 12 of line 16 to column 1 of line 17, but you care a lot if the *number* of
  • tags changed -- so have you have to compare two trees *as* trees). But that's a problem easy enough for generators to solve cleanly. Knuth writes a large (for his books) elevator-simulation program to illustrate coroutines (which are more powerful than generators), and complains that he can't come up with a simpler example that illustrates any point worth making. And he's right! The "literature standard" text-manipulation example at the end of my coroutine module illustrates what Sam was talking about wrt writing straightforward "pull" algorithms for a "push" process, but even that one can be solved with simpler pipeline control flow. At least for *that*, nobody who ever used Unix would doubt the real-world utility of the pipeline model for a nanosecond <1e-9 wink>. If you want a coroutine example, go to a restaurant and order a meal. When you leave, glance back *real quick*. If everyone in the restaurant is dead, they were a meal-generating subroutine; but if they're still serving other customers, your meal-eating coroutine and their meal-generating coroutine worked to mutual benefit . > but the only real contender I have seen brought up was Medusa. There were > certainly no clear examples of "as soon as we have this, I could change > abc to take advantage, and this would give us the very cool xyz" > > So, if anyone else if feeling at all like me about this issue, they are > feeling all warm and fuzzy knowing that a few smart people are giving us > the facility to do something we hope we never, ever have to do. :-) Actually, you'll want to do it a lot . Christian & I have bantered about this a few times a year in pvt, usually motivated by some horrendous kludge posted to c.l.py to solve a problem that any Assistant Professor of Medieval English could solve without effort in Icon. The *uses* aren't esoteric at all. or-at-least-not-more-than-you-make-'em-ly y'rs - tim From MHammond at skippinet.com.au Thu Jul 1 10:18:25 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Thu, 1 Jul 1999 18:18:25 +1000 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <000901bec395$c042cfa0$309e2299@tim> Message-ID: <00b901bec39a$4baf6b80$0801a8c0@bobcat> [Tim tells me it will all be obvious if I just think a little harder ] Your points about "acedemic examples" is well taken. The reality is that, even given these simple examples (which I dared deride as acedemic), the simple fact is Im not seeing "the point". I seriously dont doubt all all you say. However, as Sam and Chris have said many times, it is just a matter of changing the way to you think. Interestingly: Chris said it recently, and continues to say it. Sam said it to me _years_ ago, and said it repeatedly, but hasnt said it recently. Tim hasnt really said it yet :-) This is almost certainly because when your brain does switch, it is a revelation, and really not too hard at all. But after a while, you forget the switch ever took place. Closest analogy I can think of is OO programming. In my experience trying to _learn_ OO programming from a few misc examples and texts was pointless and very hard. You need a language to play with it in. And when you have one, your brain makes the switch, you see the light, and you can't see what was ever mysterious about it. And you tell everyone its easy; "just change the way you think about data" :-) But to all us here, OO programming is just so obvious it goes without saying. Occasionaly a newbie will have trouble with OO concepts in Python, and I personally have trouble seeing what could _possibly_ be difficult about understanding these very simple concepts. So Im just as guilty, just not in this particular case :-) So, short of all us here going and discovering the light using a different language (perish the thought :), my original point stands that until Chris' efforts give us something we can easily play with, some of use _still_ wont see what all the fuss is about. (Although I admit it has nothing to do with either the examples or the applicability of the technology to all sorts of things) Which leaves you poor guys in a catch 22 - without noise of some sort from the rest of us, its hard to keep the momentum going, but without basically a fully working Python with continuations, we wont be making much noise. But-I-will-thank-you-all-personally-and-profusely-when-I-do-see-the-light, ly Mark. From jack at oratrix.nl Thu Jul 1 18:05:50 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 01 Jul 1999 18:05:50 +0200 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: Message by Skip Montanaro , Mon, 28 Jun 1999 16:24:46 -0400 (EDT) , <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> Message-ID: <19990701160555.5D44512B0F@oratrix.oratrix.nl> Recently, Skip Montanaro said: > > Andrew> My personal leaning is that we can get more bang for the buck by > Andrew> working on the Distutils effort, so that installing a package > Andrew> like PyExpat becomes much easier, rather than piling more things > Andrew> into the core distribution. > > Amen to that. See Guido's note and my response regarding soundex in the > Doc-SIG. Perhaps you could get away with a very small core distribution > that only contained the stuff necessary to pull everything else from the net > via http or ftp... I don't know whether this subject belongs on the python-dev list (is there a separate distutils list?), but let's please be very careful with this. The Perl people apparently think that their auto-install stuff is so easy to use that if you find a tool on the net that needs Perl they'll just give you a few incantations you need to build the "correct" perl to run the tool, but I've never managed to do so. My last try was when I spent 2 days to try and get the perl-based Palm software for unix up and running. With various incompatilble versions of perl installed in /usr/local by the systems staff and knowing nothing about perl I had to give up at some point, because it was costing far more time (and diskspace:-) than the whole thing was worth. Something like mailman is (afaik) easy to install for non-pythoneers because it only depends on a single, well-defined Python distribution. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From skip at mojam.com Thu Jul 1 21:54:14 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 1 Jul 1999 15:54:14 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <19990701160555.5D44512B0F@oratrix.oratrix.nl> References: <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> <19990701160555.5D44512B0F@oratrix.oratrix.nl> Message-ID: <14203.50921.870411.353490@cm-24-29-94-19.nycap.rr.com> Skip> Amen to that. See Guido's note and my response regarding soundex Skip> in the Doc-SIG. Perhaps you could get away with a very small core Skip> distribution that only contained the stuff necessary to pull Skip> everything else from the net via http or ftp... Jack> I don't know whether this subject belongs on the python-dev list Jack> (is there a separate distutils list?), but let's please be very Jack> careful with this. The Perl people apparently think that their Jack> auto-install stuff is so easy to use ... I suppose I should have added a <0.5 wink> to my note. Still, knowing what Guido does and doesn't feel comfortable with in the core distribution would be a good start at seeing where we might like the core to wind up. Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/~skip/ 518-372-5583 From tim_one at email.msn.com Fri Jul 2 04:33:23 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 22:33:23 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <19990701160555.5D44512B0F@oratrix.oratrix.nl> Message-ID: <000a01bec433$41a410c0$6da02299@tim> [large vs small distributions] [Jack Jansen] > I don't know whether this subject belongs on the python-dev list (is > there a separate distutils list?), but let's please be very careful > with this. [and recounts his problems with Perl] I must say the idea of a minimal distribution sounds very appealing. But then I consider that Guido never got me to even try Tk until he put it into the std Windows distribution, and I've never given anyone any code that won't work with a fresh-from-the-box distribution either. FrankS's snappy "batteries included" wouldn't carry quite the same punch if it got reduced to "coupons for batteries hidden in the docs" . OTOH, I've got about as much use for XML as MarkH has for continuations , and here-- as in many other places --we've been saved so far by Guido's good judgment about what goes in & what stays out. So it's a good thing he can't ever resign this responsibility . if-20%-of-users-need-something-i'd-include-it-else-not-ly y'rs - tim From guido at CNRI.Reston.VA.US Sun Jul 4 03:56:31 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 03 Jul 1999 21:56:31 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Thu, 01 Jul 1999 00:30:30 EDT." <000101bec37a$7465af00$309e2299@tim> References: <000101bec37a$7465af00$309e2299@tim> Message-ID: <199907040156.VAA10874@eric.cnri.reston.va.us> > [Guido] > > I guess it's all in the perspective. 99.99% of all thread apps I've > > ever written use threads primarily to overlap I/O -- if there wasn't > > I/O to overlap I wouldn't use a thread. I think I share this > > perspective with most of the thread community (after all, threads > > originate in the OS world where they were invented as a replacement > > for I/O completion routines). [Tim] > Different perspective indeed! Where I've been, you never used something as > delicate as a thread to overlap I/O, you instead used the kernel-supported > asynch Fortran I/O extensions <0.7 wink>. > > Those days are long gone, and I've adjusted to that. Time for you to leave > the past too : by sheer numbers, most of the "thread community" > *today* is to be found typing at a Windows box, where cheap & reliable > threads are a core part of the programming culture. No quibble so far... > They have better ways to overlap I/O there too. Really? What are they? The non-threaded primitives for overlapping I/O look like Medusa to me: very high performance, but a pain to use -- because of the event-driven programming model! (Or worse, callback functions.) But maybe there are programming techniques that I'm not even aware of? (Maybe I should define what I mean by overlapping I/O -- basically every situation where disk or network I/O or GUI event handling goes on in parallel with computation or with each other. For example, in my book copying a set of files while at the same time displaying some silly animation of sheets of paper flying through the air *and* watching a Cancel button counts as overlapping I/O, and if I had to code this it would probably be a lot simpler to do using threads. > Throwing explicit threads at this is like writing a recursive > Fibonacci number generator in Scheme, but building the recursion > yourself by hand out of explicit continuations . Aren't you contradicting yourself? You say that threads are ubiquitous and easy on Windows (and I agree), yet you claim that threads are overkill for doing two kinds of I/O or one kind of I/O and some computation in parallel...? I'm also thinking of Java threads. Yes, the GC thread is one of those computational threads you are talking about, but I think the examples I've seen otherwise are mostly about having one GUI component (e.g. an applet) independent from other GUI components (e.g. the browser). To me that's overlapping I/O, since I count GUI events as I/O. > > ... > > As far as I can tell, all the examples you give are easily done using > > coroutines. Can we call whatever you're asking for coroutines instead > > of fake threads? > > I have multiple agendas, of course. What I personally want for my own work > is no more than Icon's generators, formally "semi coroutines", and easily > implemented in the interpreter (although not the language) as it exists > today. > > Coroutines, fake threads and continuations are much stronger than > generators, and I expect you can fake any of the first three given either of > the others. Coroutines, fake threads and continuations? Can you really fake continuations given generators? > Generators fall out of any of them too (*you* implemented > generators once using Python threads, and I implemented general > coroutines -- "fake threads" are good enough for either of those). Hm. Maybe I'm missing something. Why didn't you simply say "you can fake each of the others given any of these"? > So, yes, for that agenda any means of suspending/resuming control flow can > be made to work. I seized on fake threads because Python already has a > notion of threads. > > A second agenda is that Python could be a lovely language for *learning* > thread programming; the threading module helps, but fake threads could > likely help more by e.g. detecting deadlocks (and pointing them out) instead > of leaving a thread newbie staring at a hung system without a clue. Yes. > A third agenda is related to Mark & Greg's, making Python's threads "real > threads" under Windows. The fake thread agenda doesn't tie into that, > except to confuse things even more if you take either agenda seriously <0.5 > frown>. What makes them unreal except for the interpreter lock? Python threads are always OS threads, and that makes them real enough for most purposes... (I'm not sure if there are situations on uniprocessors where the interpreter lock screws things up that aren't the fault of the extension writer -- typically, problems arise when an extension does some blocking I/O but doesn't place Py_{BEGIN,END}_ALLOW_THREADS macros around the call.) > > I think that when you mention threads, green or otherwise colored, > > most people who are at all familiar with the concept will assume they > > provide I/O overlapping, except perhaps when they grew up in the > > parallel machine world. > > They didn't suggest I/O to me at all, but I grew up in the disqualified > world ; doubt they would to a Windows programmer either (e.g., my > employer ships heavily threaded Windows apps of various kinds, and > overlapped I/O isn't a factor in any of them; it's mostly a matter of > algorithm factoring to keep the real-time incestuous subsystems from growing > impossibly complex, and in some of the very expensive apps also a need to > exploit multiple processors). Hm, you admit that they sometimes want to use multiple CPUs, which was explcitly excluded from our discussion (since fake threads don't help there), and I bet that they are also watching some kind of I/O (e.g. whether the user says some more stuff). > BTW, I called them "fake" threads to get away > from whatever historical baggage comes attached to "green". Agreed -- I don't understand where green comes from at all. Does it predate Java? > > Certainly all examples I give in my never-completed thread tutorial > > (still available at > > http://www.python.org/doc/essays/threads.html) use I/O as the primary > > motivator -- > > The preceding "99.99% of all thread apps I've ever written use threads > primarily to overlap I/O" may explain this . BTW, there is only one > example there, which rather dilutes the strength of the rhetorical "all" ... OK, ok. I was planning on more along the same lines. I may have borrowed this idea from a Java book I read. > > this kind of example appeals to simples souls (e.g. downloading more than > > one file in parallel, which they probably have already seen in action in > > their web browser), as opposed to generators or pipelines or coroutines > > (for which you need to have some programming theory background to > > appreciate the powerful abstraction possibillities they give). > > I don't at all object to using I/O as a motivator, but the latter point is > off base. There is *nothing* in Comp Sci harder to master than thread > programming! It's the pinnacle of perplexity, the depth of despair, the > king of confusion (stop before I exaggerate ). I dunno, but we're probably both pretty poor predictors for what beginning programmers find hard. Randy Pausch (of www.alice.org) visited us this week; he points out that we experienced programmers are very bad at gauging what newbies find hard, because we've been trained "too much". He makes the point very eloquently. He also points out that in Alice, users have no problem at all with parallel activities (e.g. the bunny's head rotating while it is also hopping around, etc.). > Generators in particular get re-invented often as a much simpler approach to > suspending a subroutine's control flow; indeed, Icon's primary audience is > still among the humanities, and even dumb linguists don't seem to > have notable problems picking it up. Threads have all the complexities of > the other guys, plus races, deadlocks, starvation, load imbalance, > non-determinism and non-reproducibility. Strange. Maybe dumb linguists are better at simply copying examples without thinking too much about them; personally I had a hard time understanding what Icon was doing when I read about it, probably because I tried to understand how it was done. For threads, I have a simple mental model. For coroutines, my head explodes each time. > Threads simply aren't simple-soul material, no matter how pedestrian a > motivating *example* may be. I suspect that's why your tutorial remains > unfinished: you had no trouble describing the problem to be solved, but got > bogged down in mushrooming complications describing how to use threads to > solve it. No, I simply realized that I had to finish the threading module and release the thread-safe version of urllib.py before I could release the tutorial; and then I was distracted and never got back to it. > Even so, the simple example at the end is already flawed ("print" > isn't atomic in Python, so the > > print len(text), url > > may print the len(text) from one thread followed by the url from another). Fine -- that's a great excuse to introduce locks in the next section. (Most threading tutorials I've seen start by showing flawed examples to create an appreciation for the need of locks.) > It's not hard to find simple-soul examples for generators either (coroutines > & continuations *are* hard to motivate!), especially since Python's > for/__getitem__ protocol is already a weak form of generator, and xrange > *is* a full-blown generator; e.g., a common question on c.l.py is how to > iterate over a sequence backwards: > > for x in backwards(sequence): > print x > > def backwards(s): > for i in xrange(len(s)-1, -1, -1): > suspend s[i] But backwards() also returns, when it's done. What happens with the return value? > Nobody needs a comp sci background to understand what that *does*, or why > it's handy. Try iterating over a tree structure instead & then the *power* > becomes apparent; this isn't comp-sci-ish either, unless we adopt a "if > they've heard of trees, they're impractical dreamers" stance . BTW, > iterating over a tree is what os.path.walk does, and a frequent source of > newbie confusion (they understand directory trees, they don't grasp the > callback-based interface; generating (dirname, names) pairs instead would > match their mental model at once). *This* is the stuff for simple souls! Probably right, although I think that os.path.walk just has a bad API (since it gives you a whole directory at a time instead of giving you each file). > > Another good use of threads (suggested by Sam) is for GUI programming. > > An old GUI system, News by David Rosenthal at Sun, used threads > > programmed in PostScript -- very elegant (and it failed for other > > reasons -- if only he had used Python instead :-). > > > > On the other hand, having written lots of GUI code using Tkinter, the > > event-driven version doesn't feel so bad to me. Threads would be nice > > when doing things like rubberbanding, but I generally agree with > > Ousterhout's premise that event-based GUI programming is more reliable > > than thread-based. Every time your Netscape freezes you can bet > > there's a threading bug somewhere in the code. > > I don't use Netscape, but I can assure you the same is true of Internet > Explorer -- except there the threading bug is now somewhere in the OS <0.5 > wink>. > > Anyway, > > 1) There are lots of goods uses for threads, and especially in the Windows > and (maybe) multiprocessor NumPy worlds. Those would really be happier with > "free-threaded threads", though. > > 2) Apart from pedagogical purposes, there probably isn't a use for my "fake > threads" that couldn't be done easier & better via a more direct (coroutine, > continuation) approach; and if I had fake threads, the first thing I'd do > for me is rewrite the generator and coroutine packages to use them. So, > yes: you win . > > 3) Python's current threads are good for overlapping I/O. Sometimes. And > better addressed by Sam's non-threaded "select" approach when you're dead > serious about overlapping lots of I/O. This is independent of Python, and is (I think) fairly common knowledge -- if you have 10 threads this works fine, but with 100s of them the threads themselves become expensive resources. But then you end up with contorted code which is why high-performance systems require experts to write them. > They're also beaten into service > under Windows, but not without cries of anguish from Greg and Mark. Not sure what you mean here. > I don't know, Guido -- if all you wanted threads for was to speed up a > little I/O in as convoluted a way as possible, you may have been witness to > the invention of the wheel but missed that ox carts weren't the last > application . What were those applications of threads again you were talking about that could be serviced by fake threads that weren't coroutines/generators? --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Sun Jul 4 05:41:32 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sat, 3 Jul 1999 22:41:32 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907040156.VAA10874@eric.cnri.reston.va.us> References: Your message of "Thu, 01 Jul 1999 00:30:30 EDT." <000101bec37a$7465af00$309e2299@tim> Message-ID: <1281066233-51948648@hypernet.com> Hmmm. I jumped back into this one, but never saw my post show up... Threads (real or fake) are useful when more than one thing is "driving" your processing. It's just that in the real world (a place Tim visited, once, but didn't like - or was it vice versa?) those "drivers" are normally I/O. Guido complained that to do it right would require gathering up all the fds and doing a select. I don't think that's true (at least, for a decent fake thread). You just have to select on the one (to see if the I/O will work) and swap or do it accordingly. Also makes it a bit easier for portability (I thought I heard that Mac's select is limited to sockets). I see 2 questions. First, is there enough of an audience (Mac, mostly, I think) without native threads to make them worthwhile? Second, do we want to introduce yet more possibilities for brain-explosions by enabling coroutines / continuations / generators or some such? There is practical value there (as Sam has pointed out, and I now concur, watching my C state machine grow out of control with each new client request). I think the answer to both is probably "yes", and though they have a lot in common technically, they have totally different rationales. - Gordon From tim_one at email.msn.com Sun Jul 4 10:46:09 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 4 Jul 1999 04:46:09 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281066233-51948648@hypernet.com> Message-ID: <000d01bec5f9$a95fa9a0$ea9e2299@tim> [Gordon McMillan] > Hmmm. I jumped back into this one, but never saw my post show up... Me neither! An exclamation point because I see there's a recent post of yours in the Python-Dev archives, but I didn't get it in the mail either. > Threads (real or fake) are useful when more than one thing is > "driving" your processing. It's just that in the real world (a place > Tim visited, once, but didn't like - or was it vice versa?) those > "drivers" are normally I/O. Yes, but that's the consensus view of "real", and so suffers from "ten billion flies can't be wrong" syndrome . If you pitch a parallel system to the NSA, they give you a long list of problems and ask you to sketch the best way to solve them on your platform; as I recall, none had anything to do with I/O even under Guido's definition; instead tons of computation with difficult twists, and enough tight coupling to make threads the natural approach in most cases. If I said any more they'd terminate me with extreme prejudice, and the world doesn't get any realer than that . > Guido complained that to do it right would require gathering up all > the fds and doing a select. I don't think that's true (at least, for > a decent fake thread). You just have to select on the one (to see if > the I/O will work) and swap or do it accordingly. Also makes it a bit > easier for portability (I thought I heard that Mac's select is > limited to sockets). Can you flesh out the "swap" part more? That is, we're in the middle of some C code, so the C stack is involved in the state that's being swapped, and under fake threads we don't have a real thread to magically capture that. > I see 2 questions. First, is there enough of an audience (Mac, > mostly, I think) without native threads to make them worthwhile? > Second, do we want to introduce yet more possibilities for > brain-explosions by enabling coroutines / continuations / generators > or some such? There is practical value there (as Sam has pointed out, > and I now concur, watching my C state machine grow out of control > with each new client request). > > I think the answer to both is probably "yes", and though they have a > lot in common technically, they have totally different rationales. a) Generators aren't enough for Sam's designs. b) Fake threads are roughly comparable to coroutines and continuations wrt power (depending on implementation details, continuations may be strictly most powerful, and coroutines least). c) Christian's stackless Python can, I believe, already do full coroutines, and is close to doing full continuations. So soon we can kick the tires instead of each other . or-what-the-heck-we-can-akk-kick-chris-ly y'rs - tim From tim_one at email.msn.com Sun Jul 4 10:45:58 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 4 Jul 1999 04:45:58 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907040156.VAA10874@eric.cnri.reston.va.us> Message-ID: <000c01bec5f9$a3e86e80$ea9e2299@tim> [Guido and Tim, Guido and Tim] Ouch! This is getting contentious. Let's unwind the "you said, I said, you said" business a bit. Among the three {coroutines, fake threads, continuations}, I expect any could be serviceably simulated via either of the others. There: just saved a full page of sentence diagramming . All offer a strict superset of generator semantics. It follows that, *given* either coroutines or continuations, I indeed see no semantic hole that would be plugged by fake threads. But Python doesn't have any of the three now, and there are two respects in which fake threads may have an advantage over the other two: 1) Pedagogical, a friendlier sandbox for learning "real threads". 2) Python already has *a* notion of threads. So fake threads could be seen as building on that (variation of an existing concept, as opposed to something unprecedented). I'm the only one who seems to see merit in #2, so I won't mention it again: fake threads may be an aid to education, but other than that they're useless crap, and probably cause stains if not outright disk failure . About newbies, I've seen far too many try to learn threads to entertain the notion that they're easier than I think. Threads != parallel programming, though! Approaches like Gelertner's Linda, or Klappholz's "refined languages", *are* easy for newbies to pick up, because they provide clear abstractions that prevent the worst parallelism bugs by offering primitives that *can't* e.g. deadlock. threading.py is a step in the right direction (over the "thread" module) too. And while I don't know what Alice presents as a parallelism API, I'd bet 37 cents unseen that the Alice user doesn't build "parallel activities" out of thread.start_new_thread and raw mutii . About the rest, I think you have a more expansive notion of I/O than I do, although I can squint and see what you mean; e.g., I *could* view almost all of what Dragon's products do as I/O, although it's a real stretch for the thread that just polls the other threads making sure they're still alive . Back to quoting: >> Throwing explicit threads at this is like writing a recursive >> Fibonacci number generator in Scheme, but building the recursion >> yourself by hand out of explicit continuations . > Aren't you contradicting yourself? You say that threads are > ubiquitous and easy on Windows (and I agree), yet you claim that > threads are overkill for doing two kinds of I/O or one kind of I/O and > some computation in parallel...? They're a general approach (like continuations) but, yes, given an asynch I/O interface most times I'd much rather use the latter (like I'd rather use recursion directly when it's available). BTW, I didn't say threads were "easy" under Windows: cheap, reliable & ubiquitous, yes. They're easier than under many other OSes thanks to a rich collection of system-supplied thread gimmicks that actually work, but no way are they "easy". Like you did wrt hiding "thread" under "threading", even under Windows real projects have to create usable app-specific thread-based abstractions (c.f. your on-target remark about Netscape & thread bugs). > I'm also thinking of Java threads. Yes, the GC thread is one of those > computational threads you are talking about, but I think the examples > I've seen otherwise are mostly about having one GUI component (e.g. an > applet) independent from other GUI components (e.g. the browser). To > me that's overlapping I/O, since I count GUI events as I/O. Whereas I don't. So let's just agree to argue about this one with ever-increasing intensity . > ... > What makes them unreal except for the interpreter lock? Python > threads are always OS threads, and that makes them real enough for > most purposes... We should move this part to the Thread-SIG; Mark & Greg are doubtless chomping at the bit to rehash the headaches the global lock causes under Windows ; I'm not so keen either to brush off the potential benefits of multiprocessor parallelism, particularly not with the price of CPUs falling into spare-change range. > (I'm not sure if there are situations on uniprocessors where the > interpreter lock screws things up that aren't the fault of the > extension writer -- typically, problems arise when an extension does > some blocking I/O but doesn't place Py_{BEGIN,END}_ALLOW_THREADS > macros around the call.) Hmm! What kinds of problems happen then? Just a lack of hoped-for overlap, or actual deadlock (the I/O thread needing another thread to proceed for itself to make progress)? If the latter, the extension writer's view of who's at fault may differ from ours . >> (e.g., my employer ships heavily threaded Windows apps of various >> kinds, and overlapped I/O isn't a factor in any of them; it's mostly >> a matter of algorithm factoring to keep the real-time incestuous >> subsystems from growing impossibly complex, and in some of the very >> expensive apps also a need to exploit multiple processors). > Hm, you admit that they sometimes want to use multiple CPUs, which was > explcitly excluded from our discussion (since fake threads don't help > there), I've been ranting about both fake threads and real threads, and don't recall excluding anything; I do think I *should* have, though . > and I bet that they are also watching some kind of I/O (e.g. whether the > user says some more stuff). Sure, and whether the phone rings, and whether text-to-speech is in progress, and tracking the mouse position, and all sorts of other arguably I/O-like stuff too. Some of the subsytems are thread-unaware legacy or 3rd-party code, and need to run in threads dedicated to them because they believe they own the entire machine (working via callbacks). The coupling is too tight to afford IPC mechanisms, though (i.e., running these in a separate process is not an option). Mostly it's algorithm-factoring, though: text-to-speech and speech-to-text both require mondo complex processing, and the "I/O part" of each is a small link at an end of a massive chain. Example: you say something, and you expect to see "the result" the instant you stop speaking. But the CPU cycles required to recognize 10 seconds of speech consumes, alas, about 10 seconds. So we *have* to overlap the speech collection with the signal processing, the acoustic feature extraction, the acoustic scoring, the comparison with canned acoustics for many tens of thousands of words, the language modeling ("sounded most like 'Guido', but considering the context they probably said 'ghee dough'"), and so on. You simply can't write all that as a monolothic algorithm and have a hope of it working; it's most naturally a pipeline, severely complicated in that what pops out of the end of the first stage can have a profound effect on what "should have come out" at the start of the last stage. Anyway, thread-based pseudo-concurreny is a real help in structuring all that. It's *necessary* to overlap speech collection (input) with computation and result-so-far display (output), but it doesn't stop there. > ... > Agreed -- I don't understand where green comes from at all. Does it > predate Java? Don't know, but I never heard of it before Java or outside of Solaris. [about generators & dumb linguists] > Strange. Maybe dumb linguists are better at simply copying examples > without thinking too much about them; personally I had a hard time > understanding what Icon was doing when I read about it, probably > because I tried to understand how it was done. For threads, I have a > simple mental model. For coroutines, my head explodes each time. Yes, I expect the trick for "dumb linguists" is that they don't try to understand. They just use it, and it works or it doesn't. BTW, coroutines are harder to understand because of (paradoxically!) the symmetry; generators are slaves, so you don't have to bifurcate your brain to follow what they're doing . >> print len(text), url >> >> may print the len(text) from one thread followed by the url >> from another). > Fine -- that's a great excuse to introduce locks in the next section. > (Most threading tutorials I've seen start by showing flawed examples > to create an appreciation for the need of locks.) Even better, they start with an endless sequence of flawed examples that makes the reader wonder if there's *any* way to get this stuff to work . >> for x in backwards(sequence): >> print x >> >> def backwards(s): >> for i in xrange(len(s)-1, -1, -1): >> suspend s[i] > But backwards() also returns, when it's done. What happens with the > return value? I don't think a newbie would think to ask that: it would "just work" . Seriously, in Icon people quickly pick up that generators have a "natural lifetime", and when they return their life is over. It hangs together nicely enough that people don't have to think about it. Anyway, "return" and "suspend" both return a value; the only difference is that "return" kills the generator (it can't be resumed again after a return). The pseudo-Python above assumed that a generator signals the end of its life by returning None. Icon uses a different mechanism. > ... > Probably right, although I think that os.path.walk just has a bad API > (since it gives you a whole directory at a time instead of giving you > each file). Well, in Ping's absence I've generally fielded the c.l.py questions about tokenize.py too, and there's a pattern: non-GUI people simply seem to find callbacks confusing! os.path.walk has some other UI glitches (like "arg" is the 3rd argument to walk but the 1st arg to the callback, & people don't know what its purpose is anyway), but I think the callback is the core of it (& "arg" is an artifact of the callback interface). I can't help but opine that part of what people find so confusing about call/cc in Scheme is that it calls a function taking a callback argument too. Generators aren't strong enough to replace call/cc, but they're exactly what's needed to make tokenize's interface match the obvious mental model ("the program is a stream of tokens, and I want to iterate over that"); c.f. Sam's comments too about layers of callbacks vs "normal control flow". >> 3) Python's current threads are good for overlapping I/O. >> Sometimes. And better addressed by Sam's non-threaded "select" >> approach when you're dead serious about overlapping lots of I/O. > This is independent of Python, and is (I think) fairly common > knowledge -- if you have 10 threads this works fine, but with 100s of > them the threads themselves become expensive resources. I think people with a Unix background understand that, but not sure about Windows natives. Windows threads really are cheap, which easily slides into abuse; e.g., the recently-fixed electron-width hole in cleaning up thread states required extreme rates of thread death to provoke, and has been reported by multiple Windows users. An SGI guy was kind enough to confirm the test case died for him too, but did any non-Windows person ever report this bug? > But then you end up with contorted code which is why high-performance > systems require experts to write them. Which feeds back into Sam's agenda: the "advanced" control-flow gimmicks can be used by an expert to implement a high-performance system that doesn't require expertise to use. Fake threads would be good enough for that purpose too (while real threads aren't), although he's got his heart set on one of the others. >> I don't know, Guido -- if all you wanted threads for was to speed up a >> little I/O in as convoluted a way as possible, you may have been witness >> to the invention of the wheel but missed that ox carts weren't the last >> application . > What were those applications of threads again you were talking about > that could be serviced by fake threads that weren't coroutines/generators? First, let me apologize for the rhetorical excess there -- it went too far. Forgive me, or endure more of the same . Second, the answer is (of course) "none", but that was a rant about real threads, not fake ones. so-close-you-can-barely-tell-'em-apart-ly y'rs - tim From gmcm at hypernet.com Sun Jul 4 15:23:31 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sun, 4 Jul 1999 08:23:31 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000d01bec5f9$a95fa9a0$ea9e2299@tim> References: <1281066233-51948648@hypernet.com> Message-ID: <1281031342-54048300@hypernet.com> [I jump back into a needlessly contentious thread]: [Gordon McMillan - me] > > Threads (real or fake) are useful when more than one thing is > > "driving" your processing. It's just that in the real world (a place > > Tim visited, once, but didn't like - or was it vice versa?) those > > "drivers" are normally I/O. [Tim] > Yes, but that's the consensus view of "real", and so suffers from > "ten billion flies can't be wrong" syndrome . If you pitch a > parallel system to the NSA, I can assure you that gov't work isn't "real", even when the problem domain appears to be, which in this case is assuredly not true . But the point really is that (1) Guido's definition of "I/O" is very broad and (2) given that definition, it probably does account for 99% of the cases. Which is immaterial, if the fix for one fixes the others. > > Guido complained that to do it right would require gathering up all > > the fds and doing a select. I don't think that's true (at least, for > > a decent fake thread). You just have to select on the one (to see if > > the I/O will work) and swap or do it accordingly. Also makes it a bit > > easier for portability (I thought I heard that Mac's select is > > limited to sockets). > > Can you flesh out the "swap" part more? That is, we're in the > middle of some C code, so the C stack is involved in the state > that's being swapped, and under fake threads we don't have a real > thread to magically capture that. Sure - it's spelled "T I S M E R". IFRC, this whole thread started with Guido dumping cold water on the comment that perhaps Chris's work could yield green (er, "fake") threads. > > I see 2 questions. First, is there enough of an audience (Mac, > > mostly, I think) without native threads to make them worthwhile? > > Second, do we want to introduce yet more possibilities for > > brain-explosions by enabling coroutines / continuations / generators > > or some such? There is practical value there (as Sam has pointed out, > > and I now concur, watching my C state machine grow out of control > > with each new client request). > > > > I think the answer to both is probably "yes", and though they have a > > lot in common technically, they have totally different rationales. > > a) Generators aren't enough for Sam's designs. OK, but they're still (minorly) mind expanding for someone from the orthodox C / Python world... > b) Fake threads are roughly comparable to coroutines and > continuations wrt power (depending on implementation details, > continuations may be strictly most powerful, and coroutines least). > > c) Christian's stackless Python can, I believe, already do full > coroutines, and is close to doing full continuations. So soon we > can kick the tires instead of each other . So then we're down to Tim faking the others from whatever Chris comes up with? Sounds dandy to me! (Yah, bitch and moan Tim; you'd do it anyway...). (And yes, we're on the "dev" list; this is all experimental; so Guido can just live with being a bit uncomfortable with it ). The rambling arguments have had to do with "reasons" for doing this stuff. I was just trying to point out that there are a couple valid but very different reasons: 1) Macs. 2) Sam. almost-a-palindrome-ly y'rs - Gordon From tismer at appliedbiometrics.com Sun Jul 4 16:06:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 04 Jul 1999 16:06:01 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000c01bec5f9$a3e86e80$ea9e2299@tim> Message-ID: <377F6A49.B48E8000@appliedbiometrics.com> Just a few clarifications. I have no time, but need to share what I learned. Tim Peters wrote: > > [Guido and Tim, Guido and Tim] ... > Among the three {coroutines, fake threads, continuations}, I expect any > could be serviceably simulated via either of the others. There: just saved > a full page of sentence diagramming . All offer a strict superset of > generator semantics. I have just proven that this is not true. Full continuations cannot be expressed by coroutines. All the rest is true. Coroutines and fake threads just need the absence of the C stack. To be more exact: It needs that the current state of the C stack is independent from executing bound Python code (which frames are). Now the big surprize: This *can* be done without removing the C stack. It can give more speed to let the stack wind up to some degree and wind down later. Even some Scheme implementations are doing this. But the complexity to make this work correctly is even higher than to be stackless whenever possible. So this is the basement, but improvements are possible and likely to appear. Anyway, with this, you can build fake threads, coroutines and generators. They all do need a little extra treatment. Switching of context, how to stop a coroutine, how to catch exceptions and so on. You can do all that with some C code. I just believe that even that can be done with Python. Here the unsayable continuation word appears. You must have them if you want to try the above *in* Python. Reason why continuations are the hardest of the above to implement and cannot expressed by them: A continuation is the future of some computation. It allows to change the order of execution of a frame in a radical way. A frame can have as many as one dormant continuation per every function call which appears lexically, and it cannot predict which of these is actually a continuation. From klm at digicool.com Sun Jul 4 16:30:00 1999 From: klm at digicool.com (Ken Manheimer) Date: Sun, 4 Jul 1999 10:30:00 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000c01bec5f9$a3e86e80$ea9e2299@tim> <377F6A49.B48E8000@appliedbiometrics.com> Message-ID: <002601bec629$b38eedc0$5a57a4d8@erols.com> I have to say thank you, christian! I think your intent - provide the basis for designers of python's advanced control mechanisms to truly explore, and choose the direction in a well informed way - is ideal, and it's a rare and wonderful opportunity to be able to pursue something like an ideal course. Thanks to your hard work. Whatever comes of this, i think we all have at least refined our understandings of the issues - i know i have. (Thanks also to the ensuing discussion's clarity and incisiveness - i need to thank everyone involved for that...) I may not be able to contribute particularly to the implementation, but i'm glad to be able to grasp the implications as whatever proceeds, proceeds. And i actually expect that the outcome will be much better informed than it would have been without your following through on your own effort to understand. Yay! Ken klm at digicool.com From gmcm at hypernet.com Sun Jul 4 20:25:20 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sun, 4 Jul 1999 13:25:20 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <377F6A49.B48E8000@appliedbiometrics.com> Message-ID: <1281013244-55137551@hypernet.com> I'll second Ken's congratulations to Christian! [Christian] > ... Full continuations > cannot be expressed by coroutines. All the rest is true. I beg enlightenment from someone more familiar with these high-falutin' concepts. Would the following characterization be accurate? All these beasts (continuations, coroutines, generators) involve the idea of "resumable", but: A generator's state is wholly self-contained A coroutines's state is not necessarily self-contained but it is stable Continuations may have volatile state. Is this right, wrong, necessary, sufficient...?? goto-beginning-to-look-attractive-ly y'rs - Gordon From bwarsaw at cnri.reston.va.us Mon Jul 5 00:14:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Sun, 4 Jul 1999 18:14:36 -0400 (EDT) Subject: [Python-Dev] Mail getting lost? (was RE: Fake threads) References: <1281066233-51948648@hypernet.com> <000d01bec5f9$a95fa9a0$ea9e2299@tim> Message-ID: <14207.56524.360202.939414@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Me neither! An exclamation point because I see there's a TP> recent post of yours in the Python-Dev archives, but I didn't TP> get it in the mail either. A bad performance problem in Mailman was causing cpu starvation and (I'm surmising) lost messages. I believe I've fixed this in the version currently running on python.org. If you think messages are showing up in the archives but you are still not seeing them delivered to you, please let me know via webmaster at python.org! -Barry From guido at CNRI.Reston.VA.US Mon Jul 5 14:12:41 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 05 Jul 1999 08:12:41 -0400 Subject: [Python-Dev] Welcome Jean-Claude Wippler Message-ID: <199907051212.IAA11729@eric.cnri.reston.va.us> We have a new python-dev member. Welcome, Jean-Claude! (It seems you are mostly interested in lurking, since you turned on digest mode :-) Remember, the list's archives and member list are public; noth are accessible via http://www.python.org/mailman/listinfo/python-dev I would welcome more members -- please suggest names and addresses to me! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Mon Jul 5 14:06:03 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 05 Jul 1999 08:06:03 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Sun, 04 Jul 1999 13:25:20 CDT." <1281013244-55137551@hypernet.com> References: <1281013244-55137551@hypernet.com> Message-ID: <199907051206.IAA11699@eric.cnri.reston.va.us> > [Christian] > > ... Full continuations > > cannot be expressed by coroutines. All the rest is true. [Gordon] > I beg enlightenment from someone more familiar with these > high-falutin' concepts. Would the following characterization be > accurate? > > All these beasts (continuations, coroutines, generators) involve the > idea of "resumable", but: > > A generator's state is wholly self-contained > A coroutines's state is not necessarily self-contained but it is stable > Continuations may have volatile state. > > Is this right, wrong, necessary, sufficient...?? I still don't understand all of this (I have not much of an idea of what Christian's search for hidden registers is about and what kind of analysis he needs) but I think of continuations as requiring (theoretically) coping the current stack (to and from), while generators and coroutines just need their own piece of stack set aside. The difference between any of these and threads (fake or real) is that they pass control explicitly, while threads (typically) presume pre-emptive scheduling, i.e. they make independent parallel progress without explicit synchronization. (Hmm, how do you do this with fake threads? Or are these only required to switch whenever you touch a mutex?) I'm not sure if there's much of a difference between generators and coroutines -- it seems just the termination convention. (Hmm... would/should a generator be able to raise an exception in its caller? A coroutine?) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Mon Jul 5 08:55:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 5 Jul 1999 02:55:02 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281013244-55137551@hypernet.com> Message-ID: <000101bec6b3$4e752be0$349e2299@tim> [Gordon] > I beg enlightenment from someone more familiar with these > high-falutin' concepts. Would the following characterization be > accurate? > > All these beasts (continuations, coroutines, generators) involve the > idea of "resumable", but: > > A generator's state is wholly self-contained > A coroutines's state is not necessarily self-contained but it is > stable > Continuations may have volatile state. > > Is this right, wrong, necessary, sufficient...?? > > goto-beginning-to-look-attractive-ly y'rs "goto" is deliciously ironic, for a reason to become clear . Here's my biased short course. NOW First, I have the feeling most people would panic if we simply described Python's current subroutine mechanism under a new name <0.9 wink>. I'll risk that. When Python makes a call, it allocates a frame object. Attached to the frame is the info everyone takes for granted so thinks is "simple & obvious" . Chiefly, "the locals" (name -> object bindings) a little evaluation stack for holding temps and dynamic block-nesting info the offset to the current bytecode instruction, relative to the start of the code object's fixed (immutable) bytecode vector When a subroutine returns, it decrefs the frame and then the frame typically goes away; if it returns because of an exception, though, traceback objects may keep the frame alive. GENERATORS Generators add two new abstract operations, "suspend" and "resume". When a generator suspends, it's exactly like a return today except we simply decline to decref the frame. That's it! The locals, and where we are in the computation, aren't thrown away. A "resume" then consists of *re*starting the frame at its next bytecode instruction, with the retained frame's locals and eval stack just as they were. Some generator properties: + In implementation terms a trivial variation on what Python currently does. + They're asymmetric: "suspend" is something only a generator can do, and "resume" something only its caller can do (this does not preclude a generator from being "the caller" wrt to some other generator, though, and indeed that's very useful in practice). + A generator always returns control directly to its caller, at the point the caller invoked the generator. And upon resumption, a generator always picks up where it left off. + Because a generator remembers where it is and what its locals are, its state and "what to do next" don't have to be encoded in global data structures then decoded from scratch upon entry. That is, whenever you build a little (or large!) state machine to figure out "what to do next" from a collection of persistent flags and state vrbls, chances are good there's a simple algorithm dying to break free of that clutter . COROUTINES Coroutines add only one new abstract operation, "transfer". They're fully symmetric so can get away with only one. "transfer" names a coroutine to transfer to, and gives a value to deliver to it (there are variations, but this one is common & most useful). When A transfers to B, it acts like a generator "suspend" wrt A and like a generator "resume" wrt B. So A remembers where it is, and what its locals etc are, and B gets restarted from the point *it* last transfered to someone else. Coroutines grew up in simulation languages because they're an achingly natural way to model independent objects that interact with feedback. There each object (which may itself be a complex system of other stuff) is written as an infinite loop, transferring control to other objects when it has something to tell them, and transferred to by other objects when they have something to tell it. A Unix pipeline "A | B | C | D" doesn't exploit the full power but is suggestive. A may be written as while 1: x = compute my next output B.transfer(x) # resume B with my output B as while 1: x = A.transfer() # resume A to get my input y = compute something from x and my own history C.transfer(y) # resume C with my output C as while 1: x = B.transfer() # resume B to get my input y = compute something from x and my own history D.transfer(y) # resume D with my output and D as while 1: x = C.transfer() # resume C to get my input y = compute something from x and my own history print y If e.g. C collapses pairs of values from B, it can be written instead as while 1: # get a pair of B's x = B.transfer() y = B.transfer() z = f(x, y, whatever) D.transfer(z) # resume D with my output It's a local modification to C: B doesn't know and shouldn't need to know. This keeps complex algorithms manageable as things evolve. Initialization and shutdown can be delicate, but once the pipe is set up it doesn't even matter which of {A, B, C, D} first gets control! You can view A as pushing results through the pipe, or D as pulling them, or whatever. In reality they're all equal partners. Why these are so much harder to implement than generators: "transfer" *names* who next gets control, while generators always return to their (unnamed) caller. So a generator simply "pops the stack" when it suspends, while coroutine flow need not be (and typically isn't) stack-like. In Python this is currently a coroutine-killer, because the C stack gets intertwined. So if coroutine A merely calls (in the regular sense) function F, and F tries to transfer to coroutine B, the info needed to resume A includes the chunk of the C stack between A and F. And that's why the Python coroutine implementation I referenced earlier uses threads under the covers (where capturing pieces of the C stack isn't a problem). Early versions of coroutines didn't allow for this, though! At first coroutines could only transfer *directly* to other coroutines, and as soon as a coroutine made "a regular call" transfers were prohibited until the call returned (unless the called function kicked off a brand new collection of coroutines, which could then transfer among themselves -- making the distinction leads to convoluted rules, so modern practice is to generalize from the start). Then the current state of each coroutine was contained in a single frame, and it's really no harder to implement than generators. Knuth seems to have this restricted flavor of coroutine in mind when he describes generator behavior as "semi-coroutine". CONTINUATIONS Given the pedagogical structure so far, you're primed to view continuations as an enhancement of coroutines. And that's exactly what will get you nowhere . Continuations aren't more elaborate than coroutines, they're simpler. Indeed, they're simpler than generators, and even simpler than "a regular call"! That's what makes them so confusing at first: they're a different *basis* for *all* call-like behavior. Generators and coroutines are variations on what you already know; continuations challenge your fundamental view of the universe. Legend has it they were discovered when theorists were trying to find a solid reason for why goto statements suck: the growth of "denotational semantics" (DS) boomed at the same time "structured programming" took off. The former is a solid & fruitful approach to formally specifying the semantics of programming languages, built on the lambda calculus (and so dear to the Lisp/Scheme community -- this all ties together, of course ). The early hope was that goto statements would prove to present intractable problems for formal specification, and then "that's why they suck: we can't even sort them out on paper, let alone in practice". But in one of God's cleverer tricks on the programming world , the semantics of goto turned out to be trivial: at a branch point, you can go one of two ways. Represent one of those ways by a function f that computes what happens if you branch one way, and the other way by a function g. Then an if+goto simply picks one of f or g as "the continuation" of the program, depending on whether the "if" condition is true or false. And a plain goto simply replaces the current continuation with a different one (representing what happens at the branch target) unconditionally. So goto turned out to be simpler (from the DS view) than even an assignment stmt! I've often suspected theorists were *surprised* (and maybe appalled <0.7 wink>) when the language folks went on to *implement* the continuation idea. Don't really know, but suppose it doesn't matter anyway. The fact is we're stuck with them now . In theory a continuation is a function that computes "the rest of the program", or "its future". And it really is like a supercharged goto! It's the formal DS basis for all control flow, from goto stmts to exception handling, subsuming vanilla call flow, recursion, generators, coroutines, backtracking, and even loops along the way. To a certain frame of mind (like Sam's, and Christian is temporarily under his evil influence ), this relentless uniformity & consistency of approach is very appealing. Guido tends to like his implementations to mirror his surface semantics, though, and if he has ten constructs they're likely to be implemented ten ways. View that as a preview of future battles that have barely been hinted at so far <0.3 wink>. Anyway, in implementation terms a continuation "is like" what a coroutine would be if you could capture its resumption state at any point (even without the coroutine's knowledge!) and assign that state to a vrbl. So we could say it adds an abstract operation "capture", which essentially captures the program counter, call stack, and local (in Python terms) "block stack" at its point of invocation, and packages all that into a first-class "continuation object". IOW, a building block on top of which a generator's suspend, and the suspend half of a coroutine transfer, can be built. In a pure vision, there's no difference at all between a regular return and the "resume" half of a coroutine transfer: both amount to no more than picking some continuation to evaluate next. A continuation can be captured anywhere (even in the middle of an expression), and any continuation can be invoked at will from anywhere else. Note that "invoking a continuation" is *not* like "a call", though: it's abandoning the current continuation, *replacing* it with another one. In formal DS this isn't formally true (it's still "a call" -- a function application), but in practice it's a call that never returns to its caller so the implementation takes a shortcut. Like a goto, this is as low-level as it gets, and even hard-core continuation fans don't use them directly except as a means to implement better-behaved abstractions. As to whether continuations have "volatile state", I'm not sure what that was asking. If a given continuation is invoked more than once (which is something that's deliberately done when e.g. implementing backtracking searches), then changes made to the locals by the first invocation are visible to the second (& so on), so maybe the answer is "yes". It's more accurate to think of a continuation as being immutable, though: it holds a reference to the structure that implements name bindings, but does not copy (save or restore) the bindings. Quick example, given: (define continuation 0) (define (test) (let ((i 0)) (call/cc (lambda (k) (set! continuation k))) (set! i (+ i 1)) i)) That's like the Python: def test(): i = 0 global continuation continuation = magic to resume at the start of the next line i = i + 1 return i Then (this is interactive output from a Scheme shell): > (test) ; Python "test()" 1 > (continuation) ; Python "continuation()" 2 > (continuation) 3 > (define thisguy continuation) ; Python "thisguy = continuation" > (test) 1 > (continuation) 2 > (thisguy) 4 > too-simple-to-be-obvious?-ly y'rs - tim From bwarsaw at cnri.reston.va.us Mon Jul 5 18:55:01 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 5 Jul 1999 12:55:01 -0400 (EDT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <1281013244-55137551@hypernet.com> <000101bec6b3$4e752be0$349e2299@tim> Message-ID: <14208.58213.449486.917974@anthem.cnri.reston.va.us> Wow. That was by far the clearest tutorial on the subject I think I've read. I guess we need (for Tim to have) more 3 day holiday weekends. i-vote-we-pitch-in-and-pay-tim-to-take-/every/-monday-off-so-he-can-write- more-great-stuff-like-this-ly y'rs, -Barry From skip at mojam.com Mon Jul 5 19:54:45 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 5 Jul 1999 13:54:45 -0400 (EDT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <14208.58213.449486.917974@anthem.cnri.reston.va.us> References: <1281013244-55137551@hypernet.com> <000101bec6b3$4e752be0$349e2299@tim> <14208.58213.449486.917974@anthem.cnri.reston.va.us> Message-ID: <14208.61767.893387.713711@cm-24-29-94-19.nycap.rr.com> Barry> Wow. That was by far the clearest tutorial on the subject I Barry> think I've read. I guess we need (for Tim to have) more 3 day Barry> holiday weekends. What he said. Skip From MHammond at skippinet.com.au Tue Jul 6 03:16:45 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 6 Jul 1999 11:16:45 +1000 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000101bec6b3$4e752be0$349e2299@tim> Message-ID: <000401bec74d$37c8d370$0801a8c0@bobcat> > NOW No problems, fine sailing... > GENERATORS Cruising along - nice day to be out! > COROUTINES Such a pleasant day! > CONTINUATIONS Are they clouds I see? > Given the pedagogical structure so far, you're primed to view > continuations > as an enhancement of coroutines. And that's exactly what will get you > nowhere . Continuations aren't more elaborate than coroutines, > they're simpler. Indeed, they're simpler than generators, A storm warning... > Legend has it they were discovered when theorists were trying > to find a > solid reason for why goto statements suck: the growth of > "denotational > semantics" (DS) boomed at the same time "structured > programming" took off. > The former is a solid & fruitful approach to formally specifying the She's taking on water! > In theory a continuation is a function that computes "the rest of the > program", or "its future". OK - before I abandon ship, I might need my hand-held. Before I start, let me echo Skip and Barry - and excellent precis of a topic I knew nothing about (as I made you painfully aware!) And I will avoid asking you to explain the above paragraph again for now :-) Im a little confused by how these work in practice. I can see how continuations provide the framework to do all these control things. It is clear to me how you can capture the "state" of a running program. Indeed, this is exactly what it seems generators and coroutines do. With continuations, how is the state captured or created? Eg, in the case of implementing a goto or a function call, there doesnt seem to be to be a state available. Does the language supporting continuations allow you to explicitely create one from any arbitary position? I think you sort-of answered that below: > Anyway, in implementation terms a continuation "is like" what > a coroutine > would be if you could capture its resumption state at any point (even > without the coroutine's knowledge!) and assign that state to > a vrbl. So we This makes sense, although it implies a "running state" is necessary for this to work. In the case of transfering control to somewhere you have never been before (eg, a goto or a new function call) how does this work? Your example: > def test(): > i = 0 > global continuation > continuation = magic to resume at the start of the next line > i = i + 1 > return i My main problem is that this looks closer to your description of a kind-of one-sided coroutine - ie, instead of only being capable of transfering control, you can assign the state. I can understand that fine. But in the example, the function _is_ aware its state is being captured - indeed, it is explicitely capturing it. My only other slight conceptual problem was how you implement functions, as I dont understand how the concept of return values fits in at all. But Im sure that would become clearer when the rest of the mud is wiped from my eyes. And one final question: In the context of your tutorial, what do Chris' latest patches arm us with? Given my new-found expertise in this matter I would guess that the guts is there to have at least co-routines, as capturing the state of a running Python program, and restarting it later is possible. Im still unclear about continuations WRT "without the co-routines knowledge", so really unsure what is needed here... The truly final question:-) Assuming Chris' patches were 100% bug free and reliable (Im sure they are very close :-) what would the next steps be to take advantage of it in a "clean" way? ie, assuming Guido blesses them, what exactly could I do in Python? (All I really know is that the C stack has gone - thats it!) Thanks for the considerable time it must be taking to enlightening us! Mark. From jcw at equi4.com Tue Jul 6 11:27:13 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Tue, 06 Jul 1999 11:27:13 +0200 Subject: [Python-Dev] Re: Welcome Jean-Claude Wippler Message-ID: <3781CBF1.B360D466@equi4.com> Thank you Guido, for admitting this newbie to Python-dev :) [Guido: ... you are mostly interested in lurking ... digest mode ...] Fear of being flooded by email, a little shy (who, me?), and yes, a bit of curiosity. Gosh, I got to watch my steps, you figured it all out :) Thanks again. I went through the last month or so of discussion, and am fascinated by the topics and issues you guys are dealing with. And now, seeing Tim's generator/coroutine/continuations description is fantastic. Makes it obvious that I'm already wasting way too much bandwidth. When others come to mind, I'll let them know about this list. But so far, everyone I can come up with already is a member, it seems. -- Jean-Claude From guido at CNRI.Reston.VA.US Tue Jul 6 17:08:37 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 06 Jul 1999 11:08:37 -0400 Subject: [Python-Dev] Welcome to Chris Petrilli Message-ID: <199907061508.LAA12663@eric.cnri.reston.va.us> Chris, would you mind posting a few bits about yourself? Most of the people on this list have met each other at one point or another (with the big exception of the elusive Tim Peters :-); it's nice to know more than a name... --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Tue Jul 6 17:16:10 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Tue, 6 Jul 1999 11:16:10 -0400 Subject: [Python-Dev] Welcome to Chris Petrilli In-Reply-To: <199907061508.LAA12663@eric.cnri.reston.va.us>; from Guido van Rossum on Tue, Jul 06, 1999 at 11:08:37AM -0400 References: <199907061508.LAA12663@eric.cnri.reston.va.us> Message-ID: <19990706111610.A4585@amber.org> On Tue, Jul 06, 1999 at 11:08:37AM -0400, Guido van Rossum wrote: > Chris, would you mind posting a few bits about yourself? Most of the > people on this list have met each other at one point or another (with > the big exception of the elusive Tim Peters :-); it's nice to know > more than a name... As we are all aware, Tim is simply the graduate project of an AI student, running on a network Symbolics machines :-) Honestly though, about me? Um, well, I'm now (along with Brian Lloyd) the Product Management side of Digital Creations, and Zope, so I have a very vested interest in seeing Python succeed---besides my general belief that the better language SHOULd win. My background is actually in architecture, but I've spent the past 5 years working in the cryptography world, mostly in smart cards and PKI. My computer background is bizarre, twisted and quite nefarious... having grown up on a PDP-8/e, rather than PCs. And if the fact that I own 4 Lisp machines means anything, I'm affraid to ask what! For now, I'm just going to watch the masters at work. :-) Chris -- | Christopher Petrilli ``Television is bubble-gum for | petrilli at amber.org the mind.''-Frank Lloyd Wright From tim_one at email.msn.com Wed Jul 7 03:52:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 6 Jul 1999 21:52:15 -0400 Subject: [Python-Dev] Fancy control flow Message-ID: <000301bec81b$56e87660$c99e2299@tim> Responding to a msg of Guido's that shows up in the archives but didn't come across the mail link (the proper authorities have been notified, and I'm sure appropriate heads will roll at the appropriate pace ...). > From guido at CNRI.Reston.VA.US Mon, 05 Jul 1999 08:06:03 -0400 > Date: Mon, 05 Jul 1999 08:06:03 -0400 > From: Guido van Rossum guido at CNRI.Reston.VA.US > Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) [generators, coroutines, continuations] > I still don't understand all of this (I have not much of an idea of > what Christian's search for hidden registers is about and what kind of > analysis he needs) but I think of continuations as requiring > (theoretically) coping the current stack (to and from), while > generators and coroutines just need their own piece of stack set aside. A generator needs one frame, period (its own!). "Modern" coroutines can get more involved, mixing regular calls with coroutine transfers in arbitrary ways. About Christian's mysterious quest, we've been pursuing it offline. By "hidden registers" I think he means stuff on the eval stack that should *not* be saved/restored as part of a continuation's state. It's not clear to me that this isn't the empty set. The most debatable case I've seen is the Python "for" loop, which hides an anonymous loop counter on the stack. for i in seq: if func1(i): func2(i) This is more elaborate than necessary , but since it's the one we've discussed offline I'll just stick with it. Suppose func1 saves a continuation on the first iteration, and func2 invokes that continuation on the fifth. Does the resumed continuation "see" the loop as being on its first iteration or as being on its fifth? In favor of the latter is that the loop above "should be" equivalent to this: hidden = 0 while 1: try: temp = seq[hidden] except IndexError: break hidden = hidden + 1 i = temp if func1(i): func2(i) since that's what "for" *does* in Python. With the latter spelling, it's clear that the continuation should see the loop as being on its fifth iteration (continuations see changes in bindings, and making the loop counter a named local exposes it to that rule). But if the entire eval stack is (conceptually) saved/restored, the loop counter is part of it, so the continuation will see the loop counter at its old value. I think it's arguable either way, and argued in favor of "fifth" initially. Now I'm uncertain, but leaning toward "first". > The difference between any of these and threads (fake or real) is that > they pass control explicitly, while threads (typically) presume > pre-emptive scheduling, i.e. they make independent parallel progress > without explicit synchronization. Yes. > (Hmm, how do you do this with fake threads? Or are these only required > to switch whenever you touch a mutex?) I'd say they're only *required* to switch when one tries to acquire a mutex that's already locked. It would be nicer to switch them as ceval already switches "real threads", that is give another one a shot every N bytecodes. > I'm not sure if there's much of a difference between generators and > coroutines -- it seems just the termination convention. A generator is a semi-coroutine, but is the easier half . > (Hmm... would/should a generator be able to raise an exception in its > caller? Definitely. This is all perfectly clear for a generator -- it has a unique & guaranteed still-active place to return *to*. Years ago I tried to rename them "resumable functions" to get across what a trivial variation of plain functions they really are ... > A coroutine?) This one is muddier. A (at line A1) transfers to B (at line B1), which transfers at line B2 to A (at line A2), which at line A3 transfers to B (at line B3), and B raises an exception at line B4. The obvious thing to do is to pass it on to line A3+1, but what if that doesn't catch it either? We got to A3 from A2 from B2 from B1, but B1 is long gone. That's a real difference with generators: resuming a generator is stack-like, while a co-transfer is just moving control around a flat graph, like pushing a pawn around a chessboard. The coroutine implementation I posted 5 years ago almost punted on this one: if any coroutine suffered an unhandled exception, all coroutines were killed and an EarlyExit exception was raised in "the main coroutine" (the name given to the thread of your code that created the coroutine objects to begin with). Deserves more thought than that, though. or-maybe-it-doesn't-ly y'rs - tim From tim_one at email.msn.com Wed Jul 7 06:18:13 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 7 Jul 1999 00:18:13 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000401bec74d$37c8d370$0801a8c0@bobcat> Message-ID: <000001bec82f$bc171500$089e2299@tim> [Mark Hammond] > ... > Thanks for the considerable time it must be taking to enlightening us! You're welcome, but the holiday weekend is up and so is my time. Thank (all of) *you* for the considerable time it must take to endure all this ! Let's hit the highlights (or lowlights, depending on your view): > ... > Im a little confused by how these [continuations] work in practice. Very delicately . Eariler I posted a continuation-based implementation of generators in Scheme, and Sam did the same for a hypothetical Python with a call/cc equivalent. Those are typical enough. Coroutines are actually easier to implement (using continuations), thanks to their symmetry. Again, though, you never want to muck with continuations directly! They're too wild. You get an expert to use them in the bowels of an implementation of something else. > ... > It is clear to me how you can capture the "state" of a running program. > Indeed, this is exactly what it seems generators and coroutines do. Except that generators need only worry about their own frame. Another way to view it is to think of the current computation being run by a (real ) thread -- then capturing a continuation is very much like making a frozen clone of that thread, stuffing it away somewhere for later thawing. > With continuations, how is the state captured or created? There are, of course, many ways to implement these things. Christian is building them on top of the explicit frame objects Python already creates, and that's a fine way for Python. Guido views it as cloning the call stack, and that's accurate too. >> Anyway, in implementation terms a continuation "is like" what >> a coroutine would be if you could capture its resumption state at >> any point (even without the coroutine's knowledge!) and assign that >> state to a vrbl. > This makes sense, although it implies a "running state" is necessary for > this to work. In implementations (like Chris's) that do it all dynamically at runtime, you bet: you not only need a "running state", you can only capture a continuation at the exact point (the specific bytecode) you run the code to capture it. In fact, there *is* "a continuation" at *every* dynamic instance of every bytecode, and the question is then simply which of those you want to save . > In the case of transfering control to somewhere you have never been > before (eg, a goto or a new function call) how does this work? Good eye: it doesn't in this scheme. The "goto" business is a theoretical transformation, in a framework where *every* operation is modeled as a function application, and an entire program's life is modeled as a single function call. Some things are very easy to do in theory . > Your example: >> def test(): >> i = 0 >> global continuation >> continuation = magic to resume at the start of the next line >> i = i + 1 >> return i > My main problem is that this looks closer to your description of a kind-of > one-sided coroutine - ie, instead of only being capable of transfering > control, you can assign the state. I can understand that fine. Good! > But in the example, the function _is_ aware its state is being > captured - indeed, it is explicitely capturing it. In real life, "magic to resume at the start of the next line" may be spelled concretely as e.g. xyz(7) or even a.b That is, anywhere in "test" any sort of (explicit or implicit) call is made *may* be part of a saved continuation, because the callee can capture one -- with or without test's knowledge. > My only other slight conceptual problem was how you implement functions, > as I dont understand how the concept of return values fits in at all. Ya, I didn't mention that. In Scheme, the act of capturing a continuation returns a value. Like so: (define c #f) ; senseless, but Scheme requires definition before reference (define (test) (print (+ 1 (call/cc (lambda (k) (set! c k) 42)))) (newline)) The function called by call/cc there does two things: 1) Stores call/cc's continuation into the global "c". 2) Returns the int 42. > (test) 43 > Is that clear? The call/cc expression returns 42. Then (+ 1 42) is 43; then (print 43) prints the string "43"; then (newline) displays a newline; then (test) returns to *its* caller, which is the Scheme shell's read/eval/print loop. Now that whole sequence of operations-- what happens to the 42 and beyond --*is* "call/cc's continuation", which we stored into the global c. A continuation is itself "a function", that returns its argument to the context where the continuation was captured. So now e.g. > (c 12) 13 > c's argument (12) is used in place of the original call/cc expression; then (+ 1 12) is 13; then (print 13) prints the string "13"; then (newline) displays a newline; then (test) returns to *its* caller, which is *not* (c 12), but just as originally is still the Scheme shell's read/eval/print loop. That last point is subtle but vital, and maybe this may make it clearer: > (begin (c 12) (display "Tim lied!")) 13 > The continuation of (c 12) includes printing "Tim lied!", but invoking a continuation *abandons* the current continuation in favor of the invoked one. Printing "Tim lied!" wasn't part of c's future, so that nasty slur about Tim never gets executed. But: > (define liar #f) > (begin (call/cc (lambda (k) (set! liar k) (c 12))) (display "Tim lied!") (newline)) 13 > (liar 666) Tim lied! > This is why I stick to trivial examples . > And one final question: In the context of your tutorial, what do Chris' > latest patches arm us with? Given my new-found expertise in this matter > I would guess that the guts is there to have at least co-routines, > as capturing the state of a running Python program, and restarting it > later is possible. Im still unclear about continuations WRT "without the > co-routines knowledge", so really unsure what is needed here... Christian is taking his work very seriously here, and as a result is flailing a bit trying to see whether it's possible to do "the 100% right thing". I think he's a lot closer than he thinks he is <0.7 wink>, but in any case he's at worst very close to having full-blown continuations working. Coroutines already work. > The truly final question:-) Assuming Chris' patches were 100% bug free and > reliable (Im sure they are very close :-) what would the next steps be to > take advantage of it in a "clean" way? ie, assuming Guido blesses them, > what exactly could I do in Python? Nothing. What would you like to do? Sam & I tossed out a number of intriguing possibilities, but all of those build *on* what Christian is doing. You won't get anything useful out of the box unless somebody does the work to implement it. I personally have wanted generators in Python since '91, because they're extremely useful in the useless things that I do . There's a thread-based generator interface (Generator.py) in the source distribution that I occasionally use, but that's so slow I usually recode in Icon (no, I'm not a Scheme fan -- I *admire* it, but I rarely use it). I expect rebuilding that on Christian's work will yield a factor of 10-100 speedup for me (beyond losing the thread/mutex overhead, as Chris just pointed out on c.l.py resumption should be much faster than a Python call, since the frame is already set up and raring to go). Would be nice if the language grew some syntax to make generators pleasant as well as fast, but the (lack of) speed is what's really killing it for me now. BTW, I've never tried to "sell" coroutines -- let alone continuations. Just generators. I expect Sam will do a masterful job of selling those. send-today-don't-delay-couldn't-give-or-receive-a-finer-gift-ly y'rs - tim From tismer at appliedbiometrics.com Wed Jul 7 15:11:44 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 07 Jul 1999 15:11:44 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> Message-ID: <37835210.F22A7EC9@appliedbiometrics.com> Tim Peters wrote: > > [Mark Hammond] > > ... > > Thanks for the considerable time it must be taking to enlightening us! > > You're welcome, but the holiday weekend is up and so is my time. Thank (all > of) *you* for the considerable time it must take to endure all this ! Just to let you know that I'm still there, thinking, not coding, still hesitating, but maybe I can conclude now and send it off. This discussion, and especially Tim's input was extremely helpful. He has spent considerable time reading my twisted examples, writing his own, hitting my chin, kicking my -censored-, and proving to me that the truth I was searching doesn't exist. ... > Again, though, you never want to muck with continuations directly! They're > too wild. You get an expert to use them in the bowels of an implementation > of something else. Maybe with one exception: With careful coding, you can use a continuation at the head of a very deep recursion and use it as an early break if the algorithm fails. The effect is the same as bailing out with an exception, despite the fact that no "finally" causes would be obeyed. It is just a incredibly fast jump out of something if you know what you are doing. > > With continuations, how is the state captured or created? > > There are, of course, many ways to implement these things. Christian is > building them on top of the explicit frame objects Python already creates, > and that's a fine way for Python. Guido views it as cloning the call stack, > and that's accurate too. Actually, it is both! What I use (and it works fine) are so-called "push-back frames". My continuations are always appearing in some call. In order to make the caller able to be resumed, I create a push-back frame *from* it. That means, my caller frame is duplicated behind his "f_back" pointer. The original frame stays in place but now becomes a continuation frame with the current stack state preserved. All other locals and stuff are moved to the clone in the f_back which is now the real one. This always works fine, since references to the original caller frame are all intact, just the frame's meaning is modified a little. Well, I will hvae to write a good paper... ... > I personally have wanted generators in Python since '91, because they're > extremely useful in the useless things that I do . There's a > thread-based generator interface (Generator.py) in the source distribution > that I occasionally use, but that's so slow I usually recode in Icon (no, > I'm not a Scheme fan -- I *admire* it, but I rarely use it). I expect > rebuilding that on Christian's work will yield a factor of 10-100 speedup > for me (beyond losing the thread/mutex overhead, as Chris just pointed out > on c.l.py resumption should be much faster than a Python call, since the > frame is already set up and raring to go). I believe so. Well, I admit that the continuation approach is slightly too much for the coroutine/generator case, since they exactly don't have the problem where continuations are suffering a little: Switching between frames which cannot be reached more than once at a time don't need the stack copying/pushback at all. I'm still staying at the secure side for now. But since I have all refcounting accurate already, we can use it to figure out if a frame needs to be copied at all. > Would be nice if the language grew some syntax to make generators pleasant > as well as fast, but the (lack of) speed is what's really killing it for me > now. How about "amb"? :-) (see "teach youself schem in fixnum days, chapter 14 at http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) About my last problems: The hard decision is: - Either I just stop and I'm ready already, and loops are funny. - Or I do the hidden register search, which makes things more complicated and also voidens the pushback trick partially, since then I would manage all stack stuff in one frame. - Or, and that's what I will do finally: For now, I will really just correct the loops. Well, that *is* a change to Python again, but no semantic change. The internal loop counter will no longer be an integer object, but a mutable integer box. I will just create a one-element integer array and count with its zero element. This is correct, since the stack value isn't popped off, so all alive stack copies share this one element. As a side effect, I save the Object/Integer conversion, so I guess it will be faster. *and* this solution does not involve any other change, since the stack layout is identical to before. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From klm at digicool.com Wed Jul 7 17:40:15 1999 From: klm at digicool.com (Ken Manheimer) Date: Wed, 7 Jul 1999 11:40:15 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> Message-ID: <00e001bec88f$02f45c80$5a57a4d8@erols.com> Hokay. I *think* i have this, and i have a question to followup. First, i think the crucial distinction i needed to make was the fact that the stuff inside the body of the call/cc is evaluated only when the call/cc is initially evaluated. What constitutes the "future" of the continuation is the context immediately following the call/cc expression. Your final example is where that's most apparent for me: Tim presented: > [...] > The continuation of (c 12) includes printing "Tim lied!", but invoking a > continuation *abandons* the current continuation in favor of the invoked > one. Printing "Tim lied!" wasn't part of c's future, so that nasty slur > about Tim never gets executed. But: > > > (define liar #f) > > (begin > (call/cc (lambda (k) > (set! liar k) > (c 12))) > (display "Tim lied!") > (newline)) > 13 > > (liar 666) > Tim lied! > > > > This is why I stick to trivial examples . Though not quite as simple, i think this nailed the distinction for me. (Too bad that i'm probably mistaken:-) In any case, one big unknown for me is the expense of continuations. Just how expensive is squirreling away the future, anyway? (:-) If we're deep in a call stack, seems like there can be a lot of lexical-bindings baggage, plus whatever i-don't-know-how-big-it-is control logic there is/was/will be pending. Does the size of these things (continuations) vary extremely, and is the variation anticipatable? I'm used to some surprises about the depth to which some call or other may go, i don't expect as much uncertainty about my objects - and it seems like continuations directly transform the call depth/complexity into data size/complexity... ?? unfamiliar-territory,how-far-can-i-fall?-ly, Ken klm at digicool.com From tismer at appliedbiometrics.com Wed Jul 7 18:12:22 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 07 Jul 1999 18:12:22 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> <00e001bec88f$02f45c80$5a57a4d8@erols.com> Message-ID: <37837C66.1E5C33B9@appliedbiometrics.com> Ken Manheimer wrote: > > Hokay. I *think* i have this, and i have a question to followup. ... > In any case, one big unknown for me is the expense of continuations. Just > how expensive is squirreling away the future, anyway? (:-) The future costs at most to create *one* extra frame with a copy of the original frame's local stack. By keeping the references to all the other frames which were intact, the real cost is of course bigger, since we keep the whole frame path from this one up to the topmost frame alive. As soon as we drop the handle, everything winds up and vanishes. I also changed the frame refcounting to guarantee exactly that behavior. (before, unwinding was explicitly done). > If we're deep in a call stack, seems like there can be a lot of > lexical-bindings baggage, plus whatever i-don't-know-how-big-it-is control > logic there is/was/will be pending. Does the size of these things > (continuations) vary extremely, and is the variation anticipatable? I'm > used to some surprises about the depth to which some call or other may go, i > don't expect as much uncertainty about my objects - and it seems like > continuations directly transform the call depth/complexity into data > size/complexity... ?? Really, no concern necessary. The state is not saved at all (despite one frame), it is just not dropped. :-) Example: You have some application running, in a nesting level of, say, four function calls. This makes four frames. The bottom function now decides to spawn 10 coroutines in a loop and puts them into an array. Your array now holds 10 continuations, where each one is just one frame, which points back to your frame. Now assume, you are running one of the coroutines/generators/whatever, and this one calls another function "bottom", just to have some scenario. Looking from "bottom", there is just a usual frame chain, now 4+1 frames long. To shorten this: The whole story is nothing more than a tree, where exactly one leaf is active at any time, and its view of the call chain is always linear. Continuation jumps are possible to every other frame in the tree. It now only depends of keeping references to the leaf which you just left or not. If the jump removes the left reference to your current frame, then the according chain will ripple away up to the next branch point. If you held a reference, as you will do with a coroutine to resume it, this chain stays as a possible jump target. for-me-it's-a-little-like-Tarzan-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From klm at digicool.com Wed Jul 7 20:00:56 1999 From: klm at digicool.com (Ken Manheimer) Date: Wed, 7 Jul 1999 14:00:56 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) Message-ID: <613145F79272D211914B0020AFF640191D1BF2@gandalf.digicool.com> Christian wrote: > Really, no concern necessary. The state is not saved at all > (despite one frame), it is just not dropped. :-) I have to say, that's not completely reassuring.-) While little or nothing additional is created, stuff that normally would be quite transient remains around. > To shorten this: The whole story is nothing more than a tree, > where exactly one leaf is active at any time, and its view > of the call chain is always linear. That's wonderful - i particularly like that multiple continuations from the same frame only amount to a single retention of the stack for that frame. My concern is not alleviated, however. My concern is the potential, but often-realized hairiness of computation trees. Eg, looped calls to a function amount to nodes with myriad branches - one for each iteration - and each branch can be an arbitrary computation. If there were a continuation retained each time around the loop, worse, somewhere down the call stack within the loop, you could quickly amass a lot of stuff that would otherwise be reaped immediately. So it seems like use of continuations *can* be surprisingly expensive, with the expense commensurate with, and as hard (or easy) to predict as the call dynamics of the call tree. (Boy, i can see how continuations would be useful for backtracking-style chess algorithms and such. Of course, discretion about what parts of the computation is retained at each branch would probably be an important economy for large computations, while stashing the continuation retains everything...) (It's quite possible that i'm missing something - i hope i'm not being thick headed.) Note that i do not raise this to argue against continuations. In fact, they seem to me to be at least the right conceptual foundation for these advanced control structures (i happen to "like" stream abstractions, which i gather is what generators are). It just seems like it may a concern, something about which people experience with continuations experience (eg, the scheme community) would have some lore - accumulated wisdom... ken klm at digicool.com From da at ski.org Thu Jul 8 00:37:09 1999 From: da at ski.org (David Ascher) Date: Wed, 7 Jul 1999 15:37:09 -0700 (Pacific Daylight Time) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281421591-30373695@hypernet.com> Message-ID: [Tim] > Threads can be very useful purely as a means for algorithm > structuring, due to independent control flows. FWIW, I've been following the coroutine/continuation/generator bit with 'academic' interest -- the CS part of my brain likes to read about them. Prompted by Tim's latest mention of Demo/threads/Generator.py, I looked at it (again?) and *immediately* grokked it and realized how it'd fit into a tool I'm writing. Nothing to do with concurrency, I/O, etc -- just compartmentalization of stateful iterative processes (details too baroque to go over). More relevantly, that tool would be useful on thread-less Python's (well, when it reaches usefulness on threaded Pythons =). Consider me pro-generator, and still agnostic on the co* things. --david From guido at CNRI.Reston.VA.US Thu Jul 8 07:08:44 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 08 Jul 1999 01:08:44 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Tue, 06 Jul 1999 21:52:15 EDT." <000301bec81b$56e87660$c99e2299@tim> References: <000301bec81b$56e87660$c99e2299@tim> Message-ID: <199907080508.BAA00623@eric.cnri.reston.va.us> I have a few questions/suggestions about generators. Tim writes that a suspended generator has exactly one stack frame. I'm not sure I like that. The Demo/thread/Generator.py version has no such restriction; anything that has a reference to the generator can put() the next value. Is the restriction really necessary? I can see a good use for a recursive generator, e.g. one that generates a tree traversal: def inorder(node): if node.left: inorder(node.left) suspend node if node.right: inorder(node.right) If I understand Tim, this could not work because there's more than one stack frame involved. On the other hand, he seems to suggest that something like this *is* allowed when using "modern" coroutines. Am I missing something? I though that tree traversal was one of Tim's first examples of generators; would I really have to use an explicit stack to create the traversal? Next, I want more clarity about the initialization and termination conditions. The Demo/thread/Generator.py version is very explicit about initialization: you instantiate the Generator class, passing it a function that takes a Generator instance as an argument; the function executes in a new thread. (I guess I would've used a different interface now -- perhaps inheriting from the Generator class overriding a run() method.) For termination, the normal way to stop seems to be for the generator function to return (rather than calling g.put()), the consumer then gets an EOFError exception the next time it calls g.get(). There's also a way for either side to call g.kill() to stop the generator prematurely. Let me try to translate that to a threadless implementation. We could declare a simple generator as follows: generator reverse(seq): i = len(seq) while i > 0: i = i-1 suspend seq[i] This could be translated by the Python translator into the following, assuming a system class generator which provides the machinery for generators: class reverse(generator): def run(self, seq): i = len(seq) while i > 0: i = i-1 self.suspend(seq[i]) (Perhaps the identifiers generator, run and suspend would be spelled with __...__, but that's just clutter for now.) Now where Tim was writing examples like this: for c in reverse("Hello world"): print c, print I'd like to guess what the underlying machinery would look like. For argument's sake, let's assume the for loop recognizes that it's using a generator (or better, it always needs a generator, and when it's not a generator it silently implies a sequence-iterating generator). So the translator could generate the following: g = reverse("Hello world") # instantiate class reverse while 1: try: c = g.resume() except EOGError: # End Of Generator break print c, print (Where g should really be a unique temporary local variable.) In this model, the g.resume() and g.suspend() calls have all the magic. They should not be accessible to the user. They are written in C so they can play games with frame objects. I guess that the *first* call to g.resume(), for a particular generator instance, should start the generator's run() method; run() is not activated by the instantiation of the generator. Then run() runs until the first suspend() call, which causes the return from the resume() call to happen. Subsequent resume() calls know that there's already is a frame (it's stored in the generator instance) and simply continue its execution where it was. If the run() method returns from the frame, the resume() call is made to raise EOGError (blah, bogus name) which signals the end of the loop. (The user may write this code explicitly if they want to consume the generated elements in a different way than through a for loop.) Looking at this machinery, I think the recursive generator that I wanted could be made to work, by explicitly declaring a generator subclass (instead of using the generator keyword, which is just syntactic sugar) and making calls to methods of self, e.g.: class inorder(generator): def run(self, node): if node.left: self.run(node.left) self.suspend(node) if node.right: self.run(node.right) The generator machinery would (ab)use the fact that Python frames don't necessarily have to be linked in a strict stack order; the generator gets a pointer to the frame to resume from resume(), and there's a "bottom" frame which, when hit, raises the EOGError exception. All currently active frames belonging to the generator stay alive while another resume() is possible. All this is possible by the introduction of an explicit generator object. I think Tim had an implementation in mind where the standard return pointer in the frame is the only thing necessary; actually, I think the return pointer is stored in the calling frame, not in the called frame (Christian? Is this so in your version?). That shouldn't make a difference, except that it's not clear to me how to reference the frame (in the explicitly coded version, which has to exist at least at the bytecode level). With classic coroutines, I believe that there's no difference between the first call and subsequent calls to the coroutine. This works in the Knuth world where coroutines and recursion don't go together; but at least for generators I would hope that it's possible for multiple instances of the same generator to be active simultaneously (e.g. I could be reversing over a list of files and then reverse each of the lines in the file; this uses separate instances of the reverse() generator). So we need a way to reference the generator instance separately from the generator constructor. The machinery I sketched above solves this. After Tim has refined or rebutted this, I think I'll be able to suggest what to do for coroutines. (I'm still baffled by continuations. The question whether the for saved and restored loop should find itself in the 1st or 5th iteration surprises me. Doesn't this cleanly map into some Scheme code that tells us what to do? Or is it unclear because Scheme does all loops through recursion? I presume that if you save the continuation of the 1st iteration and restore it in the 5th, you'd find yourself in the back 1st iteration? But this is another thread.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Thu Jul 8 07:59:24 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:24 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <613145F79272D211914B0020AFF640191D1BF2@gandalf.digicool.com> Message-ID: <002001bec907$07934a80$1d9e2299@tim> [Christian] > Really, no concern necessary. The state is not saved at all > (despite one frame), it is just not dropped. :-) [Ken] > I have to say, that's not completely reassuring.-) While little or > nothing additional is created, stuff that normally would be quite > transient remains around. I don't think this is any different than that keeping a reference to a class instance alive keeps all the attributes of that object alive, and all the objects reachable from them too, despite that you may never again actually reference any of them. If you save a continuation, the implementation *has* to support your doing anything that's *possible* to do from the saved control-flow state -- and if that's a whole big giant gob o' stuff, that's on you. > ... > So it seems like use of continuations *can* be surprisingly expensive, > with the expense commensurate with, and as hard (or easy) to predict as > the call dynamics of the call tree. > > (Boy, i can see how continuations would be useful for backtracking-style > chess algorithms and such. It comes with the territory, though: backtracking searches are *inherently* expensive and notoriously hard to predict, whether you implement them via continuations, or via clever hand-coded assembler using explicit stacks. The number of nodes at a given depth is typically exponential in the depth, and that kills every approach at shallow levels. Christian posted a reference to an implementation of "amb" in Scheme using continuations, and that's a very cute function: given a list of choices, "amb" guarantees to return (if any such exists) that particular list element that allows the rest of the program to "succeed". So if indeed chess is a forced win for white, amb(["P->KR3", "P->KR4", ...]) as the first line of your chess program will return "the" winning move! Works great in theory . > Of course, discretion about what parts of the computation is retained > at each branch would probably be an important economy for large > computations, while stashing the continuation retains everything...) You bet. But if you're not mucking with exponential call trees-- and, believe me, you're usually not --it's not a big deal. > Note that i do not raise this to argue against continuations. In fact, > they seem to me to be at least the right conceptual foundation for these > advanced control structures (i happen to "like" stream abstractions, > which i gather is what generators are). Generators are an "imperative" flavor of stream, yes, potentially useful whenever you have an abstraction that can deliver a sequence of results (from all the lines in a file, to all the digits of pi). A very common occurrence! Heck, without it, Python's "for x in s:" wouldn't be any fun at all . how-do-i-love-thee?-let-me-generate-the-ways-ly y'rs - tim From tim_one at email.msn.com Thu Jul 8 07:59:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:15 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <00e001bec88f$02f45c80$5a57a4d8@erols.com> Message-ID: <001d01bec907$024e6a00$1d9e2299@tim> [Ken Manheimer] > First, i think the crucial distinction i needed to make was the fact that > the stuff inside the body of the call/cc is evaluated only when > the call/cc is initially evaluated. What constitutes the "future" of the > continuation is the context immediately following the call/cc expression. Right! call/cc is short for call-with-current-continuation, and "current" refers to the continuation of call/cc itself. call/cc takes a function as an argument, and passes to it its (call/cc's) *own* continuation. This is maximally clever and maximally confusing at first. Christian has a less clever way of spelling it that's likely to be less confusing too. Note that it has to be a *little* tricky, because the obvious API k = gimme_a_continuation_for_here() doesn't work. The future of "gimme_a_..." includes binding k to the result, so you could never invoke the continuation without stomping on k's binding. k = gimme_a_continuation_for_n_bytecodes_beyond_here(n) could work, but is a bit hard to explain coherently . > ... > In any case, one big unknown for me is the expense of continuations. > Just how expensive is squirreling away the future, anyway? (:-) Christian gave a straight answer, so I'll give you the truth : it doesn't matter provided that you don't pay the price if you don't use it. A more interesting question is how much everyone will pay all the time to support the possibility even if they don't use it. But that question is premature since Chris isn't yet aiming to optimize. Even so, the answer so far appears to be "> 0 but not much". in-bang-for-the-buck-continuations-are-cheap-ly y'rs - tim From tim_one at email.msn.com Thu Jul 8 07:59:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:18 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <37835210.F22A7EC9@appliedbiometrics.com> Message-ID: <001e01bec907$03f9a900$1d9e2299@tim> >> Again, though, you never want to muck with continuations directly! >> They're too wild. You get an expert to use them in the bowels of an >> implementation of something else. [Christian] > Maybe with one exception: With careful coding, you can use > a continuation at the head of a very deep recursion and use > it as an early break if the algorithm fails. The effect is > the same as bailing out with an exception, despite the fact > that no "finally" causes would be obeyed. It is just a > incredibly fast jump out of something if you know what > you are doing. You don't need continuations for this, though; e.g., in Icon I've done this often, by making the head of the deep recursion a co-expression, doing the recursion via straight calls, and then doing a coroutine resumption of &main when I want to break out. At that point I set the coexp to &null, and GC reclaims the stack frames (the coexp is no longer reachable from outside) when it feels like it . This is a particularly simple application of coroutines that could be packaged up in a simpler way for its own sake; so, again, while continuations may be used fruitfully under the covers here, there's still no reason to make a poor end user wrestle with them. > ... Well, I admit that the continuation approach is slightly too much > for the coroutine/generator case, It's good that you admit that, because generators alone could have been implemented with a 20-line patch . BTW, I expect that by far the bulk of your changes *still* amount to what's needed for disentangling the C stack, right? The continuation implementation has been subtle, but so far I've gotten the impression that it requires little code beyond that required for stacklessness. > ... > How about "amb"? :-) > (see "teach youself schem in fixnum days, chapter 14 at > http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) That's the point at which I think continuations get insane: it's an unreasonably convoluted implementation of a straightforward (via other means) backtracking framework. In a similar vein, I've read 100 times that continuations can be used to implement a notion of (fake) threads, but haven't actually seen an implementation that wasn't depressingly subtle & long-winded despite being just a feeble "proof of concept". These have the *feeling* of e.g. implementing generators on top of real threads: ya, you can do it, but nobody in their right mind is fooled by it . > About my last problems: > The hard decision is: > - Either I just stop and I'm ready already, and loops are funny. OK by me -- forgetting implementation, I still can't claim to know what's the best semantic here. > - Or I do the hidden register search, which makes things more > complicated and also voidens the pushback trick partially, > since then I would manage all stack stuff in one frame. Bleech. > - Or, and that's what I will do finally: > For now, I will really just correct the loops. > > Well, that *is* a change to Python again, but no semantic change. > The internal loop counter will no longer be an integer object, > but a mutable integer box. I will just create a one-element > integer array and count with its zero element. > This is correct, since the stack value isn't popped off, > so all alive stack copies share this one element. Ah, very clever! Yes, that will fly -- the continuations will share a reference to the value rather than the value itself. Perfect! > As a side effect, I save the Object/Integer conversion, so > I guess it will be faster. *and* this solution does not involve > any other change, since the stack layout is identical to before. Right, no downside at all. Except that Guido will hate it . there's-a-disturbance-in-the-force-ly y'rs - tim From tim_one at email.msn.com Thu Jul 8 08:45:51 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 02:45:51 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <002101bec90d$851a3e40$1d9e2299@tim> I'm out of time for tonight so will just address the first one: [Guido van Rossum] > I have a few questions/suggestions about generators. > > Tim writes that a suspended generator has exactly one stack frame. > I'm not sure I like that. The Demo/thread/Generator.py version has no > such restriction; anything that has a reference to the generator can > put() the next value. Is the restriction really necessary? It can simplify the implementation, and (not coincidentally ) the user's mental model of how they work. > I can see a good use for a recursive generator, e.g. one that generates > a tree traversal: Definitely; in fact, recursive generators are particularly useful in both traversals and enumeration of combinatorial objects (permutations, subsets, and so on). > def inorder(node): > if node.left: inorder(node.left) > suspend node > if node.right: inorder(node.right) > > If I understand Tim, this could not work because there's more than one > stack frame involved. That's right. It would be written like this instead: def inorder(node): if node.left: suspend inorder(node.left) suspend node if node.right: suspend inorder(node.right) Now there may be many instances of the "inorder" generator active (as many as the tree is deep), but each one returns directly to its caller, and all but the bottom-most one is "the caller" wrt the generator it invokes. This implies that "suspend expr" treats expr like a generator in much the same way that "for x in expr" does (or may ...). I realize there's some muddiness in that. > On the other hand, he seems to suggest that something like this *is* > allowed when using "modern" coroutines. Yes, and then your original version can be made to work, delivering its results directly to the ultimate consumer instead of (in effect) crawling up the stack each time there's a result. > Am I missing something? Only that I've been pushing generators for almost a decade, and have always pushed the simplest possible version that's sufficient for my needs. However, every time I've made a micron's progress in selling this notion, it's been hijacked by someone else pushing continuations. So I keep pushing the simplest possible version of generators ("resumable function"), in the hopes that someday somebody will remember they don't need to turn Python inside out to get just that much . [much worth discussion skipped for now] > ... > (I'm still baffled by continuations. Actually not, I think! > The question whether the for saved and restored loop should find itself > in the 1st or 5th iteration surprises me. Doesn't this cleanly map into > some Scheme code that tells us what to do? Or is it unclear because > Scheme does all loops through recursion? Bingo: Scheme has no loops. I can model Python's "for" in Scheme in such a way that the continuation sees the 1st iteration, or the 5th, but neither way is obviously right -- or wrong (they both reproduce Python's behavior in the *absence* of continuations!). > I presume that if you save the continuation of the 1st iteration and > restore it in the 5th, you'd find yourself in the back 1st iteration? > But this is another thread.) The short course here is just that any way I've tried to model Python's "for" in *Python* shares the property of the "while 1:" way I posted: the continuation sees the 5th iteration. And some hours I think it probably should , since the bindings of all the locals it sees will be consistent with the 5th iteration's values but not the 1st's. could-live-with-it-either-way-but-"correct"-is-debatable-ly y'rs - tim From tismer at appliedbiometrics.com Thu Jul 8 16:23:11 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 08 Jul 1999 16:23:11 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <001e01bec907$03f9a900$1d9e2299@tim> Message-ID: <3784B44F.C2F76E8A@appliedbiometrics.com> Tim Peters wrote: ... > This is a particularly simple application of coroutines that could be > packaged up in a simpler way for its own sake; so, again, while > continuations may be used fruitfully under the covers here, there's still no > reason to make a poor end user wrestle with them. Well. def longcomputation(prog, *args, **kw): return quickreturn(prog, args, kw) # prog must be something with return function first arg # quickreturn could be done as so: def quickreturn(prog, args, kw): cont = getpcc() # get parent's continuation def jumpback(val=None, cont=cont): putcc(cont, val) # jump to continuation apply(prog, jumpback, args, kw) # and if they want to jump out, they call jumpback with # an optional return value. Can't help it, it still is continuation-ish. > > ... Well, I admit that the continuation approach is slightly too much > > for the coroutine/generator case, > > It's good that you admit that, because generators alone could have been > implemented with a 20-line patch . BTW, I expect that by far the bulk > of your changes *still* amount to what's needed for disentangling the C > stack, right? The continuation implementation has been subtle, but so far > I've gotten the impression that it requires little code beyond that required > for stacklessness. Right. You will see soon. The only bit which cont's need more than coro's is to save more than one stack state for a frame. So, basically, it is just the frame copy operation. If I was heading just for coroutines, then I could save that, but then I need to handle special cases like exception, what to do on return, and so on. Easier to do that one stuff once right. Then I will never dump code for an unforeseen coro-effect, since with cont's, I *may* jump in and bail out wherever I want or don't want. The special cases come later and will be optimized, and naturally they will reduce themselves to what's needed. Example: If I just want to switch to a different coro, I just have to swap two frames. This leads to a data structure which can hold a frame and exchange it with another one. The cont-implementation does something like fetch my current continuation # and this does the frame copy stuff save into local state variable fetch cont from other coro's local state variable jump to new cont Now, if the source and target frames are guaranteed to be different, and if the source frame has no dormant extra cont attached, then it is safe to merge the above steps into one operation, without the need to save local state. In the end, two coro's will jump to each other by doing nothing more than this. Exactly that is what Sam's prototype does right now. WHat he's missing is treatment of the return case. If a coro returns towards the place where it was forked off, then we want to have a cont which is able to handle it properly. That's why exceptions work fine with my stuff: You can put one exceptionhandler on top of all your coroutines which you create. It works without special knowledge of coroutines. After I realized that, I knew the way to go. > > > ... > > How about "amb"? :-) > > (see "teach youself schem in fixnum days, chapter 14 at > > http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) > > That's the point at which I think continuations get insane: it's an > unreasonably convoluted implementation of a straightforward (via other > means) backtracking framework. In a similar vein, I've read 100 times that > continuations can be used to implement a notion of (fake) threads, but > haven't actually seen an implementation that wasn't depressingly subtle & > long-winded despite being just a feeble "proof of concept". Maybe this is a convoluted implementation. But the principle? Return a value to your caller, but stay able to continue and do this again. Two continuations, and with the optimizations from above, it will be nothing. I will show you the code in a few, and you will realize that we are discussing the empty set. The frames have to be used, and the frames are already continuations. Only if they can be reached twice, they will have to be armed for that. Moving back to my new "more code - less words" principle. [mutable ints as loop counters] > Ah, very clever! Yes, that will fly -- the continuations will share a > reference to the value rather than the value itself. Perfect! Actually I'm copying some code out of Marc's counterobject which is nothing more than a mutable integer and hide it in ceval.c, since that doesn't introduce another module for a thing which isn't needed elsewhere, after Guido's hint. Better than to use the array module which doesn't publish its internals and might not always be linked in. > Right, no downside at all. Except that Guido will hate it . I made sure that this is what he hates the lest. off-for-coding-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Fri Jul 9 09:47:36 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 9 Jul 1999 03:47:36 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <000c01bec9df$4f935c20$c49e2299@tim> Picking up where we left off, I like Guido's vision of generators fine. The "one frame" version I've described is in fact what Icon provides, and what Guido is doing requires using coroutines instead in that language. Guido's is more flexible, and I'm not opposed to that . OTOH, I *have* seen many a person (including me!) confused by the semantics of coroutines in Icon, so I don't know how much of the additional flexibility converts into additional confusion. One thing I am sure of: having debated the fine points of continuations recently, I'm incapable of judging it harshly today <0.5 wink>. > ... > def inorder(node): > if node.left: inorder(node.left) > suspend node > if node.right: inorder(node.right) The first thing that struck me there is that I'm not sure to whom the suspend transfers control. In the one-frame flavor of generator, it's always to the caller of the function that (lexically) contains the "suspend". Is it possible to keep this all straight if the "suspend" above is changed to e.g. pass_it_back(node) where def pass_it_back(x): suspend x ? I'm vaguely picturing some kind of additional frame state, a pointer to the topmost frame that's "expecting" to receive a suspend. (I see you resolve this in a different way later, though.) > ... > I thought that tree traversal was one of Tim's first examples of > generators; would I really have to use an explicit stack to create > the traversal? As before, still no , but the one-frame version does require an unbroken *implicit* chain back to the intended receiver, with an explicit "suspend" at every step back to that. Let me rewrite the one-frame version in a way that assumes less semantics from "suspend", instead building on the already-assumed new smarts in "for": def inorder(node): if node: for child in inorder(node.left): suspend child suspend node for child in inorder(node.right): suspend child I hope this makes it clearer that the one-frame version spawns two *new* generators for every non-None node, and in purely stack-like fashion (both "recursing down" and "suspending up"). > Next, I want more clarity about the initialization and termination > conditions. Good idea. > The Demo/thread/Generator.py version is very explicit about > initialization: you instantiate the Generator class, passing it a > function that takes a Generator instance as an argument; the function > executes in a new thread. (I guess I would've used a different > interface now -- perhaps inheriting from the Generator class > overriding a run() method.) I would change my coroutine implementation similarly. > For termination, the normal way to stop seems to be for the generator > function to return (rather than calling g.put()), the consumer then gets > an EOFError exception the next time it calls g.get(). There's also a > way for either side to call g.kill() to stop the generator prematurely. A perfectly serviceable interface, but "feels clumsy" in comparison to normal for loops and e.g. reading lines from a file, where *visible* exceptions aren't raised at the end. I expect most sequences to terminate before I do , so (visible) try/except isn't the best UI here. > Let me try to translate that to a threadless implementation. We could > declare a simple generator as follows: > > generator reverse(seq): > i = len(seq) > while i > 0: > i = i-1 > suspend seq[i] > > This could be translated by the Python translator into the following, > assuming a system class generator which provides the machinery for > generators: > > class reverse(generator): > def run(self, seq): > i = len(seq) > while i > 0: > i = i-1 > self.suspend(seq[i]) > > (Perhaps the identifiers generator, run and suspend would be spelled > with __...__, but that's just clutter for now.) > > Now where Tim was writing examples like this: > > for c in reverse("Hello world"): > print c, > print > > I'd like to guess what the underlying machinery would look like. For > argument's sake, let's assume the for loop recognizes that it's using > a generator (or better, it always needs a generator, and when it's not > a generator it silently implies a sequence-iterating generator). In the end I expect these concepts could be unified, e.g. via a new class __iterate__ method. Then for i in 42: could fail simply because ints don't have a value in that slot, while lists and tuples could inherit from SequenceIterator, pushing the generation of the index range into the type instead of explicitly constructed by the eval loop. > So the translator could generate the following: > > g = reverse("Hello world") # instantiate class reverse > while 1: > try: > c = g.resume() > except EOGError: # End Of Generator > break > print c, > print > > (Where g should really be a unique temporary local variable.) > > In this model, the g.resume() and g.suspend() calls have all the magic. > They should not be accessible to the user. This seems at odds with the later: > (The user may write this code explicitly if they want to consume the > generated elements in a different way than through a for loop.) Whether it's at odds or not, I like the latter better. When the machinery is clean & well-designed, expose it! Else in 2002 we'll be subjected to a generatorhacks module . > They are written in C so they can play games with frame objects. > > I guess that the *first* call to g.resume(), for a particular > generator instance, should start the generator's run() method; run() > is not activated by the instantiation of the generator. This can work either way. If it's more convenient to begin run() as part of instantiation, the code for run() can start with an equivalent of if self.first_time: self.first_time = 0 return where self.first_time is set true by the constructor. Then "the frame" will exist from the start. The first resume() will skip over that block and launch into the code, while subsequent resume()s will never even see this block: almost free. > Then run() runs until the first suspend() call, which causes the return > from the resume() call to happen. Subsequent resume() calls know that > there's already is a frame (it's stored in the generator instance) and simply > continue its execution where it was. If the run() method returns from > the frame, the resume() call is made to raise EOGError (blah, bogus > name) which signals the end of the loop. (The user may write this > code explicitly if they want to consume the generated elements in a > different way than through a for loop.) Yes, that parenthetical comment bears repeating . > Looking at this machinery, I think the recursive generator that I > wanted could be made to work, by explicitly declaring a generator > subclass (instead of using the generator keyword, which is just > syntactic sugar) and making calls to methods of self, e.g.: > > class inorder(generator): > def run(self, node): > if node.left: self.run(node.left) > self.suspend(node) > if node.right: self.run(node.right) Going way back to the top, this implies the def pass_it_back(x): suspend x indirection couldn't work -- unless pass_it_back were also a method of inorder. Not complaining, just trying to understand. Once you generalize, it's hard to know when to stop. > The generator machinery would (ab)use the fact that Python frames > don't necessarily have to be linked in a strict stack order; If you call *this* abuse, what words remain to vilify what Christian is doing ? > the generator gets a pointer to the frame to resume from resume(), Ah! That addresses my first question. Are you implicitly assuming a "stackless" eval loop here? Else resuming the receiving frame would appear to push another C stack frame for each value delivered, ever deeper. The "one frame" version of generators doesn't have this headache (since a suspend *returns* to its immediate caller there -- it doesn't *resume* its caller). > and there's a "bottom" frame which, when hit, raises the EOGError > exception. Although desribed at the end, this is something set up at the start, right? To trap a plain return from the topmost invocation of the generator. > All currently active frames belonging to the generator stay alive > while another resume() is possible. And those form a linear chain from the most-recent suspend() back to the primal resume(). Which appears to address an earlier issue not brought up in this message: this provides a well-defined & intuitively clear path for exceptions to follow, yes? I'm not sure about coroutines, but there's something wrong with a generator implementation if the guy who kicks it off can't see errors raised by the generator's execution! This doesn't appear to be a problem here. > All this is possible by the introduction of an explicit generator > object. I think Tim had an implementation in mind where the standard > return pointer in the frame is the only thing necessary; actually, I > think the return pointer is stored in the calling frame, not in the > called frame What I've had in mind is what Majewski implemented 5 years ago, but lost interest in because it couldn't be extended to those blasted continuations . The called frame points back to the calling frame via f->f_back (of course), and I think that's all the return info the one-frame version needs. I expect I'm missing your meaning here. > (Christian? Is this so in your version?). That shouldn't make a > difference, except that it's not clear to me how to reference the frame > (in the explicitly coded version, which has to exist at least at the > bytecode level). "The" frame being which frame specifically, and refrenced from where? Regardless, it must be solvable, since if Christian can (& he thinks he can, & I believe him ) expose a call/cc variant, the generator class could be coded entirely in Python. > With classic coroutines, I believe that there's no difference between > the first call and subsequent calls to the coroutine. This works in > the Knuth world where coroutines and recursion don't go together; That's also a world where co-transfers are implemented via funky self-modifying assembler, custom-crafted for the exact number of coroutines you expect to be using -- I don't recommend Knuth as a guide to *implementing* these beasts <0.3 wink>. That said, yes, provided the coroutines objects all exist, there's nothing special about the first call. About "provided that": if your coroutine objects A and B have "run" methods, you dare not invoke A.run() before B has been constructed (else the first instance of B.transfer() in A chokes -- there's no object to transfer *to*). So, in practice, I think instantiation is still divorced from initiation. One possibility is to hide all that in a cobegin(list_of_coroutine_classes_to_instantiate_and_run) function. But then naming the instances is a puzzle. > but at least for generators I would hope that it's possible for multiple > instances of the same generator to be active simultaneously (e.g. I > could be reversing over a list of files and then reverse each of the > lines in the file; this uses separate instances of the reverse() > generator). Since that's the trick the "one frame" generators *rely* on for recursion, it's surely not a problem in your stronger version. Note that my old coroutine implementation did allow for multiple instances of a coroutine, although the examples posted with it didn't illustrate that. The weakness of coroutines in practice is (in my experience) the requirement that you *name* the target of a transfer. This is brittle; e.g., in the pipeline example I posted, each stage had to know the names of the stages on either side of it. By adopting a target.transfer(optional_value) primitive it's possible to *pass in* the target object as an argument to the coroutine doing the transfer. Then "the names" are all in the setup, and don't pollute the bodies of the coroutines (e.g., each coroutine in the pipeline example could have arguments named "stdin" and "stdout"). I haven't seen a system that *does* this, but it's so obviously the right thing to do it's not worth saying any more about . > So we need a way to reference the generator instance separately from > the generator constructor. The machinery I sketched above solves this. > > After Tim has refined or rebutted this, I think I'll be able to > suggest what to do for coroutines. Please do. Whether or not it's futile, it's fun . hmm-haven't-had-enough-of-that-lately!-ly y'rs - tim From tismer at appliedbiometrics.com Fri Jul 9 14:22:05 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 09 Jul 1999 14:22:05 +0200 Subject: [Python-Dev] Generator details References: <000301bec81b$56e87660$c99e2299@tim> <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <3785E96D.A1641530@appliedbiometrics.com> Guido van Rossum wrote: [snipped all what's addressed to Tim] > All this is possible by the introduction of an explicit generator > object. I think Tim had an implementation in mind where the standard > return pointer in the frame is the only thing necessary; actually, I > think the return pointer is stored in the calling frame, not in the > called frame (Christian? Is this so in your version?). That > shouldn't make a difference, except that it's not clear to me how to > reference the frame (in the explicitly coded version, which has to > exist at least at the bytecode level). No, it isn't. It is still as it was. I didn't change the frame machinery at all. The callee finds his caller in its f_back field. [...] > (I'm still baffled by continuations. The question whether the for > saved and restored loop should find itself in the 1st or 5th iteration > surprises me. Doesn't this cleanly map into some Scheme code that > tells us what to do? Or is it unclear because Scheme does all loops > through recursion? I presume that if you save the continuation of the > 1st iteration and restore it in the 5th, you'd find yourself in the > back 1st iteration? But this is another thread.) In Scheme, Python's for-loop would be a tail-recursive expression, it would especially be its own extra lambda. Doesn't fit. Tim is right when he says that Python isn't Scheme. Yesterday I built your suggested change to for-loops, and it works fine. By turning the loop counter into a mutable object, every reference to it shares the current value, and it behaves like Tim pointed out it should. About Tims reply to this post: [Gui-do] > The generator machinery would (ab)use the fact that Python frames > don't necessarily have to be linked in a strict stack order; [Tim-bot] If you call *this* abuse, what words remain to vilify what Christian is doing ? As a matter of fact, I have been thinking quite long about this *abuse*. At the moment I do not do this. The frame stack becomes a frame tree, and you can jump like Tarzan from leaf to leaf, but I never change the order. Perhaps this can make sense too, but this is curently where *my* brain explodes. Right now I'm happy that there is *always* a view of the top level, and an exception always knows where to wind up. Form that point of view, I'm even more conservative than Guido (above) and Sam (replacing whole frame chains). In a sense, since I don't change the frame chain but only change the current frame, this is like a functional way to use weak references. The continuation approach is to build new paths in a tree, and loose those which are unreachable. Modifying the tree is not part of my model at the moment. This may be interesting to study after we know everything about this tree and wee need even more freedom. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Sat Jul 10 16:28:13 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 10 Jul 1999 10:28:13 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Fri, 09 Jul 1999 14:22:05 +0200." <3785E96D.A1641530@appliedbiometrics.com> References: <000301bec81b$56e87660$c99e2299@tim> <199907080508.BAA00623@eric.cnri.reston.va.us> <3785E96D.A1641530@appliedbiometrics.com> Message-ID: <199907101428.KAA04364@eric.cnri.reston.va.us> [Christian] > The frame stack becomes a frame tree, and you can jump like Tarzan > from leaf to leaf [...]. Christian, I want to kiss you! (OK, just a hug. We're both Europeans. :-) This one remark suddenly made me understand much better what continuations do -- it was the one missing piece of insight I still needed after Tim's explanation and skimming the Scheme tutorial a bit. I'll have to think more about the consequences but this finally made me understand better how to interpreter the mysterious words ``the continuation represents "the rest of the program"''. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Sat Jul 10 17:48:43 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 10 Jul 1999 11:48:43 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Fri, 09 Jul 1999 03:47:36 EDT." <000c01bec9df$4f935c20$c49e2299@tim> References: <000c01bec9df$4f935c20$c49e2299@tim> Message-ID: <199907101548.LAA04399@eric.cnri.reston.va.us> I've been thinking some more about Tim's single-frame generators, and I think I understand better how to implement them now. (And yes, it was a mistake of me to write that the suspend() and resume() methods shouldn't be accessible to the user! Also thanks for the clarification of how to write a recursive generator.) Let's say we have a generator function like this: generator reverse(l): i = len(l) while i > 0: i = i-1 suspend l[i] and a for loop like this: for i in reverse(range(10)): print i What is the expanded version of the for loop? I think this will work: __value, __frame = call_generator(reverse, range(10)) while __frame: i = __value # start of original for loop body print i # end of original for loop body __value, __frame = resume_frame(__frame) (Note that when the original for loop body contains 'continue', this should jump to the resume_frame() call. This is just pseudo code.) Now we must define two new built-in functions: call_generator() and resume_frame(). - call_generator() is like apply() but it returns a pair (result, frame) where result is the function result and frame is the frame, *if* the function returned via suspend. If it returned via return, call_generator() returns None for the frame. - resume_frame() does exactly what its name suggests. It has the same return convention as call_generator(). Note that the for loop throws away the final (non-suspend) return value of the generator -- this just signals the end of the loop. How to translate the generator itself? I've come up with two versions. First version: add a new bytecode SUSPEND, which does the same as RETURN but also marks the frame as resumable. call_generator() then calls the function using a primitive which allows it to specify the frame (e.g. a variant of eval_code2 taking a frame argument). When the call returns, it looks at the resumable bit of the frame to decode whether to return (value, frame) or (value, None). resume_frame() simply marks the frame as non-resumable and continues its execution; upon return it does the same thing as call_generator(). Alternative translation version: introduce a new builtin get_frame() which returns the current frame. The statement "suspend x" gets translated to "return x, get_frame()" and the statement "return x" (including the default "return None" at the end of the function) gets translated to "return x, None". So our example turns into: def reverse(l): i = len(l) while i > 0: i = i-1 return l[i], get_frame() return None, None This of course means that call_generator() can be exactly the same as apply(), and in fact we better get rid of it, so the for loop translation becomes: __value, __frame = reverse(range(10)) while __frame: ...same as before... In a real implementation, get_frame() could be a new bytecode; but it doesn't have to be (making for easier experimentation). (get_frame() makes a fine builtin; there's nothing inherently dangerous to it, in fact people get it all the time, currently using horrible hacks!). I'm not sure which is better; the version without call_generator() allows you to create your own generator without using the 'generator' and 'suspend' keywords, calling get_frame() explicitly. Loose end: what to do when there's a try/finally around a suspend? E.g. generator foo(l): try: for i in l: suspend i+1 finally: print "Done" The second translation variant would cause "Done" to be printed on each suspend *and* on the final return. This is confusing (and in fact I think resuming the frame would be a problem since the return breaks down the try-finally blocks). So I guess the SUSPEND bytecode is a better implementation -- it can suspend the frame without going through try-finally clauses. Then of course we create another loose end: what if the for loop contains a break? Then the frame will never be resumed and its finally clause will never be executed! This sounds bad. Perhaps the destructor of the frame should look at the 'resumable' bit and if set, resume the frame with a system exception, "Killed", indicating an abortion? (This is like the kill() call in Generator.py.) We can increase the likelihood that the frame's desctructor is called at the expected time (right when the for loop terminates), by deleting __frame at the end of the loop. If the resumed frame raises another exception, we ignore it. Its return value is ignored. If it suspends itself again, we resume it with the "Killed" exception again until it dies (thoughts of the Blank Knight come to mind). I am beginning to like this idea. (Not that I have time for an implementation... But it could be done without Christian's patches.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Sat Jul 10 23:09:48 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 10 Jul 1999 17:09:48 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907101428.KAA04364@eric.cnri.reston.va.us> Message-ID: <000501becb18$8ae0e240$e69e2299@tim> [Christian] > The frame stack becomes a frame tree, and you can jump like Tarzan > from leaf to leaf [...]. [Guido] > Christian, I want to kiss you! (OK, just a hug. We're both > Europeans. :-) Not in America, pal -- the only male hugging allowed here is in the two seconds after your team wins the Superbowl -- and even then only so long as you haven't yet taken off your helmets. > This one remark suddenly made me understand much better what > continuations do -- it was the one missing piece of insight I still > needed after Tim's explanation and skimming the Scheme tutorial a bit. It's an insight I was missing too -- continuations are often *invoked* in general directed-graph fashion, and before Christian said that I hadn't realized the *implementation* never sees anything worse than a tree. So next time I see Christian, I'll punch him hard in the stomach, and mumble "good job" loudly enough so that he hears it, but indistinctly enough so I can plausibly deny it in case any other guy overhears us. *That's* the American Way . first-it's-hugging-then-it's-song-contests-ly y'rs - tim From MHammond at skippinet.com.au Sun Jul 11 02:52:22 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sun, 11 Jul 1999 10:52:22 +1000 Subject: [Python-Dev] Win32 Extensions Registered Users Message-ID: <003c01becb37$a369c3d0$0801a8c0@bobcat> Hi all, As you may or may not have noticed, I have recently begun offering a Registered Users" program where people who use my Windows extensions can pay $50.00 per 2 years, and get a range of benefits. The primary benefits are: * Early access to binary versions. * Registered Users only mailing list (very low volume to date) * Better support from me. The last benefit really isnt to this list - anyone here will obviously get (and hopefully does get) a pretty good response should they need to mail me. The early access to binary versions may be of interest. As everyone on this list spends considerable and worthwhile effort helping Python, I would like to offer everyone here a free registration. If you would like to take advantage, just send me a quick email. I will email you the "top secret" location of the Registered Users page (where the very slick and very new Pythonwin can be found). Also, feel free to join the registered users mailing list at http://mailman.pythonpros.com/mailman/listinfo/win32-reg-users. This is low volume, and once volume does increase an announce list will be created, so you can join without fear of more swamping of your mailbox. And just FYI, I am very pleased with the registration process to date. In about 3 weeks I have around 20 paid users! If I can keep that rate up I will be very impressed (although that already looks highly unlikely :-) Even still, I consider it going well. Mark. From tim_one at email.msn.com Sun Jul 11 21:49:57 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 11 Jul 1999 15:49:57 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Message-ID: <000201becbd6$8df90660$569e2299@tim> [David Ascher] > FWIW, I've been following the coroutine/continuation/generator bit with > 'academic' interest -- the CS part of my brain likes to read about them. > Prompted by Tim's latest mention of Demo/threads/Generator.py, I looked at > it (again?) and *immediately* grokked it and realized how it'd fit into a > tool I'm writing. Nothing to do with concurrency, I/O, etc -- just > compartmentalization of stateful iterative processes (details too baroque > to go over). "stateful iterative process" is a helpful characterization of where these guys can be useful! State captured in variables is the obvious one, but simply "where you are" in a mass of nested loops and conditionals is also "state" -- and a kind of state especially clumsy to encode as data state instead (ever rewrite a hairy recursive routine to use iteration with an explicit stack? it's a transformation that can be mechanized, but the result is usually ugly & often hard to understand). Once it sinks in that it's *possible* to implement a stateful iterative process in this other way, I think you'll find examples popping up all over the place. > More relevantly, that tool would be useful on thread-less > Python's (well, when it reaches usefulness on threaded Pythons =). As Guido pointed out, the API provided by Generator.py is less restrictive than any that can be built with the "one frame" flavor of generator ("resumable function"). Were you able to make enough sense of the long discussion that ensued to guess whether the particular use you had in mind required Generator.py's full power? If you couldn't tell, post the baroque details & I'll tell you . not-putting-too-fine-a-point-on-possible-vs-natural-ly y'rs - tim From da at ski.org Sun Jul 11 22:14:04 1999 From: da at ski.org (David Ascher) Date: Sun, 11 Jul 1999 13:14:04 -0700 (Pacific Daylight Time) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000201becbd6$8df90660$569e2299@tim> Message-ID: On Sun, 11 Jul 1999, Tim Peters wrote: > As Guido pointed out, the API provided by Generator.py is less restrictive > than any that can be built with the "one frame" flavor of generator > ("resumable function"). Were you able to make enough sense of the long > discussion that ensued to guess whether the particular use you had in mind > required Generator.py's full power? If you couldn't tell, post the baroque > details & I'll tell you . I'm pretty sure the use I mentioned would fit in even the simplest version of a generator. As to how much sense I made of the discussion, let's just say I'm glad there's no quiz at the end. I did shudder at the mention of unmentionables (male public displays of affection -- yeaach!), yodel at the mention of Lord Greystoke swinging among stack branches and chuckled at the vision of him being thrown back in a traceback (ouch! ouch! ouch!, "most painful last"...). --david From tim_one at email.msn.com Mon Jul 12 04:26:44 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 11 Jul 1999 22:26:44 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907101548.LAA04399@eric.cnri.reston.va.us> Message-ID: <000001becc0d$fcb64f40$229e2299@tim> [Guido, sketches 112 ways to implement one-frame generators today ] I'm glad you're having fun too! I won't reply in detail here; it's enough for now to happily agree that adding a one-frame generator isn't much of a stretch for the current implementation of the PVM. > Loose end: what to do when there's a try/finally around a suspend? > E.g. > > generator foo(l): > try: > for i in l: > suspend i+1 > finally: > print "Done" > > The second translation variant would cause "Done" to be printed on > each suspend *and* on the final return. This is confusing (and in > fact I think resuming the frame would be a problem since the return > breaks down the try-finally blocks). There are several things to be said about this: + A suspend really can't ever go thru today's normal "return" path, because (among other things) that wipes out the frame's value stack! while (!EMPTY()) { v = POP(); Py_XDECREF(v); } A SUSPEND opcode would let it do what it needs to do without mixing that into the current return path. So my answer to: > I'm not sure which is better; the version without call_generator() > allows you to create your own generator without using the 'generator' > and 'suspend' keywords, calling get_frame() explicitly. is "both" : get_frame() is beautifully clean, but it still needs something like SUSPEND to keep everything straight. Maybe this just amounts to setting "why" to a new WHY_SUSPEND and sorting it all out after the eval loop; OTOH, that code is pretty snaky already. + I *expect* the example code to print "Done" len(l)+1 times! The generator mechanics are the same as the current for/__getitem__ protocol in this respect: if you have N items to enumerate, the enumeration routine will get called N+1 times, and that's life. That is, the fact is that the generator "gets to" execute code N+1 times, and the only reason your original example seems surprising at first is that it doesn't happen to do anything (except exit the "try" block) on the last of those times. Change it to generator foo(l): try: for i in l: suspend i+1 cleanup() # new line finally: print "Done" and then you'd be surprised *not* to see "Done" printed len(l)+1 times. So I think the easiest thing is also the right thing in this case. OTOH, the notion that the "finally" clause should get triggered at all the first len(l) times is debatable. If I picture it as a "resumable function" then, sure, it should; but if I picture the caller as bouncing control back & forth with the generator, coroutine style, then suspension is a just a pause in the generator's execution. The latter is probably the more natural way to picture it, eh? Which feeds into: > Then of course we create another loose end: what if the for loop > contains a break? Then the frame will never be resumed and its > finally clause will never be executed! This sounds bad. Perhaps the > destructor of the frame should look at the 'resumable' bit and if set, > resume the frame with a system exception, "Killed", indicating an > abortion? (This is like the kill() call in Generator.py.) We can > increase the likelihood that the frame's desctructor is called at the > expected time (right when the for loop terminates), by deleting > __frame at the end of the loop. If the resumed frame raises another > exception, we ignore it. Its return value is ignored. If it suspends > itself again, we resume it with the "Killed" exception again until it > dies (thoughts of the Blank Knight come to mind). This may leave another loose end : what if the for loop doesn't contain a break, but dies because of an exception in some line unrelated to the generator? Or someone has used an explicit get_frame() in any case and that keeps a ref to the frame alive? If the semantic is that the generator must be shut down no matter what, then the invoker needs code more like value, frame = generator(args) try: while frame: etc value, frame = resume_frame(frame) finally: if frame: shut_frame_down(frame) OTOH, the possibility that someone *can* do an explicit get_frame suggests that "for" shouldn't assume it's the master of the universe . Perhaps the user's intent was to generate the first 100 values in a for loop, then break out, analyze the results, and decide whether to resume it again by hand (I've done stuff like that ...). So there's also a case to be made for saying that a "finally" clause wrapping a generator body will only be executed if the generator body raises an exception or the generator itself decides it's done; i.e. iff it triggers while the generator is actively running. Just complicating things there . It actually sounds pretty good to raise a Killed exception in the frame destructor! The destructor has to do *something* to trigger the code that drains the frame's value stack anyway, "finally" blocks or not (frame_dealloc doesn't do that now, since there's currently no way to get out of eval_code2 with a non-empty stack). > ... > I am beginning to like this idea. (Not that I have time for an > implementation... But it could be done without Christian's patches.) Or with them too . If stuff is implemented via continuations, the same concerns about try/finally blocks pop up everywhere a continuation is invoked: you (probably) leave the current frame, and may or may not ever come back. So if there's a "finally" clause pending and you don't ever come back, it's a surprise there too. So while you thought you were dealing with dirt-simple one-frame generators, you were *really* thinking about how to make general continuations play nice . solve-one-mystery-and-you-solve-'em-all-ly y'rs - tim From guido at CNRI.Reston.VA.US Mon Jul 12 05:01:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sun, 11 Jul 1999 23:01:04 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Sun, 11 Jul 1999 22:26:44 EDT." <000001becc0d$fcb64f40$229e2299@tim> References: <000001becc0d$fcb64f40$229e2299@tim> Message-ID: <199907120301.XAA06001@eric.cnri.reston.va.us> [Tim seems to be explaining why len(l)+1 and not len(l) -- but I was really thinking about len(l)+1 vs. 1.] > OTOH, the notion that the "finally" clause should get triggered at all the > first len(l) times is debatable. If I picture it as a "resumable function" > then, sure, it should; but if I picture the caller as bouncing control back > & forth with the generator, coroutine style, then suspension is a just a > pause in the generator's execution. The latter is probably the more natural > way to picture it, eh? *This* is what I was getting at, and it points in favor of a SUSPEND opcode since I don't know how to do that in the multiple-return. As you point out, there can be various things on the various in-frame stacks (value stack and block stack) that all get discarded by a return, and that no restart_frame() can restore (unless get_frame() returns a *copy* of the frame, which seems to be defeating the purpose). > OTOH, the possibility that someone *can* do an explicit get_frame suggests > that "for" shouldn't assume it's the master of the universe . Perhaps > the user's intent was to generate the first 100 values in a for loop, then > break out, analyze the results, and decide whether to resume it again by > hand (I've done stuff like that ...). So there's also a case to be made for > saying that a "finally" clause wrapping a generator body will only be > executed if the generator body raises an exception or the generator itself > decides it's done; i.e. iff it triggers while the generator is actively > running. Hmm... I think that if the generator is started by a for loop, it's okay for the loop to assume it is the master of the universe -- just like there's no force in the world (apart from illegal C code :) that can change the hidden loop counter in present-day for loop. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Mon Jul 12 05:36:05 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sun, 11 Jul 1999 23:36:05 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Sun, 11 Jul 1999 15:49:57 EDT." <000201becbd6$8df90660$569e2299@tim> References: <000201becbd6$8df90660$569e2299@tim> Message-ID: <199907120336.XAA06056@eric.cnri.reston.va.us> [Tim] > "stateful iterative process" is a helpful characterization of where these > guys can be useful! State captured in variables is the obvious one, but > simply "where you are" in a mass of nested loops and conditionals is also > "state" -- and a kind of state especially clumsy to encode as data state > instead (ever rewrite a hairy recursive routine to use iteration with an > explicit stack? it's a transformation that can be mechanized, but the > result is usually ugly & often hard to understand). This is another key description of continuations (maybe not quite worth a hug :). The continuation captures exactly all state that is represented by "position in the program" and no state that is represented by variables. But there are many hairy details. In antiquated assembly, there might not be a call stack, and a continuation could be represented by a single value: the program counter. But now we have a call stack, a value stack, a block stack (in Python) and who knows what else. I'm trying to understand whether we can get away with saving just a pointer to a frame, whether we need to copy the frame, or whether we need to copy the entire frame stack. (In regular Python, the frame stack also contains local variables. These are explicitly exempted from being saved by a continuation. I don't know how Christian does this, but I presume he uses the dictionary which can be shared between frames.) Let's see... Say we have this function: def f(x): try: return 1 + (-x) finally: print "boo" The bytecode (simplified) looks like: SETUP_FINALLY (L1) LOAD_CONST (1) LOAD_FAST (x) UNARY_NEGATIVE BINARY_ADD RETURN_VALUE L1: LOAD_CONST ("boo") PRINT_ITEM PRINT_NEWLINE END_FINALLY Now suppose that the unary minus operator saves its continuation (e.g. because x was a type with a __neg__ method). At this point there is an entry on the block stack pointing to L1 as the try-finally block, and the value stack has the value 1 pushed on it. Clearly if that saved continuation is ever invoked (called? used? activated? What do you call what you do to a continuation?) it should substitute whatever value was passed into the continuation for the result of the unary minus, and the program should continue by pushing it on top of the value stack, adding it to 1, and returning the result, executing the block of code at L1 on the way out. So clearly when the continuation is used, 1 should be on the value stack and L1 should be on trh block stack. Assuming that the unary minus function initially returns just fine, the value stack and the block stack of the frame will be popped. So I conclude that saving a continuation must save at least the value and block stack of the frame being saved. Is it safe not to save the frame and block stacks of frames further down on the call stack? I don't think so -- these are all destroyed when frames are popped off the call stack (even if the frame is kept alive, its value and block stack are always empty when the function has returned). So I hope that Christian has code that saves the frame and block stacks! (It would be fun to try and optimize this by doing it lazily, so that frames which haven't returned yet aren't copied yet.) How does Scheme do this? I don't know if it has something like the block stack, but surely it has a value stack! Still mystified, --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Mon Jul 12 09:03:59 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 12 Jul 1999 03:03:59 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <000201becc34$b79f7900$9b9e2299@tim> [Guido wonders about continuations -- must be a bad night for sleep ] Paul Wilson's book-in-progress has a (large) page of HTML that you can digest quickly and that will clear up many mysteries: ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_142.html Scheme may be the most-often implemented language on Earth (ask 100 Schemers what they use Scheme for, persist until you get the truth, and 81 will eventually tell you that mostly they putz around writing their own Scheme interpreter <0.51 wink>); so there are a *lot* of approaches out there. Wilson describes a simple approach for a compiler. A key to understanding it is that continuations aren't "special" in Scheme: they're the norm. Even plain old calls are set up by saving the caller's continuation, then handing control to the callee. In Wilson's approach, "the eval stack" is a globally shared stack, but at any given moment contains only the eval temps relevant to the function currently executing. In preparation for a call, the caller saves away its state in "a continuation", a record which includes: the current program counter a pointer to the continuation record it inherited a pointer to the structure supporting name resolution (locals & beyond) the current eval stack, which gets drained (emptied) at this point There isn't anything akin to Python's block stack (everything reduces to closures, lambdas and continuations). Note: the continuation is immutable; once constructed, it's never changed. Then the callees' arguments are pushed on the eval stack, a pointer to the continuation as saved above is stored in "the continuation register", and control is transferred to the callee. Then a function return is exactly the same operation as "invoking a continuation": whatever is in the continuation register at the time of the return/invoke is dereferenced, and the PC, continuation register, env pointer and eval stack values are copied out of the continuation record. The return value is passed back in another "virtual register", and pushed onto the eval stack first thing after the guts of the continuation are restored. So this copies the eval stack all the time, at every call and every return/invoke. Kind of. This is partly why "tail calls" are such a big deal in Scheme: a tail call need not (*must* not, in std Scheme) create a new continuation. The target of a tail call simply inherits the continuation pointer inherited by its caller. Of course many Scheme implementations optimize beyond this. > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. In the absence of tail calls, the approach above saves the stack on every call and restores it on every return, so there's no "extra" copying needed when capturing, or invoking, a continuation (cold comfort, I agree ). About Christian's code, we'd better let it speak for itself -- I'm not clear on the details of what he's doing today. Generalities: > ... > So I hope that Christian has code that saves the frame and block > stacks! Yes, but nothing gets copied until a continuation gets captured, and at the start of that I believe only one frame gets cloned. > (It would be fun to try and optimize this by doing it lazily, > so that frames which haven't returned yet aren't copied yet.) He's aware of that . > How does Scheme do this? I don't know if it has something like the > block stack, but surely it has a value stack! Stacks and registers and such aren't part of the language spec, but, you bet -- however it may be spelled in a given implementation, "a value stack" is there. BTW, many optimizing Schemes define a weaker form of continuation too (call/ec, for "escaping continuation"). Skipping the mumbo jumbo definition <0.9 wink>, you can only invoke one of those if its target is on the path back from the invoker to the root of the call tree (climb up tree like Cheetah, not leap across branches like Tarzan). This amounts to a setjmp/longjmp in C -- and may be implemented that way! i-say-do-it-right-or-not-at-all-ly y'rs - tim From tismer at appliedbiometrics.com Mon Jul 12 11:44:06 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 11:44:06 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201becbd6$8df90660$569e2299@tim> <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <3789B8E6.C4CB6840@appliedbiometrics.com> Guido van Rossum wrote: ... > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. You need to preserve the stack and the block stack of a frame, if and only if it can be reached twice. I make this dependent from its refcount. Every frame monitors itself before and after every call_function, if a handler field in the frame "f_callguard" has been set. If so, the callguard is called. Its task is to see wether we must preserve the current state of the frame and to carry this out. The idea is to create a shadow frame "on demand". When I touch a frame with a refcount > 1, I duplicate it at its f_back pointer. By that is is turned into a "continuation frame" which is nothing more than the stack copy, IP, and the block stack. By that, the frame stays in place where it was, all pointers are still fine. The "real" one is now in the back, and the continuation frame's purpose when called is only to restore the state of the "real one" and run it (after doing a new save if necessary). I call this technique "push back frames". > > (In regular Python, the frame stack also contains local variables. > These are explicitly exempted from being saved by a continuation. I > don't know how Christian does this, but I presume he uses the > dictionary which can be shared between frames.) I keep the block stack and a stack copy. All the locals are only existing once. The frame is also only one frame. Actually always a new one (due to push back), but virtually it is "the frame", with multiple continuation frames pointing at it. ... > Clearly if that saved continuation is ever invoked (called? used? > activated? What do you call what you do to a continuation?) I think of throwing. Mine are thrown. The executive of standard frames is "eval_code2_loop(f, passed_retval)", where the executive of a continuation frame is "throw_continuation(f, passed_retval)". ... > Is it safe not to save the frame and block stacks of frames further > down on the call stack? I don't think so -- these are all destroyed > when frames are popped off the call stack (even if the frame is kept > alive, its value and block stack are always empty when the function > has returned). > > So I hope that Christian has code that saves the frame and block > stacks! (It would be fun to try and optimize this by doing it lazily, > so that frames which haven't returned yet aren't copied yet.) :-) I have exactly that, and I do it lazily already. Unless somebody saves a continuation, nothing special happens. But if he does, the push back process follows his path like a zip (? Rei?verschlu?) and ensures that the path can be walked again. Tarzan has now the end of this liane in his hand. He might use it to swing over, or he might drop it, and it ribbles away and vanishes as if it never existed. Give me some final testing, and you will be able to try it out in a few days. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Mon Jul 12 11:56:00 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 11:56:00 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201becc34$b79f7900$9b9e2299@tim> Message-ID: <3789BBB0.39F6BD20@appliedbiometrics.com> Tim Peters wrote: ... > BTW, many optimizing Schemes define a weaker form of continuation too > (call/ec, for "escaping continuation"). Skipping the mumbo jumbo definition > <0.9 wink>, you can only invoke one of those if its target is on the path > back from the invoker to the root of the call tree (climb up tree like > Cheetah, not leap across branches like Tarzan). This amounts to a > setjmp/longjmp in C -- and may be implemented that way! Right, maybe this would do enough. We will throw away what's not needed, when we know what we actually need... > i-say-do-it-right-or-not-at-all-ly y'rs - tim ...and at the moment I think it was right to take it all. just-fixing-continuations-spun-off-in-an-__init__-which- -is-quite-hard-since-still-recursive,-and-I-will-ship-it-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Mon Jul 12 17:42:14 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 12 Jul 1999 11:42:14 -0400 (EDT) Subject: [Python-Dev] Generator details References: <199907101548.LAA04399@eric.cnri.reston.va.us> <000001becc0d$fcb64f40$229e2299@tim> Message-ID: <14218.3286.847367.125679@anthem.cnri.reston.va.us> | value, frame = generator(args) | try: | while frame: | etc | value, frame = resume_frame(frame) | finally: | if frame: | shut_frame_down(frame) Minor point, but why not make resume() and shutdown() methods on the frame? Isn't this much cleaner? value, frame = generator(args) try: while frame: etc value, frame = frame.resume() finally: if frame: frame.shutdown() -Barry From tismer at appliedbiometrics.com Mon Jul 12 21:39:40 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 21:39:40 +0200 Subject: [Python-Dev] continuationmodule.c preview Message-ID: <378A447C.D4DD24D8@appliedbiometrics.com> Howdy, please find attached my latest running version of continuationmodule.c which is really able to do continuations. You need stackless Python 0.3 for it, which I just submitted. This module is by no means ready. The central functions are getpcc() and putcc. Call/cc is at the moment to be done like: def callcc(fun, *args, **kw): cont = getpcc() return apply(fun, (cont,)+args, kw) getpcc(level=1) gets a parent's current continuation. putcc(cont, val) throws a continuation. At the moment, these are still frames (albeit special ones) which I will change. They should be turned into objects which have a link to the actual frame, which can be unlinked after a shot or by hand. This makes it easier to clean up circular references. I have a rough implementation of this in Python, also a couple of generators and coroutines, but all not pleasing me yet. Due to the fact that my son is ill, my energy has dropped a little for the moment, so I thought I'd better release something now. I will make the module public when things have been settled a little more. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home -------------- next part -------------- A non-text attachment was scrubbed... Name: continuationmodule.c Type: application/x-unknown-content-type-cfile Size: 19750 bytes Desc: not available URL: From guido at CNRI.Reston.VA.US Mon Jul 12 22:04:21 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 12 Jul 1999 16:04:21 -0400 Subject: [Python-Dev] Python bugs database started Message-ID: <199907122004.QAA09348@eric.cnri.reston.va.us> Barry has installed Jitterbug on python.org and now we can use it to track Python bugs. I already like it much better than the todo wizard, because the response time is much better (the CGI program is written in C). Please try it out -- submit bugs, search for bugs, etc. The URL is http://www.python.org/python-bugs/. Some of you already subscribed to the mailing list (python-bugs-list) -- beware that this list receives a message for each bug reported and each followup. The HTML is preliminary -- it is configurable (somewhat) and I would like to make it look nicer, but don't have the time right now. There are certain features (such as moving bugs to different folders) that are only accessible to authorized users. If you have a good reason I might authorize you. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Tue Jul 13 06:03:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:25 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: <199907122004.QAA09348@eric.cnri.reston.va.us> Message-ID: <000701becce4$a973c920$31a02299@tim> > Please try it out -- submit bugs, search for bugs, etc. The URL is > http://www.python.org/python-bugs/. Cool! About those "Jitterbug bugs" (repeated submissions): those popped up for me, DA, and MH. The first and the last are almost certainly using IE5 as their browser, and that DA shows increasing signs of becoming a Windows Mutant too . The first time I submitted a bug, I backed up to the entry page and hit Refresh to get the category counts updated (never saw Jitterbug before, so must play!). IE5 whined about something-or-other being out of date, and would I like to "repost the data"? I said sure. I did that a few other times after posting other bugs, and-- while I don't know for sure --it looks likely that you got a number of resubmissions equal to the number of times I told IE5 "ya, ya, repost whatever you want". Next time I post a bug I'll just close the browser and come back an hour later. If "the repeat bug" goes away then, it's half IE5's fault for being confused about which page it's on, and half mine for assuming IE5 knows what it's doing. meta-bugging-ly y'rs - tim From tim_one at email.msn.com Tue Jul 13 06:03:30 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:30 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907120301.XAA06001@eric.cnri.reston.va.us> Message-ID: <000801becce4$aafd7660$31a02299@tim> [Guido] > ... > Hmm... I think that if the generator is started by a for loop, it's > okay for the loop to assume it is the master of the universe -- just > like there's no force in the world (apart from illegal C code :) that > can change the hidden loop counter in present-day for loop. If it comes to a crunch, me too. I think your idea of forcing an exception in the frame's destructor (to get the stacks cleaned up, and any suspended "finally" blocks executed) renders this a non-issue, though (it will "just work", and if people resort to illegal C code, it will *still* work ). hadn't-noticed-you-can't-spell-"illegal-code"-without-"c"-ly y'rs - tim From tim_one at email.msn.com Tue Jul 13 06:03:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:33 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <000901becce4$ac88aa40$31a02299@tim> Backtracking a bit: [Guido] > This is another key description of continuations (maybe not quite > worth a hug :). I suppose a kiss is out of the question, then. > The continuation captures exactly all state that is represented by > "position in the program" and no state that is represented by variables. Right! > But there are many hairy details. In antiquated assembly, there might > not be a call stack, and a continuation could be represented by a > single value: the program counter. But now we have a call stack, a > value stack, a block stack (in Python) and who knows what else. > > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. As you convinced yourself in following paragraphs, for 1st-class continuations "the entire frame stack" *may* be necessary. > ... > How does Scheme do this? I looked up R. Kent Dybvig's doctoral dissertation, at ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/3imp.ps.gz He gives detailed explanations of 3 Scheme implementations there (from whence "3imp", I guess). The first is all heap-based, and looks much like the simple Wilson implementation I summarized yesterday. Dybvig profiled it and discovered it spent half its time in, together, function call overhead and name resolution. So he took a different approach: Scheme is, at heart, just another lexically scoped language, like Algol or Pascal. So how about implementing it with a perfectly conventional shared, contiguous stack? Because that doesn't work: the advanced features (lexical closures with indefinite extent, and user-captured continuations) aren't stack-like. Tough, forget those at the start, and do whatever it takes later to *make* 'em work. So he did. When his stack implementation hit a user's call/cc, it made a physical copy of the entire stack. And everything ran much faster! He points out that "real programs" come in two flavors: 1) Very few, or no, call/cc thingies. Then most calls are no worse than Algol/Pascal/C functions, and the stack implementation runs them at Algol/Pascal/C speed (if we knew of anything faster than a plain stack, the latter would use it). 2) Lots of call/cc thingies. Then "the stack" is likely to be shallow (the program is spending most of its time co-transferring, not recursing deeply), and because the stack is contiguous he can exploit the platform's fastest block-copy operation (no need to chase pointer links, etc). So, in some respects, Dybvig's stack implementation of Scheme was more Pythonic than Python's current implementation . His third implementation was for some propeller-head theoretical "string machine", so I won't even mention it. worrying-about-the-worst-case-can-hurt-the-normal-cases-ly y'rs - tim From da at ski.org Tue Jul 13 06:15:28 1999 From: da at ski.org (David Ascher) Date: Mon, 12 Jul 1999 21:15:28 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Python bugs database started In-Reply-To: <000701becce4$a973c920$31a02299@tim> Message-ID: > About those "Jitterbug bugs" (repeated submissions): those popped up for > me, DA, and MH. The first and the last are almost certainly using IE5 as > their browser, and that DA shows increasing signs of becoming a Windows > Mutant too . > > Next time I post a bug I'll just close the browser and come back an hour > later. If "the repeat bug" goes away then, it's half IE5's fault for being > confused about which page it's on, and half mine for assuming IE5 knows what > it's doing. FYI, I did the same thing but w/ Communicator. (I do use windows, but refuse to use IE =). This one's not specifically MS' fault. From tim_one at email.msn.com Tue Jul 13 06:47:43 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:47:43 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <14218.3286.847367.125679@anthem.cnri.reston.va.us> Message-ID: <001501beccea$d83f6740$31a02299@tim> [Barry] > Minor point, but why not make resume() and shutdown() methods on the > frame? Isn't this much cleaner? > > value, frame = generator(args) > try: > while frame: > etc > value, frame = frame.resume() > finally: > if frame: > frame.shutdown() Yes -- and at least it's better than arguing over what to name them . btw-tabs-in-email-don't-look-the-way-you-expect-them-to-ly y'rs - tim From tim_one at email.msn.com Tue Jul 13 08:47:43 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 02:47:43 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <378A447C.D4DD24D8@appliedbiometrics.com> Message-ID: <000001beccfb$9beb4fa0$2f9e2299@tim> The latest versions of the Icon language (9.3.1 & beyond) sprouted an interesting change in semantics: if you open a file for reading in "translated" (text) mode now, it normalizes Unix, Mac and Windows line endings to plain \n. Writing in text mode still produces what's natural for the platform. Anyone think that's *not* a good idea? c-will-never-get-fixed-ly y'rs - tim From Vladimir.Marangozov at inrialpes.fr Tue Jul 13 13:54:00 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Tue, 13 Jul 1999 12:54:00 +0100 (NFT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000101bec6b3$4e752be0$349e2299@tim> from "Tim Peters" at "Jul 5, 99 02:55:02 am" Message-ID: <199907131154.MAA22698@pukapuka.inrialpes.fr> After a short vacation, I'm trying to swallow the latest discussion about control flow management & derivatives. Could someone help me please by answering two naive questions that popped up spontaneously in my head: Tim Peters wrote: [a biased short course on generators, continuations, coroutines] > > ... > > GENERATORS > > Generators add two new abstract operations, "suspend" and "resume". When a > generator suspends, it's exactly like a return today except we simply > decline to decref the frame. That's it! The locals, and where we are in > the computation, aren't thrown away. A "resume" then consists of > *re*starting the frame at its next bytecode instruction, with the retained > frame's locals and eval stack just as they were. > > ... > > too-simple-to-be-obvious?-ly y'rs - tim Yes. I'm trying to understand the following: 1. What does a generator generate? 2. Clearly, what's the difference between a generator and a thread? -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tismer at appliedbiometrics.com Tue Jul 13 13:41:32 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 13 Jul 1999 13:41:32 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <199907131154.MAA22698@pukapuka.inrialpes.fr> Message-ID: <378B25EC.2739BCE3@appliedbiometrics.com> Vladimir Marangozov wrote: ... > > too-simple-to-be-obvious?-ly y'rs - tim > > Yes. I'm trying to understand the following: > > 1. What does a generator generate? Trying my little understanding. A generator generates a series of results if you ask for it. That's done by a resume call (generator, resume your computation), and the generate continues until he either comes to a suspend (return a value, but be prepared to continue from here) or it does a final return. > 2. Clearly, what's the difference between a generator and a thread? Threads can be scheduled automatically, and they don't return values to each other, natively. Generators are asymmetric to their callers, they're much like functions. Coroutines are more symmetric. They "return" to each other values. They are not determined as caller and callee, but they cooperate on the same level. Therefore, threads and coroutines look more similar, just that coroutines usually are'nt scheduled automatically. Add a scheduler, don't pass values, and you have threads, nearly. (of course I dropped the I/O blocking stuff which doesn't apply and isn't the intent of fake threads). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Tue Jul 13 14:53:52 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 13 Jul 1999 08:53:52 -0400 Subject: [Python-Dev] End of the line In-Reply-To: Your message of "Tue, 13 Jul 1999 02:47:43 EDT." <000001beccfb$9beb4fa0$2f9e2299@tim> References: <000001beccfb$9beb4fa0$2f9e2299@tim> Message-ID: <199907131253.IAA10730@eric.cnri.reston.va.us> > The latest versions of the Icon language (9.3.1 & beyond) sprouted an > interesting change in semantics: if you open a file for reading in > "translated" (text) mode now, it normalizes Unix, Mac and Windows line > endings to plain \n. Writing in text mode still produces what's natural for > the platform. > > Anyone think that's *not* a good idea? I've been thinking about this myself -- exactly what I would do. Not clear how easy it is to implement (given that I'm not so enthused about the idea of rewriting the entire I/O system without using stdio -- see archives). The implementation must be as fast as the current one -- people used to complain bitterly when readlines() or read() where just a tad slower than they *could* be. There's a lookahead of 1 character needed -- ungetc() might be sufficient except that I think it's not guaranteed to work on unbuffered files. Should also do this for the Python parser -- there it would be a lot easier. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Tue Jul 13 16:41:25 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 13 Jul 1999 10:41:25 -0400 (EDT) Subject: [Python-Dev] Python bugs database started References: <199907122004.QAA09348@eric.cnri.reston.va.us> <000701becce4$a973c920$31a02299@tim> Message-ID: <14219.20501.697542.358579@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> The first time I submitted a bug, I backed up to the entry TP> page and hit Refresh to get the category counts updated (never TP> saw Jitterbug before, so must play!). IE5 whined about TP> something-or-other being out of date, and would I like to TP> "repost the data"? I said sure. This makes perfect sense, and explains exactly what's going on. Let's call it "poor design"[1] instead of "user error". A quick scan last night of the Jitterbug site shows no signs of fixes or workarounds. What would Jitterbug have to do to avoid these kinds of problems? Maybe keep a checksum of the current submission and check it against the next one to make sure it's not a re-submit. Maybe a big warning sign reading "Do not repost this form!" Hmm. I think I'll complain on the Jitterbug mailing list. -Barry [1] In the midst of re-reading D. Norman's "The Design of Everyday Things", otherwise I would have said you guys were just incompetent Webweenies :) From da at ski.org Tue Jul 13 18:01:55 1999 From: da at ski.org (David Ascher) Date: Tue, 13 Jul 1999 09:01:55 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Python bugs database started In-Reply-To: <14219.20501.697542.358579@anthem.cnri.reston.va.us> Message-ID: On Tue, 13 Jul 1999, Barry A. Warsaw wrote: > > This makes perfect sense, and explains exactly what's going on. Let's > call it "poor design"[1] instead of "user error". A quick scan last > night of the Jitterbug site shows no signs of fixes or workarounds. > What would Jitterbug have to do to avoid these kinds of problems? > Maybe keep a checksum of the current submission and check it against > the next one to make sure it's not a re-submit. That's be good -- alternatively, insert a 'safe' CGI script after the validation -- "Thanks for submitting the bug. Click here to go back to the home page". From guido at CNRI.Reston.VA.US Tue Jul 13 18:09:48 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 13 Jul 1999 12:09:48 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: Your message of "Tue, 13 Jul 1999 09:01:55 PDT." References: Message-ID: <199907131609.MAA11208@eric.cnri.reston.va.us> > That's be good -- alternatively, insert a 'safe' CGI script after the > validation -- "Thanks for submitting the bug. Click here to go back to > the home page". That makes a lot of sense! I'm now quite sure that I had the same "Repost form data?" experience, and just didn't realized that mattered, because I was staring at the part of the form that was showing the various folders. The Jitterbug software is nice for tracking bugs, but its user interface *SUCKS*. I wish I had the time to redseign that part -- unfortunately it's probably totally integrated with the rest of the code... --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at CNRI.Reston.VA.US Tue Jul 13 18:19:26 1999 From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw) Date: Tue, 13 Jul 1999 12:19:26 -0400 (EDT) Subject: [Python-Dev] Python bugs database started References: <199907131609.MAA11208@eric.cnri.reston.va.us> Message-ID: <14219.26382.122095.608613@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> The Jitterbug software is nice for tracking bugs, but its Guido> user interface *SUCKS*. I wish I had the time to redseign Guido> that part -- unfortunately it's probably totally integrated Guido> with the rest of the code... There is an unsupported fork that some guy did that totally revamped the interface: http://lists.samba.org/listproc/jitterbug/0095.html Still not great tho'. -Barry From MHammond at skippinet.com.au Wed Jul 14 04:25:50 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 14 Jul 1999 12:25:50 +1000 Subject: [Python-Dev] Interrupting a thread Message-ID: <006d01becda0$318035e0$0801a8c0@bobcat> Ive struck this a number of times, and the simple question is "can we make it possible to interrupt a thread without the thread's knowledge" or otherwise stated "how can we asynchronously raise an exception in another thread?" The specific issue is that quite often, I find it necessary to interrupt one thread from another. One example is Pythonwin - rather than use the debugger hooks as IDLE does, I use a secondary thread. But how can I use that thread to interrupt the code executing in the first? (With magic that only works sometimes is how :-) Another example came up on the newsgroup recently - discussion about making Medusa a true Windows NT Service. A trivial solution would be to have a "service thread", that simply runs Medusa's loop in a seperate thread. When the "service thread" recieves a shut-down request from NT, how can it interrupt Medusa? I probably should not have started with a Medusa example - it may have a solution. Pretend I said "any arbitary script written to run similarly to a Unix daemon". There are one or 2 other cases where I have wanted to execute existing code that assumes it runs stand-alone, and can really only be stopped with a KeyboardInterrupt. I can't see a decent way to do this. [I guess this ties into the "signals and threads" limitations - I believe you cant direct signals at threads either?] Is it desirable? Unfortunately, I can see that it might be hard :-( But-sounds-pretty-easy-under-those-fake-threads-ly, Mark. From tim_one at email.msn.com Wed Jul 14 05:56:20 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 23:56:20 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907131154.MAA22698@pukapuka.inrialpes.fr> Message-ID: <000d01becdac$d4dee900$7d9e2299@tim> [Vladimir Marangozov] > Yes. I'm trying to understand the following: > > 1. What does a generator generate? Any sequence of objects: the lines in a file, the digits of pi, a postorder traversal of the nodes of a binary tree, the files in a directory, the machines on a LAN, the critical bugs filed before 3/1/1995, the set of builtin types, all possible ways of matching a regexp to a string, the 5-card poker hands beating a pair of deuces, ... anything! Icon uses the word "generators", and it's derived from that language's ubiquitous use of the beasts to generate paths in a backtracking search space. In OO languages it may be better to name them "iterators", after the closest common OO concept. The CLU language had full-blown (semi-coroutine, like Icon generators) iterators 20 years ago, and the idea was copied & reinvented by many later languages. Sather is probably the best known of those, and also calls them iterators. > 2. Clearly, what's the difference between a generator and a thread? If you can clearly explain what "a thread" is, I can clearly explain the similarities and differences. Well? I'm holding my breath here . Generators/iterators are simpler than threads, whether looked at from a user's viewpoint or an implementor's. Their semantics are synchronous and deterministic. Python's for/__getitem__ protocol *is* an iterator protocol already, but if I ask you which is the 378th 5-card poker hand beating a pair of deuces, and ask you a new question like that every hour, you may start to suspect there may be a better way to *approach* coding enumerations in general . then-again-there-may-not-be-ly y'rs - tim From tim_one at email.msn.com Wed Jul 14 05:56:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 23:56:15 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <199907131253.IAA10730@eric.cnri.reston.va.us> Message-ID: <000c01becdac$d2ad6300$7d9e2299@tim> [Tim] > ... Icon ... sprouted an interesting change in semantics: if you open > a file for reading in ...text mode ... it normalizes Unix, Mac and > Windows line endings to plain \n. Writing in text mode still produces > what's natural for the platform. [Guido] > I've been thinking about this myself -- exactly what I would do. Me too . > Not clear how easy it is to implement (given that I'm not so enthused > about the idea of rewriting the entire I/O system without using stdio > -- see archives). The Icon implementation is very simple: they *still* open the file in stdio text mode. "What's natural for the platform" on writing then comes for free. On reading, libc usually takes care of what's needed, and what remains is to check for stray '\r' characters that stdio glossed over. That is, in fileobject.c, replacing if ((*buf++ = c) == '\n') { if (n < 0) buf--; break; } with a block like (untested!) *buf++ = c; if (c == '\n' || c == '\r') { if (c == '\r') { *(buf-1) = '\n'; /* consume following newline, if any */ c = getc(fp); if (c != '\n') ungetc(c, fp); } if (n < 0) buf--; break; } Related trickery needed in readlines. Of course the '\r' business should be done only if the file was opened in text mode. > The implementation must be as fast as the current one -- people used > to complain bitterly when readlines() or read() where just a tad > slower than they *could* be. The above does add one compare per character. Haven't timed it. readlines may be worse. BTW, people complain bitterly anyway, but it's in comparison to Perl text mode line-at-a-time reads! D:\Python>wc a.c 1146880 3023873 25281537 a.c D:\Python> Reading that via def g(): f = open("a.c") while 1: line = f.readline() if not line: break and using python -O took 51 seconds. Running the similar Perl (although it's not idiomatic Perl to assign each line to an explict var, or to test that var in the loop, or to use "if !" instead of "unless" -- did all those to make it more like the Python): open(DATA, ") {last if ! $line;} took 17 seconds. So when people are complaining about a factor of 3, I'm not inclined to get excited about a few percent . > There's a lookahead of 1 character needed -- ungetc() might be > sufficient except that I think it's not guaranteed to work on > unbuffered files. Don't believe I've bumped into that. *Have* bumped into problems with ungetc not playing nice with fseek/ftell, and that's probably enough to kill it right there (alas). > Should also do this for the Python parser -- there it would be a lot > easier. And probably the biggest bang for the buck. the-problem-with-exposing-libc-is-that-libc-isn't-worth-exposing Message-ID: <007401becdb6$22445c80$0801a8c0@bobcat> I asked Guido to provide comments on one of the chapters in our book: I was discussing appending the mode ("t" or "b") to the open() call > p.10, bottom: text mode is the default -- I've never seen the 't' > option described! (So even if it exists, better be silent about it.) > You need to append 'b' to get binary mode instead. This brings up an interesting issue. MSVC exposes a global variable that contains the default mode - ie, you can change the default to binary. (_fmode for those with the docs) This has some implications and questions: * Will Guido ever bow to pressure (when it arrives :) to expose this via the "msvcrt" module? I can imagine where it may be useful in a limited context. A reasonable argument would be that, like _setmode and other MS specific stuff, if it exists it should be exposed. * But even if not, due to the shared CRTL, in COM and other worlds we really cant predict what the default is. Although Python does not touch it, that does not stop someone else touching it. A web-server built using MSVC on Windows may use it? Thus, it appears that to be 100% sure what mode you are using, you should not rely on the default, but should _always_ use "b" or "t" on the file mode. Any thoughts or comments? The case for abandoning the CRTL's text mode gets stronger and stronger! Mark. From tim_one at email.msn.com Wed Jul 14 08:35:31 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 02:35:31 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <000901becdc3$119be9e0$a09e2299@tim> [Mark Hammond] > Ive struck this a number of times, and the simple question is "can we > make it possible to interrupt a thread without the thread's knowledge" > or otherwise stated "how can we asynchronously raise an exception in > another thread?" I don't think there's any portable way to do this. Even restricting the scope to Windows, forget Python for a moment: can you do this reliably with NT threads from C, availing yourself of every trick in the SDK? Not that I know of; not without crafting a new protocol that the targeted threads agree to in advance. > ... > But-sounds-pretty-easy-under-those-fake-threads-ly, Yes, piece o' cake! Fake threads can do anything, because unless we write every stick of their implementation they can't do anything at all . odd-how-solutions-create-more-problems-than-they-solve-ly y'rs - tim From tim_one at email.msn.com Wed Jul 14 08:35:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 02:35:33 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <007401becdb6$22445c80$0801a8c0@bobcat> Message-ID: <000a01becdc3$12d94be0$a09e2299@tim> [Mark Hammond] > ... > MSVC exposes a global variable that contains the default [fopen] mode - > ie, you can change the default to binary. (_fmode for those with the > docs) > > This has some implications and questions: > * Will Guido ever bow to pressure (when it arrives :) to expose this via > the "msvcrt" module? No. It changes the advertised semantics of Python builtins, and no option ever does that. If it went in at all, it would have to be exposed as a Python-level feature that changed the semantics similarly on all platforms -- and even then Guido wouldn't put it in . > ... > Thus, it appears that to be 100% sure what mode you are using, you should > not rely on the default, but should _always_ use "b" or "t" on the file > mode. And on platforms that have libc options to treat "t" as if it were "b"? There's no limit to how perverse platform options can get! There's no fully safe ground to stand on, so Python stands on the minimal guarantees libc provides. If a user violates those, tough, they can't use Python. Unless, of course, they contribute a lot of money to the PSA . > ... > Any thoughts or comments? The case for abandoning the CRTL's text mode > gets stronger and stronger! C's text mode is, alas, a bad joke. The only thing worse is Microsoft's half-assed implementation of it <0.5 wink>. ctrl-z-=-eof-even-gets-in-the-way-under-windows!-ly y'rs - tim From MHammond at skippinet.com.au Wed Jul 14 08:58:25 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 14 Jul 1999 16:58:25 +1000 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <000901becdc3$119be9e0$a09e2299@tim> Message-ID: <007e01becdc6$45982490$0801a8c0@bobcat> > I don't think there's any portable way to do this. Even > restricting the > scope to Windows, forget Python for a moment: can you do > this reliably with > NT threads from C, availing yourself of every trick in the > SDK? Not that I Nope - not if I forget Python. However, when I restrict myself _to_ Python, I find this nice little ceval.c loop and nice little per-thread structures - even with nice-looking exception place-holders ;-) Something tells me that it wont be quite as easy as filling these in (while you have the lock, of course!), but it certainly seems far more plausible than if we consider it a C problem :-) > odd-how-solutions-create-more-problems-than-they-solve-ly y'rs - tim Only because they often open your eyes to a whole new class of problem . Continuations/generators/co-routines (even threads themselves!) would appear to be a good example - for all their power, I shudder to think at the number of questions they will generate! If I understand correctly, it is a recognised deficiency WRT signals and threads - so its all Guido's fault for adding these damn threads in the first place :-) just-more-proof-there-is-no-such-thing-as-a-free-lunch-ly, Mark. From jack at oratrix.nl Wed Jul 14 10:07:59 1999 From: jack at oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 10:07:59 +0200 Subject: [Python-Dev] Python bugs database started In-Reply-To: Message by Guido van Rossum , Tue, 13 Jul 1999 12:09:48 -0400 , <199907131609.MAA11208@eric.cnri.reston.va.us> Message-ID: <19990714080759.D49B2303120@snelboot.oratrix.nl> > The Jitterbug software is nice for tracking bugs, but its user > interface *SUCKS*. I wish I had the time to redseign that part -- > unfortunately it's probably totally integrated with the rest of the > code... We looked into bug tracking systems recently, and basically they all suck. We went with gnats in the end, but it has pretty similar problems on the GUI side. But maybe we could convince some people with too much time on their hands to do a Python bug reporting system:-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jack at oratrix.nl Wed Jul 14 10:21:16 1999 From: jack at oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 10:21:16 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Tim Peters" , Tue, 13 Jul 1999 23:56:15 -0400 , <000c01becdac$d2ad6300$7d9e2299@tim> Message-ID: <19990714082116.6DE96303120@snelboot.oratrix.nl> > The Icon implementation is very simple: they *still* open the file in stdio > text mode. "What's natural for the platform" on writing then comes for > free. On reading, libc usually takes care of what's needed, and what > remains is to check for stray '\r' characters that stdio glossed over. This'll work for Unix and PC conventions, but not for the Mac. Mac end of line is \r, so reading a line from a mac file on unix will give you the whole file. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tismer at appliedbiometrics.com Wed Jul 14 14:13:10 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 14 Jul 1999 14:13:10 +0200 Subject: [Python-Dev] Interrupting a thread References: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <378C7ED6.F0DB4E6E@appliedbiometrics.com> Mark Hammond wrote: ... > Another example came up on the newsgroup recently - discussion about making > Medusa a true Windows NT Service. A trivial solution would be to have a > "service thread", that simply runs Medusa's loop in a seperate thread. Ah, thanks, that was what I'd like to know :-) > When the "service thread" recieves a shut-down request from NT, how can it > interrupt Medusa? Very simple. I do this shutdown stuff already, at a user request. Medusa has its polling loop which is so simple (wait until a timeout, then run again) that I pulled it out of Medusa, and added a polling function. I have even simulated timer objects by this, which do certain tasks from time to time (at the granularity of the loop of course). One of these looks if there is a global object in module __main__ with a special name which is executable. This happens to be the shutdown, which may be injected by another thread as well. I can send you an example. > I probably should not have started with a Medusa example - it may have a > solution. Pretend I said "any arbitary script written to run similarly to > a Unix daemon". There are one or 2 other cases where I have wanted to > execute existing code that assumes it runs stand-alone, and can really only > be stopped with a KeyboardInterrupt. I can't see a decent way to do this. Well, yes, I would want to have this too, and see also no way. > [I guess this ties into the "signals and threads" limitations - I believe > you cant direct signals at threads either?] > > Is it desirable? Unfortunately, I can see that it might be hard :-( > > But-sounds-pretty-easy-under-those-fake-threads-ly, You mean you would catch every signal in the one thread, and redirect it to the right fake thread. Given exactly two real threads, one always sitting waiting in a multiple select, the other running any number of fake threads. Would this be enough to do everything which is done with threads today? maybe-almost-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Wed Jul 14 14:24:53 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 08:24:53 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: Your message of "Wed, 14 Jul 1999 15:10:38 +1000." <007401becdb6$22445c80$0801a8c0@bobcat> References: <007401becdb6$22445c80$0801a8c0@bobcat> Message-ID: <199907141224.IAA12211@eric.cnri.reston.va.us> > I asked Guido to provide comments on one of the chapters in our book: > > I was discussing appending the mode ("t" or "b") to the open() call > > > p.10, bottom: text mode is the default -- I've never seen the 't' > > option described! (So even if it exists, better be silent about it.) > > You need to append 'b' to get binary mode instead. In addition, 't' probably isn't even supported on many Unix systems! > This brings up an interesting issue. > > MSVC exposes a global variable that contains the default mode - ie, you can > change the default to binary. (_fmode for those with the docs) The best thing to do with this variable is to ignore it. In large programs like Python that link together pieces of code that never ever heard about each other, making global changes to the semantics of standard library functions is a bad thing. Code that sets it or requires you to set it is broken. > This has some implications and questions: > * Will Guido ever bow to pressure (when it arrives :) to expose this via > the "msvcrt" module? I can imagine where it may be useful in a limited > context. A reasonable argument would be that, like _setmode and other MS > specific stuff, if it exists it should be exposed. No. (And I've never bought that argument before -- I always use "is there sufficient need and no other way.") > * But even if not, due to the shared CRTL, in COM and other worlds we > really cant predict what the default is. Although Python does not touch > it, that does not stop someone else touching it. A web-server built using > MSVC on Windows may use it? But would be stupid for it to do so, and I would argue that the web server was broken. Since they should know better than this, I doubt they do this (this option is more likely to be used in small, self-contained programs). Until you find a concrete example, let's ignore the possibility. > Thus, it appears that to be 100% sure what mode you are using, you should > not rely on the default, but should _always_ use "b" or "t" on the file > mode. Stop losing sleep over it. > Any thoughts or comments? The case for abandoning the CRTL's text mode > gets stronger and stronger! OK, you write the code :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Wed Jul 14 15:03:07 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Wed, 14 Jul 1999 09:03:07 -0400 (EDT) Subject: [Python-Dev] Python bugs database started In-Reply-To: <19990714080759.D49B2303120@snelboot.oratrix.nl> References: <199907131609.MAA11208@eric.cnri.reston.va.us> <19990714080759.D49B2303120@snelboot.oratrix.nl> Message-ID: <14220.35467.644552.307210@amarok.cnri.reston.va.us> Jack Jansen writes: >But maybe we could convince some people with too much time on their hands to >do a Python bug reporting system:-) Digicool has a relatively simple bug tracking system for Zope which you can try out at http://www.zope.org/Collector/ . -- A.M. Kuchling http://starship.python.net/crew/amk/ I'm going to dance now, I'm afraid. -- Ishtar ends it all, in SANDMAN #45: "Brief Lives:5" From gmcm at hypernet.com Wed Jul 14 16:02:22 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 14 Jul 1999 09:02:22 -0500 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <007401becdb6$22445c80$0801a8c0@bobcat> References: <199907121650.MAA06687@eric.cnri.reston.va.us> Message-ID: <1280165369-10624337@hypernet.com> [Mark] > I asked Guido to provide comments on one of the chapters in our > book: > > I was discussing appending the mode ("t" or "b") to the open() call [Guido] > > p.10, bottom: text mode is the default -- I've never seen the 't' > > option described! (So even if it exists, better be silent about it.) > > You need to append 'b' to get binary mode instead. I hadn't either, until I made the mistake of helping Mr took-6-exchanges-before-he-used-the-right-DLL Embedder, who used it in his code. Certainly not mentioned in man fopen on my Linux box. > This brings up an interesting issue. > > MSVC exposes a global variable that contains the default mode - ie, > you can change the default to binary. (_fmode for those with the > docs) Mentally prepend another underscore. This is something for that other p-language. >... The case for abandoning the CRTL's text > mode gets stronger and stronger! If you're tying this in with Tim's Icon worship, note that in these days of LANS, the issue is yet more complex. It would be dandy if I could read text any old text file and have it look sane, but I may be writing it to a different machine without any way of knowing that. When I bother to manipulate these things, I usually choose to use *nix style text files. But I don't deal with Macs, and the only common Windows tool that can't deal with plain \n is Notepad. and-stripcr.py-is-everywhere-available-on-my-Linux-box-ly y'rs - Gordon From guido at CNRI.Reston.VA.US Wed Jul 14 17:05:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 11:05:04 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: Your message of "Wed, 14 Jul 1999 12:25:50 +1000." <006d01becda0$318035e0$0801a8c0@bobcat> References: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <199907141505.LAA12313@eric.cnri.reston.va.us> > Ive struck this a number of times, and the simple question is "can we make > it possible to interrupt a thread without the thread's knowledge" or > otherwise stated "how can we asynchronously raise an exception in another > thread?" > > The specific issue is that quite often, I find it necessary to interrupt > one thread from another. One example is Pythonwin - rather than use the > debugger hooks as IDLE does, I use a secondary thread. But how can I use > that thread to interrupt the code executing in the first? (With magic that > only works sometimes is how :-) > > Another example came up on the newsgroup recently - discussion about making > Medusa a true Windows NT Service. A trivial solution would be to have a > "service thread", that simply runs Medusa's loop in a seperate thread. > When the "service thread" recieves a shut-down request from NT, how can it > interrupt Medusa? > > I probably should not have started with a Medusa example - it may have a > solution. Pretend I said "any arbitary script written to run similarly to > a Unix daemon". There are one or 2 other cases where I have wanted to > execute existing code that assumes it runs stand-alone, and can really only > be stopped with a KeyboardInterrupt. I can't see a decent way to do this. > > [I guess this ties into the "signals and threads" limitations - I believe > you cant direct signals at threads either?] > > Is it desirable? Unfortunately, I can see that it might be hard :-( > > But-sounds-pretty-easy-under-those-fake-threads-ly, Hmm... Forget about signals -- they're twisted Unixisms (even if they are nominally supported on NT). The interesting thing is that you can interrupt the "main" thread easily (from C) using Py_AddPendingCall() -- this registers a function that will be invoked by the main thread the next time it gets to the top of the VM loop. But the mechanism here was designed with a specific purpose in mind, and it doesn't allow you to aim at a specific thread -- it only works for the main thread. It might be possible to add an API that allows you to specify a thread id though... Of course if the thread to be interrupted is blocked waiting for I/O, this is not going to interrupt the I/O. (On Unix, that's what signals do; is there an equivalent on NT? I don't think so.) Why do you say that your magic only works sometimes? You mailed me your code once and the Python side of it looks okay to me: it calls PyErr_SetInterrupt(), which calls Py_AddPendingCall(), which is threadsafe. Of course it only works if the thread you try to interrupt is recognized by Python as the main thread -- perhaps this is not always under your control, e.g. when COM interferes? Where is this going? Is the answer "provide a C-level API like Py_AddPendingCall() that takes a thread ID" good enough? Note that for IDLE, I have another problem -- how to catch the ^C event when Tk is processing events? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Wed Jul 14 17:42:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 11:42:14 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <19990714082116.6DE96303120@snelboot.oratrix.nl> Message-ID: <000101bece0f$72095c80$f7a02299@tim> [Tim] > On reading, libc usually takes care of what's needed, and what > remains is to check for stray '\r' characters that stdio glossed over. [Jack Jansen] > This'll work for Unix and PC conventions, but not for the Mac. > Mac end of line is \r, so reading a line from a mac file on unix will > give you the whole file. I don't see how. Did you look at the code I posted? It treats '\r' the same as '\n', except that when it sees an '\r' it eats a following '\n' (if any) too, and replaces the '\r' with '\n' regardless. Maybe you're missing that Python reads lines one character at a time? So e.g. the behavior of the platform libc fgets is irrelevant. From tim_one at email.msn.com Wed Jul 14 17:53:46 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 11:53:46 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <007e01becdc6$45982490$0801a8c0@bobcat> Message-ID: <000301bece11$0f0f9d40$f7a02299@tim> [Tim sez there's no portable way to violate another thread "even in C"] [Mark Hammond] > Nope - not if I forget Python. However, when I restrict myself _to_ > Python, I find this nice little ceval.c loop and nice little per-thread > structures - even with nice-looking exception place-holders ;-) Good point! Python does have its own notion of threads. > Something tells me that it wont be quite as easy as filling these > in (while you have the lock, of course!), but it certainly seems far > more plausible than if we consider it a C problem :-) Adding a scheme that builds on the global lock and Python-controlled thread switches may not be prudent if your life's goal is to make Python free-threaded . But if "if you can't beat 'em, join 'em" rules the day, making Py_AddPendingCall thread safe, adding a target thread argument, and fleshing out the XXX Darn! With the advent of thread state, we should have an array of pending calls per thread in the thread state! Later... comment before it, could go a long way toward facilitating groping in the back seat of dad's car . cheaper-than-renting-a-motel-room-for-sure-ly y'rs - tim From jack at oratrix.nl Wed Jul 14 17:53:36 1999 From: jack at oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 17:53:36 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Tim Peters" , Wed, 14 Jul 1999 11:42:14 -0400 , <000101bece0f$72095c80$f7a02299@tim> Message-ID: <19990714155336.94DA8303120@snelboot.oratrix.nl> > [Jack Jansen] > > This'll work for Unix and PC conventions, but not for the Mac. > > Mac end of line is \r, so reading a line from a mac file on unix will > > give you the whole file. > [...] > > Maybe you're missing that Python reads lines one character at a time? So > e.g. the behavior of the platform libc fgets is irrelevant. You're absolutely right... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Wed Jul 14 18:15:12 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 12:15:12 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: Your message of "Wed, 14 Jul 1999 09:03:07 EDT." <14220.35467.644552.307210@amarok.cnri.reston.va.us> References: <199907131609.MAA11208@eric.cnri.reston.va.us> <19990714080759.D49B2303120@snelboot.oratrix.nl> <14220.35467.644552.307210@amarok.cnri.reston.va.us> Message-ID: <199907141615.MAA12513@eric.cnri.reston.va.us> > Digicool has a relatively simple bug tracking system for Zope which > you can try out at http://www.zope.org/Collector/ . I asked, and Collector is dead -- but the new offering (Tracker) isn't ready for prime time yet. I'll suffer through Jitterbug until Tracker is out of beta (the first outsider who submitted a bug also did the Reload thing :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Wed Jul 14 18:14:47 1999 From: da at ski.org (David Ascher) Date: Wed, 14 Jul 1999 09:14:47 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Interrupting a thread In-Reply-To: <1280166671-10546045@hypernet.com> Message-ID: On Wed, 14 Jul 1999, Gordon McMillan wrote: a reply to the python-dev thread on python-list. You didn't really intend to do that, did you Gordon? =) --david From tim_one at email.msn.com Thu Jul 15 06:21:10 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 00:21:10 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000901becce4$ac88aa40$31a02299@tim> Message-ID: <000001bece79$77edf240$51a22299@tim> Just so Guido doesn't feel like the quesion is being ignored : > ... > How does Scheme do this? [continuations] One more reference here. Previously sketched Wilson's simple heap implementation and Dybvig's simple stack one. They're easy to understand, but are (heap) slow all the time, or (stack) fast most of the time but horribly slow in some cases. For the other extreme end of things, check out: Representing Control in the Presence of First-Class Continuations Robert Hieb, R. Kent Dybvig, and Carl Bruggeman PLDI, June 1990 http://www.cs.indiana.edu/~dyb/papers/stack.ps In part: In this paper we show how stacks can be used to implement activation records in a way that is compatible with continuation operations, multiple control threads, and deep recursion. Our approach allows a small upper bound to be placed on the cost of continuation operations and stack overflow and underflow recovery. ... ordinary procedure calls and returns are not adversely affected. ... One important feature of our method is that the stack is not copied when a continuation is captured. Consequently, capturing a continuation is very efficient, and objects that are known to have dynamic extent can be stack? allocated and modified since they remain in the locations in which they were originally allocated. By copying only a small portion of the stack when a continuation is reinstated, reinstatement costs are bounded by a small constant. The basic gimmick is a segmented stack, where large segments are heap-allocated and each contains multiple contiguous frames (across their code base, only 1% of frames exceeded 30 machine words). But this is a complicated approach, best suited for industrial-strength native-code compilers (speed at any cost -- the authors go thru hell to save an add here, a pointer store there, etc). At least at the time the paper was written, it was the approach implemented by Dybvig's Chez Scheme (a commercial native-code Scheme compiler noted for high speed). Given that Python allocates frames from the heap, I doubt there's a much faster approach than the one Christian has crafted out of his own sweat and blood! It's worth a paper of its own. or-at-least-two-hugs-ly y'rs - tim From tim_one at email.msn.com Thu Jul 15 09:00:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 03:00:14 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <199907141224.IAA12211@eric.cnri.reston.va.us> Message-ID: <000301bece8f$b0dd7060$51a22299@tim> >> I was discussing appending the mode ("t" or "b") to the open() call > In addition, 't' probably isn't even supported on many Unix systems! 't' is not ANSI C, so there's no guarantee that it's portable. Hate to say it, but Python should really strip t out before passing a mode string to fopen! From tim_one at email.msn.com Thu Jul 15 09:00:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 03:00:18 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <1280165369-10624337@hypernet.com> Message-ID: <000401bece8f$b2810e40$51a22299@tim> [Mark] >> ... The case for abandoning the CRTL's text mode gets stronger >> and stronger! [Gordon] > If you're tying this in with Tim's Icon worship, Icon inherits stdio behavior-- for the most part --too. It does define its own mode string characters, though (like "t" for translated and "u" for untranslated); Icon has been ported to platforms that can't even spell libc, let alone support it. > note that in these days of LANS, the issue is yet more complex. It would > be dandy if I could read text any old text file and have it look sane, but > I may be writing it to a different machine without any way of knowing that. So where's the problem? No matter *what* machine you end up on, Python could read the thing fine. Or are you assuming some fantasy world in which people sometimes run software other than Python ? Caveat: give the C std a close reading. It guarantees much less about text mode than anyone who hasn't studied it would believe; e.g., text mode doesn't guarantee to preserve chars with the high bit set, or most control chars either (MS's treatment of CTRL-Z as EOF under text mode conforms to the std!). Also doesn't guarantee to preserve a line-- even if composed of nothing but printable chars --if it's longer than 509(!) characters. That's what I mean when I say stdio's text mode is a bad joke. > When I bother to manipulate these things, I usually choose to use > *nix style text files. But I don't deal with Macs, and the only > common Windows tool that can't deal with plain \n is Notepad. I generally create text files in binary mode, faking the \n convention by hand. Of course, I didn't do this before I became a Windows Guy <0.5 wink>. > and-stripcr.py-is-everywhere-available-on-my-Linux-box-ly y'rs A plug for my linefix.py (Python FTP contrib, under System), which converts among Unix/Windows/Mac in any direction (by default, from any to Unix). who-needs-linux-when-there's-a-python-in-the-window-ly y'rs - tim From MHammond at skippinet.com.au Thu Jul 15 09:16:32 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Thu, 15 Jul 1999 17:16:32 +1000 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <000301bece8f$b0dd7060$51a22299@tim> Message-ID: <000801bece91$f80576c0$0801a8c0@bobcat> > 't' is not ANSI C, so there's no guarantee that it's > portable. Hate to say > it, but Python should really strip t out before passing a > mode string to > fopen! OK - thanks all - it is clear that this MS aberration is not, and never will be supported by Python. Not being a standards sort of guy I must admit I assumed both the "t" and "b" were standards. Thanks for the clarifications! Mark. From gstein at lyra.org Thu Jul 15 09:15:20 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 15 Jul 1999 00:15:20 -0700 Subject: [Python-Dev] RE: Python on Windows chapter. References: <000301bece8f$b0dd7060$51a22299@tim> Message-ID: <378D8A88.583A4DBF@lyra.org> Tim Peters wrote: > > >> I was discussing appending the mode ("t" or "b") to the open() call > > > In addition, 't' probably isn't even supported on many Unix systems! > > 't' is not ANSI C, so there's no guarantee that it's portable. Hate to say > it, but Python should really strip t out before passing a mode string to > fopen! Should we also filter the socket type when creating sockets? Or the address family? What if I pass "bamboozle" as the fopen mode? Should that become "bab" after filtering? Oh, but what about those two "b" characters? Maybe just reduce it to one? We also can't forget to filter chmod() arguments... can't have unknown bits set. etc etc In other words, I think the idea of "stripping out the t" is bunk. Python is not fatherly. It gives you the rope and lets you figure it out for yourself. You should know that :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Thu Jul 15 10:59:56 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 04:59:56 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <378D8A88.583A4DBF@lyra.org> Message-ID: <000001becea0$69df8ca0$aea22299@tim> [Tim] > 't' is not ANSI C, so there's no guarantee that it's portable. > Hate to say it, but Python should really strip t out before passing > a mode string to fopen! [Greg Stein] > Should we also filter the socket type when creating sockets? Or the > address family? Filtering 't' is a matter of increasing portability by throwing out an option that doesn't do anything on the platforms that accept it, yet can cause a program to die on platforms that don't -- despite that it says nothing. So it's helpful to toss it, not restrictive. > What if I pass "bamboozle" as the fopen mode? Should that become "bab" > after filtering? Oh, but what about those two "b" characters? Those go far beyond what I suggested, Greg. Even so , it would indeed help a great many non-C programmers if Python defined the mode strings it accepts & barfed on others by default. The builtin open is impossible for a non-C weenie to understand from the docs (as a frustrated sister delights in reminding me). It should be made friendlier. Experts can use a new os.fopen if they need to pass "bamboozle"; fine by me; I do think the builtins should hide as much ill-defined libc crap as possible (btw, "open" is unique in this respect). > Maybe just reduce it to one? We also can't forget to filter chmod() > arguments... can't have unknown bits set. I at least agree that chmod has a miserable UI . > etc etc > > In other words, I think the idea of "stripping out the t" is bunk. > Python is not fatherly. It gives you the rope and lets you figure it out > for yourself. You should know that :-) So should Mark -- but we have his testimony that, like most other people, he has no idea what's "std C" and what isn't. In this case he should have noticed that Python's "open" docs don't admit to "t"'s existence either, but even so I see no reason to take comfort in the expectation that he'll eventually be hanged for this sin. ypu-i'd-rather-"open"-died-when-passed-"t"-ly y'rs - tim From guido at cnri.reston.va.us Fri Jul 16 00:29:54 1999 From: guido at cnri.reston.va.us (Guido van Rossum) Date: 15 Jul 1999 18:29:54 -0400 Subject: [Python-Dev] ISPs and Python Message-ID: <5lu2r5czrx.fsf@eric.cnri.reston.va.us> Remember the days when the big problem was to find an ISP who would install Python? Apparently that problem has gone away... The problem is now to get one that installs a decent set of Python extensions :-) See attached c.l.py post. This is similar to the evolution of Python's name recognition -- used to be, managers would say "what's Python?"; then they said "nobody else uses Python"; now presumably they will have to make up some kind ad-hoc no-Python company policy :-) --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Start of forwarded message ------- From: Sim & Golda Zacks Newsgroups: comp.lang.python Subject: Re: htmllib, cgi, HTMLfmt, genCGI, HTMLgen, html, Zope, ... Date: Wed, 14 Jul 1999 00:00:25 -0400 Organization: ExecPC Internet - Milwaukee, WI Message-ID: <7mh1qu$c6m at newsops.execpc.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I am in the exact same situation as you are. I am a web programmer and I'm trying to implement the CGI and database stuff with Python. I am using the HTMLFMT module from the INTERNET PROGRAMMING book and the cgi module from the standard library. What the HTMLFMT library does for you is just that you don't have to type in all the tags, basically it's nothing magical, if I didn't have it I would have to make something up and it probably wouldn't be half as good. the standard cgi unit gives you all the fields from the form, and I haven't looked at the cgi modules from the book yet to see if they give me any added benefit. The big problem I came across was my web host, and all of the other ones I talked to, refused to install the mysql interface to Python, and it has to be included in the build (or something like that) So I just installed gadfly, which seems to be working great for me right now. I'm still playing with it not in production yet. I have no idea what ZOPE does, but everyone who talks about it seems to love it. Hope this helps Sim Zacks [...] ------- End of forwarded message ------- From mhammond at skippinet.com.au Fri Jul 16 01:21:40 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Fri, 16 Jul 1999 09:21:40 +1000 Subject: [Python-Dev] ISPs and Python In-Reply-To: <5lu2r5czrx.fsf@eric.cnri.reston.va.us> Message-ID: <001001becf18$cb850610$0801a8c0@bobcat> > Remember the days when the big problem was to find an ISP who would > install Python? Apparently that problem has gone away... The problem > is now to get one that installs a decent set of Python extensions :-) he he. Yes, hence I believe the general agreement exists that we should begin to focus on these more external issues than the language itself. Pity we all agree, but are still such hackers :-) > looked at the cgi modules from the book yet to see if they > give me any added > benefit. The big problem I came across was my web host, and > all of the other From mal at lemburg.com Fri Jul 16 09:44:20 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 16 Jul 1999 09:44:20 +0200 Subject: [Python-Dev] ISPs and Python References: <001001becf18$cb850610$0801a8c0@bobcat> Message-ID: <378EE2D4.A67F5BD@lemburg.com> Mark Hammond wrote: > > > Remember the days when the big problem was to find an ISP who would > > install Python? Apparently that problem has gone away... The problem > > is now to get one that installs a decent set of Python extensions :-) > > he he. Yes, hence I believe the general agreement exists that we should > begin to focus on these more external issues than the language itself. > Pity we all agree, but are still such hackers :-) > > > looked at the cgi modules from the book yet to see if they > > give me any added > > benefit. The big problem I came across was my web host, and > > all of the other > > >From the ISP's POV, this is reasonable. I wouldnt be surprised to find > they started with the same policy for Perl. The issue is less likely to be > anything to do with Python, but to do with stability. If every client was > allowed to install their own extension, then that could wreak havoc. Some > ISPs will allow a private Python build, but some only allow you to use > their shared version, which they obviously want kept pretty stable. > > The answer would seem to be to embrace MALs efforts. Not only should we be > looking at pre-compiled (as I believe his effort is) but also towards > "batteries included, plus spare batteries, wall charger, car charger and > solar panels". ISP targetted installations with _many_ extensions > installed could be very useful - who cares if it is 20MB - if they dont > want that, let then do it manually with the standard installation like > everyone else. mxCGIPython is a project aimed at exactly this situation. The only current caveat with it is that the binaries are not capable of loading shared extensions (maybe some linker guru could help here). In summary the cgipython binaries are complete Python interpreters with a frozen Python standard lib included. This means that you only need to install a single file on your ISP account and you're set for CGI/Python. More infos + the binaries are available here: http://starship.skyport.net/~lemburg/mxCGIPython.html The package could also be tweaked to include a set of common extensions, I suppose, since it uses freeze.py to do most of the job. > There could almost be commercial scope here for a support company. > Offering ISP/Corporate specific CDs and support. Installations targetted > at machines shared among a huge number of users, with almost every common > Python extension any of these users would need. Corporates and ISPs may > pay far more handsomly than individuals for this kind of stuff. > > I know I am ranting still, but I repeat my starting point that addressing > issues like this are IMO the single best thing we could do for Python. We > could leave the language along for 2 years, and come back to it when this > shite is better under control :-) Naa, that would spoil all the fun ;-) But anyways, going commercial with Python is not that far-fetched anymore nowadays... something like what the Linux distributors are doing for Linux could probably also be done with Python. Which brings us back to the package name topic or better the import mechanism... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 168 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From skip at mojam.com Fri Jul 16 20:04:58 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 16 Jul 1999 14:04:58 -0400 (EDT) Subject: [Python-Dev] Python bugs database started In-Reply-To: <14219.20501.697542.358579@anthem.cnri.reston.va.us> References: <199907122004.QAA09348@eric.cnri.reston.va.us> <000701becce4$a973c920$31a02299@tim> <14219.20501.697542.358579@anthem.cnri.reston.va.us> Message-ID: <14223.29664.66832.630010@94.chicago-33-34rs.il.dial-access.att.net> TP> The first time I submitted a bug, I backed up to the entry page and TP> hit Refresh to get the category counts updated (never saw Jitterbug TP> before, so must play!). IE5 whined about something-or-other being TP> out of date, and would I like to "repost the data"? I said sure. Barry> This makes perfect sense, and explains exactly what's going on. Barry> Let's call it "poor design"[1] instead of "user error". A quick Barry> scan last night of the Jitterbug site shows no signs of fixes or Barry> workarounds. What would Jitterbug have to do to avoid these Barry> kinds of problems? If the submission form uses METHOD=GET instead of METHOD=POST, the backup problem should go away. Skip (finally hobbling through my email after the move to Illinois...) From tim_one at email.msn.com Sun Jul 18 09:06:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 18 Jul 1999 03:06:16 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <199907131253.IAA10730@eric.cnri.reston.va.us> Message-ID: <000b01bed0ec$075c47a0$36a02299@tim> > The latest versions of the Icon language [convert \r\n, \r and \n to > plain \n in text mode upon read, and convert \n to the platform convention > on write] It's a trend : the latest version of the REBOL language also does this. The Java compiler does it for Java source files, but I don't know how runtime file read/write work in Java. Anyone know offhand if there's a reliable way to determine whether an open file descriptor (a C FILE*) is seekable? if-i'm-doomed-to-get-obsessed-by-this-may-as-well-make-it-faster- too-ly y'rs - tim From mal at lemburg.com Sun Jul 18 22:29:43 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 18 Jul 1999 22:29:43 +0200 Subject: [Python-Dev] End of the line References: <000b01bed0ec$075c47a0$36a02299@tim> Message-ID: <37923937.4E73E8D8@lemburg.com> Tim Peters wrote: > > Anyone know offhand if there's a reliable way to determine whether an open > file descriptor (a C FILE*) is seekable? I'd simply use trial&error: if (fseek(stream,0,SEEK_CUR) < 0) { if (errno != EBADF)) { /* Not seekable */ errno = 0; } else /* Error */ ; } else /* Seekable */ ; How to get this thread safe is left as exercise to the interested reader ;) Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 166 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da at ski.org Thu Jul 22 01:41:28 1999 From: da at ski.org (David Ascher) Date: Wed, 21 Jul 1999 16:41:28 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Perl 5.6 'feature list' Message-ID: Not all that exciting, but good to know what they're doing: http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html From tim_one at email.msn.com Thu Jul 22 04:52:26 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 21 Jul 1999 22:52:26 -0400 Subject: [Python-Dev] Perl 5.6 'feature list' In-Reply-To: Message-ID: <000f01bed3ed$3b509800$642d2399@tim> [David Ascher] > Not all that exciting, but good to know what they're doing: > > http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html It is good to know, and I didn't, so thanks for passing that on! I see they're finally stealing Python's version numbering scheme . In other news, I just noticed that REBOL threw 1st-class continuations *out* of the language, leaving just the "escape up the current call chain" exception-handling (throw/catch) kind. This isn't an open project, so it's hard to second-guess why. Or easy, depending on how you look at it . i-suggest-looking-at-it-the-right-way-ly y'rs - tim From jim at digicool.com Thu Jul 22 14:15:08 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 08:15:08 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a default value Message-ID: <37970B4C.8E8C741E@digicool.com> I like the list pop method because it provides a way to use lists as thread safe queues and stacks (since append and pop are protected by the global interpreter lock). With pop, you can essentially test whether the list is empty and get a value if it isn't in one atomic operation: try: foo=queue.pop(0) except IndexError: ... empty queue case else: ... non-empty case, do something with foo Unfortunately, this incurs exception overhead. I'd rather do something like: foo=queue.pop(0,marker) if foo is marker: ... empty queue case else: ... non-empty case, do something with foo I'd be happy to provide a patch. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fredrik at pythonware.com Thu Jul 22 15:14:50 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 22 Jul 1999 15:14:50 +0200 Subject: [Python-Dev] Perl 5.6 'feature list' References: Message-ID: <001501bed444$2f5dbe90$f29b12c2@secret.pythonware.com> David Ascher wrote: > Not all that exciting, but good to know what they're doing: > > http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html well, "unicode all the way down" and "language level event loop" sounds pretty exciting to me... (but christian's work beats it all, of course...) From skip at mojam.com Thu Jul 22 16:24:53 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 22 Jul 1999 09:24:53 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a default value In-Reply-To: <37970B4C.8E8C741E@digicool.com> References: <37970B4C.8E8C741E@digicool.com> Message-ID: <14231.10515.423401.512972@153.chicago-41-42rs.il.dial-access.att.net> Jim> I like the list pop method because it provides a way to use lists Jim> as thread safe queues and stacks (since append and pop are Jim> protected by the global interpreter lock). The global interpreter lock is a property of the current implementation of Python, not of the language itself. At one point in the past Greg Stein created a set of patches that eliminated the lock. While it's perhaps convenient to use now, it may not always exist. I'm not so sure that it should be used as a motivator for changes to libraries in the standard distribution. Skip From jim at digicool.com Thu Jul 22 16:47:13 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 10:47:13 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a defaultvalue References: <37970B4C.8E8C741E@digicool.com> <14231.10515.423401.512972@153.chicago-41-42rs.il.dial-access.att.net> Message-ID: <37972EF1.372C2CB1@digicool.com> Skip Montanaro wrote: > > Jim> I like the list pop method because it provides a way to use lists > Jim> as thread safe queues and stacks (since append and pop are > Jim> protected by the global interpreter lock). > > The global interpreter lock is a property of the current implementation of > Python, not of the language itself. At one point in the past Greg Stein > created a set of patches that eliminated the lock. While it's perhaps > convenient to use now, it may not always exist. I'm not so sure that it > should be used as a motivator for changes to libraries in the standard > distribution. If the global interpreter lock goes away, then some other locking mechanism will be used to make built-in object operations atomic. For example, in Greg's changes, each list was protected by a list lock. The key is that pop combines checking for an empty list and removing an element into a single operation. As long as the operations append and pop are atomic, then lists can be used as thread-safe stacks and queues. The benefit of the proposal does not really depend on the global interpreter lock. It only depends on list operations being atomic. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gmcm at hypernet.com Thu Jul 22 18:07:31 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:07:31 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37970B4C.8E8C741E@digicool.com> Message-ID: <1279466648-20991135@hypernet.com> Jim Fulton writes: > With pop, you can essentially test whether the list is > empty and get a value if it isn't in one atomic operation: > > try: > foo=queue.pop(0) > except IndexError: > ... empty queue case > else: > ... non-empty case, do something with foo > > Unfortunately, this incurs exception overhead. I'd rather do > something like: > > foo=queue.pop(0,marker) > if foo is marker: > ... empty queue case > else: > ... non-empty case, do something with foo I'm assuming you're asking for the equivalent of: def pop(self, default=None): much like dict.get? Then how do I get the old behavior? (I've been known to do odd things - like change behavior based on the number of args - in extension modules, but this ain't an extension). - Gordon From fredrik at pythonware.com Thu Jul 22 17:23:00 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 22 Jul 1999 17:23:00 +0200 Subject: [Python-Dev] End of the line References: <000001beccfb$9beb4fa0$2f9e2299@tim> Message-ID: <009901bed456$161a4950$f29b12c2@secret.pythonware.com> Tim Peters wrote: > The latest versions of the Icon language (9.3.1 & beyond) sprouted an > interesting change in semantics: if you open a file for reading in > "translated" (text) mode now, it normalizes Unix, Mac and Windows line > endings to plain \n. Writing in text mode still produces what's natural for > the platform. > > Anyone think that's *not* a good idea? if we were to change this, how would you tell Python to open a file in text mode? From jim at digicool.com Thu Jul 22 17:30:22 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 11:30:22 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279466648-20991135@hypernet.com> Message-ID: <3797390E.50972562@digicool.com> Gordon McMillan wrote: > > Then how do I get the old behavior? Just pass 0 or 1 argument. >(I've been known to do odd > things - like change behavior based on the number of args - in > extension modules, but this ain't an extension). It *is* a built-in method. It will be handled just like dictionaries handle the second argument to get. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gmcm at hypernet.com Thu Jul 22 18:33:06 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:33:06 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <3797390E.50972562@digicool.com> Message-ID: <1279465114-21083404@hypernet.com> Jim Fulton wrote: > Gordon McMillan wrote: > > > > Then how do I get the old behavior? > > Just pass 0 or 1 argument. > > >(I've been known to do odd > > things - like change behavior based on the number of args - in > > extension modules, but this ain't an extension). > > It *is* a built-in method. It will be handled just like > dictionaries handle the second argument to get. d.get(nonexistantkey) does not throw an exception, it returns None. If list.pop() does not throw an exception when list is empty, it's new behavior. Which are you asking for: breaking code that expects IndexError Violating Pythonic expectations by, in effect, creating 2 methods list.pop(void) list.pop(default_return) - Gordon From jim at digicool.com Thu Jul 22 17:44:22 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 11:44:22 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> Message-ID: <37973C56.5ACFFDEC@digicool.com> Gordon McMillan wrote: > > Jim Fulton wrote: > > > Gordon McMillan wrote: > > > > > > Then how do I get the old behavior? > > > > Just pass 0 or 1 argument. > > > > >(I've been known to do odd > > > things - like change behavior based on the number of args - in > > > extension modules, but this ain't an extension). > > > > It *is* a built-in method. It will be handled just like > > dictionaries handle the second argument to get. > > d.get(nonexistantkey) does not throw an exception, it returns None. Oops, I'd forgotten that. > If list.pop() does not throw an exception when list is empty, it's > new behavior. > > Which are you asking for: > breaking code that expects IndexError No. > Violating Pythonic expectations by, in effect, creating 2 methods > list.pop(void) > list.pop(default_return) Yes, except that I disagree that this is non-pythonic. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From mal at lemburg.com Thu Jul 22 19:27:53 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 19:27:53 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> <37973C56.5ACFFDEC@digicool.com> Message-ID: <37975499.FB61E4E3@lemburg.com> Jim Fulton wrote: > > > Violating Pythonic expectations by, in effect, creating 2 methods > > list.pop(void) > > list.pop(default_return) > > Yes, except that I disagree that this is non-pythonic. Wouldn't a generic builtin for these kinds of things be better, e.g. a function returning a default value in case an exception occurs... something like: tryexcept(list.pop(), IndexError, default) which returns default in case an IndexError occurs. Don't think this would be much faster that the explicit try:...except: though... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gmcm at hypernet.com Thu Jul 22 18:54:58 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:54:58 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37973C56.5ACFFDEC@digicool.com> Message-ID: <1279463517-21179480@hypernet.com> Jim Fulton wrote: > > Gordon McMillan wrote: ... > > Violating Pythonic expectations by, in effect, creating 2 methods > > list.pop(void) > > list.pop(default_return) > > Yes, except that I disagree that this is non-pythonic. > I'll leave the final determination to Mr. Python, but I disagree. Offhand I can't think of a built-in that can't be expressed in normal Python notation, where "optional" args are really defaulted args. Which would lead us to either a new list method, or redefining pop: def pop(usedefault=0, default=None) and making you use 2 args. But maybe I've missed a precedent because I'm so used to it. (Hmm, I guess string.split is a sort-of precedent, because the first default arg behaves differently than anything you could pass in). - Gordon From bwarsaw at cnri.reston.va.us Thu Jul 22 20:33:57 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 14:33:57 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> <37973C56.5ACFFDEC@digicool.com> <37975499.FB61E4E3@lemburg.com> Message-ID: <14231.25621.888844.205034@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> Wouldn't a generic builtin for these kinds of things be M> better, e.g. a function returning a default value in case M> an exception occurs... something like: M> tryexcept(list.pop(), IndexError, default) M> which returns default in case an IndexError occurs. Don't think M> this would be much faster that the explicit try:...except: M> though... Don't know if this would be better (or useful, etc.), but it could possibly be faster than explicit try/except, because with try/except you have to instantiate the exception object. Presumably tryexcept() -- however it was spelled -- would catch the exception in C, thus avoiding the overhead of exception object instantiation. -Barry From bwarsaw at cnri.reston.va.us Thu Jul 22 20:36:09 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 14:36:09 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <3797390E.50972562@digicool.com> <1279465114-21083404@hypernet.com> Message-ID: <14231.25753.710299.405579@anthem.cnri.reston.va.us> >>>>> "Gordo" == Gordon McMillan writes: Gordo> Which are you asking for: breaking code that expects Gordo> IndexError Violating Pythonic expectations by, in effect, Gordo> creating 2 methods Gordo> list.pop(void) Gordo> list.pop(default_return) The docs /do/ say that list.pop() is experimental, so that probably gives Guido all the out he'd need to change the semantics :). I myself have yet to use list.pop() so I don't know how disasterous the change in semantics would be to existing code. -Barry From jim at digicool.com Thu Jul 22 18:49:33 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 12:49:33 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> Message-ID: <37974B9D.59C2D45E@digicool.com> Gordon McMillan wrote: > > Offhand I can't think of a built-in that can't be expressed in normal > Python notation, where "optional" args are really defaulted args. I can define the pop I want in Python as follows: _marker=[] class list: ... def pop(index=-1, default=marker): try: v=self[index] except IndexError: if default is not marker: return default if self: m='pop index out of range' else: m='pop from empty list' raise IndexError, m del self[index] return v Although I'm not sure why the "pythonicity" of an interface should depend on it's implementation. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim at digicool.com Thu Jul 22 18:53:26 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 12:53:26 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> Message-ID: <37974C86.2EC53BE7@digicool.com> BTW, a good precedent for what I want is getattr. getattr(None,'spam') raises an error, but: getattr(None,'spam',1) returns 1 Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From bwarsaw at cnri.reston.va.us Thu Jul 22 21:02:21 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 15:02:21 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> Message-ID: <14231.27325.387718.435420@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> BTW, a good precedent for what I want JF> is getattr. JF> getattr(None,'spam') JF> raises an error, but: JF> getattr(None,'spam',1) JF> returns 1 Okay, how did this one sneak in, huh? I didn't even realize this had been added to getattr()! CVS reveals it was added b/w 1.5.1 and 1.5.2a1, so maybe I just missed the checkin message. Fred, the built-in-funcs doc needs updating: http://www.python.org/doc/current/lib/built-in-funcs.html FWIW, the CVS log message says this feature is experimental too. :) -Barry From jim at digicool.com Thu Jul 22 21:20:46 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 15:20:46 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> Message-ID: <37976F0E.DFB4067B@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "JF" == Jim Fulton writes: > > JF> BTW, a good precedent for what I want > JF> is getattr. > > JF> getattr(None,'spam') > > JF> raises an error, but: > > JF> getattr(None,'spam',1) > > JF> returns 1 > > Okay, how did this one sneak in, huh? I don't know. Someone told me about it. I find it wildly useful. > I didn't even realize this had > been added to getattr()! CVS reveals it was added b/w 1.5.1 and > 1.5.2a1, so maybe I just missed the checkin message. > > Fred, the built-in-funcs doc needs updating: > > http://www.python.org/doc/current/lib/built-in-funcs.html > > FWIW, the CVS log message says this feature is experimental too. :) Eek! I want it to stay! I also really like list.pop. :) Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at cnri.reston.va.us Thu Jul 22 21:26:32 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Thu, 22 Jul 1999 15:26:32 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14231.27325.387718.435420@anthem.cnri.reston.va.us> References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> Message-ID: <14231.28776.160422.442859@weyr.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> BTW, a good precedent for what I want JF> is getattr. JF> getattr(None,'spam') JF> raises an error, but: JF> getattr(None,'spam',1) JF> returns 1 Barry A. Warsaw writes: > Fred, the built-in-funcs doc needs updating: This is done in the CVS repository; thanks for pointing out the oversight! Do people realize that pop() already has an optional parameter? That *is* in the docs: http://www.python.org/docs/current/lib/typesseq-mutable.html See note 4 below the table. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Thu Jul 22 21:37:20 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 15:37:20 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <14231.29424.569863.149366@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> I don't know. Someone told me about it. I find it JF> wildly useful. No kidding! :) From mal at lemburg.com Thu Jul 22 22:32:23 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 22:32:23 +0200 Subject: [Python-Dev] Importing extension modules Message-ID: <37977FD7.BD7A9826@lemburg.com> I'm currently testing a pure Python version of mxDateTime (my date/time package), which uses a contributed Python version of the C extension. Now, to avoid problems with pickled DateTime objects (they include the complete module name), I would like to name *both* the Python and the C extension version mxDateTime. With the current lookup scheme (shared mods are searched before Python modules) this is no problem since the shared mod is found before the Python version and used instead, so getting this working is rather simple. The question is: will this setup remain a feature in future versions of Python ? (Does it work this way on all platforms ?) Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Thu Jul 22 22:45:24 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 22:45:24 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <379782E4.7DC79460@lemburg.com> Jim Fulton wrote: > > [getattr(obj,name[,default])] > > Okay, how did this one sneak in, huh? > > I don't know. Someone told me about it. I find it > wildly useful. Me too... ;-) > > I didn't even realize this had > > been added to getattr()! CVS reveals it was added b/w 1.5.1 and > > 1.5.2a1, so maybe I just missed the checkin message. http://www.deja.com/getdoc.xp?AN=366635977 -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at appliedbiometrics.com Thu Jul 22 22:50:42 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 22 Jul 1999 22:50:42 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <37978422.F36BB130@appliedbiometrics.com> > > Fred, the built-in-funcs doc needs updating: > > > > http://www.python.org/doc/current/lib/built-in-funcs.html > > > > FWIW, the CVS log message says this feature is experimental too. :) > > Eek! I want it to stay! > > I also really like list.pop. :) Seconded! Also, things which appeared between some alphas and made it upto the final, are just there. It would be fair to update the CVS tree and say the features made it into the dist, even if it just was a mistake not to remove them in time. It was time enough. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Thu Jul 22 22:50:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 16:50:36 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> Message-ID: <14231.33820.422195.45250@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> http://www.deja.com/getdoc.xp?AN=366635977 Ah, thanks! Your rationale was exactly the reason why I added dict.get(). I'm still not 100% sure about list.pop() though, since it's not exactly equivalent -- list.pop() modifies the list as a side-effect :) Makes me think you might want an alternative spelling for list[s], call it list.get() and put the optional default on that method. Then again, maybe list.pop() with an optional default is good enough. -Barry From jim at digicool.com Thu Jul 22 22:55:05 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 16:55:05 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> Message-ID: <37978529.B1AC5273@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> http://www.deja.com/getdoc.xp?AN=366635977 > > Ah, thanks! Your rationale was exactly the reason why I added > dict.get(). I'm still not 100% sure about list.pop() though, since > it's not exactly equivalent -- list.pop() modifies the list as a > side-effect :) Makes me think you might want an alternative spelling > for list[s], call it list.get() and put the optional default on that > method. Then again, maybe list.pop() with an optional default is good > enough. list.get and list.pop are different, since get wouldn't modify the list and pop would. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From bwarsaw at cnri.reston.va.us Thu Jul 22 23:13:49 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 17:13:49 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> <37978529.B1AC5273@digicool.com> Message-ID: <14231.35214.1590.898304@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> list.get and list.pop are different, since get wouldn't modify JF> the list and pop would. Right. Would we need them both? From jim at digicool.com Thu Jul 22 23:36:03 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 17:36:03 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> <37978529.B1AC5273@digicool.com> <14231.35214.1590.898304@anthem.cnri.reston.va.us> Message-ID: <37978EC3.CAAF2632@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "JF" == Jim Fulton writes: > > JF> list.get and list.pop are different, since get wouldn't modify > JF> the list and pop would. > > Right. Would we need them both? Sure. Since a sequence is sort of a special kind of mapping, get makes sense. I definately, want pop. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tim_one at email.msn.com Fri Jul 23 05:08:05 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:05 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37976F0E.DFB4067B@digicool.com> Message-ID: <000201bed4b8$951f9ae0$2c2d2399@tim> [Barry] > FWIW, the CVS log message says this feature [3-arg getattr] is > experimental too. :) [Jim] > Eek! I want it to stay! > > I also really like list.pop. :) Don't panic: Guido has never removed a feature explicitly called "experimental"; he's only removed non-experimental ones. that's-why-we-call-stackless-python-"an-experiment"-ly y'rs - tim From tim_one at email.msn.com Fri Jul 23 05:08:07 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:07 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <009901bed456$161a4950$f29b12c2@secret.pythonware.com> Message-ID: <000301bed4b8$964492e0$2c2d2399@tim> [Tim] > The latest versions of the Icon language ... normalizes Unix, Mac > and Windows line endings to plain \n. Writing in text mode still > produces what's natural for the platform. [/F] > if we were to change this, how would you > tell Python to open a file in text mode? Meaning whatever it is the platform libc does? In Icon or REBOL, you don't. Icon is more interesting because they changed the semantics of their "t" (for "translated") mode without providing any way to go back to the old behavior (REBOL did this too, but didn't have Icon's 15 years of history to wrestle with). Curiously (I doubt Griswold *cared* about this!), the resulting behavior still conforms to ANSI C, because that std promises little about text mode semantics in the presence of non-printable characters. Nothing of mine would miss C's raw text mode (lack of) semantics, so I don't care. I *would* like Python to define portable semantics for the mode strings it accepts in the builtin open regardless, and push platform-specific silliness (including raw C text mode, if someone really wants that; or MS's "c" mode, etc) into a new os.fopen function. Push random C crap into expert modules, where it won't baffle my sister <0.7 wink>. I expect Python should still open non-binary files in the platform's text mode, though, to minimize surprises for C extensions mucking with the underlying stream object (Icon/REBOL don't have this problem, although Icon opens the file in native libc text mode anyway). next-step:-define-tabs-to-mean-8-characters-and-drop-unicode-in- favor-of-7-bit-ascii-ly y'rs - tim From tim_one at email.msn.com Fri Jul 23 05:08:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:02 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37975499.FB61E4E3@lemburg.com> Message-ID: <000101bed4b8$9395eda0$2c2d2399@tim> [M.-A. Lemburg] > Wouldn't a generic builtin for these kinds of things be > better, e.g. a function returning a default value in case > an exception occurs... something like: > > tryexcept(list.pop(), IndexError, default) > > which returns default in case an IndexError occurs. Don't > think this would be much faster that the explicit try:...except: > though... As a function (builtin or not), tryexcept will never get called if list.pop() raises an exception. tryexcept would need to be a new statement type, and the compiler would have to generate code akin to try: whatever = list.pop() except IndexError: whatever = default If you want to do it in a C function instead to avoid the Python-level exception overhead, the compiler would have to wrap list.pop() in a lambda in order to delay evaluation until the C code got control; and then you've got worse overhead . generalization-is-the-devil's-playground-ly y'rs - tim From tim_one at email.msn.com Fri Jul 23 09:23:27 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 23 Jul 1999 03:23:27 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a defaultvalue In-Reply-To: <37970B4C.8E8C741E@digicool.com> Message-ID: <000201bed4dc$41c9f240$392d2399@tim> In a moment of insanity, Guido gave me carte blanche to suggest new list methods, and list.pop & list.extend were the result. I considered spec'ing list.pop to take an optional "default on bad index" argument too, but after playing with it didn't like it (always appeared just as easy & clearer to use "if list:" / "while list:" etc). Jim has a novel use I hadn't considered: > With pop, you can essentially test whether the list is > empty and get a value if it isn't in one atomic operation: > > try: > foo=queue.pop(0) > except IndexError: > ... empty queue case > else: > ... non-empty case, do something with foo > > Unfortunately, this incurs exception overhead. I'd rather do > something like: > > foo=queue.pop(0,marker) > if foo is marker: > ... empty queue case > else: > ... non-empty case, do something with foo It's both clever and pretty. OTOH, the original try/except isn't expensive unless the "except" triggers frequently, in which case (the queue is often empty) a thread is likely better off with a yielding Queue.get() call. So this strikes me as useful only for thread micro-optimization, and a kind of optimization most users should be steered away from anyway. Does anyone have a real use for this outside of threads? If not, I'd rather it not go in. For threads that need an optimized non-blocking probe, I'd write it: gotone = 0 if queue: try: foo = queue.pop(0) gotone = 1 except IndexError: pass if gotone: # use foo else: # twiddle thumbs For the IndexError to trigger there, a thread has to lose its bytecode slice between a successful "if queue" and the queue.pop, and not get another chance to run until other threads have emptied the queue. From mal at lemburg.com Fri Jul 23 10:27:47 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 23 Jul 1999 10:27:47 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> Message-ID: <37982783.E60E9941@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > Wouldn't a generic builtin for these kinds of things be > > better, e.g. a function returning a default value in case > > an exception occurs... something like: > > > > tryexcept(list.pop(), IndexError, default) > > > > which returns default in case an IndexError occurs. Don't > > think this would be much faster that the explicit try:...except: > > though... > > As a function (builtin or not), tryexcept will never get called if > list.pop() raises an exception. Dang. You're right... > tryexcept would need to be a new statement > type, and the compiler would have to generate code akin to > > try: > whatever = list.pop() > except IndexError: > whatever = default > > If you want to do it in a C function instead to avoid the Python-level > exception overhead, the compiler would have to wrap list.pop() in a lambda > in order to delay evaluation until the C code got control; and then you've > got worse overhead . Oh well, forget the whole idea then. list.pop() is really not needed that often anyways to warrant the default arg thing, IMHO. dict.get() and getattr() have the default arg as performance enhancement and I believe that you wouldn't get all that much better performance on average by adding a second optional argument to list.pop(). BTW, there is a generic get() function in mxTools (you know where...) in case someone should be looking for such a beast. It works with all sequences and mappings. Also, has anybody considered writing list.pop(..,default) this way: if list: obj = list.pop() else: obj = default No exceptions, no changes, fast as hell :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at appliedbiometrics.com Fri Jul 23 12:39:27 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 23 Jul 1999 12:39:27 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> Message-ID: <3798465F.33A253D4@appliedbiometrics.com> "M.-A. Lemburg" wrote: ... > Also, has anybody considered writing list.pop(..,default) this way: > > if list: > obj = list.pop() > else: > obj = default > > No exceptions, no changes, fast as hell :-) Yes, that's the best way to go, I think. But wasn't the primary question directed on an atomic function which is thread-safe? I'm not sure, this thread has grown too fast :-) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From mal at lemburg.com Fri Jul 23 13:07:22 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 23 Jul 1999 13:07:22 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <3798465F.33A253D4@appliedbiometrics.com> Message-ID: <37984CEA.1DF062F6@lemburg.com> Christian Tismer wrote: > > "M.-A. Lemburg" wrote: > ... > > Also, has anybody considered writing list.pop(..,default) this way: > > > > if list: > > obj = list.pop() > > else: > > obj = default > > > > No exceptions, no changes, fast as hell :-) > > Yes, that's the best way to go, I think. > But wasn't the primary question directed on > an atomic function which is thread-safe? > I'm not sure, this thread has grown too fast :-) I think that was what Jim had in mind in the first place. Hmm, so maybe we're not after lists after all: maybe what we need is access to the global interpreter lock in Python, so that we can write: sys.lock.acquire() if list: obj = list.pop() else: obj = default sys.lock.release() Or maybe we need some general lock in the thread module for these purposes... don't know. It's been some time since I used threads. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim at digicool.com Fri Jul 23 13:58:23 1999 From: jim at digicool.com (Jim Fulton) Date: Fri, 23 Jul 1999 07:58:23 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <3798465F.33A253D4@appliedbiometrics.com> Message-ID: <379858DF.D317A40F@digicool.com> Christian Tismer wrote: > > "M.-A. Lemburg" wrote: > ... > > Also, has anybody considered writing list.pop(..,default) this way: > > > > if list: > > obj = list.pop() > > else: > > obj = default > > > > No exceptions, no changes, fast as hell :-) > > Yes, that's the best way to go, I think. > But wasn't the primary question directed on > an atomic function which is thread-safe? Right. And the above code doesn't solve this problem. Tim's code *does* solve the problem. It's the code we were using. It is a bit verbose though. > I'm not sure, this thread has grown too fast :-) Don't they all? Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at cnri.reston.va.us Fri Jul 23 17:07:37 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Fri, 23 Jul 1999 11:07:37 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37982783.E60E9941@lemburg.com> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> Message-ID: <14232.34105.421424.838212@weyr.cnri.reston.va.us> Tim Peters wrote: > As a function (builtin or not), tryexcept will never get called if > list.pop() raises an exception. M.-A. Lemburg writes: > Oh well, forget the whole idea then. list.pop() is really not Giving up already? Wouldn't you just love this as an expression operator (which could work)? How about: top = list.pop() excepting IndexError, default Hehehe... ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip at mojam.com Fri Jul 23 18:23:31 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 23 Jul 1999 11:23:31 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.34105.421424.838212@weyr.cnri.reston.va.us> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> Message-ID: <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> Fred> Giving up already? Wouldn't you just love this as an expression Fred> operator (which could work)? Fred> How about: Fred> top = list.pop() excepting IndexError, default Why not go all the way to Perl with top = list.pop() unless IndexError ??? ;-) Skip From fdrake at cnri.reston.va.us Fri Jul 23 18:30:17 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Fri, 23 Jul 1999 12:30:17 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> Message-ID: <14232.39065.687719.135590@weyr.cnri.reston.va.us> Skip Montanaro writes: > Why not go all the way to Perl with > > top = list.pop() unless IndexError Trying to kill me, Skip? ;-) Actually, the semantics are different. If we interpret that using the Perl semantics for "unless", don't we have the same thing as: if not IndexError: top = list.pop() Since IndexError will normally be a non-empty string or a class, this is pretty much: if 0: top = list.pop() which certainly isn't quite as interesting. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip at mojam.com Fri Jul 23 22:23:12 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 23 Jul 1999 15:23:12 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.39065.687719.135590@weyr.cnri.reston.va.us> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> <14232.39065.687719.135590@weyr.cnri.reston.va.us> Message-ID: <14232.52576.746910.229435@227.chicago-26-27rs.il.dial-access.att.net> Fred> Skip Montanaro writes: >> Why not go all the way to Perl with >> >> top = list.pop() unless IndexError Fred> Trying to kill me, Skip? ;-) Nope, just a flesh wound. I'll wait for the resulting infection to really do you in. ;-) Fred> Actually, the semantics are different. If we interpret that using Fred> the Perl semantics for "unless", don't we have the same thing as: Yes, but the flavor is the same. Reading Perl code that uses the unless keyword always seemed counterintuitive to me. Something like x = y unless foo; always reads to me like, "Assign y to x. No, wait a minute. I forgot something. Only do that if foo isn't true." What was so bad about if (!foo) { x = y; } That was my initial reaction to the use of the trailing except. We argue a lot in the Python community about whether or not a proposed language feature increases the expressive power of the language or not (which is a good idea in my opinion). The Perl community has apparently never been afflicted with that disease. smiles all 'round... Skip From tismer at appliedbiometrics.com Sat Jul 24 01:36:33 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 24 Jul 1999 01:36:33 +0200 Subject: [Python-Dev] continuations for the curious Message-ID: <3798FC81.A57E9CFE@appliedbiometrics.com> Howdy, my modules are nearly ready. I will be out of my office for two weeks, but had no time to finalize and publish yet. Stackless Python has reached what I wanted it to reach: A continuation can be saved at every opcode. The continuationmodule has been shrunk heavily. Some extension is still needed, continuations are still frames, but they can be picked like Sam wanted it originally. Sam, I'm pretty sure this is more than enough for coroutines. Just have a look at getpcc(), this is now very easy. All involved frames are armed so that they *can* save themselves, but will do so just if necessary. The cheapest solution I could think of, no other optimization is necessary. If your coroutine functions like to swap two frames, and if they manage to do so that the refcount of the target stays at one, no extra frame will be generated. That's it, really. If someone wants to play, get the stackless module, replace ceval.c, and build continuationmodule.c as a dll or whatever. testct.py contains a lot of crap. The first implementation of class coroutine is working right. The second one is wrong by concept. later - chris ftp://ftp.pns.cc/pub/veryfar.zip -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing at nightmare.com Sat Jul 24 03:52:00 1999 From: rushing at nightmare.com (Sam Rushing) Date: Fri, 23 Jul 1999 18:52:00 -0700 (PDT) Subject: [Python-Dev] continuations for the curious In-Reply-To: <3798FC81.A57E9CFE@appliedbiometrics.com> References: <3798FC81.A57E9CFE@appliedbiometrics.com> Message-ID: <14233.7163.919863.981628@seattle.nightmare.com> Hey Chris, I think you're missing some include files from 'veryfar.zip'? ceval.c: In function `PyEval_EvalCode': ceval.c:355: warning: return makes pointer from integer without a cast ceval.c: In function `PyEval_EvalCode_nr': ceval.c:375: `Py_UnwindToken' undeclared (first use this function) ceval.c:375: (Each undeclared identifier is reported only once ceval.c:375: for each function it appears in.) ceval.c: In function `eval_code2_setup': ceval.c:490: structure has no member named `f_execute' ceval.c:639: structure has no member named `f_first_instr' ceval.c:640: structure has no member named `f_next_instr' -Sam From tim_one at email.msn.com Sat Jul 24 04:16:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 23 Jul 1999 22:16:16 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37984CEA.1DF062F6@lemburg.com> Message-ID: <000c01bed57a$82a79620$832d2399@tim> > ... > Hmm, so maybe we're not after lists after all: maybe what > we need is access to the global interpreter lock in Python, > so that we can write: > > sys.lock.acquire() > if list: > obj = list.pop() > else: > obj = default > sys.lock.release() The thread attempting the sys.lock.acquire() necessarily already owns the global lock, so the attempt to acquire it is a guaranteed deadlock -- arguably not helpful . > Or maybe we need some general lock in the thread module for these > purposes... don't know. It's been some time since I used > threads. Jim could easily allocate a list lock for this purpose if that's what he wanted; and wrap it in a class with a nice interface too. He'd eventually end up with the std Queue.py module, though. But if he doesn't want the overhead of an exception when the queue is empty, he sure doesn't want the comparatively huge overhead of a (any flavor of) lock either (which may drag the OS into the picture). There's nothing wrong with wanting a fast thread-safe queue! I just don't like the idea of adding an otherwise-ugly new gimmick to core lists for it; also have to wonder about Jim's larger picture if he's writing stuff in Python that's *so* time-critical that the overhead of an ordinary exception from time to time is a genuine problem. The verbosity of the alternative can be hidden in a lock-free class or function, if it's the clumsiness instead of the time that's grating. From mal at lemburg.com Sat Jul 24 10:38:59 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 24 Jul 1999 10:38:59 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000c01bed57a$82a79620$832d2399@tim> Message-ID: <37997BA3.B5AB23B4@lemburg.com> Tim Peters wrote: > > > ... > > Hmm, so maybe we're not after lists after all: maybe what > > we need is access to the global interpreter lock in Python, > > so that we can write: > > > > sys.lock.acquire() > > if list: > > obj = list.pop() > > else: > > obj = default > > sys.lock.release() > > The thread attempting the sys.lock.acquire() necessarily already owns the > global lock, so the attempt to acquire it is a guaranteed deadlock -- > arguably not helpful . True, sys.lock.acquire() would have to set a flag *not* to release the lock until the next call to sys.lock.release(), which then clears this flag again. Sort of a lock for the unlocking the lock ;-) Could this work, or am I having a mind twister somewhere in there again ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 160 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gmcm at hypernet.com Sat Jul 24 14:41:39 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sat, 24 Jul 1999 07:41:39 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37997BA3.B5AB23B4@lemburg.com> Message-ID: <1279306201-30642004@hypernet.com> M.-A. Lemburg writes: > True, sys.lock.acquire() would have to set a flag *not* to release > the lock until the next call to sys.lock.release(), which then > clears this flag again. Sort of a lock for the unlocking the lock > ;-) > > Could this work, or am I having a mind twister somewhere in > there again ? Sounds like a critical section to me. On Windows, those are lightweight and very handy. You can build one with Python thread primitives, but unfortunately, they come out on the heavy side. Locks come in 4 types, categorized by whether they can be released only by the owning thread, and whether they can be acquired recursively. The interpreter lock is in the opposite quadrant from a critical section, so "sys.lock.freeze()" and "sys.lock.thaw()" have little chance of having an efficient implementation on any platform. A shame. That would be pretty cool. - Gordon From tim_one at email.msn.com Sun Jul 25 20:57:50 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 25 Jul 1999 14:57:50 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <000c01becdac$d2ad6300$7d9e2299@tim> Message-ID: <000001bed6cf$984cd8e0$b02d2399@tim> [Tim, notes that Perl line-at-a-time text mode input runs 3x faster than Python's on his platform] And much to my surprise, it turns out Perl reads lines a character at a time too! And they do not reimplement stdio. But they do cheat. Perl's internals are written on top of an abstract IO API, with "PerlIO *" instead of "FILE *", "PerlIO_tell(PerlIO *)" instead of "ftell(FILE*)", and so on. Nothing surprising in the details, except maybe that stdin is modeled as a function "PerlIO *PerlIO_stdin(void)" instead of as global data (& ditto for stdout/stderr). The usual *implementation* of these guys is as straight macro substitution to the corresponding C stdio call. It's possible to implement them some other way, but I don't see anything in the source that suggests anyone has done so, except possibly to build it all on AT&T's SFIO lib. So where's the cheating? In these API functions: int PerlIO_has_base(PerlIO *); int PerlIO_has_cntptr(PerlIO *); int PerlIO_canset_cnt(PerlIO *); char *PerlIO_get_ptr(PerlIO *); int PerlIO_get_cnt(PerlIO *); void PerlIO_set_cnt(PerlIO *,int); void PerlIO_set_ptrcnt(PerlIO *,char *,int); char *PerlIO_get_base(PerlIO *); int PerlIO_get_bufsiz(PerlIO *); In almost all platform stdio implementations, the C FILE struct has members that may vary in name but serve the same purpose: an internal buffer, and some way (pointer or offset) to get at "the next" buffer character. The guys above are usually just (after layers & layers of config stuff sets it up) macros that expand into the platform's internal way of spelling these things. For example, the count member is spelled under Windows as fp->_cnt under VC, or as fp->level under Borland. The payoff is in Perl's sv_gets function, in file sv.c. This is long and very complicated, but at its core has a fast inner loop that copies characters (provided the PerlIO_has/canXXX functions say it's possible) directly from the stdio buffer into a Perl string variable -- in the way a platform fgets function *would* do it if it bothered to optimize fgets. In my experience, platforms usually settle for the same kind of fgetc/EOF?/newline? loop Python uses, as if fgets were a stdio client rather than a stdio primitive. Perl's keeps everything in registers inside the loop, updates the FILE struct members only at the boundaries, and doesn't check for EOF except at the boundaries (so long as the buffer has unread stuff in it, you can't be at EOF). If the stdio buffer is exhausted before the input terminator is seen (Perl has "input record separator" and "paragraph mode" gimmicks, so it's hairier than just looking for \n), it calls PerlIO_getc once to force the platform to refill the buffer, and goes back to the screaming loop. Major hackery, but major payoff (on most platforms) too. The abstract I/O layer is a fine idea regardless. The sad thing is that the real reason Perl is so fast here is that platform fgets is so needlessly slow. perl-input-is-faster-than-c-input-ly y'rs - tim From tim_one at email.msn.com Mon Jul 26 06:58:31 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 26 Jul 1999 00:58:31 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37982783.E60E9941@lemburg.com> Message-ID: <000601bed723$81bb1020$492d2399@tim> [M.-A. Lemburg] > ... > Oh well, forget the whole idea then. list.pop() is really not > needed that often anyways to warrant the default arg thing, IMHO. > dict.get() and getattr() have the default arg as performance > enhancement I like their succinctness too; count = dict.get(key, 0) is helpfully "slimmer" than either of try: count = dict[key] except KeyError: count = 0 or count = 0 if dict.has_key(key): count = dict[key] > and I believe that you wouldn't get all that much better performance > on average by adding a second optional argument to list.pop(). I think you wouldn't at *all*, except in Jim's novel case. That is, when a list is empty, it's usually the signal to get out of a loop, and you can either test if list: item = list.pop() else: break today or item = list.pop(-1, marker) if item is marker: break tomorrow. The second way doesn't buy anything to my eye, and the first way is very often the pretty while list: item = list.pop() if-it-weren't-for-jim's-use-i'd-see-no-use-at-all-ly y'rs - tim From mal at lemburg.com Mon Jul 26 10:31:01 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 26 Jul 1999 10:31:01 +0200 Subject: [Python-Dev] Thread locked sections References: <1279306201-30642004@hypernet.com> Message-ID: <379C1CC5.51A89688@lemburg.com> Gordon McMillan wrote: > > M.-A. Lemburg writes: > > > True, sys.lock.acquire() would have to set a flag *not* to release > > the lock until the next call to sys.lock.release(), which then > > clears this flag again. Sort of a lock for the unlocking the lock > > ;-) > > > > Could this work, or am I having a mind twister somewhere in > > there again ? > > Sounds like a critical section to me. On Windows, those are > lightweight and very handy. You can build one with Python thread > primitives, but unfortunately, they come out on the heavy side. > > Locks come in 4 types, categorized by whether they can be released > only by the owning thread, and whether they can be acquired > recursively. The interpreter lock is in the opposite quadrant from a > critical section, so "sys.lock.freeze()" and "sys.lock.thaw()" have > little chance of having an efficient implementation on any platform. Actually, I think all that's needed is another global like the interpreter_lock in ceval.c. Since this lock is only accessed via abstract functions, I presume the unlock flag could easily be added. The locking section would only focus on Python, though: other threads could still be running provided they don't execute Python code, e.g. write data to a spooler. So it's not really the equivalent of a critical section as the one you can define in C. PS: I changed the subject line... hope this doesn't kill the thread ;) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 158 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From Brian at digicool.com Mon Jul 26 15:46:00 1999 From: Brian at digicool.com (Brian Lloyd) Date: Mon, 26 Jul 1999 09:46:00 -0400 Subject: [Python-Dev] End of the line Message-ID: <613145F79272D211914B0020AFF6401914DC02@gandalf.digicool.com> > [Tim, notes that Perl line-at-a-time text mode input runs 3x > faster than > Python's on his platform] > > And much to my surprise, it turns out Perl reads lines a > character at a time > too! And they do not reimplement stdio. But they do cheat. > > [some notes on the cheating and PerlIO api snipped] > > The usual *implementation* of these guys is as straight macro > substitution > to the corresponding C stdio call. It's possible to > implement them some > other way, but I don't see anything in the source that > suggests anyone has > done so, except possibly to build it all on AT&T's SFIO lib. Hmm - speed bonuses not withstanding, an implementation of such a beast in the Python sources would've helped a lot to reduce the ugly hairy gymnastics required to get Python going on Win CE, where (until very recently) there was no concept of most of the things you expect to find in stdio... Brian Lloyd brian at digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From mhammond at skippinet.com.au Tue Jul 27 00:49:56 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 08:49:56 +1000 Subject: [Python-Dev] Thread locked sections In-Reply-To: <379C1CC5.51A89688@lemburg.com> Message-ID: <002801bed7b9$2fa8b620$0801a8c0@bobcat> > Actually, I think all that's needed is another global like > the interpreter_lock in ceval.c. Since this lock is only > accessed via abstract functions, I presume the unlock flag could > easily be added. Well, my personal opinion is that this is really quite wrong. The most obvious thing to me is that we are exposing an implementation detail we all would dearly like to see removed one day - the global interpreter lock. But even if we ignore that, it seems to me that you are describing an application abstraction, not a language abstraction. This thread started with Jim wanting a thread-safe, atomic list operation. This is not an unusual requirement (ie, a thread-safe, atomic operation), so languages give you access to primitives that let you build this. To my mind, you are asking for the equivilent of a C function that says "suspend all threads except me, cos Im doing something _really_ important". C does not provide that, and I have never thought it should. As Gordon said, Win32 has critical sections, but these are really just lightweight locks. I really dont see how Python is different - it gives you all the tools you need to build these abstractions. I really dont see what you are after that can not be done with a lock. If the performance is a problem, then to paraphrase the Timbot, it may be questionable if you are using Python appropriately in this case. Mark. From tim_one at email.msn.com Tue Jul 27 03:41:17 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 26 Jul 1999 21:41:17 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <613145F79272D211914B0020AFF6401914DC02@gandalf.digicool.com> Message-ID: <000b01bed7d1$1eac5620$eea22299@tim> [Tim, on the cheating PerlIO API] [Brian Lloyd] > Hmm - speed bonuses not withstanding, an implementation of > such a beast in the Python sources would've helped a lot to > reduce the ugly hairy gymnastics required to get Python going > on Win CE, where (until very recently) there was no concept > of most of the things you expect to find in stdio... I don't think it would have helped you there. If e.g. ftell is missing, it's no easier to implement it yourself under the name "PerlIO_ftell" than under the name "ftell" ... Back before Larry Wall got it into in his head that Perl is a grand metaphor for freedom and creativity (or whatever), he justifiably claimed that Perl's great achievement was in taming Unix. Which it did! Perl essentially defined yet a 537th variation of libc/shell/tool semantics, but in a way that worked the same across its 536 Unix hosts. The PerlIO API is a great help with *that*: if a platform is a little off kilter in its implementation of one of these functions, Perl can use a corresponding PerlIO wrapper to hide the shortcoming in a platform-specific file, and the rest of Perl blissfully assumes everything works the same everywhere. That's a good, cool idea. Ironically, Perl does more to hide gratuitous platform differences here than Python does! But it's just a pile of names if you've got no stdio to build on. let's-model-PythonIO-on-the-win32-api-ly y'rs - tim From mhammond at skippinet.com.au Tue Jul 27 04:13:09 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 12:13:09 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <000b01bed7d1$1eac5620$eea22299@tim> Message-ID: <002a01bed7d5$93a4a780$0801a8c0@bobcat> > let's-model-PythonIO-on-the-win32-api-ly y'rs - tim Interestingly, this raises a point worth mentioning sans-wink :-) Win32 has quite a nice concept that file handles (nearly all handles really) are "waitable". Indeed, in the Win32 world, this feature usually prevents me from using the "threading" module - I need to wait on objects other than threads or locks (usually files, but sometimes child processes). I also usually need a "wait for the first one of these objects", which threading doesnt provide, but that is a digression... What Im getting at is that a Python IO model should maybe go a little further than "tradtional" IO - asynchronous IO and synchronisation capabilities should also be specified. Of course, these would be optional, but it would be excellent if a platform could easily slot into pre-defined Python semantics if possible. Is this reasonable, or really simply too hard to abstract in the manner I an talking!? Mark. From mal at lemburg.com Tue Jul 27 10:31:27 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 10:31:27 +0200 Subject: [Python-Dev] Thread locked sections References: <002801bed7b9$2fa8b620$0801a8c0@bobcat> Message-ID: <379D6E5F.B29251EF@lemburg.com> Mark Hammond wrote: > > > Actually, I think all that's needed is another global like > > the interpreter_lock in ceval.c. Since this lock is only > > access From mal at lemburg.com Tue Jul 27 11:23:05 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 11:23:05 +0200 Subject: [Python-Dev] Thread locked sections References: <002801bed7b9$2fa8b620$0801a8c0@bobcat> Message-ID: <379D7A79.DB97B2C@lemburg.com> [The previous mail got truncated due to insufficient disk space; here is a summary...] Mark Hammond wrote: > > > Actually, I think all that's needed is another global like > > the interpreter_lock in ceval.c. Since this lock is only > > accessed via abstract functions, I presume the unlock flag could > > easily be added. > > Well, my personal opinion is that this is really quite wrong. The most > obvious thing to me is that we are exposing an implementation detail we all > would dearly like to see removed one day - the global interpreter lock. > > But even if we ignore that, it seems to me that you are describing an > application abstraction, not a language abstraction. This thread started > with Jim wanting a thread-safe, atomic list operation. This is not an > unusual requirement (ie, a thread-safe, atomic operation), so languages > give you access to primitives that let you build this. > > To my mind, you are asking for the equivilent of a C function that says > "suspend all threads except me, cos Im doing something _really_ important". > C does not provide that, and I have never thought it should. As Gordon > said, Win32 has critical sections, but these are really just lightweight > locks. I really dont see how Python is different - it gives you all the > tools you need to build these abstractions. > > I really dont see what you are after that can not be done with a lock. If > the performance is a problem, then to paraphrase the Timbot, it may be > questionable if you are using Python appropriately in this case. The locked section may not be leading in the right direction, but it surely helps in situations where you cannot otherwise enforce useage of an object specific lock, e.g. for builtin file objects (some APIs insist on getting the real thing, not a thread safe wrapper). Here is a hack that let's you do much the same with an unpatched Python interpreter: sys.setcheckinterval(sys.maxint) # *) # >=10 Python OPs to flush the ticker counter and have the new # check interavl setting take effect: 0==0; 0==0; 0==0; 0==0 try: ...lock section... finally: sys.setcheckinterval(10) *) sys.setcheckinterval should really return the previous value so that we can reset the value to the original one afterwards. Note that the lock section may not call code which uses the Py_*_ALLOW_THREADS macros. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 157 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From rushing at nightmare.com Tue Jul 27 12:33:03 1999 From: rushing at nightmare.com (Sam Rushing) Date: Tue, 27 Jul 1999 03:33:03 -0700 (PDT) Subject: [Python-Dev] continuations for the curious In-Reply-To: <3798FC81.A57E9CFE@appliedbiometrics.com> References: <3798FC81.A57E9CFE@appliedbiometrics.com> Message-ID: <14237.33980.82091.445607@seattle.nightmare.com> I've been playing for a bit, trying to write my own coroutine class (obeying the law of "you won't understand it until you write it yourself") based on one I've worked up for 'lunacy'. I think I have it, let me know what you think: >>> from coroutine import * >>> cc = coroutine (counter, 100, 10) >>> cc.resume() 100 >>> cc.resume() 110 >>> Differences: 1) callcc wraps the 'escape frame' with a lambda, so that it can be invoked like any other function. this actually simplifies the bootstrapping, because starting the function is identical to resuming it. 2) the coroutine object keeps track of who resumed it, so that it can resume the caller without having to know who it is. 3) the coroutine class keeps track of which is the currently 'active' coroutine. It's currently a class variable, but I think this can lead to leaks, so it might have to be made a global. +----------------------------------------------------------------- | For those folks (like me) that were confused about where to get | all the necessary files for building the latest Stackless Python, | here's the procedure: | | 1) unwrap a fresh copy of 1.5.2 | 2) unzip | http://www.pns.cc/anonftp/pub/stackless_990713.zip | on top of it | 3) then, unzip | ftp://ftp.pns.cc/pub/veryfar.zip | on top of that | 4) add "continuation continuationmodule.c" to Modules/Setup -Sam -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: coroutine.py URL: From jack at oratrix.nl Tue Jul 27 14:04:39 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 27 Jul 1999 14:04:39 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Mark Hammond" , Tue, 27 Jul 1999 12:13:09 +1000 , <002a01bed7d5$93a4a780$0801a8c0@bobcat> Message-ID: <19990727120440.13D5F303120@snelboot.oratrix.nl> > What Im getting at is that a Python IO model should maybe go a little > further than "tradtional" IO - asynchronous IO and synchronisation > capabilities should also be specified. Of course, these would be optional, > but it would be excellent if a platform could easily slot into pre-defined > Python semantics if possible. What Python could do with reasonable ease is a sort of "promise" model, where an I/O operation returns an object that waits for the I/O to complete upon access or destruction. Something like def foo(): obj = stdin.delayed_read() obj2 = stdout.delayed_write("data") do_lengthy_computation() data = obj.get() # Here we wait for the read to complete del obj2 # Here we wait for the write to complete. This gives a fairly nice programming model. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mhammond at skippinet.com.au Tue Jul 27 14:10:56 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 22:10:56 +1000 Subject: [Python-Dev] Thread locked sections In-Reply-To: <379D7A79.DB97B2C@lemburg.com> Message-ID: <004201bed829$16211c40$0801a8c0@bobcat> [Marc writes] > The locked section may not be leading in the right direction, > but it surely helps in situations where you cannot otherwise > enforce useage of an object specific lock, e.g. for builtin > file objects (some APIs insist on getting the real thing, not > a thread safe wrapper). Really, all this boils down to is that you want a Python-ish critical section - ie, a light-weight lock. This presumably would be desirable if it could be shown Python locks are indeed "heavy" - I know that from the C POV they may be considered as such, but I havent seen many complaints about lock speed from Python. So in an attempt to get _some_ evidence, I wrote a test program that used the Queue module to append 10000 integers then remove them all. I then hacked the queue module to remove all locking, and ran the same test. The results were 2.4 seconds for the non-locking version, vs 3.8 for the standard version. Without time (or really inclination ) to take this further, it _does_ appear a native Python "critical section" could indeed save a few milli-seconds for a few real-world apps. So if we ignore the implementation details Marc started spelling, does the idea of a Python "critical section" appeal? Could simply be a built-in way of saying "no other _Python_ threads should run" (and of-course the "allow them again"). The semantics could be simply to ensure the Python program integrity - it need say nothing about the Python internal "state" as such. Mark. From mal at lemburg.com Tue Jul 27 14:27:55 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 14:27:55 +0200 Subject: [Python-Dev] continuations for the curious References: <3798FC81.A57E9CFE@appliedbiometrics.com> <14237.33980.82091.445607@seattle.nightmare.com> Message-ID: <379DA5CB.B3619365@lemburg.com> Sam Rushing wrote: > > +----------------------------------------------------------------- > | For those folks (like me) that were confused about where to get > | all the necessary files for building the latest Stackless Python, > | here's the procedure: Thanks... this guide made me actually try it ;-) > | > | 1) unwrap a fresh copy of 1.5.2 > | 2) unzip > | http://www.pns.cc/anonftp/pub/stackless_990713.zip > | on top of it > | 3) then, unzip > | ftp://ftp.pns.cc/pub/veryfar.zip > | on top of that It seems that Christian forgot the directory information in this ZIP file. You have to move the continuationmodule.c file to Modules/ by hand. > | 4) add "continuation continuationmodule.c" to Modules/Setup -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 157 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mhammond at skippinet.com.au Tue Jul 27 16:45:12 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Wed, 28 Jul 1999 00:45:12 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <19990727120440.13D5F303120@snelboot.oratrix.nl> Message-ID: <004401bed83e$a9252b70$0801a8c0@bobcat> [Jack seems to like an asynch IO model] > def foo(): > obj = stdin.delayed_read() > obj2 = stdout.delayed_write("data") > do_lengthy_computation() > data = obj.get() # Here we wait for the read to complete > del obj2 # Here we wait for the write to > complete. > > This gives a fairly nice programming model. Indeed. Taking this a little further, I come up with something like: inlock = threading.Lock() buffer = stdin.delayed_read(inlock) outlock = threading.Lock() stdout.delayed_write(outlock, "The data") fired = threading.Wait(inlock, outlock) # new fn :-) if fired is inlock: # etc. The idea is we can make everything wait on a single lock abstraction. threading.Wait() could accept lock objects, thread objects, Sockets, etc. Obviously a bit to work out, but it does make an appealing model. OTOH, I wonder how it fits with continutations etc. Not too badly from my weak understanding. May be an interesting convergence! Mark. From jack at oratrix.nl Tue Jul 27 17:31:13 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 27 Jul 1999 17:31:13 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Mark Hammond" , Wed, 28 Jul 1999 00:45:12 +1000 , <004401bed83e$a9252b70$0801a8c0@bobcat> Message-ID: <19990727153113.4A2F1303120@snelboot.oratrix.nl> > [Jack seems to like an asynch IO model] > > > def foo(): > > obj = stdin.delayed_read() > > obj2 = stdout.delayed_write("data") > > do_lengthy_computation() > > data = obj.get() # Here we wait for the read to complete > > del obj2 # Here we wait for the write to > > complete. > > > > This gives a fairly nice programming model. > > Indeed. Taking this a little further, I come up with something like: > > inlock = threading.Lock() > buffer = stdin.delayed_read(inlock) > > outlock = threading.Lock() > stdout.delayed_write(outlock, "The data") > > fired = threading.Wait(inlock, outlock) # new fn :-) > > if fired is inlock: # etc. I think this is exactly what I _didn't_ want:-) I'd like the delayed read to return an object that will automatically wait when I try to get the data from it, and the delayed write object to automatically wait when I garbage-collect it. Of course, there's no reason why you couldn't also wait on these objects (or, on unix, pass them to select(), or whatever). On second thought the method of the delayed read should be called read() in stead of get(), of course. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mhammond at skippinet.com.au Wed Jul 28 00:21:19 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Wed, 28 Jul 1999 08:21:19 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <19990727153113.4A2F1303120@snelboot.oratrix.nl> Message-ID: <000c01bed87e$5af42060$0801a8c0@bobcat> [I missed Jack's point] > I think this is exactly what I _didn't_ want:-) > > I'd like the delayed read to return an object that will > automatically wait > when I try to get the data from it, and the delayed write object to > automatically wait when I garbage-collect it. OK - that is fine. My driving requirement was that I be able to wait on _multiple_ files at the same time - ie, I dont know which one will complete first. There is no reason then why your initial suggestion can not satisfy my requirement, as long as the "buffer type object" returned from read is itself waitable. I agree there is no driving need for a seperate buffer type object and seperate waitable object necessarily. [OTOH, your scheme could be simply built on top of my scheme as a framework] Unfortunately, this doesnt seem to have grabbed anyone elses interest.. Mark. From da at ski.org Wed Jul 28 23:46:21 1999 From: da at ski.org (David Ascher) Date: Wed, 28 Jul 1999 14:46:21 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Tcl news Message-ID: 8.2b1 is released: Some surprising news: they now use cygwin tools to do the windows build. Not surprising news: they still haven't incorporated some bug fixes I submitted eons ago =) http://www.scriptics.com/software/relnotes/tcl8.2b1 --david From tim_one at email.msn.com Thu Jul 29 05:10:40 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 28 Jul 1999 23:10:40 -0400 Subject: [Python-Dev] RE: delayed I/O; multiple waits In-Reply-To: <000c01bed87e$5af42060$0801a8c0@bobcat> Message-ID: <001201bed96f$f06990c0$71a22299@tim> [Mark Hammond] > ... > Unfortunately, this doesnt seem to have grabbed anyone elses interest.. You lost me when you said it should be optional -- that's fine for an extension module, but it sounded like you wanted this to somehow be part of the language core. If WaitForMultipleObjects (which is what you *really* want ) is thought to be a cool enough idea to be in the core, we should think about how to implement it on non-Win32 platforms too. needs-more-words-ly y'rs - tim From mhammond at skippinet.com.au Thu Jul 29 05:52:47 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu, 29 Jul 1999 13:52:47 +1000 Subject: [Python-Dev] RE: delayed I/O; multiple waits In-Reply-To: <001201bed96f$f06990c0$71a22299@tim> Message-ID: <002e01bed975$d392d910$0801a8c0@bobcat> > You lost me when you said it should be optional -- that's fine for an > extension module, but it sounded like you wanted this to Cool - I admit I knew it was too vague, but left it in anyway. > the language core. If WaitForMultipleObjects (which is what > you *really* Sort-of. IMO, the threading module does need a WaitForMultipleObjects (whatever the spelling) but I also recall the discussion that this is not trivial. But what I _really_ want is an enhanced concept of "waitable" - threading can only wait on locks and threads. If we have this, the WaitForMultiple would become even more pressing, but they are not directly related. So, I see 2 issues, both of which usually prevent me personally from using the threading module in the real world. By "optional", I meant a way for a platform to slot into existing "waitable" semantics. Win32 file operations are waitable. I dont really want native win32 file operations to be in the core, but I would like some standard way that, if possible, I could map the waitable semantics to Python waitable semantics. Thus, although the threading module knows nothing about win32 file objects or handles, it would be nice if it could still wait on them. > needs-more-words-ly y'rs - tim Unfortunately, if I knew exactly what I wanted I would be asking for implementation advice rather than grasping at straws :-) Attempting to move from totally raw to half-baked, I suppose this is what I had in mind: * Platform optionally defines what a "waitable" object is, in the same way it now defines what a lock is. Locks are currently _required_ only with threading - waitables would never be required. * Python defines a "waitable" protocol - eg, a new "tp_wait"/"__wait__" slot. If this slot is filled/function exists, it is expected to provide a "waitable" object or NULL/None. * Threading support for platforms that support it define a tp_wait slot that maps the Thread ID to the "waitable object" * Ditto lock support for the plaform. * Extensions such as win32 handles also provide this. * Dream up extensions to file objects a-la Jack's idea. When a file is opened asynch, tp_wait returns non-NULL (via platform specific hooks), or NULL when opened sync (making it not waitable). Non-asynch platforms need zero work here - the asynch open fails, tp_wait slot never filled in. Thus, for platforms that provide no extra asynch support, threading can still only wait on threads and locks. The threading module could take advantage of the new protocol thereby supporting any waitable object. Like I said, only half-baked, but I think expresses a potentially workable idea. Does this get closer to either a) explaining what I meant, or b) confirming I am dribbling? Biggest problem I see is that the only platform that may take advantage is Windows, thereby making a platform specific solution (such as win32event I use now) perfectly reasonable. Maybe my focus should simply be on allowing win32event.WaitFor* to accept threading instances and standard Python lock objects!! Mark. From Brian at digicool.com Fri Jul 30 16:23:49 1999 From: Brian at digicool.com (Brian Lloyd) Date: Fri, 30 Jul 1999 10:23:49 -0400 Subject: [Python-Dev] RE: NT select.select? Message-ID: <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> > Is there some low limit on maximum number of sockets you can > have in the > Python-NT's select call? A program that happens to work > perfectly on Linux > seems to die on NT around 64(?) sockets to the 'too many file > descriptors > in call' error. > > Any portable ways to bypass it? > > -Markus Hi Markus, It turns out that NT has a default 64 fd limit on arguments to select(). The good news is that you can actually bump the limit up to whatever number you want by specifying a define when compiling python15.dll. If you have the ability to rebuild your python15.dll, you can add the define: FD_SETSIZE=1024 to the preprocessor options for the python15 project to raise the limit to 1024 fds. The default 64 fd limit is too low for anyone trying to run an async server that handles even a modest load, so I've submitted a bug report to python.org asking that the define above find its way into the next python release... Brian Lloyd brian at digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From guido at CNRI.Reston.VA.US Fri Jul 30 17:04:58 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 30 Jul 1999 11:04:58 -0400 Subject: [Python-Dev] RE: NT select.select? In-Reply-To: Your message of "Fri, 30 Jul 1999 10:23:49 EDT." <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> References: <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> Message-ID: <199907301504.LAA13183@eric.cnri.reston.va.us> > It turns out that NT has a default 64 fd limit on arguments to > select(). The good news is that you can actually bump the limit up > to whatever number you want by specifying a define when compiling > python15.dll. > > If you have the ability to rebuild your python15.dll, you can add > the define: > > FD_SETSIZE=1024 > > to the preprocessor options for the python15 project to raise the > limit to 1024 fds. > > The default 64 fd limit is too low for anyone trying to run > an async server that handles even a modest load, so I've > submitted a bug report to python.org asking that the define > above find its way into the next python release... Brian, (Also in response to your bug report.) I'm a little worried that upping the limit to 1024 would cause some performance problems if you're making a lot of select() calls. The select allocates three arrays of length FD_SETSIZE+3; each array item is 12 bytes. This is a total allocation of more than 36K for a meager select() call! And all that memory also has to be cleared by the FD_ZERO() call. If you actually have that many sockets, that's worth paying for (the socket objects themselves use up just as much memory, and your Python data structures for the sockets, no matter how small, are probably several times bigger), but for a more typical program, I see this as a lot of overhead. Is there a way that this can be done more dynamically, e.g. by making the set size as big as needed on windows but no bigger? (Before you suggest allocating that memory statically, remember it's possible to call select from multiple threads. Allocating 36K of thread-local space for each thread also doesn't sound too pleasant.) --Guido van Rossum (home page: http://www.python.org/~guido/) From Brian at digicool.com Fri Jul 30 20:25:01 1999 From: Brian at digicool.com (Brian Lloyd) Date: Fri, 30 Jul 1999 14:25:01 -0400 Subject: [Python-Dev] RE: NT select.select? Message-ID: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Guido wrote: > > Brian, > > (Also in response to your bug report.) I'm a little worried that > upping the limit to 1024 would cause some performance problems if > you're making a lot of select() calls. The select allocates three > arrays of length FD_SETSIZE+3; each array item is 12 bytes. This is a > total allocation of more than 36K for a meager select() call! > And all > that memory also has to be cleared by the FD_ZERO() call. > > If you actually have that many sockets, that's worth paying for (the > socket objects themselves use up just as much memory, and your Python > data structures for the sockets, no matter how small, are probably > several times bigger), but for a more typical program, I see > this as a > lot of overhead. > > Is there a way that this can be done more dynamically, e.g. by making > the set size as big as needed on windows but no bigger? > > (Before you suggest allocating that memory statically, remember it's > possible to call select from multiple threads. Allocating 36K of > thread-local space for each thread also doesn't sound too pleasant.) > > --Guido van Rossum (home page: http://www.python.org/~guido/) Hmm - after going through all of the Win32 sdks, it doesn't appear to be possible to do it any other way than as a -D option at compile time, so optimizing for the common case (folks who _don't_ need large numbers of fds) is reasonable. Since we distribute a python15.dll with Zope on windows, this isn't that big a deal for us - we can just compile in a higher limit in our distributed dll. I was mostly thinking of the win32 users who don't have the ability to rebuild their dll, but maybe this isn't that much of a problem; I suspect that the people who write significant socket apps that would run into this problem probably have access to a compiler if they need it. Brian Lloyd brian at digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From da at ski.org Fri Jul 30 20:59:37 1999 From: da at ski.org (David Ascher) Date: Fri, 30 Jul 1999 11:59:37 -0700 (Pacific Daylight Time) Subject: [Python-Dev] RE: NT select.select? In-Reply-To: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Message-ID: On Fri, 30 Jul 1999, Brian Lloyd wrote: > Since we distribute a python15.dll with Zope on windows, this > isn't that big a deal for us - we can just compile in a higher > limit in our distributed dll. I was mostly thinking of the win32 > users who don't have the ability to rebuild their dll, but > maybe this isn't that much of a problem; I suspect that the > people who write significant socket apps that would run into > this problem probably have access to a compiler if they need it. It's a worthy piece of knowledge to document somehow -- I'm not sure where that should be... From fdrake at cnri.reston.va.us Fri Jul 30 21:05:37 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Fri, 30 Jul 1999 15:05:37 -0400 (EDT) Subject: [Python-Dev] RE: NT select.select? In-Reply-To: References: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Message-ID: <14241.63361.737047.998159@weyr.cnri.reston.va.us> David Ascher writes: > It's a worthy piece of knowledge to document somehow -- I'm not sure where > that should be... Perhaps a paragraph in the library reference? If someone can send along a clear bit of text (unformatted is fine), I'll be glad to add it. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From tim_one at email.msn.com Thu Jul 1 06:30:30 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 00:30:30 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906291201.IAA02535@eric.cnri.reston.va.us> Message-ID: <000101bec37a$7465af00$309e2299@tim> [Guido] > I guess it's all in the perspective. 99.99% of all thread apps I've > ever written use threads primarily to overlap I/O -- if there wasn't > I/O to overlap I wouldn't use a thread. I think I share this > perspective with most of the thread community (after all, threads > originate in the OS world where they were invented as a replacement > for I/O completion routines). Different perspective indeed! Where I've been, you never used something as delicate as a thread to overlap I/O, you instead used the kernel-supported asynch Fortran I/O extensions <0.7 wink>. Those days are long gone, and I've adjusted to that. Time for you to leave the past too : by sheer numbers, most of the "thread community" *today* is to be found typing at a Windows box, where cheap & reliable threads are a core part of the programming culture. They have better ways to overlap I/O there too. Throwing explicit threads at this is like writing a recursive Fibonacci number generator in Scheme, but building the recursion yourself by hand out of explicit continuations . > ... > As far as I can tell, all the examples you give are easily done using > coroutines. Can we call whatever you're asking for coroutines instead > of fake threads? I have multiple agendas, of course. What I personally want for my own work is no more than Icon's generators, formally "semi coroutines", and easily implemented in the interpreter (although not the language) as it exists today. Coroutines, fake threads and continuations are much stronger than generators, and I expect you can fake any of the first three given either of the others. Generators fall out of any of them too (*you* implemented generators once using Python threads, and I implemented general coroutines -- "fake threads" are good enough for either of those). So, yes, for that agenda any means of suspending/resuming control flow can be made to work. I seized on fake threads because Python already has a notion of threads. A second agenda is that Python could be a lovely language for *learning* thread programming; the threading module helps, but fake threads could likely help more by e.g. detecting deadlocks (and pointing them out) instead of leaving a thread newbie staring at a hung system without a clue. A third agenda is related to Mark & Greg's, making Python's threads "real threads" under Windows. The fake thread agenda doesn't tie into that, except to confuse things even more if you take either agenda seriously <0.5 frown>. > I think that when you mention threads, green or otherwise colored, > most people who are at all familiar with the concept will assume they > provide I/O overlapping, except perhaps when they grew up in the > parallel machine world. They didn't suggest I/O to me at all, but I grew up in the disqualified world ; doubt they would to a Windows programmer either (e.g., my employer ships heavily threaded Windows apps of various kinds, and overlapped I/O isn't a factor in any of them; it's mostly a matter of algorithm factoring to keep the real-time incestuous subsystems from growing impossibly complex, and in some of the very expensive apps also a need to exploit multiple processors). BTW, I called them "fake" threads to get away from whatever historical baggage comes attached to "green". > Certainly all examples I give in my never-completed thread tutorial > (still available at > http://www.python.org/doc/essays/threads.html) use I/O as the primary > motivator -- The preceding "99.99% of all thread apps I've ever written use threads primarily to overlap I/O" may explain this . BTW, there is only one example there, which rather dilutes the strength of the rhetorical "all" ... > this kind of example appeals to simples souls (e.g. downloading more than > one file in parallel, which they probably have already seen in action in > their web browser), as opposed to generators or pipelines or coroutines > (for which you need to have some programming theory background to > appreciate the powerful abstraction possibillities they give). I don't at all object to using I/O as a motivator, but the latter point is off base. There is *nothing* in Comp Sci harder to master than thread programming! It's the pinnacle of perplexity, the depth of despair, the king of confusion (stop before I exaggerate ). Generators in particular get re-invented often as a much simpler approach to suspending a subroutine's control flow; indeed, Icon's primary audience is still among the humanities, and even dumb linguists don't seem to have notable problems picking it up. Threads have all the complexities of the other guys, plus races, deadlocks, starvation, load imbalance, non-determinism and non-reproducibility. Threads simply aren't simple-soul material, no matter how pedestrian a motivating *example* may be. I suspect that's why your tutorial remains unfinished: you had no trouble describing the problem to be solved, but got bogged down in mushrooming complications describing how to use threads to solve it. Even so, the simple example at the end is already flawed ("print" isn't atomic in Python, so the print len(text), url may print the len(text) from one thread followed by the url from another). It's not hard to find simple-soul examples for generators either (coroutines & continuations *are* hard to motivate!), especially since Python's for/__getitem__ protocol is already a weak form of generator, and xrange *is* a full-blown generator; e.g., a common question on c.l.py is how to iterate over a sequence backwards: for x in backwards(sequence): print x def backwards(s): for i in xrange(len(s)-1, -1, -1): suspend s[i] Nobody needs a comp sci background to understand what that *does*, or why it's handy. Try iterating over a tree structure instead & then the *power* becomes apparent; this isn't comp-sci-ish either, unless we adopt a "if they've heard of trees, they're impractical dreamers" stance . BTW, iterating over a tree is what os.path.walk does, and a frequent source of newbie confusion (they understand directory trees, they don't grasp the callback-based interface; generating (dirname, names) pairs instead would match their mental model at once). *This* is the stuff for simple souls! > Another good use of threads (suggested by Sam) is for GUI programming. > An old GUI system, News by David Rosenthal at Sun, used threads > programmed in PostScript -- very elegant (and it failed for other > reasons -- if only he had used Python instead :-). > > On the other hand, having written lots of GUI code using Tkinter, the > event-driven version doesn't feel so bad to me. Threads would be nice > when doing things like rubberbanding, but I generally agree with > Ousterhout's premise that event-based GUI programming is more reliable > than thread-based. Every time your Netscape freezes you can bet > there's a threading bug somewhere in the code. I don't use Netscape, but I can assure you the same is true of Internet Explorer -- except there the threading bug is now somewhere in the OS <0.5 wink>. Anyway, 1) There are lots of goods uses for threads, and especially in the Windows and (maybe) multiprocessor NumPy worlds. Those would really be happier with "free-threaded threads", though. 2) Apart from pedagogical purposes, there probably isn't a use for my "fake threads" that couldn't be done easier & better via a more direct (coroutine, continuation) approach; and if I had fake threads, the first thing I'd do for me is rewrite the generator and coroutine packages to use them. So, yes: you win . 3) Python's current threads are good for overlapping I/O. Sometimes. And better addressed by Sam's non-threaded "select" approach when you're dead serious about overlapping lots of I/O. They're also beaten into service under Windows, but not without cries of anguish from Greg and Mark. I don't know, Guido -- if all you wanted threads for was to speed up a little I/O in as convoluted a way as possible, you may have been witness to the invention of the wheel but missed that ox carts weren't the last application . nevertheless-ox-carts-may-be-the-best-ly y'rs - tim From tim_one at email.msn.com Thu Jul 1 09:45:54 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 03:45:54 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> Message-ID: <000901bec395$c042cfa0$309e2299@tim> [Mark Hammond] > Im a little unhappy as this [stackless Python] will break the Active > Debugging stuff ... > ... > So the solution MS came up with was, surprise surprise, the machine stack! > :-) The assumption is that all languages will make _some_ use of the > stack, so they ask a language to report its "stack base address" > and "stack size". Using this information, the debugger sorts into the > correct call sequence. Mark, you can't *really* believe Chris is incapable of hacking around this, right? It's not even clear there's something to be hacked around, since Python is only Python and there's nothing Christian can do to stop other languages that call into Python from using the machine stack, or to call other languages from Python without using the machine stack. So Python "shows up on the stack" no matter what, cross-language. > ... > Bit I also understand completely the silence on this issue. When the > thread started, there was much discussion about exactly what the > hell these continuation/coroutine thingies even were. The Fuchs paper Sam referenced explained it in simple C terms: a continuation is exactly what C setjmp/longjmp would do if setjmp saved (& longjmp restored) the C stack *in addition* to the program counter and machine registers (which they already save/restore). That's all there is to it, at heart: objects capture data state, continuations capture control flow state. Whenever the OS services an interrupt and drops into kernel mode, it captures a continuation for user mode -- they don't *call* it that, but that's what they're doing, and it's as practical as a pencil (well, *more* practical, actually ). > However, there were precious few real-world examples where they could > be used. Nobody asked for any before now <0.5 wink> -- and I see Sam provided some marvelous ones in response to this. > A few acedemic, theoretical places, I think you undervalue those: people working on the underpinnings of languages strive very hard to come up with the simplest possible examples that don't throw away the core of the problem to be solved. That doesn't mean the theoreticians are too air-headed to understand "real world problems"; it's much more that, e.g., "if you can't compare the fringes of two trees cleanly, you can't possibly do anything harder than that cleanly either -- but if you can do this little bit cleanly, we have strong reason to believe there's a large class of difficult real problems you can also do cleanly". If you need a "practical" example of that, picture e.g. a structure-based diff engine for HTML source. Which are really trees defined by tags, and where text-based comparison can be useless (you don't care if "
  • " moved from column 12 of line 16 to column 1 of line 17, but you care a lot if the *number* of
  • tags changed -- so have you have to compare two trees *as* trees). But that's a problem easy enough for generators to solve cleanly. Knuth writes a large (for his books) elevator-simulation program to illustrate coroutines (which are more powerful than generators), and complains that he can't come up with a simpler example that illustrates any point worth making. And he's right! The "literature standard" text-manipulation example at the end of my coroutine module illustrates what Sam was talking about wrt writing straightforward "pull" algorithms for a "push" process, but even that one can be solved with simpler pipeline control flow. At least for *that*, nobody who ever used Unix would doubt the real-world utility of the pipeline model for a nanosecond <1e-9 wink>. If you want a coroutine example, go to a restaurant and order a meal. When you leave, glance back *real quick*. If everyone in the restaurant is dead, they were a meal-generating subroutine; but if they're still serving other customers, your meal-eating coroutine and their meal-generating coroutine worked to mutual benefit . > but the only real contender I have seen brought up was Medusa. There were > certainly no clear examples of "as soon as we have this, I could change > abc to take advantage, and this would give us the very cool xyz" > > So, if anyone else if feeling at all like me about this issue, they are > feeling all warm and fuzzy knowing that a few smart people are giving us > the facility to do something we hope we never, ever have to do. :-) Actually, you'll want to do it a lot . Christian & I have bantered about this a few times a year in pvt, usually motivated by some horrendous kludge posted to c.l.py to solve a problem that any Assistant Professor of Medieval English could solve without effort in Icon. The *uses* aren't esoteric at all. or-at-least-not-more-than-you-make-'em-ly y'rs - tim From MHammond at skippinet.com.au Thu Jul 1 10:18:25 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Thu, 1 Jul 1999 18:18:25 +1000 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <000901bec395$c042cfa0$309e2299@tim> Message-ID: <00b901bec39a$4baf6b80$0801a8c0@bobcat> [Tim tells me it will all be obvious if I just think a little harder ] Your points about "acedemic examples" is well taken. The reality is that, even given these simple examples (which I dared deride as acedemic), the simple fact is Im not seeing "the point". I seriously dont doubt all all you say. However, as Sam and Chris have said many times, it is just a matter of changing the way to you think. Interestingly: Chris said it recently, and continues to say it. Sam said it to me _years_ ago, and said it repeatedly, but hasnt said it recently. Tim hasnt really said it yet :-) This is almost certainly because when your brain does switch, it is a revelation, and really not too hard at all. But after a while, you forget the switch ever took place. Closest analogy I can think of is OO programming. In my experience trying to _learn_ OO programming from a few misc examples and texts was pointless and very hard. You need a language to play with it in. And when you have one, your brain makes the switch, you see the light, and you can't see what was ever mysterious about it. And you tell everyone its easy; "just change the way you think about data" :-) But to all us here, OO programming is just so obvious it goes without saying. Occasionaly a newbie will have trouble with OO concepts in Python, and I personally have trouble seeing what could _possibly_ be difficult about understanding these very simple concepts. So Im just as guilty, just not in this particular case :-) So, short of all us here going and discovering the light using a different language (perish the thought :), my original point stands that until Chris' efforts give us something we can easily play with, some of use _still_ wont see what all the fuss is about. (Although I admit it has nothing to do with either the examples or the applicability of the technology to all sorts of things) Which leaves you poor guys in a catch 22 - without noise of some sort from the rest of us, its hard to keep the momentum going, but without basically a fully working Python with continuations, we wont be making much noise. But-I-will-thank-you-all-personally-and-profusely-when-I-do-see-the-light, ly Mark. From jack at oratrix.nl Thu Jul 1 18:05:50 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 01 Jul 1999 18:05:50 +0200 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: Message by Skip Montanaro , Mon, 28 Jun 1999 16:24:46 -0400 (EDT) , <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> Message-ID: <19990701160555.5D44512B0F@oratrix.oratrix.nl> Recently, Skip Montanaro said: > > Andrew> My personal leaning is that we can get more bang for the buck by > Andrew> working on the Distutils effort, so that installing a package > Andrew> like PyExpat becomes much easier, rather than piling more things > Andrew> into the core distribution. > > Amen to that. See Guido's note and my response regarding soundex in the > Doc-SIG. Perhaps you could get away with a very small core distribution > that only contained the stuff necessary to pull everything else from the net > via http or ftp... I don't know whether this subject belongs on the python-dev list (is there a separate distutils list?), but let's please be very careful with this. The Perl people apparently think that their auto-install stuff is so easy to use that if you find a tool on the net that needs Perl they'll just give you a few incantations you need to build the "correct" perl to run the tool, but I've never managed to do so. My last try was when I spent 2 days to try and get the perl-based Palm software for unix up and running. With various incompatilble versions of perl installed in /usr/local by the systems staff and knowing nothing about perl I had to give up at some point, because it was costing far more time (and diskspace:-) than the whole thing was worth. Something like mailman is (afaik) easy to install for non-pythoneers because it only depends on a single, well-defined Python distribution. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From skip at mojam.com Thu Jul 1 21:54:14 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 1 Jul 1999 15:54:14 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <19990701160555.5D44512B0F@oratrix.oratrix.nl> References: <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> <19990701160555.5D44512B0F@oratrix.oratrix.nl> Message-ID: <14203.50921.870411.353490@cm-24-29-94-19.nycap.rr.com> Skip> Amen to that. See Guido's note and my response regarding soundex Skip> in the Doc-SIG. Perhaps you could get away with a very small core Skip> distribution that only contained the stuff necessary to pull Skip> everything else from the net via http or ftp... Jack> I don't know whether this subject belongs on the python-dev list Jack> (is there a separate distutils list?), but let's please be very Jack> careful with this. The Perl people apparently think that their Jack> auto-install stuff is so easy to use ... I suppose I should have added a <0.5 wink> to my note. Still, knowing what Guido does and doesn't feel comfortable with in the core distribution would be a good start at seeing where we might like the core to wind up. Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/~skip/ 518-372-5583 From tim_one at email.msn.com Fri Jul 2 04:33:23 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 1 Jul 1999 22:33:23 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <19990701160555.5D44512B0F@oratrix.oratrix.nl> Message-ID: <000a01bec433$41a410c0$6da02299@tim> [large vs small distributions] [Jack Jansen] > I don't know whether this subject belongs on the python-dev list (is > there a separate distutils list?), but let's please be very careful > with this. [and recounts his problems with Perl] I must say the idea of a minimal distribution sounds very appealing. But then I consider that Guido never got me to even try Tk until he put it into the std Windows distribution, and I've never given anyone any code that won't work with a fresh-from-the-box distribution either. FrankS's snappy "batteries included" wouldn't carry quite the same punch if it got reduced to "coupons for batteries hidden in the docs" . OTOH, I've got about as much use for XML as MarkH has for continuations , and here-- as in many other places --we've been saved so far by Guido's good judgment about what goes in & what stays out. So it's a good thing he can't ever resign this responsibility . if-20%-of-users-need-something-i'd-include-it-else-not-ly y'rs - tim From guido at CNRI.Reston.VA.US Sun Jul 4 03:56:31 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 03 Jul 1999 21:56:31 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Thu, 01 Jul 1999 00:30:30 EDT." <000101bec37a$7465af00$309e2299@tim> References: <000101bec37a$7465af00$309e2299@tim> Message-ID: <199907040156.VAA10874@eric.cnri.reston.va.us> > [Guido] > > I guess it's all in the perspective. 99.99% of all thread apps I've > > ever written use threads primarily to overlap I/O -- if there wasn't > > I/O to overlap I wouldn't use a thread. I think I share this > > perspective with most of the thread community (after all, threads > > originate in the OS world where they were invented as a replacement > > for I/O completion routines). [Tim] > Different perspective indeed! Where I've been, you never used something as > delicate as a thread to overlap I/O, you instead used the kernel-supported > asynch Fortran I/O extensions <0.7 wink>. > > Those days are long gone, and I've adjusted to that. Time for you to leave > the past too : by sheer numbers, most of the "thread community" > *today* is to be found typing at a Windows box, where cheap & reliable > threads are a core part of the programming culture. No quibble so far... > They have better ways to overlap I/O there too. Really? What are they? The non-threaded primitives for overlapping I/O look like Medusa to me: very high performance, but a pain to use -- because of the event-driven programming model! (Or worse, callback functions.) But maybe there are programming techniques that I'm not even aware of? (Maybe I should define what I mean by overlapping I/O -- basically every situation where disk or network I/O or GUI event handling goes on in parallel with computation or with each other. For example, in my book copying a set of files while at the same time displaying some silly animation of sheets of paper flying through the air *and* watching a Cancel button counts as overlapping I/O, and if I had to code this it would probably be a lot simpler to do using threads. > Throwing explicit threads at this is like writing a recursive > Fibonacci number generator in Scheme, but building the recursion > yourself by hand out of explicit continuations . Aren't you contradicting yourself? You say that threads are ubiquitous and easy on Windows (and I agree), yet you claim that threads are overkill for doing two kinds of I/O or one kind of I/O and some computation in parallel...? I'm also thinking of Java threads. Yes, the GC thread is one of those computational threads you are talking about, but I think the examples I've seen otherwise are mostly about having one GUI component (e.g. an applet) independent from other GUI components (e.g. the browser). To me that's overlapping I/O, since I count GUI events as I/O. > > ... > > As far as I can tell, all the examples you give are easily done using > > coroutines. Can we call whatever you're asking for coroutines instead > > of fake threads? > > I have multiple agendas, of course. What I personally want for my own work > is no more than Icon's generators, formally "semi coroutines", and easily > implemented in the interpreter (although not the language) as it exists > today. > > Coroutines, fake threads and continuations are much stronger than > generators, and I expect you can fake any of the first three given either of > the others. Coroutines, fake threads and continuations? Can you really fake continuations given generators? > Generators fall out of any of them too (*you* implemented > generators once using Python threads, and I implemented general > coroutines -- "fake threads" are good enough for either of those). Hm. Maybe I'm missing something. Why didn't you simply say "you can fake each of the others given any of these"? > So, yes, for that agenda any means of suspending/resuming control flow can > be made to work. I seized on fake threads because Python already has a > notion of threads. > > A second agenda is that Python could be a lovely language for *learning* > thread programming; the threading module helps, but fake threads could > likely help more by e.g. detecting deadlocks (and pointing them out) instead > of leaving a thread newbie staring at a hung system without a clue. Yes. > A third agenda is related to Mark & Greg's, making Python's threads "real > threads" under Windows. The fake thread agenda doesn't tie into that, > except to confuse things even more if you take either agenda seriously <0.5 > frown>. What makes them unreal except for the interpreter lock? Python threads are always OS threads, and that makes them real enough for most purposes... (I'm not sure if there are situations on uniprocessors where the interpreter lock screws things up that aren't the fault of the extension writer -- typically, problems arise when an extension does some blocking I/O but doesn't place Py_{BEGIN,END}_ALLOW_THREADS macros around the call.) > > I think that when you mention threads, green or otherwise colored, > > most people who are at all familiar with the concept will assume they > > provide I/O overlapping, except perhaps when they grew up in the > > parallel machine world. > > They didn't suggest I/O to me at all, but I grew up in the disqualified > world ; doubt they would to a Windows programmer either (e.g., my > employer ships heavily threaded Windows apps of various kinds, and > overlapped I/O isn't a factor in any of them; it's mostly a matter of > algorithm factoring to keep the real-time incestuous subsystems from growing > impossibly complex, and in some of the very expensive apps also a need to > exploit multiple processors). Hm, you admit that they sometimes want to use multiple CPUs, which was explcitly excluded from our discussion (since fake threads don't help there), and I bet that they are also watching some kind of I/O (e.g. whether the user says some more stuff). > BTW, I called them "fake" threads to get away > from whatever historical baggage comes attached to "green". Agreed -- I don't understand where green comes from at all. Does it predate Java? > > Certainly all examples I give in my never-completed thread tutorial > > (still available at > > http://www.python.org/doc/essays/threads.html) use I/O as the primary > > motivator -- > > The preceding "99.99% of all thread apps I've ever written use threads > primarily to overlap I/O" may explain this . BTW, there is only one > example there, which rather dilutes the strength of the rhetorical "all" ... OK, ok. I was planning on more along the same lines. I may have borrowed this idea from a Java book I read. > > this kind of example appeals to simples souls (e.g. downloading more than > > one file in parallel, which they probably have already seen in action in > > their web browser), as opposed to generators or pipelines or coroutines > > (for which you need to have some programming theory background to > > appreciate the powerful abstraction possibillities they give). > > I don't at all object to using I/O as a motivator, but the latter point is > off base. There is *nothing* in Comp Sci harder to master than thread > programming! It's the pinnacle of perplexity, the depth of despair, the > king of confusion (stop before I exaggerate ). I dunno, but we're probably both pretty poor predictors for what beginning programmers find hard. Randy Pausch (of www.alice.org) visited us this week; he points out that we experienced programmers are very bad at gauging what newbies find hard, because we've been trained "too much". He makes the point very eloquently. He also points out that in Alice, users have no problem at all with parallel activities (e.g. the bunny's head rotating while it is also hopping around, etc.). > Generators in particular get re-invented often as a much simpler approach to > suspending a subroutine's control flow; indeed, Icon's primary audience is > still among the humanities, and even dumb linguists don't seem to > have notable problems picking it up. Threads have all the complexities of > the other guys, plus races, deadlocks, starvation, load imbalance, > non-determinism and non-reproducibility. Strange. Maybe dumb linguists are better at simply copying examples without thinking too much about them; personally I had a hard time understanding what Icon was doing when I read about it, probably because I tried to understand how it was done. For threads, I have a simple mental model. For coroutines, my head explodes each time. > Threads simply aren't simple-soul material, no matter how pedestrian a > motivating *example* may be. I suspect that's why your tutorial remains > unfinished: you had no trouble describing the problem to be solved, but got > bogged down in mushrooming complications describing how to use threads to > solve it. No, I simply realized that I had to finish the threading module and release the thread-safe version of urllib.py before I could release the tutorial; and then I was distracted and never got back to it. > Even so, the simple example at the end is already flawed ("print" > isn't atomic in Python, so the > > print len(text), url > > may print the len(text) from one thread followed by the url from another). Fine -- that's a great excuse to introduce locks in the next section. (Most threading tutorials I've seen start by showing flawed examples to create an appreciation for the need of locks.) > It's not hard to find simple-soul examples for generators either (coroutines > & continuations *are* hard to motivate!), especially since Python's > for/__getitem__ protocol is already a weak form of generator, and xrange > *is* a full-blown generator; e.g., a common question on c.l.py is how to > iterate over a sequence backwards: > > for x in backwards(sequence): > print x > > def backwards(s): > for i in xrange(len(s)-1, -1, -1): > suspend s[i] But backwards() also returns, when it's done. What happens with the return value? > Nobody needs a comp sci background to understand what that *does*, or why > it's handy. Try iterating over a tree structure instead & then the *power* > becomes apparent; this isn't comp-sci-ish either, unless we adopt a "if > they've heard of trees, they're impractical dreamers" stance . BTW, > iterating over a tree is what os.path.walk does, and a frequent source of > newbie confusion (they understand directory trees, they don't grasp the > callback-based interface; generating (dirname, names) pairs instead would > match their mental model at once). *This* is the stuff for simple souls! Probably right, although I think that os.path.walk just has a bad API (since it gives you a whole directory at a time instead of giving you each file). > > Another good use of threads (suggested by Sam) is for GUI programming. > > An old GUI system, News by David Rosenthal at Sun, used threads > > programmed in PostScript -- very elegant (and it failed for other > > reasons -- if only he had used Python instead :-). > > > > On the other hand, having written lots of GUI code using Tkinter, the > > event-driven version doesn't feel so bad to me. Threads would be nice > > when doing things like rubberbanding, but I generally agree with > > Ousterhout's premise that event-based GUI programming is more reliable > > than thread-based. Every time your Netscape freezes you can bet > > there's a threading bug somewhere in the code. > > I don't use Netscape, but I can assure you the same is true of Internet > Explorer -- except there the threading bug is now somewhere in the OS <0.5 > wink>. > > Anyway, > > 1) There are lots of goods uses for threads, and especially in the Windows > and (maybe) multiprocessor NumPy worlds. Those would really be happier with > "free-threaded threads", though. > > 2) Apart from pedagogical purposes, there probably isn't a use for my "fake > threads" that couldn't be done easier & better via a more direct (coroutine, > continuation) approach; and if I had fake threads, the first thing I'd do > for me is rewrite the generator and coroutine packages to use them. So, > yes: you win . > > 3) Python's current threads are good for overlapping I/O. Sometimes. And > better addressed by Sam's non-threaded "select" approach when you're dead > serious about overlapping lots of I/O. This is independent of Python, and is (I think) fairly common knowledge -- if you have 10 threads this works fine, but with 100s of them the threads themselves become expensive resources. But then you end up with contorted code which is why high-performance systems require experts to write them. > They're also beaten into service > under Windows, but not without cries of anguish from Greg and Mark. Not sure what you mean here. > I don't know, Guido -- if all you wanted threads for was to speed up a > little I/O in as convoluted a way as possible, you may have been witness to > the invention of the wheel but missed that ox carts weren't the last > application . What were those applications of threads again you were talking about that could be serviced by fake threads that weren't coroutines/generators? --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Sun Jul 4 05:41:32 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sat, 3 Jul 1999 22:41:32 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907040156.VAA10874@eric.cnri.reston.va.us> References: Your message of "Thu, 01 Jul 1999 00:30:30 EDT." <000101bec37a$7465af00$309e2299@tim> Message-ID: <1281066233-51948648@hypernet.com> Hmmm. I jumped back into this one, but never saw my post show up... Threads (real or fake) are useful when more than one thing is "driving" your processing. It's just that in the real world (a place Tim visited, once, but didn't like - or was it vice versa?) those "drivers" are normally I/O. Guido complained that to do it right would require gathering up all the fds and doing a select. I don't think that's true (at least, for a decent fake thread). You just have to select on the one (to see if the I/O will work) and swap or do it accordingly. Also makes it a bit easier for portability (I thought I heard that Mac's select is limited to sockets). I see 2 questions. First, is there enough of an audience (Mac, mostly, I think) without native threads to make them worthwhile? Second, do we want to introduce yet more possibilities for brain-explosions by enabling coroutines / continuations / generators or some such? There is practical value there (as Sam has pointed out, and I now concur, watching my C state machine grow out of control with each new client request). I think the answer to both is probably "yes", and though they have a lot in common technically, they have totally different rationales. - Gordon From tim_one at email.msn.com Sun Jul 4 10:46:09 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 4 Jul 1999 04:46:09 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281066233-51948648@hypernet.com> Message-ID: <000d01bec5f9$a95fa9a0$ea9e2299@tim> [Gordon McMillan] > Hmmm. I jumped back into this one, but never saw my post show up... Me neither! An exclamation point because I see there's a recent post of yours in the Python-Dev archives, but I didn't get it in the mail either. > Threads (real or fake) are useful when more than one thing is > "driving" your processing. It's just that in the real world (a place > Tim visited, once, but didn't like - or was it vice versa?) those > "drivers" are normally I/O. Yes, but that's the consensus view of "real", and so suffers from "ten billion flies can't be wrong" syndrome . If you pitch a parallel system to the NSA, they give you a long list of problems and ask you to sketch the best way to solve them on your platform; as I recall, none had anything to do with I/O even under Guido's definition; instead tons of computation with difficult twists, and enough tight coupling to make threads the natural approach in most cases. If I said any more they'd terminate me with extreme prejudice, and the world doesn't get any realer than that . > Guido complained that to do it right would require gathering up all > the fds and doing a select. I don't think that's true (at least, for > a decent fake thread). You just have to select on the one (to see if > the I/O will work) and swap or do it accordingly. Also makes it a bit > easier for portability (I thought I heard that Mac's select is > limited to sockets). Can you flesh out the "swap" part more? That is, we're in the middle of some C code, so the C stack is involved in the state that's being swapped, and under fake threads we don't have a real thread to magically capture that. > I see 2 questions. First, is there enough of an audience (Mac, > mostly, I think) without native threads to make them worthwhile? > Second, do we want to introduce yet more possibilities for > brain-explosions by enabling coroutines / continuations / generators > or some such? There is practical value there (as Sam has pointed out, > and I now concur, watching my C state machine grow out of control > with each new client request). > > I think the answer to both is probably "yes", and though they have a > lot in common technically, they have totally different rationales. a) Generators aren't enough for Sam's designs. b) Fake threads are roughly comparable to coroutines and continuations wrt power (depending on implementation details, continuations may be strictly most powerful, and coroutines least). c) Christian's stackless Python can, I believe, already do full coroutines, and is close to doing full continuations. So soon we can kick the tires instead of each other . or-what-the-heck-we-can-akk-kick-chris-ly y'rs - tim From tim_one at email.msn.com Sun Jul 4 10:45:58 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 4 Jul 1999 04:45:58 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907040156.VAA10874@eric.cnri.reston.va.us> Message-ID: <000c01bec5f9$a3e86e80$ea9e2299@tim> [Guido and Tim, Guido and Tim] Ouch! This is getting contentious. Let's unwind the "you said, I said, you said" business a bit. Among the three {coroutines, fake threads, continuations}, I expect any could be serviceably simulated via either of the others. There: just saved a full page of sentence diagramming . All offer a strict superset of generator semantics. It follows that, *given* either coroutines or continuations, I indeed see no semantic hole that would be plugged by fake threads. But Python doesn't have any of the three now, and there are two respects in which fake threads may have an advantage over the other two: 1) Pedagogical, a friendlier sandbox for learning "real threads". 2) Python already has *a* notion of threads. So fake threads could be seen as building on that (variation of an existing concept, as opposed to something unprecedented). I'm the only one who seems to see merit in #2, so I won't mention it again: fake threads may be an aid to education, but other than that they're useless crap, and probably cause stains if not outright disk failure . About newbies, I've seen far too many try to learn threads to entertain the notion that they're easier than I think. Threads != parallel programming, though! Approaches like Gelertner's Linda, or Klappholz's "refined languages", *are* easy for newbies to pick up, because they provide clear abstractions that prevent the worst parallelism bugs by offering primitives that *can't* e.g. deadlock. threading.py is a step in the right direction (over the "thread" module) too. And while I don't know what Alice presents as a parallelism API, I'd bet 37 cents unseen that the Alice user doesn't build "parallel activities" out of thread.start_new_thread and raw mutii . About the rest, I think you have a more expansive notion of I/O than I do, although I can squint and see what you mean; e.g., I *could* view almost all of what Dragon's products do as I/O, although it's a real stretch for the thread that just polls the other threads making sure they're still alive . Back to quoting: >> Throwing explicit threads at this is like writing a recursive >> Fibonacci number generator in Scheme, but building the recursion >> yourself by hand out of explicit continuations . > Aren't you contradicting yourself? You say that threads are > ubiquitous and easy on Windows (and I agree), yet you claim that > threads are overkill for doing two kinds of I/O or one kind of I/O and > some computation in parallel...? They're a general approach (like continuations) but, yes, given an asynch I/O interface most times I'd much rather use the latter (like I'd rather use recursion directly when it's available). BTW, I didn't say threads were "easy" under Windows: cheap, reliable & ubiquitous, yes. They're easier than under many other OSes thanks to a rich collection of system-supplied thread gimmicks that actually work, but no way are they "easy". Like you did wrt hiding "thread" under "threading", even under Windows real projects have to create usable app-specific thread-based abstractions (c.f. your on-target remark about Netscape & thread bugs). > I'm also thinking of Java threads. Yes, the GC thread is one of those > computational threads you are talking about, but I think the examples > I've seen otherwise are mostly about having one GUI component (e.g. an > applet) independent from other GUI components (e.g. the browser). To > me that's overlapping I/O, since I count GUI events as I/O. Whereas I don't. So let's just agree to argue about this one with ever-increasing intensity . > ... > What makes them unreal except for the interpreter lock? Python > threads are always OS threads, and that makes them real enough for > most purposes... We should move this part to the Thread-SIG; Mark & Greg are doubtless chomping at the bit to rehash the headaches the global lock causes under Windows ; I'm not so keen either to brush off the potential benefits of multiprocessor parallelism, particularly not with the price of CPUs falling into spare-change range. > (I'm not sure if there are situations on uniprocessors where the > interpreter lock screws things up that aren't the fault of the > extension writer -- typically, problems arise when an extension does > some blocking I/O but doesn't place Py_{BEGIN,END}_ALLOW_THREADS > macros around the call.) Hmm! What kinds of problems happen then? Just a lack of hoped-for overlap, or actual deadlock (the I/O thread needing another thread to proceed for itself to make progress)? If the latter, the extension writer's view of who's at fault may differ from ours . >> (e.g., my employer ships heavily threaded Windows apps of various >> kinds, and overlapped I/O isn't a factor in any of them; it's mostly >> a matter of algorithm factoring to keep the real-time incestuous >> subsystems from growing impossibly complex, and in some of the very >> expensive apps also a need to exploit multiple processors). > Hm, you admit that they sometimes want to use multiple CPUs, which was > explcitly excluded from our discussion (since fake threads don't help > there), I've been ranting about both fake threads and real threads, and don't recall excluding anything; I do think I *should* have, though . > and I bet that they are also watching some kind of I/O (e.g. whether the > user says some more stuff). Sure, and whether the phone rings, and whether text-to-speech is in progress, and tracking the mouse position, and all sorts of other arguably I/O-like stuff too. Some of the subsytems are thread-unaware legacy or 3rd-party code, and need to run in threads dedicated to them because they believe they own the entire machine (working via callbacks). The coupling is too tight to afford IPC mechanisms, though (i.e., running these in a separate process is not an option). Mostly it's algorithm-factoring, though: text-to-speech and speech-to-text both require mondo complex processing, and the "I/O part" of each is a small link at an end of a massive chain. Example: you say something, and you expect to see "the result" the instant you stop speaking. But the CPU cycles required to recognize 10 seconds of speech consumes, alas, about 10 seconds. So we *have* to overlap the speech collection with the signal processing, the acoustic feature extraction, the acoustic scoring, the comparison with canned acoustics for many tens of thousands of words, the language modeling ("sounded most like 'Guido', but considering the context they probably said 'ghee dough'"), and so on. You simply can't write all that as a monolothic algorithm and have a hope of it working; it's most naturally a pipeline, severely complicated in that what pops out of the end of the first stage can have a profound effect on what "should have come out" at the start of the last stage. Anyway, thread-based pseudo-concurreny is a real help in structuring all that. It's *necessary* to overlap speech collection (input) with computation and result-so-far display (output), but it doesn't stop there. > ... > Agreed -- I don't understand where green comes from at all. Does it > predate Java? Don't know, but I never heard of it before Java or outside of Solaris. [about generators & dumb linguists] > Strange. Maybe dumb linguists are better at simply copying examples > without thinking too much about them; personally I had a hard time > understanding what Icon was doing when I read about it, probably > because I tried to understand how it was done. For threads, I have a > simple mental model. For coroutines, my head explodes each time. Yes, I expect the trick for "dumb linguists" is that they don't try to understand. They just use it, and it works or it doesn't. BTW, coroutines are harder to understand because of (paradoxically!) the symmetry; generators are slaves, so you don't have to bifurcate your brain to follow what they're doing . >> print len(text), url >> >> may print the len(text) from one thread followed by the url >> from another). > Fine -- that's a great excuse to introduce locks in the next section. > (Most threading tutorials I've seen start by showing flawed examples > to create an appreciation for the need of locks.) Even better, they start with an endless sequence of flawed examples that makes the reader wonder if there's *any* way to get this stuff to work . >> for x in backwards(sequence): >> print x >> >> def backwards(s): >> for i in xrange(len(s)-1, -1, -1): >> suspend s[i] > But backwards() also returns, when it's done. What happens with the > return value? I don't think a newbie would think to ask that: it would "just work" . Seriously, in Icon people quickly pick up that generators have a "natural lifetime", and when they return their life is over. It hangs together nicely enough that people don't have to think about it. Anyway, "return" and "suspend" both return a value; the only difference is that "return" kills the generator (it can't be resumed again after a return). The pseudo-Python above assumed that a generator signals the end of its life by returning None. Icon uses a different mechanism. > ... > Probably right, although I think that os.path.walk just has a bad API > (since it gives you a whole directory at a time instead of giving you > each file). Well, in Ping's absence I've generally fielded the c.l.py questions about tokenize.py too, and there's a pattern: non-GUI people simply seem to find callbacks confusing! os.path.walk has some other UI glitches (like "arg" is the 3rd argument to walk but the 1st arg to the callback, & people don't know what its purpose is anyway), but I think the callback is the core of it (& "arg" is an artifact of the callback interface). I can't help but opine that part of what people find so confusing about call/cc in Scheme is that it calls a function taking a callback argument too. Generators aren't strong enough to replace call/cc, but they're exactly what's needed to make tokenize's interface match the obvious mental model ("the program is a stream of tokens, and I want to iterate over that"); c.f. Sam's comments too about layers of callbacks vs "normal control flow". >> 3) Python's current threads are good for overlapping I/O. >> Sometimes. And better addressed by Sam's non-threaded "select" >> approach when you're dead serious about overlapping lots of I/O. > This is independent of Python, and is (I think) fairly common > knowledge -- if you have 10 threads this works fine, but with 100s of > them the threads themselves become expensive resources. I think people with a Unix background understand that, but not sure about Windows natives. Windows threads really are cheap, which easily slides into abuse; e.g., the recently-fixed electron-width hole in cleaning up thread states required extreme rates of thread death to provoke, and has been reported by multiple Windows users. An SGI guy was kind enough to confirm the test case died for him too, but did any non-Windows person ever report this bug? > But then you end up with contorted code which is why high-performance > systems require experts to write them. Which feeds back into Sam's agenda: the "advanced" control-flow gimmicks can be used by an expert to implement a high-performance system that doesn't require expertise to use. Fake threads would be good enough for that purpose too (while real threads aren't), although he's got his heart set on one of the others. >> I don't know, Guido -- if all you wanted threads for was to speed up a >> little I/O in as convoluted a way as possible, you may have been witness >> to the invention of the wheel but missed that ox carts weren't the last >> application . > What were those applications of threads again you were talking about > that could be serviced by fake threads that weren't coroutines/generators? First, let me apologize for the rhetorical excess there -- it went too far. Forgive me, or endure more of the same . Second, the answer is (of course) "none", but that was a rant about real threads, not fake ones. so-close-you-can-barely-tell-'em-apart-ly y'rs - tim From gmcm at hypernet.com Sun Jul 4 15:23:31 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sun, 4 Jul 1999 08:23:31 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000d01bec5f9$a95fa9a0$ea9e2299@tim> References: <1281066233-51948648@hypernet.com> Message-ID: <1281031342-54048300@hypernet.com> [I jump back into a needlessly contentious thread]: [Gordon McMillan - me] > > Threads (real or fake) are useful when more than one thing is > > "driving" your processing. It's just that in the real world (a place > > Tim visited, once, but didn't like - or was it vice versa?) those > > "drivers" are normally I/O. [Tim] > Yes, but that's the consensus view of "real", and so suffers from > "ten billion flies can't be wrong" syndrome . If you pitch a > parallel system to the NSA, I can assure you that gov't work isn't "real", even when the problem domain appears to be, which in this case is assuredly not true . But the point really is that (1) Guido's definition of "I/O" is very broad and (2) given that definition, it probably does account for 99% of the cases. Which is immaterial, if the fix for one fixes the others. > > Guido complained that to do it right would require gathering up all > > the fds and doing a select. I don't think that's true (at least, for > > a decent fake thread). You just have to select on the one (to see if > > the I/O will work) and swap or do it accordingly. Also makes it a bit > > easier for portability (I thought I heard that Mac's select is > > limited to sockets). > > Can you flesh out the "swap" part more? That is, we're in the > middle of some C code, so the C stack is involved in the state > that's being swapped, and under fake threads we don't have a real > thread to magically capture that. Sure - it's spelled "T I S M E R". IFRC, this whole thread started with Guido dumping cold water on the comment that perhaps Chris's work could yield green (er, "fake") threads. > > I see 2 questions. First, is there enough of an audience (Mac, > > mostly, I think) without native threads to make them worthwhile? > > Second, do we want to introduce yet more possibilities for > > brain-explosions by enabling coroutines / continuations / generators > > or some such? There is practical value there (as Sam has pointed out, > > and I now concur, watching my C state machine grow out of control > > with each new client request). > > > > I think the answer to both is probably "yes", and though they have a > > lot in common technically, they have totally different rationales. > > a) Generators aren't enough for Sam's designs. OK, but they're still (minorly) mind expanding for someone from the orthodox C / Python world... > b) Fake threads are roughly comparable to coroutines and > continuations wrt power (depending on implementation details, > continuations may be strictly most powerful, and coroutines least). > > c) Christian's stackless Python can, I believe, already do full > coroutines, and is close to doing full continuations. So soon we > can kick the tires instead of each other . So then we're down to Tim faking the others from whatever Chris comes up with? Sounds dandy to me! (Yah, bitch and moan Tim; you'd do it anyway...). (And yes, we're on the "dev" list; this is all experimental; so Guido can just live with being a bit uncomfortable with it ). The rambling arguments have had to do with "reasons" for doing this stuff. I was just trying to point out that there are a couple valid but very different reasons: 1) Macs. 2) Sam. almost-a-palindrome-ly y'rs - Gordon From tismer at appliedbiometrics.com Sun Jul 4 16:06:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 04 Jul 1999 16:06:01 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000c01bec5f9$a3e86e80$ea9e2299@tim> Message-ID: <377F6A49.B48E8000@appliedbiometrics.com> Just a few clarifications. I have no time, but need to share what I learned. Tim Peters wrote: > > [Guido and Tim, Guido and Tim] ... > Among the three {coroutines, fake threads, continuations}, I expect any > could be serviceably simulated via either of the others. There: just saved > a full page of sentence diagramming . All offer a strict superset of > generator semantics. I have just proven that this is not true. Full continuations cannot be expressed by coroutines. All the rest is true. Coroutines and fake threads just need the absence of the C stack. To be more exact: It needs that the current state of the C stack is independent from executing bound Python code (which frames are). Now the big surprize: This *can* be done without removing the C stack. It can give more speed to let the stack wind up to some degree and wind down later. Even some Scheme implementations are doing this. But the complexity to make this work correctly is even higher than to be stackless whenever possible. So this is the basement, but improvements are possible and likely to appear. Anyway, with this, you can build fake threads, coroutines and generators. They all do need a little extra treatment. Switching of context, how to stop a coroutine, how to catch exceptions and so on. You can do all that with some C code. I just believe that even that can be done with Python. Here the unsayable continuation word appears. You must have them if you want to try the above *in* Python. Reason why continuations are the hardest of the above to implement and cannot expressed by them: A continuation is the future of some computation. It allows to change the order of execution of a frame in a radical way. A frame can have as many as one dormant continuation per every function call which appears lexically, and it cannot predict which of these is actually a continuation. From klm at digicool.com Sun Jul 4 16:30:00 1999 From: klm at digicool.com (Ken Manheimer) Date: Sun, 4 Jul 1999 10:30:00 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000c01bec5f9$a3e86e80$ea9e2299@tim> <377F6A49.B48E8000@appliedbiometrics.com> Message-ID: <002601bec629$b38eedc0$5a57a4d8@erols.com> I have to say thank you, christian! I think your intent - provide the basis for designers of python's advanced control mechanisms to truly explore, and choose the direction in a well informed way - is ideal, and it's a rare and wonderful opportunity to be able to pursue something like an ideal course. Thanks to your hard work. Whatever comes of this, i think we all have at least refined our understandings of the issues - i know i have. (Thanks also to the ensuing discussion's clarity and incisiveness - i need to thank everyone involved for that...) I may not be able to contribute particularly to the implementation, but i'm glad to be able to grasp the implications as whatever proceeds, proceeds. And i actually expect that the outcome will be much better informed than it would have been without your following through on your own effort to understand. Yay! Ken klm at digicool.com From gmcm at hypernet.com Sun Jul 4 20:25:20 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sun, 4 Jul 1999 13:25:20 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <377F6A49.B48E8000@appliedbiometrics.com> Message-ID: <1281013244-55137551@hypernet.com> I'll second Ken's congratulations to Christian! [Christian] > ... Full continuations > cannot be expressed by coroutines. All the rest is true. I beg enlightenment from someone more familiar with these high-falutin' concepts. Would the following characterization be accurate? All these beasts (continuations, coroutines, generators) involve the idea of "resumable", but: A generator's state is wholly self-contained A coroutines's state is not necessarily self-contained but it is stable Continuations may have volatile state. Is this right, wrong, necessary, sufficient...?? goto-beginning-to-look-attractive-ly y'rs - Gordon From bwarsaw at cnri.reston.va.us Mon Jul 5 00:14:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Sun, 4 Jul 1999 18:14:36 -0400 (EDT) Subject: [Python-Dev] Mail getting lost? (was RE: Fake threads) References: <1281066233-51948648@hypernet.com> <000d01bec5f9$a95fa9a0$ea9e2299@tim> Message-ID: <14207.56524.360202.939414@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Me neither! An exclamation point because I see there's a TP> recent post of yours in the Python-Dev archives, but I didn't TP> get it in the mail either. A bad performance problem in Mailman was causing cpu starvation and (I'm surmising) lost messages. I believe I've fixed this in the version currently running on python.org. If you think messages are showing up in the archives but you are still not seeing them delivered to you, please let me know via webmaster at python.org! -Barry From guido at CNRI.Reston.VA.US Mon Jul 5 14:12:41 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 05 Jul 1999 08:12:41 -0400 Subject: [Python-Dev] Welcome Jean-Claude Wippler Message-ID: <199907051212.IAA11729@eric.cnri.reston.va.us> We have a new python-dev member. Welcome, Jean-Claude! (It seems you are mostly interested in lurking, since you turned on digest mode :-) Remember, the list's archives and member list are public; noth are accessible via http://www.python.org/mailman/listinfo/python-dev I would welcome more members -- please suggest names and addresses to me! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Mon Jul 5 14:06:03 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 05 Jul 1999 08:06:03 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Sun, 04 Jul 1999 13:25:20 CDT." <1281013244-55137551@hypernet.com> References: <1281013244-55137551@hypernet.com> Message-ID: <199907051206.IAA11699@eric.cnri.reston.va.us> > [Christian] > > ... Full continuations > > cannot be expressed by coroutines. All the rest is true. [Gordon] > I beg enlightenment from someone more familiar with these > high-falutin' concepts. Would the following characterization be > accurate? > > All these beasts (continuations, coroutines, generators) involve the > idea of "resumable", but: > > A generator's state is wholly self-contained > A coroutines's state is not necessarily self-contained but it is stable > Continuations may have volatile state. > > Is this right, wrong, necessary, sufficient...?? I still don't understand all of this (I have not much of an idea of what Christian's search for hidden registers is about and what kind of analysis he needs) but I think of continuations as requiring (theoretically) coping the current stack (to and from), while generators and coroutines just need their own piece of stack set aside. The difference between any of these and threads (fake or real) is that they pass control explicitly, while threads (typically) presume pre-emptive scheduling, i.e. they make independent parallel progress without explicit synchronization. (Hmm, how do you do this with fake threads? Or are these only required to switch whenever you touch a mutex?) I'm not sure if there's much of a difference between generators and coroutines -- it seems just the termination convention. (Hmm... would/should a generator be able to raise an exception in its caller? A coroutine?) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Mon Jul 5 08:55:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 5 Jul 1999 02:55:02 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281013244-55137551@hypernet.com> Message-ID: <000101bec6b3$4e752be0$349e2299@tim> [Gordon] > I beg enlightenment from someone more familiar with these > high-falutin' concepts. Would the following characterization be > accurate? > > All these beasts (continuations, coroutines, generators) involve the > idea of "resumable", but: > > A generator's state is wholly self-contained > A coroutines's state is not necessarily self-contained but it is > stable > Continuations may have volatile state. > > Is this right, wrong, necessary, sufficient...?? > > goto-beginning-to-look-attractive-ly y'rs "goto" is deliciously ironic, for a reason to become clear . Here's my biased short course. NOW First, I have the feeling most people would panic if we simply described Python's current subroutine mechanism under a new name <0.9 wink>. I'll risk that. When Python makes a call, it allocates a frame object. Attached to the frame is the info everyone takes for granted so thinks is "simple & obvious" . Chiefly, "the locals" (name -> object bindings) a little evaluation stack for holding temps and dynamic block-nesting info the offset to the current bytecode instruction, relative to the start of the code object's fixed (immutable) bytecode vector When a subroutine returns, it decrefs the frame and then the frame typically goes away; if it returns because of an exception, though, traceback objects may keep the frame alive. GENERATORS Generators add two new abstract operations, "suspend" and "resume". When a generator suspends, it's exactly like a return today except we simply decline to decref the frame. That's it! The locals, and where we are in the computation, aren't thrown away. A "resume" then consists of *re*starting the frame at its next bytecode instruction, with the retained frame's locals and eval stack just as they were. Some generator properties: + In implementation terms a trivial variation on what Python currently does. + They're asymmetric: "suspend" is something only a generator can do, and "resume" something only its caller can do (this does not preclude a generator from being "the caller" wrt to some other generator, though, and indeed that's very useful in practice). + A generator always returns control directly to its caller, at the point the caller invoked the generator. And upon resumption, a generator always picks up where it left off. + Because a generator remembers where it is and what its locals are, its state and "what to do next" don't have to be encoded in global data structures then decoded from scratch upon entry. That is, whenever you build a little (or large!) state machine to figure out "what to do next" from a collection of persistent flags and state vrbls, chances are good there's a simple algorithm dying to break free of that clutter . COROUTINES Coroutines add only one new abstract operation, "transfer". They're fully symmetric so can get away with only one. "transfer" names a coroutine to transfer to, and gives a value to deliver to it (there are variations, but this one is common & most useful). When A transfers to B, it acts like a generator "suspend" wrt A and like a generator "resume" wrt B. So A remembers where it is, and what its locals etc are, and B gets restarted from the point *it* last transfered to someone else. Coroutines grew up in simulation languages because they're an achingly natural way to model independent objects that interact with feedback. There each object (which may itself be a complex system of other stuff) is written as an infinite loop, transferring control to other objects when it has something to tell them, and transferred to by other objects when they have something to tell it. A Unix pipeline "A | B | C | D" doesn't exploit the full power but is suggestive. A may be written as while 1: x = compute my next output B.transfer(x) # resume B with my output B as while 1: x = A.transfer() # resume A to get my input y = compute something from x and my own history C.transfer(y) # resume C with my output C as while 1: x = B.transfer() # resume B to get my input y = compute something from x and my own history D.transfer(y) # resume D with my output and D as while 1: x = C.transfer() # resume C to get my input y = compute something from x and my own history print y If e.g. C collapses pairs of values from B, it can be written instead as while 1: # get a pair of B's x = B.transfer() y = B.transfer() z = f(x, y, whatever) D.transfer(z) # resume D with my output It's a local modification to C: B doesn't know and shouldn't need to know. This keeps complex algorithms manageable as things evolve. Initialization and shutdown can be delicate, but once the pipe is set up it doesn't even matter which of {A, B, C, D} first gets control! You can view A as pushing results through the pipe, or D as pulling them, or whatever. In reality they're all equal partners. Why these are so much harder to implement than generators: "transfer" *names* who next gets control, while generators always return to their (unnamed) caller. So a generator simply "pops the stack" when it suspends, while coroutine flow need not be (and typically isn't) stack-like. In Python this is currently a coroutine-killer, because the C stack gets intertwined. So if coroutine A merely calls (in the regular sense) function F, and F tries to transfer to coroutine B, the info needed to resume A includes the chunk of the C stack between A and F. And that's why the Python coroutine implementation I referenced earlier uses threads under the covers (where capturing pieces of the C stack isn't a problem). Early versions of coroutines didn't allow for this, though! At first coroutines could only transfer *directly* to other coroutines, and as soon as a coroutine made "a regular call" transfers were prohibited until the call returned (unless the called function kicked off a brand new collection of coroutines, which could then transfer among themselves -- making the distinction leads to convoluted rules, so modern practice is to generalize from the start). Then the current state of each coroutine was contained in a single frame, and it's really no harder to implement than generators. Knuth seems to have this restricted flavor of coroutine in mind when he describes generator behavior as "semi-coroutine". CONTINUATIONS Given the pedagogical structure so far, you're primed to view continuations as an enhancement of coroutines. And that's exactly what will get you nowhere . Continuations aren't more elaborate than coroutines, they're simpler. Indeed, they're simpler than generators, and even simpler than "a regular call"! That's what makes them so confusing at first: they're a different *basis* for *all* call-like behavior. Generators and coroutines are variations on what you already know; continuations challenge your fundamental view of the universe. Legend has it they were discovered when theorists were trying to find a solid reason for why goto statements suck: the growth of "denotational semantics" (DS) boomed at the same time "structured programming" took off. The former is a solid & fruitful approach to formally specifying the semantics of programming languages, built on the lambda calculus (and so dear to the Lisp/Scheme community -- this all ties together, of course ). The early hope was that goto statements would prove to present intractable problems for formal specification, and then "that's why they suck: we can't even sort them out on paper, let alone in practice". But in one of God's cleverer tricks on the programming world , the semantics of goto turned out to be trivial: at a branch point, you can go one of two ways. Represent one of those ways by a function f that computes what happens if you branch one way, and the other way by a function g. Then an if+goto simply picks one of f or g as "the continuation" of the program, depending on whether the "if" condition is true or false. And a plain goto simply replaces the current continuation with a different one (representing what happens at the branch target) unconditionally. So goto turned out to be simpler (from the DS view) than even an assignment stmt! I've often suspected theorists were *surprised* (and maybe appalled <0.7 wink>) when the language folks went on to *implement* the continuation idea. Don't really know, but suppose it doesn't matter anyway. The fact is we're stuck with them now . In theory a continuation is a function that computes "the rest of the program", or "its future". And it really is like a supercharged goto! It's the formal DS basis for all control flow, from goto stmts to exception handling, subsuming vanilla call flow, recursion, generators, coroutines, backtracking, and even loops along the way. To a certain frame of mind (like Sam's, and Christian is temporarily under his evil influence ), this relentless uniformity & consistency of approach is very appealing. Guido tends to like his implementations to mirror his surface semantics, though, and if he has ten constructs they're likely to be implemented ten ways. View that as a preview of future battles that have barely been hinted at so far <0.3 wink>. Anyway, in implementation terms a continuation "is like" what a coroutine would be if you could capture its resumption state at any point (even without the coroutine's knowledge!) and assign that state to a vrbl. So we could say it adds an abstract operation "capture", which essentially captures the program counter, call stack, and local (in Python terms) "block stack" at its point of invocation, and packages all that into a first-class "continuation object". IOW, a building block on top of which a generator's suspend, and the suspend half of a coroutine transfer, can be built. In a pure vision, there's no difference at all between a regular return and the "resume" half of a coroutine transfer: both amount to no more than picking some continuation to evaluate next. A continuation can be captured anywhere (even in the middle of an expression), and any continuation can be invoked at will from anywhere else. Note that "invoking a continuation" is *not* like "a call", though: it's abandoning the current continuation, *replacing* it with another one. In formal DS this isn't formally true (it's still "a call" -- a function application), but in practice it's a call that never returns to its caller so the implementation takes a shortcut. Like a goto, this is as low-level as it gets, and even hard-core continuation fans don't use them directly except as a means to implement better-behaved abstractions. As to whether continuations have "volatile state", I'm not sure what that was asking. If a given continuation is invoked more than once (which is something that's deliberately done when e.g. implementing backtracking searches), then changes made to the locals by the first invocation are visible to the second (& so on), so maybe the answer is "yes". It's more accurate to think of a continuation as being immutable, though: it holds a reference to the structure that implements name bindings, but does not copy (save or restore) the bindings. Quick example, given: (define continuation 0) (define (test) (let ((i 0)) (call/cc (lambda (k) (set! continuation k))) (set! i (+ i 1)) i)) That's like the Python: def test(): i = 0 global continuation continuation = magic to resume at the start of the next line i = i + 1 return i Then (this is interactive output from a Scheme shell): > (test) ; Python "test()" 1 > (continuation) ; Python "continuation()" 2 > (continuation) 3 > (define thisguy continuation) ; Python "thisguy = continuation" > (test) 1 > (continuation) 2 > (thisguy) 4 > too-simple-to-be-obvious?-ly y'rs - tim From bwarsaw at cnri.reston.va.us Mon Jul 5 18:55:01 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 5 Jul 1999 12:55:01 -0400 (EDT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <1281013244-55137551@hypernet.com> <000101bec6b3$4e752be0$349e2299@tim> Message-ID: <14208.58213.449486.917974@anthem.cnri.reston.va.us> Wow. That was by far the clearest tutorial on the subject I think I've read. I guess we need (for Tim to have) more 3 day holiday weekends. i-vote-we-pitch-in-and-pay-tim-to-take-/every/-monday-off-so-he-can-write- more-great-stuff-like-this-ly y'rs, -Barry From skip at mojam.com Mon Jul 5 19:54:45 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 5 Jul 1999 13:54:45 -0400 (EDT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <14208.58213.449486.917974@anthem.cnri.reston.va.us> References: <1281013244-55137551@hypernet.com> <000101bec6b3$4e752be0$349e2299@tim> <14208.58213.449486.917974@anthem.cnri.reston.va.us> Message-ID: <14208.61767.893387.713711@cm-24-29-94-19.nycap.rr.com> Barry> Wow. That was by far the clearest tutorial on the subject I Barry> think I've read. I guess we need (for Tim to have) more 3 day Barry> holiday weekends. What he said. Skip From MHammond at skippinet.com.au Tue Jul 6 03:16:45 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 6 Jul 1999 11:16:45 +1000 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000101bec6b3$4e752be0$349e2299@tim> Message-ID: <000401bec74d$37c8d370$0801a8c0@bobcat> > NOW No problems, fine sailing... > GENERATORS Cruising along - nice day to be out! > COROUTINES Such a pleasant day! > CONTINUATIONS Are they clouds I see? > Given the pedagogical structure so far, you're primed to view > continuations > as an enhancement of coroutines. And that's exactly what will get you > nowhere . Continuations aren't more elaborate than coroutines, > they're simpler. Indeed, they're simpler than generators, A storm warning... > Legend has it they were discovered when theorists were trying > to find a > solid reason for why goto statements suck: the growth of > "denotational > semantics" (DS) boomed at the same time "structured > programming" took off. > The former is a solid & fruitful approach to formally specifying the She's taking on water! > In theory a continuation is a function that computes "the rest of the > program", or "its future". OK - before I abandon ship, I might need my hand-held. Before I start, let me echo Skip and Barry - and excellent precis of a topic I knew nothing about (as I made you painfully aware!) And I will avoid asking you to explain the above paragraph again for now :-) Im a little confused by how these work in practice. I can see how continuations provide the framework to do all these control things. It is clear to me how you can capture the "state" of a running program. Indeed, this is exactly what it seems generators and coroutines do. With continuations, how is the state captured or created? Eg, in the case of implementing a goto or a function call, there doesnt seem to be to be a state available. Does the language supporting continuations allow you to explicitely create one from any arbitary position? I think you sort-of answered that below: > Anyway, in implementation terms a continuation "is like" what > a coroutine > would be if you could capture its resumption state at any point (even > without the coroutine's knowledge!) and assign that state to > a vrbl. So we This makes sense, although it implies a "running state" is necessary for this to work. In the case of transfering control to somewhere you have never been before (eg, a goto or a new function call) how does this work? Your example: > def test(): > i = 0 > global continuation > continuation = magic to resume at the start of the next line > i = i + 1 > return i My main problem is that this looks closer to your description of a kind-of one-sided coroutine - ie, instead of only being capable of transfering control, you can assign the state. I can understand that fine. But in the example, the function _is_ aware its state is being captured - indeed, it is explicitely capturing it. My only other slight conceptual problem was how you implement functions, as I dont understand how the concept of return values fits in at all. But Im sure that would become clearer when the rest of the mud is wiped from my eyes. And one final question: In the context of your tutorial, what do Chris' latest patches arm us with? Given my new-found expertise in this matter I would guess that the guts is there to have at least co-routines, as capturing the state of a running Python program, and restarting it later is possible. Im still unclear about continuations WRT "without the co-routines knowledge", so really unsure what is needed here... The truly final question:-) Assuming Chris' patches were 100% bug free and reliable (Im sure they are very close :-) what would the next steps be to take advantage of it in a "clean" way? ie, assuming Guido blesses them, what exactly could I do in Python? (All I really know is that the C stack has gone - thats it!) Thanks for the considerable time it must be taking to enlightening us! Mark. From jcw at equi4.com Tue Jul 6 11:27:13 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Tue, 06 Jul 1999 11:27:13 +0200 Subject: [Python-Dev] Re: Welcome Jean-Claude Wippler Message-ID: <3781CBF1.B360D466@equi4.com> Thank you Guido, for admitting this newbie to Python-dev :) [Guido: ... you are mostly interested in lurking ... digest mode ...] Fear of being flooded by email, a little shy (who, me?), and yes, a bit of curiosity. Gosh, I got to watch my steps, you figured it all out :) Thanks again. I went through the last month or so of discussion, and am fascinated by the topics and issues you guys are dealing with. And now, seeing Tim's generator/coroutine/continuations description is fantastic. Makes it obvious that I'm already wasting way too much bandwidth. When others come to mind, I'll let them know about this list. But so far, everyone I can come up with already is a member, it seems. -- Jean-Claude From guido at CNRI.Reston.VA.US Tue Jul 6 17:08:37 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 06 Jul 1999 11:08:37 -0400 Subject: [Python-Dev] Welcome to Chris Petrilli Message-ID: <199907061508.LAA12663@eric.cnri.reston.va.us> Chris, would you mind posting a few bits about yourself? Most of the people on this list have met each other at one point or another (with the big exception of the elusive Tim Peters :-); it's nice to know more than a name... --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Tue Jul 6 17:16:10 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Tue, 6 Jul 1999 11:16:10 -0400 Subject: [Python-Dev] Welcome to Chris Petrilli In-Reply-To: <199907061508.LAA12663@eric.cnri.reston.va.us>; from Guido van Rossum on Tue, Jul 06, 1999 at 11:08:37AM -0400 References: <199907061508.LAA12663@eric.cnri.reston.va.us> Message-ID: <19990706111610.A4585@amber.org> On Tue, Jul 06, 1999 at 11:08:37AM -0400, Guido van Rossum wrote: > Chris, would you mind posting a few bits about yourself? Most of the > people on this list have met each other at one point or another (with > the big exception of the elusive Tim Peters :-); it's nice to know > more than a name... As we are all aware, Tim is simply the graduate project of an AI student, running on a network Symbolics machines :-) Honestly though, about me? Um, well, I'm now (along with Brian Lloyd) the Product Management side of Digital Creations, and Zope, so I have a very vested interest in seeing Python succeed---besides my general belief that the better language SHOULd win. My background is actually in architecture, but I've spent the past 5 years working in the cryptography world, mostly in smart cards and PKI. My computer background is bizarre, twisted and quite nefarious... having grown up on a PDP-8/e, rather than PCs. And if the fact that I own 4 Lisp machines means anything, I'm affraid to ask what! For now, I'm just going to watch the masters at work. :-) Chris -- | Christopher Petrilli ``Television is bubble-gum for | petrilli at amber.org the mind.''-Frank Lloyd Wright From tim_one at email.msn.com Wed Jul 7 03:52:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 6 Jul 1999 21:52:15 -0400 Subject: [Python-Dev] Fancy control flow Message-ID: <000301bec81b$56e87660$c99e2299@tim> Responding to a msg of Guido's that shows up in the archives but didn't come across the mail link (the proper authorities have been notified, and I'm sure appropriate heads will roll at the appropriate pace ...). > From guido at CNRI.Reston.VA.US Mon, 05 Jul 1999 08:06:03 -0400 > Date: Mon, 05 Jul 1999 08:06:03 -0400 > From: Guido van Rossum guido at CNRI.Reston.VA.US > Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) [generators, coroutines, continuations] > I still don't understand all of this (I have not much of an idea of > what Christian's search for hidden registers is about and what kind of > analysis he needs) but I think of continuations as requiring > (theoretically) coping the current stack (to and from), while > generators and coroutines just need their own piece of stack set aside. A generator needs one frame, period (its own!). "Modern" coroutines can get more involved, mixing regular calls with coroutine transfers in arbitrary ways. About Christian's mysterious quest, we've been pursuing it offline. By "hidden registers" I think he means stuff on the eval stack that should *not* be saved/restored as part of a continuation's state. It's not clear to me that this isn't the empty set. The most debatable case I've seen is the Python "for" loop, which hides an anonymous loop counter on the stack. for i in seq: if func1(i): func2(i) This is more elaborate than necessary , but since it's the one we've discussed offline I'll just stick with it. Suppose func1 saves a continuation on the first iteration, and func2 invokes that continuation on the fifth. Does the resumed continuation "see" the loop as being on its first iteration or as being on its fifth? In favor of the latter is that the loop above "should be" equivalent to this: hidden = 0 while 1: try: temp = seq[hidden] except IndexError: break hidden = hidden + 1 i = temp if func1(i): func2(i) since that's what "for" *does* in Python. With the latter spelling, it's clear that the continuation should see the loop as being on its fifth iteration (continuations see changes in bindings, and making the loop counter a named local exposes it to that rule). But if the entire eval stack is (conceptually) saved/restored, the loop counter is part of it, so the continuation will see the loop counter at its old value. I think it's arguable either way, and argued in favor of "fifth" initially. Now I'm uncertain, but leaning toward "first". > The difference between any of these and threads (fake or real) is that > they pass control explicitly, while threads (typically) presume > pre-emptive scheduling, i.e. they make independent parallel progress > without explicit synchronization. Yes. > (Hmm, how do you do this with fake threads? Or are these only required > to switch whenever you touch a mutex?) I'd say they're only *required* to switch when one tries to acquire a mutex that's already locked. It would be nicer to switch them as ceval already switches "real threads", that is give another one a shot every N bytecodes. > I'm not sure if there's much of a difference between generators and > coroutines -- it seems just the termination convention. A generator is a semi-coroutine, but is the easier half . > (Hmm... would/should a generator be able to raise an exception in its > caller? Definitely. This is all perfectly clear for a generator -- it has a unique & guaranteed still-active place to return *to*. Years ago I tried to rename them "resumable functions" to get across what a trivial variation of plain functions they really are ... > A coroutine?) This one is muddier. A (at line A1) transfers to B (at line B1), which transfers at line B2 to A (at line A2), which at line A3 transfers to B (at line B3), and B raises an exception at line B4. The obvious thing to do is to pass it on to line A3+1, but what if that doesn't catch it either? We got to A3 from A2 from B2 from B1, but B1 is long gone. That's a real difference with generators: resuming a generator is stack-like, while a co-transfer is just moving control around a flat graph, like pushing a pawn around a chessboard. The coroutine implementation I posted 5 years ago almost punted on this one: if any coroutine suffered an unhandled exception, all coroutines were killed and an EarlyExit exception was raised in "the main coroutine" (the name given to the thread of your code that created the coroutine objects to begin with). Deserves more thought than that, though. or-maybe-it-doesn't-ly y'rs - tim From tim_one at email.msn.com Wed Jul 7 06:18:13 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 7 Jul 1999 00:18:13 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000401bec74d$37c8d370$0801a8c0@bobcat> Message-ID: <000001bec82f$bc171500$089e2299@tim> [Mark Hammond] > ... > Thanks for the considerable time it must be taking to enlightening us! You're welcome, but the holiday weekend is up and so is my time. Thank (all of) *you* for the considerable time it must take to endure all this ! Let's hit the highlights (or lowlights, depending on your view): > ... > Im a little confused by how these [continuations] work in practice. Very delicately . Eariler I posted a continuation-based implementation of generators in Scheme, and Sam did the same for a hypothetical Python with a call/cc equivalent. Those are typical enough. Coroutines are actually easier to implement (using continuations), thanks to their symmetry. Again, though, you never want to muck with continuations directly! They're too wild. You get an expert to use them in the bowels of an implementation of something else. > ... > It is clear to me how you can capture the "state" of a running program. > Indeed, this is exactly what it seems generators and coroutines do. Except that generators need only worry about their own frame. Another way to view it is to think of the current computation being run by a (real ) thread -- then capturing a continuation is very much like making a frozen clone of that thread, stuffing it away somewhere for later thawing. > With continuations, how is the state captured or created? There are, of course, many ways to implement these things. Christian is building them on top of the explicit frame objects Python already creates, and that's a fine way for Python. Guido views it as cloning the call stack, and that's accurate too. >> Anyway, in implementation terms a continuation "is like" what >> a coroutine would be if you could capture its resumption state at >> any point (even without the coroutine's knowledge!) and assign that >> state to a vrbl. > This makes sense, although it implies a "running state" is necessary for > this to work. In implementations (like Chris's) that do it all dynamically at runtime, you bet: you not only need a "running state", you can only capture a continuation at the exact point (the specific bytecode) you run the code to capture it. In fact, there *is* "a continuation" at *every* dynamic instance of every bytecode, and the question is then simply which of those you want to save . > In the case of transfering control to somewhere you have never been > before (eg, a goto or a new function call) how does this work? Good eye: it doesn't in this scheme. The "goto" business is a theoretical transformation, in a framework where *every* operation is modeled as a function application, and an entire program's life is modeled as a single function call. Some things are very easy to do in theory . > Your example: >> def test(): >> i = 0 >> global continuation >> continuation = magic to resume at the start of the next line >> i = i + 1 >> return i > My main problem is that this looks closer to your description of a kind-of > one-sided coroutine - ie, instead of only being capable of transfering > control, you can assign the state. I can understand that fine. Good! > But in the example, the function _is_ aware its state is being > captured - indeed, it is explicitely capturing it. In real life, "magic to resume at the start of the next line" may be spelled concretely as e.g. xyz(7) or even a.b That is, anywhere in "test" any sort of (explicit or implicit) call is made *may* be part of a saved continuation, because the callee can capture one -- with or without test's knowledge. > My only other slight conceptual problem was how you implement functions, > as I dont understand how the concept of return values fits in at all. Ya, I didn't mention that. In Scheme, the act of capturing a continuation returns a value. Like so: (define c #f) ; senseless, but Scheme requires definition before reference (define (test) (print (+ 1 (call/cc (lambda (k) (set! c k) 42)))) (newline)) The function called by call/cc there does two things: 1) Stores call/cc's continuation into the global "c". 2) Returns the int 42. > (test) 43 > Is that clear? The call/cc expression returns 42. Then (+ 1 42) is 43; then (print 43) prints the string "43"; then (newline) displays a newline; then (test) returns to *its* caller, which is the Scheme shell's read/eval/print loop. Now that whole sequence of operations-- what happens to the 42 and beyond --*is* "call/cc's continuation", which we stored into the global c. A continuation is itself "a function", that returns its argument to the context where the continuation was captured. So now e.g. > (c 12) 13 > c's argument (12) is used in place of the original call/cc expression; then (+ 1 12) is 13; then (print 13) prints the string "13"; then (newline) displays a newline; then (test) returns to *its* caller, which is *not* (c 12), but just as originally is still the Scheme shell's read/eval/print loop. That last point is subtle but vital, and maybe this may make it clearer: > (begin (c 12) (display "Tim lied!")) 13 > The continuation of (c 12) includes printing "Tim lied!", but invoking a continuation *abandons* the current continuation in favor of the invoked one. Printing "Tim lied!" wasn't part of c's future, so that nasty slur about Tim never gets executed. But: > (define liar #f) > (begin (call/cc (lambda (k) (set! liar k) (c 12))) (display "Tim lied!") (newline)) 13 > (liar 666) Tim lied! > This is why I stick to trivial examples . > And one final question: In the context of your tutorial, what do Chris' > latest patches arm us with? Given my new-found expertise in this matter > I would guess that the guts is there to have at least co-routines, > as capturing the state of a running Python program, and restarting it > later is possible. Im still unclear about continuations WRT "without the > co-routines knowledge", so really unsure what is needed here... Christian is taking his work very seriously here, and as a result is flailing a bit trying to see whether it's possible to do "the 100% right thing". I think he's a lot closer than he thinks he is <0.7 wink>, but in any case he's at worst very close to having full-blown continuations working. Coroutines already work. > The truly final question:-) Assuming Chris' patches were 100% bug free and > reliable (Im sure they are very close :-) what would the next steps be to > take advantage of it in a "clean" way? ie, assuming Guido blesses them, > what exactly could I do in Python? Nothing. What would you like to do? Sam & I tossed out a number of intriguing possibilities, but all of those build *on* what Christian is doing. You won't get anything useful out of the box unless somebody does the work to implement it. I personally have wanted generators in Python since '91, because they're extremely useful in the useless things that I do . There's a thread-based generator interface (Generator.py) in the source distribution that I occasionally use, but that's so slow I usually recode in Icon (no, I'm not a Scheme fan -- I *admire* it, but I rarely use it). I expect rebuilding that on Christian's work will yield a factor of 10-100 speedup for me (beyond losing the thread/mutex overhead, as Chris just pointed out on c.l.py resumption should be much faster than a Python call, since the frame is already set up and raring to go). Would be nice if the language grew some syntax to make generators pleasant as well as fast, but the (lack of) speed is what's really killing it for me now. BTW, I've never tried to "sell" coroutines -- let alone continuations. Just generators. I expect Sam will do a masterful job of selling those. send-today-don't-delay-couldn't-give-or-receive-a-finer-gift-ly y'rs - tim From tismer at appliedbiometrics.com Wed Jul 7 15:11:44 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 07 Jul 1999 15:11:44 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> Message-ID: <37835210.F22A7EC9@appliedbiometrics.com> Tim Peters wrote: > > [Mark Hammond] > > ... > > Thanks for the considerable time it must be taking to enlightening us! > > You're welcome, but the holiday weekend is up and so is my time. Thank (all > of) *you* for the considerable time it must take to endure all this ! Just to let you know that I'm still there, thinking, not coding, still hesitating, but maybe I can conclude now and send it off. This discussion, and especially Tim's input was extremely helpful. He has spent considerable time reading my twisted examples, writing his own, hitting my chin, kicking my -censored-, and proving to me that the truth I was searching doesn't exist. ... > Again, though, you never want to muck with continuations directly! They're > too wild. You get an expert to use them in the bowels of an implementation > of something else. Maybe with one exception: With careful coding, you can use a continuation at the head of a very deep recursion and use it as an early break if the algorithm fails. The effect is the same as bailing out with an exception, despite the fact that no "finally" causes would be obeyed. It is just a incredibly fast jump out of something if you know what you are doing. > > With continuations, how is the state captured or created? > > There are, of course, many ways to implement these things. Christian is > building them on top of the explicit frame objects Python already creates, > and that's a fine way for Python. Guido views it as cloning the call stack, > and that's accurate too. Actually, it is both! What I use (and it works fine) are so-called "push-back frames". My continuations are always appearing in some call. In order to make the caller able to be resumed, I create a push-back frame *from* it. That means, my caller frame is duplicated behind his "f_back" pointer. The original frame stays in place but now becomes a continuation frame with the current stack state preserved. All other locals and stuff are moved to the clone in the f_back which is now the real one. This always works fine, since references to the original caller frame are all intact, just the frame's meaning is modified a little. Well, I will hvae to write a good paper... ... > I personally have wanted generators in Python since '91, because they're > extremely useful in the useless things that I do . There's a > thread-based generator interface (Generator.py) in the source distribution > that I occasionally use, but that's so slow I usually recode in Icon (no, > I'm not a Scheme fan -- I *admire* it, but I rarely use it). I expect > rebuilding that on Christian's work will yield a factor of 10-100 speedup > for me (beyond losing the thread/mutex overhead, as Chris just pointed out > on c.l.py resumption should be much faster than a Python call, since the > frame is already set up and raring to go). I believe so. Well, I admit that the continuation approach is slightly too much for the coroutine/generator case, since they exactly don't have the problem where continuations are suffering a little: Switching between frames which cannot be reached more than once at a time don't need the stack copying/pushback at all. I'm still staying at the secure side for now. But since I have all refcounting accurate already, we can use it to figure out if a frame needs to be copied at all. > Would be nice if the language grew some syntax to make generators pleasant > as well as fast, but the (lack of) speed is what's really killing it for me > now. How about "amb"? :-) (see "teach youself schem in fixnum days, chapter 14 at http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) About my last problems: The hard decision is: - Either I just stop and I'm ready already, and loops are funny. - Or I do the hidden register search, which makes things more complicated and also voidens the pushback trick partially, since then I would manage all stack stuff in one frame. - Or, and that's what I will do finally: For now, I will really just correct the loops. Well, that *is* a change to Python again, but no semantic change. The internal loop counter will no longer be an integer object, but a mutable integer box. I will just create a one-element integer array and count with its zero element. This is correct, since the stack value isn't popped off, so all alive stack copies share this one element. As a side effect, I save the Object/Integer conversion, so I guess it will be faster. *and* this solution does not involve any other change, since the stack layout is identical to before. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From klm at digicool.com Wed Jul 7 17:40:15 1999 From: klm at digicool.com (Ken Manheimer) Date: Wed, 7 Jul 1999 11:40:15 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> Message-ID: <00e001bec88f$02f45c80$5a57a4d8@erols.com> Hokay. I *think* i have this, and i have a question to followup. First, i think the crucial distinction i needed to make was the fact that the stuff inside the body of the call/cc is evaluated only when the call/cc is initially evaluated. What constitutes the "future" of the continuation is the context immediately following the call/cc expression. Your final example is where that's most apparent for me: Tim presented: > [...] > The continuation of (c 12) includes printing "Tim lied!", but invoking a > continuation *abandons* the current continuation in favor of the invoked > one. Printing "Tim lied!" wasn't part of c's future, so that nasty slur > about Tim never gets executed. But: > > > (define liar #f) > > (begin > (call/cc (lambda (k) > (set! liar k) > (c 12))) > (display "Tim lied!") > (newline)) > 13 > > (liar 666) > Tim lied! > > > > This is why I stick to trivial examples . Though not quite as simple, i think this nailed the distinction for me. (Too bad that i'm probably mistaken:-) In any case, one big unknown for me is the expense of continuations. Just how expensive is squirreling away the future, anyway? (:-) If we're deep in a call stack, seems like there can be a lot of lexical-bindings baggage, plus whatever i-don't-know-how-big-it-is control logic there is/was/will be pending. Does the size of these things (continuations) vary extremely, and is the variation anticipatable? I'm used to some surprises about the depth to which some call or other may go, i don't expect as much uncertainty about my objects - and it seems like continuations directly transform the call depth/complexity into data size/complexity... ?? unfamiliar-territory,how-far-can-i-fall?-ly, Ken klm at digicool.com From tismer at appliedbiometrics.com Wed Jul 7 18:12:22 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 07 Jul 1999 18:12:22 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000001bec82f$bc171500$089e2299@tim> <00e001bec88f$02f45c80$5a57a4d8@erols.com> Message-ID: <37837C66.1E5C33B9@appliedbiometrics.com> Ken Manheimer wrote: > > Hokay. I *think* i have this, and i have a question to followup. ... > In any case, one big unknown for me is the expense of continuations. Just > how expensive is squirreling away the future, anyway? (:-) The future costs at most to create *one* extra frame with a copy of the original frame's local stack. By keeping the references to all the other frames which were intact, the real cost is of course bigger, since we keep the whole frame path from this one up to the topmost frame alive. As soon as we drop the handle, everything winds up and vanishes. I also changed the frame refcounting to guarantee exactly that behavior. (before, unwinding was explicitly done). > If we're deep in a call stack, seems like there can be a lot of > lexical-bindings baggage, plus whatever i-don't-know-how-big-it-is control > logic there is/was/will be pending. Does the size of these things > (continuations) vary extremely, and is the variation anticipatable? I'm > used to some surprises about the depth to which some call or other may go, i > don't expect as much uncertainty about my objects - and it seems like > continuations directly transform the call depth/complexity into data > size/complexity... ?? Really, no concern necessary. The state is not saved at all (despite one frame), it is just not dropped. :-) Example: You have some application running, in a nesting level of, say, four function calls. This makes four frames. The bottom function now decides to spawn 10 coroutines in a loop and puts them into an array. Your array now holds 10 continuations, where each one is just one frame, which points back to your frame. Now assume, you are running one of the coroutines/generators/whatever, and this one calls another function "bottom", just to have some scenario. Looking from "bottom", there is just a usual frame chain, now 4+1 frames long. To shorten this: The whole story is nothing more than a tree, where exactly one leaf is active at any time, and its view of the call chain is always linear. Continuation jumps are possible to every other frame in the tree. It now only depends of keeping references to the leaf which you just left or not. If the jump removes the left reference to your current frame, then the according chain will ripple away up to the next branch point. If you held a reference, as you will do with a coroutine to resume it, this chain stays as a possible jump target. for-me-it's-a-little-like-Tarzan-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From klm at digicool.com Wed Jul 7 20:00:56 1999 From: klm at digicool.com (Ken Manheimer) Date: Wed, 7 Jul 1999 14:00:56 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) Message-ID: <613145F79272D211914B0020AFF640191D1BF2@gandalf.digicool.com> Christian wrote: > Really, no concern necessary. The state is not saved at all > (despite one frame), it is just not dropped. :-) I have to say, that's not completely reassuring.-) While little or nothing additional is created, stuff that normally would be quite transient remains around. > To shorten this: The whole story is nothing more than a tree, > where exactly one leaf is active at any time, and its view > of the call chain is always linear. That's wonderful - i particularly like that multiple continuations from the same frame only amount to a single retention of the stack for that frame. My concern is not alleviated, however. My concern is the potential, but often-realized hairiness of computation trees. Eg, looped calls to a function amount to nodes with myriad branches - one for each iteration - and each branch can be an arbitrary computation. If there were a continuation retained each time around the loop, worse, somewhere down the call stack within the loop, you could quickly amass a lot of stuff that would otherwise be reaped immediately. So it seems like use of continuations *can* be surprisingly expensive, with the expense commensurate with, and as hard (or easy) to predict as the call dynamics of the call tree. (Boy, i can see how continuations would be useful for backtracking-style chess algorithms and such. Of course, discretion about what parts of the computation is retained at each branch would probably be an important economy for large computations, while stashing the continuation retains everything...) (It's quite possible that i'm missing something - i hope i'm not being thick headed.) Note that i do not raise this to argue against continuations. In fact, they seem to me to be at least the right conceptual foundation for these advanced control structures (i happen to "like" stream abstractions, which i gather is what generators are). It just seems like it may a concern, something about which people experience with continuations experience (eg, the scheme community) would have some lore - accumulated wisdom... ken klm at digicool.com From da at ski.org Thu Jul 8 00:37:09 1999 From: da at ski.org (David Ascher) Date: Wed, 7 Jul 1999 15:37:09 -0700 (Pacific Daylight Time) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <1281421591-30373695@hypernet.com> Message-ID: [Tim] > Threads can be very useful purely as a means for algorithm > structuring, due to independent control flows. FWIW, I've been following the coroutine/continuation/generator bit with 'academic' interest -- the CS part of my brain likes to read about them. Prompted by Tim's latest mention of Demo/threads/Generator.py, I looked at it (again?) and *immediately* grokked it and realized how it'd fit into a tool I'm writing. Nothing to do with concurrency, I/O, etc -- just compartmentalization of stateful iterative processes (details too baroque to go over). More relevantly, that tool would be useful on thread-less Python's (well, when it reaches usefulness on threaded Pythons =). Consider me pro-generator, and still agnostic on the co* things. --david From guido at CNRI.Reston.VA.US Thu Jul 8 07:08:44 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 08 Jul 1999 01:08:44 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Tue, 06 Jul 1999 21:52:15 EDT." <000301bec81b$56e87660$c99e2299@tim> References: <000301bec81b$56e87660$c99e2299@tim> Message-ID: <199907080508.BAA00623@eric.cnri.reston.va.us> I have a few questions/suggestions about generators. Tim writes that a suspended generator has exactly one stack frame. I'm not sure I like that. The Demo/thread/Generator.py version has no such restriction; anything that has a reference to the generator can put() the next value. Is the restriction really necessary? I can see a good use for a recursive generator, e.g. one that generates a tree traversal: def inorder(node): if node.left: inorder(node.left) suspend node if node.right: inorder(node.right) If I understand Tim, this could not work because there's more than one stack frame involved. On the other hand, he seems to suggest that something like this *is* allowed when using "modern" coroutines. Am I missing something? I though that tree traversal was one of Tim's first examples of generators; would I really have to use an explicit stack to create the traversal? Next, I want more clarity about the initialization and termination conditions. The Demo/thread/Generator.py version is very explicit about initialization: you instantiate the Generator class, passing it a function that takes a Generator instance as an argument; the function executes in a new thread. (I guess I would've used a different interface now -- perhaps inheriting from the Generator class overriding a run() method.) For termination, the normal way to stop seems to be for the generator function to return (rather than calling g.put()), the consumer then gets an EOFError exception the next time it calls g.get(). There's also a way for either side to call g.kill() to stop the generator prematurely. Let me try to translate that to a threadless implementation. We could declare a simple generator as follows: generator reverse(seq): i = len(seq) while i > 0: i = i-1 suspend seq[i] This could be translated by the Python translator into the following, assuming a system class generator which provides the machinery for generators: class reverse(generator): def run(self, seq): i = len(seq) while i > 0: i = i-1 self.suspend(seq[i]) (Perhaps the identifiers generator, run and suspend would be spelled with __...__, but that's just clutter for now.) Now where Tim was writing examples like this: for c in reverse("Hello world"): print c, print I'd like to guess what the underlying machinery would look like. For argument's sake, let's assume the for loop recognizes that it's using a generator (or better, it always needs a generator, and when it's not a generator it silently implies a sequence-iterating generator). So the translator could generate the following: g = reverse("Hello world") # instantiate class reverse while 1: try: c = g.resume() except EOGError: # End Of Generator break print c, print (Where g should really be a unique temporary local variable.) In this model, the g.resume() and g.suspend() calls have all the magic. They should not be accessible to the user. They are written in C so they can play games with frame objects. I guess that the *first* call to g.resume(), for a particular generator instance, should start the generator's run() method; run() is not activated by the instantiation of the generator. Then run() runs until the first suspend() call, which causes the return from the resume() call to happen. Subsequent resume() calls know that there's already is a frame (it's stored in the generator instance) and simply continue its execution where it was. If the run() method returns from the frame, the resume() call is made to raise EOGError (blah, bogus name) which signals the end of the loop. (The user may write this code explicitly if they want to consume the generated elements in a different way than through a for loop.) Looking at this machinery, I think the recursive generator that I wanted could be made to work, by explicitly declaring a generator subclass (instead of using the generator keyword, which is just syntactic sugar) and making calls to methods of self, e.g.: class inorder(generator): def run(self, node): if node.left: self.run(node.left) self.suspend(node) if node.right: self.run(node.right) The generator machinery would (ab)use the fact that Python frames don't necessarily have to be linked in a strict stack order; the generator gets a pointer to the frame to resume from resume(), and there's a "bottom" frame which, when hit, raises the EOGError exception. All currently active frames belonging to the generator stay alive while another resume() is possible. All this is possible by the introduction of an explicit generator object. I think Tim had an implementation in mind where the standard return pointer in the frame is the only thing necessary; actually, I think the return pointer is stored in the calling frame, not in the called frame (Christian? Is this so in your version?). That shouldn't make a difference, except that it's not clear to me how to reference the frame (in the explicitly coded version, which has to exist at least at the bytecode level). With classic coroutines, I believe that there's no difference between the first call and subsequent calls to the coroutine. This works in the Knuth world where coroutines and recursion don't go together; but at least for generators I would hope that it's possible for multiple instances of the same generator to be active simultaneously (e.g. I could be reversing over a list of files and then reverse each of the lines in the file; this uses separate instances of the reverse() generator). So we need a way to reference the generator instance separately from the generator constructor. The machinery I sketched above solves this. After Tim has refined or rebutted this, I think I'll be able to suggest what to do for coroutines. (I'm still baffled by continuations. The question whether the for saved and restored loop should find itself in the 1st or 5th iteration surprises me. Doesn't this cleanly map into some Scheme code that tells us what to do? Or is it unclear because Scheme does all loops through recursion? I presume that if you save the continuation of the 1st iteration and restore it in the 5th, you'd find yourself in the back 1st iteration? But this is another thread.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Thu Jul 8 07:59:24 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:24 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <613145F79272D211914B0020AFF640191D1BF2@gandalf.digicool.com> Message-ID: <002001bec907$07934a80$1d9e2299@tim> [Christian] > Really, no concern necessary. The state is not saved at all > (despite one frame), it is just not dropped. :-) [Ken] > I have to say, that's not completely reassuring.-) While little or > nothing additional is created, stuff that normally would be quite > transient remains around. I don't think this is any different than that keeping a reference to a class instance alive keeps all the attributes of that object alive, and all the objects reachable from them too, despite that you may never again actually reference any of them. If you save a continuation, the implementation *has* to support your doing anything that's *possible* to do from the saved control-flow state -- and if that's a whole big giant gob o' stuff, that's on you. > ... > So it seems like use of continuations *can* be surprisingly expensive, > with the expense commensurate with, and as hard (or easy) to predict as > the call dynamics of the call tree. > > (Boy, i can see how continuations would be useful for backtracking-style > chess algorithms and such. It comes with the territory, though: backtracking searches are *inherently* expensive and notoriously hard to predict, whether you implement them via continuations, or via clever hand-coded assembler using explicit stacks. The number of nodes at a given depth is typically exponential in the depth, and that kills every approach at shallow levels. Christian posted a reference to an implementation of "amb" in Scheme using continuations, and that's a very cute function: given a list of choices, "amb" guarantees to return (if any such exists) that particular list element that allows the rest of the program to "succeed". So if indeed chess is a forced win for white, amb(["P->KR3", "P->KR4", ...]) as the first line of your chess program will return "the" winning move! Works great in theory . > Of course, discretion about what parts of the computation is retained > at each branch would probably be an important economy for large > computations, while stashing the continuation retains everything...) You bet. But if you're not mucking with exponential call trees-- and, believe me, you're usually not --it's not a big deal. > Note that i do not raise this to argue against continuations. In fact, > they seem to me to be at least the right conceptual foundation for these > advanced control structures (i happen to "like" stream abstractions, > which i gather is what generators are). Generators are an "imperative" flavor of stream, yes, potentially useful whenever you have an abstraction that can deliver a sequence of results (from all the lines in a file, to all the digits of pi). A very common occurrence! Heck, without it, Python's "for x in s:" wouldn't be any fun at all . how-do-i-love-thee?-let-me-generate-the-ways-ly y'rs - tim From tim_one at email.msn.com Thu Jul 8 07:59:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:15 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <00e001bec88f$02f45c80$5a57a4d8@erols.com> Message-ID: <001d01bec907$024e6a00$1d9e2299@tim> [Ken Manheimer] > First, i think the crucial distinction i needed to make was the fact that > the stuff inside the body of the call/cc is evaluated only when > the call/cc is initially evaluated. What constitutes the "future" of the > continuation is the context immediately following the call/cc expression. Right! call/cc is short for call-with-current-continuation, and "current" refers to the continuation of call/cc itself. call/cc takes a function as an argument, and passes to it its (call/cc's) *own* continuation. This is maximally clever and maximally confusing at first. Christian has a less clever way of spelling it that's likely to be less confusing too. Note that it has to be a *little* tricky, because the obvious API k = gimme_a_continuation_for_here() doesn't work. The future of "gimme_a_..." includes binding k to the result, so you could never invoke the continuation without stomping on k's binding. k = gimme_a_continuation_for_n_bytecodes_beyond_here(n) could work, but is a bit hard to explain coherently . > ... > In any case, one big unknown for me is the expense of continuations. > Just how expensive is squirreling away the future, anyway? (:-) Christian gave a straight answer, so I'll give you the truth : it doesn't matter provided that you don't pay the price if you don't use it. A more interesting question is how much everyone will pay all the time to support the possibility even if they don't use it. But that question is premature since Chris isn't yet aiming to optimize. Even so, the answer so far appears to be "> 0 but not much". in-bang-for-the-buck-continuations-are-cheap-ly y'rs - tim From tim_one at email.msn.com Thu Jul 8 07:59:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 01:59:18 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <37835210.F22A7EC9@appliedbiometrics.com> Message-ID: <001e01bec907$03f9a900$1d9e2299@tim> >> Again, though, you never want to muck with continuations directly! >> They're too wild. You get an expert to use them in the bowels of an >> implementation of something else. [Christian] > Maybe with one exception: With careful coding, you can use > a continuation at the head of a very deep recursion and use > it as an early break if the algorithm fails. The effect is > the same as bailing out with an exception, despite the fact > that no "finally" causes would be obeyed. It is just a > incredibly fast jump out of something if you know what > you are doing. You don't need continuations for this, though; e.g., in Icon I've done this often, by making the head of the deep recursion a co-expression, doing the recursion via straight calls, and then doing a coroutine resumption of &main when I want to break out. At that point I set the coexp to &null, and GC reclaims the stack frames (the coexp is no longer reachable from outside) when it feels like it . This is a particularly simple application of coroutines that could be packaged up in a simpler way for its own sake; so, again, while continuations may be used fruitfully under the covers here, there's still no reason to make a poor end user wrestle with them. > ... Well, I admit that the continuation approach is slightly too much > for the coroutine/generator case, It's good that you admit that, because generators alone could have been implemented with a 20-line patch . BTW, I expect that by far the bulk of your changes *still* amount to what's needed for disentangling the C stack, right? The continuation implementation has been subtle, but so far I've gotten the impression that it requires little code beyond that required for stacklessness. > ... > How about "amb"? :-) > (see "teach youself schem in fixnum days, chapter 14 at > http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) That's the point at which I think continuations get insane: it's an unreasonably convoluted implementation of a straightforward (via other means) backtracking framework. In a similar vein, I've read 100 times that continuations can be used to implement a notion of (fake) threads, but haven't actually seen an implementation that wasn't depressingly subtle & long-winded despite being just a feeble "proof of concept". These have the *feeling* of e.g. implementing generators on top of real threads: ya, you can do it, but nobody in their right mind is fooled by it . > About my last problems: > The hard decision is: > - Either I just stop and I'm ready already, and loops are funny. OK by me -- forgetting implementation, I still can't claim to know what's the best semantic here. > - Or I do the hidden register search, which makes things more > complicated and also voidens the pushback trick partially, > since then I would manage all stack stuff in one frame. Bleech. > - Or, and that's what I will do finally: > For now, I will really just correct the loops. > > Well, that *is* a change to Python again, but no semantic change. > The internal loop counter will no longer be an integer object, > but a mutable integer box. I will just create a one-element > integer array and count with its zero element. > This is correct, since the stack value isn't popped off, > so all alive stack copies share this one element. Ah, very clever! Yes, that will fly -- the continuations will share a reference to the value rather than the value itself. Perfect! > As a side effect, I save the Object/Integer conversion, so > I guess it will be faster. *and* this solution does not involve > any other change, since the stack layout is identical to before. Right, no downside at all. Except that Guido will hate it . there's-a-disturbance-in-the-force-ly y'rs - tim From tim_one at email.msn.com Thu Jul 8 08:45:51 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 8 Jul 1999 02:45:51 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <002101bec90d$851a3e40$1d9e2299@tim> I'm out of time for tonight so will just address the first one: [Guido van Rossum] > I have a few questions/suggestions about generators. > > Tim writes that a suspended generator has exactly one stack frame. > I'm not sure I like that. The Demo/thread/Generator.py version has no > such restriction; anything that has a reference to the generator can > put() the next value. Is the restriction really necessary? It can simplify the implementation, and (not coincidentally ) the user's mental model of how they work. > I can see a good use for a recursive generator, e.g. one that generates > a tree traversal: Definitely; in fact, recursive generators are particularly useful in both traversals and enumeration of combinatorial objects (permutations, subsets, and so on). > def inorder(node): > if node.left: inorder(node.left) > suspend node > if node.right: inorder(node.right) > > If I understand Tim, this could not work because there's more than one > stack frame involved. That's right. It would be written like this instead: def inorder(node): if node.left: suspend inorder(node.left) suspend node if node.right: suspend inorder(node.right) Now there may be many instances of the "inorder" generator active (as many as the tree is deep), but each one returns directly to its caller, and all but the bottom-most one is "the caller" wrt the generator it invokes. This implies that "suspend expr" treats expr like a generator in much the same way that "for x in expr" does (or may ...). I realize there's some muddiness in that. > On the other hand, he seems to suggest that something like this *is* > allowed when using "modern" coroutines. Yes, and then your original version can be made to work, delivering its results directly to the ultimate consumer instead of (in effect) crawling up the stack each time there's a result. > Am I missing something? Only that I've been pushing generators for almost a decade, and have always pushed the simplest possible version that's sufficient for my needs. However, every time I've made a micron's progress in selling this notion, it's been hijacked by someone else pushing continuations. So I keep pushing the simplest possible version of generators ("resumable function"), in the hopes that someday somebody will remember they don't need to turn Python inside out to get just that much . [much worth discussion skipped for now] > ... > (I'm still baffled by continuations. Actually not, I think! > The question whether the for saved and restored loop should find itself > in the 1st or 5th iteration surprises me. Doesn't this cleanly map into > some Scheme code that tells us what to do? Or is it unclear because > Scheme does all loops through recursion? Bingo: Scheme has no loops. I can model Python's "for" in Scheme in such a way that the continuation sees the 1st iteration, or the 5th, but neither way is obviously right -- or wrong (they both reproduce Python's behavior in the *absence* of continuations!). > I presume that if you save the continuation of the 1st iteration and > restore it in the 5th, you'd find yourself in the back 1st iteration? > But this is another thread.) The short course here is just that any way I've tried to model Python's "for" in *Python* shares the property of the "while 1:" way I posted: the continuation sees the 5th iteration. And some hours I think it probably should , since the bindings of all the locals it sees will be consistent with the 5th iteration's values but not the 1st's. could-live-with-it-either-way-but-"correct"-is-debatable-ly y'rs - tim From tismer at appliedbiometrics.com Thu Jul 8 16:23:11 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 08 Jul 1999 16:23:11 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <001e01bec907$03f9a900$1d9e2299@tim> Message-ID: <3784B44F.C2F76E8A@appliedbiometrics.com> Tim Peters wrote: ... > This is a particularly simple application of coroutines that could be > packaged up in a simpler way for its own sake; so, again, while > continuations may be used fruitfully under the covers here, there's still no > reason to make a poor end user wrestle with them. Well. def longcomputation(prog, *args, **kw): return quickreturn(prog, args, kw) # prog must be something with return function first arg # quickreturn could be done as so: def quickreturn(prog, args, kw): cont = getpcc() # get parent's continuation def jumpback(val=None, cont=cont): putcc(cont, val) # jump to continuation apply(prog, jumpback, args, kw) # and if they want to jump out, they call jumpback with # an optional return value. Can't help it, it still is continuation-ish. > > ... Well, I admit that the continuation approach is slightly too much > > for the coroutine/generator case, > > It's good that you admit that, because generators alone could have been > implemented with a 20-line patch . BTW, I expect that by far the bulk > of your changes *still* amount to what's needed for disentangling the C > stack, right? The continuation implementation has been subtle, but so far > I've gotten the impression that it requires little code beyond that required > for stacklessness. Right. You will see soon. The only bit which cont's need more than coro's is to save more than one stack state for a frame. So, basically, it is just the frame copy operation. If I was heading just for coroutines, then I could save that, but then I need to handle special cases like exception, what to do on return, and so on. Easier to do that one stuff once right. Then I will never dump code for an unforeseen coro-effect, since with cont's, I *may* jump in and bail out wherever I want or don't want. The special cases come later and will be optimized, and naturally they will reduce themselves to what's needed. Example: If I just want to switch to a different coro, I just have to swap two frames. This leads to a data structure which can hold a frame and exchange it with another one. The cont-implementation does something like fetch my current continuation # and this does the frame copy stuff save into local state variable fetch cont from other coro's local state variable jump to new cont Now, if the source and target frames are guaranteed to be different, and if the source frame has no dormant extra cont attached, then it is safe to merge the above steps into one operation, without the need to save local state. In the end, two coro's will jump to each other by doing nothing more than this. Exactly that is what Sam's prototype does right now. WHat he's missing is treatment of the return case. If a coro returns towards the place where it was forked off, then we want to have a cont which is able to handle it properly. That's why exceptions work fine with my stuff: You can put one exceptionhandler on top of all your coroutines which you create. It works without special knowledge of coroutines. After I realized that, I knew the way to go. > > > ... > > How about "amb"? :-) > > (see "teach youself schem in fixnum days, chapter 14 at > > http://www.cs.rice.edu/~dorai/t-y-scheme/t-y-scheme-Z-H-15.html#%_chap_14) > > That's the point at which I think continuations get insane: it's an > unreasonably convoluted implementation of a straightforward (via other > means) backtracking framework. In a similar vein, I've read 100 times that > continuations can be used to implement a notion of (fake) threads, but > haven't actually seen an implementation that wasn't depressingly subtle & > long-winded despite being just a feeble "proof of concept". Maybe this is a convoluted implementation. But the principle? Return a value to your caller, but stay able to continue and do this again. Two continuations, and with the optimizations from above, it will be nothing. I will show you the code in a few, and you will realize that we are discussing the empty set. The frames have to be used, and the frames are already continuations. Only if they can be reached twice, they will have to be armed for that. Moving back to my new "more code - less words" principle. [mutable ints as loop counters] > Ah, very clever! Yes, that will fly -- the continuations will share a > reference to the value rather than the value itself. Perfect! Actually I'm copying some code out of Marc's counterobject which is nothing more than a mutable integer and hide it in ceval.c, since that doesn't introduce another module for a thing which isn't needed elsewhere, after Guido's hint. Better than to use the array module which doesn't publish its internals and might not always be linked in. > Right, no downside at all. Except that Guido will hate it . I made sure that this is what he hates the lest. off-for-coding-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Fri Jul 9 09:47:36 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 9 Jul 1999 03:47:36 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <000c01bec9df$4f935c20$c49e2299@tim> Picking up where we left off, I like Guido's vision of generators fine. The "one frame" version I've described is in fact what Icon provides, and what Guido is doing requires using coroutines instead in that language. Guido's is more flexible, and I'm not opposed to that . OTOH, I *have* seen many a person (including me!) confused by the semantics of coroutines in Icon, so I don't know how much of the additional flexibility converts into additional confusion. One thing I am sure of: having debated the fine points of continuations recently, I'm incapable of judging it harshly today <0.5 wink>. > ... > def inorder(node): > if node.left: inorder(node.left) > suspend node > if node.right: inorder(node.right) The first thing that struck me there is that I'm not sure to whom the suspend transfers control. In the one-frame flavor of generator, it's always to the caller of the function that (lexically) contains the "suspend". Is it possible to keep this all straight if the "suspend" above is changed to e.g. pass_it_back(node) where def pass_it_back(x): suspend x ? I'm vaguely picturing some kind of additional frame state, a pointer to the topmost frame that's "expecting" to receive a suspend. (I see you resolve this in a different way later, though.) > ... > I thought that tree traversal was one of Tim's first examples of > generators; would I really have to use an explicit stack to create > the traversal? As before, still no , but the one-frame version does require an unbroken *implicit* chain back to the intended receiver, with an explicit "suspend" at every step back to that. Let me rewrite the one-frame version in a way that assumes less semantics from "suspend", instead building on the already-assumed new smarts in "for": def inorder(node): if node: for child in inorder(node.left): suspend child suspend node for child in inorder(node.right): suspend child I hope this makes it clearer that the one-frame version spawns two *new* generators for every non-None node, and in purely stack-like fashion (both "recursing down" and "suspending up"). > Next, I want more clarity about the initialization and termination > conditions. Good idea. > The Demo/thread/Generator.py version is very explicit about > initialization: you instantiate the Generator class, passing it a > function that takes a Generator instance as an argument; the function > executes in a new thread. (I guess I would've used a different > interface now -- perhaps inheriting from the Generator class > overriding a run() method.) I would change my coroutine implementation similarly. > For termination, the normal way to stop seems to be for the generator > function to return (rather than calling g.put()), the consumer then gets > an EOFError exception the next time it calls g.get(). There's also a > way for either side to call g.kill() to stop the generator prematurely. A perfectly serviceable interface, but "feels clumsy" in comparison to normal for loops and e.g. reading lines from a file, where *visible* exceptions aren't raised at the end. I expect most sequences to terminate before I do , so (visible) try/except isn't the best UI here. > Let me try to translate that to a threadless implementation. We could > declare a simple generator as follows: > > generator reverse(seq): > i = len(seq) > while i > 0: > i = i-1 > suspend seq[i] > > This could be translated by the Python translator into the following, > assuming a system class generator which provides the machinery for > generators: > > class reverse(generator): > def run(self, seq): > i = len(seq) > while i > 0: > i = i-1 > self.suspend(seq[i]) > > (Perhaps the identifiers generator, run and suspend would be spelled > with __...__, but that's just clutter for now.) > > Now where Tim was writing examples like this: > > for c in reverse("Hello world"): > print c, > print > > I'd like to guess what the underlying machinery would look like. For > argument's sake, let's assume the for loop recognizes that it's using > a generator (or better, it always needs a generator, and when it's not > a generator it silently implies a sequence-iterating generator). In the end I expect these concepts could be unified, e.g. via a new class __iterate__ method. Then for i in 42: could fail simply because ints don't have a value in that slot, while lists and tuples could inherit from SequenceIterator, pushing the generation of the index range into the type instead of explicitly constructed by the eval loop. > So the translator could generate the following: > > g = reverse("Hello world") # instantiate class reverse > while 1: > try: > c = g.resume() > except EOGError: # End Of Generator > break > print c, > print > > (Where g should really be a unique temporary local variable.) > > In this model, the g.resume() and g.suspend() calls have all the magic. > They should not be accessible to the user. This seems at odds with the later: > (The user may write this code explicitly if they want to consume the > generated elements in a different way than through a for loop.) Whether it's at odds or not, I like the latter better. When the machinery is clean & well-designed, expose it! Else in 2002 we'll be subjected to a generatorhacks module . > They are written in C so they can play games with frame objects. > > I guess that the *first* call to g.resume(), for a particular > generator instance, should start the generator's run() method; run() > is not activated by the instantiation of the generator. This can work either way. If it's more convenient to begin run() as part of instantiation, the code for run() can start with an equivalent of if self.first_time: self.first_time = 0 return where self.first_time is set true by the constructor. Then "the frame" will exist from the start. The first resume() will skip over that block and launch into the code, while subsequent resume()s will never even see this block: almost free. > Then run() runs until the first suspend() call, which causes the return > from the resume() call to happen. Subsequent resume() calls know that > there's already is a frame (it's stored in the generator instance) and simply > continue its execution where it was. If the run() method returns from > the frame, the resume() call is made to raise EOGError (blah, bogus > name) which signals the end of the loop. (The user may write this > code explicitly if they want to consume the generated elements in a > different way than through a for loop.) Yes, that parenthetical comment bears repeating . > Looking at this machinery, I think the recursive generator that I > wanted could be made to work, by explicitly declaring a generator > subclass (instead of using the generator keyword, which is just > syntactic sugar) and making calls to methods of self, e.g.: > > class inorder(generator): > def run(self, node): > if node.left: self.run(node.left) > self.suspend(node) > if node.right: self.run(node.right) Going way back to the top, this implies the def pass_it_back(x): suspend x indirection couldn't work -- unless pass_it_back were also a method of inorder. Not complaining, just trying to understand. Once you generalize, it's hard to know when to stop. > The generator machinery would (ab)use the fact that Python frames > don't necessarily have to be linked in a strict stack order; If you call *this* abuse, what words remain to vilify what Christian is doing ? > the generator gets a pointer to the frame to resume from resume(), Ah! That addresses my first question. Are you implicitly assuming a "stackless" eval loop here? Else resuming the receiving frame would appear to push another C stack frame for each value delivered, ever deeper. The "one frame" version of generators doesn't have this headache (since a suspend *returns* to its immediate caller there -- it doesn't *resume* its caller). > and there's a "bottom" frame which, when hit, raises the EOGError > exception. Although desribed at the end, this is something set up at the start, right? To trap a plain return from the topmost invocation of the generator. > All currently active frames belonging to the generator stay alive > while another resume() is possible. And those form a linear chain from the most-recent suspend() back to the primal resume(). Which appears to address an earlier issue not brought up in this message: this provides a well-defined & intuitively clear path for exceptions to follow, yes? I'm not sure about coroutines, but there's something wrong with a generator implementation if the guy who kicks it off can't see errors raised by the generator's execution! This doesn't appear to be a problem here. > All this is possible by the introduction of an explicit generator > object. I think Tim had an implementation in mind where the standard > return pointer in the frame is the only thing necessary; actually, I > think the return pointer is stored in the calling frame, not in the > called frame What I've had in mind is what Majewski implemented 5 years ago, but lost interest in because it couldn't be extended to those blasted continuations . The called frame points back to the calling frame via f->f_back (of course), and I think that's all the return info the one-frame version needs. I expect I'm missing your meaning here. > (Christian? Is this so in your version?). That shouldn't make a > difference, except that it's not clear to me how to reference the frame > (in the explicitly coded version, which has to exist at least at the > bytecode level). "The" frame being which frame specifically, and refrenced from where? Regardless, it must be solvable, since if Christian can (& he thinks he can, & I believe him ) expose a call/cc variant, the generator class could be coded entirely in Python. > With classic coroutines, I believe that there's no difference between > the first call and subsequent calls to the coroutine. This works in > the Knuth world where coroutines and recursion don't go together; That's also a world where co-transfers are implemented via funky self-modifying assembler, custom-crafted for the exact number of coroutines you expect to be using -- I don't recommend Knuth as a guide to *implementing* these beasts <0.3 wink>. That said, yes, provided the coroutines objects all exist, there's nothing special about the first call. About "provided that": if your coroutine objects A and B have "run" methods, you dare not invoke A.run() before B has been constructed (else the first instance of B.transfer() in A chokes -- there's no object to transfer *to*). So, in practice, I think instantiation is still divorced from initiation. One possibility is to hide all that in a cobegin(list_of_coroutine_classes_to_instantiate_and_run) function. But then naming the instances is a puzzle. > but at least for generators I would hope that it's possible for multiple > instances of the same generator to be active simultaneously (e.g. I > could be reversing over a list of files and then reverse each of the > lines in the file; this uses separate instances of the reverse() > generator). Since that's the trick the "one frame" generators *rely* on for recursion, it's surely not a problem in your stronger version. Note that my old coroutine implementation did allow for multiple instances of a coroutine, although the examples posted with it didn't illustrate that. The weakness of coroutines in practice is (in my experience) the requirement that you *name* the target of a transfer. This is brittle; e.g., in the pipeline example I posted, each stage had to know the names of the stages on either side of it. By adopting a target.transfer(optional_value) primitive it's possible to *pass in* the target object as an argument to the coroutine doing the transfer. Then "the names" are all in the setup, and don't pollute the bodies of the coroutines (e.g., each coroutine in the pipeline example could have arguments named "stdin" and "stdout"). I haven't seen a system that *does* this, but it's so obviously the right thing to do it's not worth saying any more about . > So we need a way to reference the generator instance separately from > the generator constructor. The machinery I sketched above solves this. > > After Tim has refined or rebutted this, I think I'll be able to > suggest what to do for coroutines. Please do. Whether or not it's futile, it's fun . hmm-haven't-had-enough-of-that-lately!-ly y'rs - tim From tismer at appliedbiometrics.com Fri Jul 9 14:22:05 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 09 Jul 1999 14:22:05 +0200 Subject: [Python-Dev] Generator details References: <000301bec81b$56e87660$c99e2299@tim> <199907080508.BAA00623@eric.cnri.reston.va.us> Message-ID: <3785E96D.A1641530@appliedbiometrics.com> Guido van Rossum wrote: [snipped all what's addressed to Tim] > All this is possible by the introduction of an explicit generator > object. I think Tim had an implementation in mind where the standard > return pointer in the frame is the only thing necessary; actually, I > think the return pointer is stored in the calling frame, not in the > called frame (Christian? Is this so in your version?). That > shouldn't make a difference, except that it's not clear to me how to > reference the frame (in the explicitly coded version, which has to > exist at least at the bytecode level). No, it isn't. It is still as it was. I didn't change the frame machinery at all. The callee finds his caller in its f_back field. [...] > (I'm still baffled by continuations. The question whether the for > saved and restored loop should find itself in the 1st or 5th iteration > surprises me. Doesn't this cleanly map into some Scheme code that > tells us what to do? Or is it unclear because Scheme does all loops > through recursion? I presume that if you save the continuation of the > 1st iteration and restore it in the 5th, you'd find yourself in the > back 1st iteration? But this is another thread.) In Scheme, Python's for-loop would be a tail-recursive expression, it would especially be its own extra lambda. Doesn't fit. Tim is right when he says that Python isn't Scheme. Yesterday I built your suggested change to for-loops, and it works fine. By turning the loop counter into a mutable object, every reference to it shares the current value, and it behaves like Tim pointed out it should. About Tims reply to this post: [Gui-do] > The generator machinery would (ab)use the fact that Python frames > don't necessarily have to be linked in a strict stack order; [Tim-bot] If you call *this* abuse, what words remain to vilify what Christian is doing ? As a matter of fact, I have been thinking quite long about this *abuse*. At the moment I do not do this. The frame stack becomes a frame tree, and you can jump like Tarzan from leaf to leaf, but I never change the order. Perhaps this can make sense too, but this is curently where *my* brain explodes. Right now I'm happy that there is *always* a view of the top level, and an exception always knows where to wind up. Form that point of view, I'm even more conservative than Guido (above) and Sam (replacing whole frame chains). In a sense, since I don't change the frame chain but only change the current frame, this is like a functional way to use weak references. The continuation approach is to build new paths in a tree, and loose those which are unreachable. Modifying the tree is not part of my model at the moment. This may be interesting to study after we know everything about this tree and wee need even more freedom. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Sat Jul 10 16:28:13 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 10 Jul 1999 10:28:13 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Fri, 09 Jul 1999 14:22:05 +0200." <3785E96D.A1641530@appliedbiometrics.com> References: <000301bec81b$56e87660$c99e2299@tim> <199907080508.BAA00623@eric.cnri.reston.va.us> <3785E96D.A1641530@appliedbiometrics.com> Message-ID: <199907101428.KAA04364@eric.cnri.reston.va.us> [Christian] > The frame stack becomes a frame tree, and you can jump like Tarzan > from leaf to leaf [...]. Christian, I want to kiss you! (OK, just a hug. We're both Europeans. :-) This one remark suddenly made me understand much better what continuations do -- it was the one missing piece of insight I still needed after Tim's explanation and skimming the Scheme tutorial a bit. I'll have to think more about the consequences but this finally made me understand better how to interpreter the mysterious words ``the continuation represents "the rest of the program"''. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Sat Jul 10 17:48:43 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sat, 10 Jul 1999 11:48:43 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Fri, 09 Jul 1999 03:47:36 EDT." <000c01bec9df$4f935c20$c49e2299@tim> References: <000c01bec9df$4f935c20$c49e2299@tim> Message-ID: <199907101548.LAA04399@eric.cnri.reston.va.us> I've been thinking some more about Tim's single-frame generators, and I think I understand better how to implement them now. (And yes, it was a mistake of me to write that the suspend() and resume() methods shouldn't be accessible to the user! Also thanks for the clarification of how to write a recursive generator.) Let's say we have a generator function like this: generator reverse(l): i = len(l) while i > 0: i = i-1 suspend l[i] and a for loop like this: for i in reverse(range(10)): print i What is the expanded version of the for loop? I think this will work: __value, __frame = call_generator(reverse, range(10)) while __frame: i = __value # start of original for loop body print i # end of original for loop body __value, __frame = resume_frame(__frame) (Note that when the original for loop body contains 'continue', this should jump to the resume_frame() call. This is just pseudo code.) Now we must define two new built-in functions: call_generator() and resume_frame(). - call_generator() is like apply() but it returns a pair (result, frame) where result is the function result and frame is the frame, *if* the function returned via suspend. If it returned via return, call_generator() returns None for the frame. - resume_frame() does exactly what its name suggests. It has the same return convention as call_generator(). Note that the for loop throws away the final (non-suspend) return value of the generator -- this just signals the end of the loop. How to translate the generator itself? I've come up with two versions. First version: add a new bytecode SUSPEND, which does the same as RETURN but also marks the frame as resumable. call_generator() then calls the function using a primitive which allows it to specify the frame (e.g. a variant of eval_code2 taking a frame argument). When the call returns, it looks at the resumable bit of the frame to decode whether to return (value, frame) or (value, None). resume_frame() simply marks the frame as non-resumable and continues its execution; upon return it does the same thing as call_generator(). Alternative translation version: introduce a new builtin get_frame() which returns the current frame. The statement "suspend x" gets translated to "return x, get_frame()" and the statement "return x" (including the default "return None" at the end of the function) gets translated to "return x, None". So our example turns into: def reverse(l): i = len(l) while i > 0: i = i-1 return l[i], get_frame() return None, None This of course means that call_generator() can be exactly the same as apply(), and in fact we better get rid of it, so the for loop translation becomes: __value, __frame = reverse(range(10)) while __frame: ...same as before... In a real implementation, get_frame() could be a new bytecode; but it doesn't have to be (making for easier experimentation). (get_frame() makes a fine builtin; there's nothing inherently dangerous to it, in fact people get it all the time, currently using horrible hacks!). I'm not sure which is better; the version without call_generator() allows you to create your own generator without using the 'generator' and 'suspend' keywords, calling get_frame() explicitly. Loose end: what to do when there's a try/finally around a suspend? E.g. generator foo(l): try: for i in l: suspend i+1 finally: print "Done" The second translation variant would cause "Done" to be printed on each suspend *and* on the final return. This is confusing (and in fact I think resuming the frame would be a problem since the return breaks down the try-finally blocks). So I guess the SUSPEND bytecode is a better implementation -- it can suspend the frame without going through try-finally clauses. Then of course we create another loose end: what if the for loop contains a break? Then the frame will never be resumed and its finally clause will never be executed! This sounds bad. Perhaps the destructor of the frame should look at the 'resumable' bit and if set, resume the frame with a system exception, "Killed", indicating an abortion? (This is like the kill() call in Generator.py.) We can increase the likelihood that the frame's desctructor is called at the expected time (right when the for loop terminates), by deleting __frame at the end of the loop. If the resumed frame raises another exception, we ignore it. Its return value is ignored. If it suspends itself again, we resume it with the "Killed" exception again until it dies (thoughts of the Blank Knight come to mind). I am beginning to like this idea. (Not that I have time for an implementation... But it could be done without Christian's patches.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Sat Jul 10 23:09:48 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 10 Jul 1999 17:09:48 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907101428.KAA04364@eric.cnri.reston.va.us> Message-ID: <000501becb18$8ae0e240$e69e2299@tim> [Christian] > The frame stack becomes a frame tree, and you can jump like Tarzan > from leaf to leaf [...]. [Guido] > Christian, I want to kiss you! (OK, just a hug. We're both > Europeans. :-) Not in America, pal -- the only male hugging allowed here is in the two seconds after your team wins the Superbowl -- and even then only so long as you haven't yet taken off your helmets. > This one remark suddenly made me understand much better what > continuations do -- it was the one missing piece of insight I still > needed after Tim's explanation and skimming the Scheme tutorial a bit. It's an insight I was missing too -- continuations are often *invoked* in general directed-graph fashion, and before Christian said that I hadn't realized the *implementation* never sees anything worse than a tree. So next time I see Christian, I'll punch him hard in the stomach, and mumble "good job" loudly enough so that he hears it, but indistinctly enough so I can plausibly deny it in case any other guy overhears us. *That's* the American Way . first-it's-hugging-then-it's-song-contests-ly y'rs - tim From MHammond at skippinet.com.au Sun Jul 11 02:52:22 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sun, 11 Jul 1999 10:52:22 +1000 Subject: [Python-Dev] Win32 Extensions Registered Users Message-ID: <003c01becb37$a369c3d0$0801a8c0@bobcat> Hi all, As you may or may not have noticed, I have recently begun offering a Registered Users" program where people who use my Windows extensions can pay $50.00 per 2 years, and get a range of benefits. The primary benefits are: * Early access to binary versions. * Registered Users only mailing list (very low volume to date) * Better support from me. The last benefit really isnt to this list - anyone here will obviously get (and hopefully does get) a pretty good response should they need to mail me. The early access to binary versions may be of interest. As everyone on this list spends considerable and worthwhile effort helping Python, I would like to offer everyone here a free registration. If you would like to take advantage, just send me a quick email. I will email you the "top secret" location of the Registered Users page (where the very slick and very new Pythonwin can be found). Also, feel free to join the registered users mailing list at http://mailman.pythonpros.com/mailman/listinfo/win32-reg-users. This is low volume, and once volume does increase an announce list will be created, so you can join without fear of more swamping of your mailbox. And just FYI, I am very pleased with the registration process to date. In about 3 weeks I have around 20 paid users! If I can keep that rate up I will be very impressed (although that already looks highly unlikely :-) Even still, I consider it going well. Mark. From tim_one at email.msn.com Sun Jul 11 21:49:57 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 11 Jul 1999 15:49:57 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Message-ID: <000201becbd6$8df90660$569e2299@tim> [David Ascher] > FWIW, I've been following the coroutine/continuation/generator bit with > 'academic' interest -- the CS part of my brain likes to read about them. > Prompted by Tim's latest mention of Demo/threads/Generator.py, I looked at > it (again?) and *immediately* grokked it and realized how it'd fit into a > tool I'm writing. Nothing to do with concurrency, I/O, etc -- just > compartmentalization of stateful iterative processes (details too baroque > to go over). "stateful iterative process" is a helpful characterization of where these guys can be useful! State captured in variables is the obvious one, but simply "where you are" in a mass of nested loops and conditionals is also "state" -- and a kind of state especially clumsy to encode as data state instead (ever rewrite a hairy recursive routine to use iteration with an explicit stack? it's a transformation that can be mechanized, but the result is usually ugly & often hard to understand). Once it sinks in that it's *possible* to implement a stateful iterative process in this other way, I think you'll find examples popping up all over the place. > More relevantly, that tool would be useful on thread-less > Python's (well, when it reaches usefulness on threaded Pythons =). As Guido pointed out, the API provided by Generator.py is less restrictive than any that can be built with the "one frame" flavor of generator ("resumable function"). Were you able to make enough sense of the long discussion that ensued to guess whether the particular use you had in mind required Generator.py's full power? If you couldn't tell, post the baroque details & I'll tell you . not-putting-too-fine-a-point-on-possible-vs-natural-ly y'rs - tim From da at ski.org Sun Jul 11 22:14:04 1999 From: da at ski.org (David Ascher) Date: Sun, 11 Jul 1999 13:14:04 -0700 (Pacific Daylight Time) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000201becbd6$8df90660$569e2299@tim> Message-ID: On Sun, 11 Jul 1999, Tim Peters wrote: > As Guido pointed out, the API provided by Generator.py is less restrictive > than any that can be built with the "one frame" flavor of generator > ("resumable function"). Were you able to make enough sense of the long > discussion that ensued to guess whether the particular use you had in mind > required Generator.py's full power? If you couldn't tell, post the baroque > details & I'll tell you . I'm pretty sure the use I mentioned would fit in even the simplest version of a generator. As to how much sense I made of the discussion, let's just say I'm glad there's no quiz at the end. I did shudder at the mention of unmentionables (male public displays of affection -- yeaach!), yodel at the mention of Lord Greystoke swinging among stack branches and chuckled at the vision of him being thrown back in a traceback (ouch! ouch! ouch!, "most painful last"...). --david From tim_one at email.msn.com Mon Jul 12 04:26:44 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 11 Jul 1999 22:26:44 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907101548.LAA04399@eric.cnri.reston.va.us> Message-ID: <000001becc0d$fcb64f40$229e2299@tim> [Guido, sketches 112 ways to implement one-frame generators today ] I'm glad you're having fun too! I won't reply in detail here; it's enough for now to happily agree that adding a one-frame generator isn't much of a stretch for the current implementation of the PVM. > Loose end: what to do when there's a try/finally around a suspend? > E.g. > > generator foo(l): > try: > for i in l: > suspend i+1 > finally: > print "Done" > > The second translation variant would cause "Done" to be printed on > each suspend *and* on the final return. This is confusing (and in > fact I think resuming the frame would be a problem since the return > breaks down the try-finally blocks). There are several things to be said about this: + A suspend really can't ever go thru today's normal "return" path, because (among other things) that wipes out the frame's value stack! while (!EMPTY()) { v = POP(); Py_XDECREF(v); } A SUSPEND opcode would let it do what it needs to do without mixing that into the current return path. So my answer to: > I'm not sure which is better; the version without call_generator() > allows you to create your own generator without using the 'generator' > and 'suspend' keywords, calling get_frame() explicitly. is "both" : get_frame() is beautifully clean, but it still needs something like SUSPEND to keep everything straight. Maybe this just amounts to setting "why" to a new WHY_SUSPEND and sorting it all out after the eval loop; OTOH, that code is pretty snaky already. + I *expect* the example code to print "Done" len(l)+1 times! The generator mechanics are the same as the current for/__getitem__ protocol in this respect: if you have N items to enumerate, the enumeration routine will get called N+1 times, and that's life. That is, the fact is that the generator "gets to" execute code N+1 times, and the only reason your original example seems surprising at first is that it doesn't happen to do anything (except exit the "try" block) on the last of those times. Change it to generator foo(l): try: for i in l: suspend i+1 cleanup() # new line finally: print "Done" and then you'd be surprised *not* to see "Done" printed len(l)+1 times. So I think the easiest thing is also the right thing in this case. OTOH, the notion that the "finally" clause should get triggered at all the first len(l) times is debatable. If I picture it as a "resumable function" then, sure, it should; but if I picture the caller as bouncing control back & forth with the generator, coroutine style, then suspension is a just a pause in the generator's execution. The latter is probably the more natural way to picture it, eh? Which feeds into: > Then of course we create another loose end: what if the for loop > contains a break? Then the frame will never be resumed and its > finally clause will never be executed! This sounds bad. Perhaps the > destructor of the frame should look at the 'resumable' bit and if set, > resume the frame with a system exception, "Killed", indicating an > abortion? (This is like the kill() call in Generator.py.) We can > increase the likelihood that the frame's desctructor is called at the > expected time (right when the for loop terminates), by deleting > __frame at the end of the loop. If the resumed frame raises another > exception, we ignore it. Its return value is ignored. If it suspends > itself again, we resume it with the "Killed" exception again until it > dies (thoughts of the Blank Knight come to mind). This may leave another loose end : what if the for loop doesn't contain a break, but dies because of an exception in some line unrelated to the generator? Or someone has used an explicit get_frame() in any case and that keeps a ref to the frame alive? If the semantic is that the generator must be shut down no matter what, then the invoker needs code more like value, frame = generator(args) try: while frame: etc value, frame = resume_frame(frame) finally: if frame: shut_frame_down(frame) OTOH, the possibility that someone *can* do an explicit get_frame suggests that "for" shouldn't assume it's the master of the universe . Perhaps the user's intent was to generate the first 100 values in a for loop, then break out, analyze the results, and decide whether to resume it again by hand (I've done stuff like that ...). So there's also a case to be made for saying that a "finally" clause wrapping a generator body will only be executed if the generator body raises an exception or the generator itself decides it's done; i.e. iff it triggers while the generator is actively running. Just complicating things there . It actually sounds pretty good to raise a Killed exception in the frame destructor! The destructor has to do *something* to trigger the code that drains the frame's value stack anyway, "finally" blocks or not (frame_dealloc doesn't do that now, since there's currently no way to get out of eval_code2 with a non-empty stack). > ... > I am beginning to like this idea. (Not that I have time for an > implementation... But it could be done without Christian's patches.) Or with them too . If stuff is implemented via continuations, the same concerns about try/finally blocks pop up everywhere a continuation is invoked: you (probably) leave the current frame, and may or may not ever come back. So if there's a "finally" clause pending and you don't ever come back, it's a surprise there too. So while you thought you were dealing with dirt-simple one-frame generators, you were *really* thinking about how to make general continuations play nice . solve-one-mystery-and-you-solve-'em-all-ly y'rs - tim From guido at CNRI.Reston.VA.US Mon Jul 12 05:01:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sun, 11 Jul 1999 23:01:04 -0400 Subject: [Python-Dev] Generator details In-Reply-To: Your message of "Sun, 11 Jul 1999 22:26:44 EDT." <000001becc0d$fcb64f40$229e2299@tim> References: <000001becc0d$fcb64f40$229e2299@tim> Message-ID: <199907120301.XAA06001@eric.cnri.reston.va.us> [Tim seems to be explaining why len(l)+1 and not len(l) -- but I was really thinking about len(l)+1 vs. 1.] > OTOH, the notion that the "finally" clause should get triggered at all the > first len(l) times is debatable. If I picture it as a "resumable function" > then, sure, it should; but if I picture the caller as bouncing control back > & forth with the generator, coroutine style, then suspension is a just a > pause in the generator's execution. The latter is probably the more natural > way to picture it, eh? *This* is what I was getting at, and it points in favor of a SUSPEND opcode since I don't know how to do that in the multiple-return. As you point out, there can be various things on the various in-frame stacks (value stack and block stack) that all get discarded by a return, and that no restart_frame() can restore (unless get_frame() returns a *copy* of the frame, which seems to be defeating the purpose). > OTOH, the possibility that someone *can* do an explicit get_frame suggests > that "for" shouldn't assume it's the master of the universe . Perhaps > the user's intent was to generate the first 100 values in a for loop, then > break out, analyze the results, and decide whether to resume it again by > hand (I've done stuff like that ...). So there's also a case to be made for > saying that a "finally" clause wrapping a generator body will only be > executed if the generator body raises an exception or the generator itself > decides it's done; i.e. iff it triggers while the generator is actively > running. Hmm... I think that if the generator is started by a for loop, it's okay for the loop to assume it is the master of the universe -- just like there's no force in the world (apart from illegal C code :) that can change the hidden loop counter in present-day for loop. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Mon Jul 12 05:36:05 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Sun, 11 Jul 1999 23:36:05 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Sun, 11 Jul 1999 15:49:57 EDT." <000201becbd6$8df90660$569e2299@tim> References: <000201becbd6$8df90660$569e2299@tim> Message-ID: <199907120336.XAA06056@eric.cnri.reston.va.us> [Tim] > "stateful iterative process" is a helpful characterization of where these > guys can be useful! State captured in variables is the obvious one, but > simply "where you are" in a mass of nested loops and conditionals is also > "state" -- and a kind of state especially clumsy to encode as data state > instead (ever rewrite a hairy recursive routine to use iteration with an > explicit stack? it's a transformation that can be mechanized, but the > result is usually ugly & often hard to understand). This is another key description of continuations (maybe not quite worth a hug :). The continuation captures exactly all state that is represented by "position in the program" and no state that is represented by variables. But there are many hairy details. In antiquated assembly, there might not be a call stack, and a continuation could be represented by a single value: the program counter. But now we have a call stack, a value stack, a block stack (in Python) and who knows what else. I'm trying to understand whether we can get away with saving just a pointer to a frame, whether we need to copy the frame, or whether we need to copy the entire frame stack. (In regular Python, the frame stack also contains local variables. These are explicitly exempted from being saved by a continuation. I don't know how Christian does this, but I presume he uses the dictionary which can be shared between frames.) Let's see... Say we have this function: def f(x): try: return 1 + (-x) finally: print "boo" The bytecode (simplified) looks like: SETUP_FINALLY (L1) LOAD_CONST (1) LOAD_FAST (x) UNARY_NEGATIVE BINARY_ADD RETURN_VALUE L1: LOAD_CONST ("boo") PRINT_ITEM PRINT_NEWLINE END_FINALLY Now suppose that the unary minus operator saves its continuation (e.g. because x was a type with a __neg__ method). At this point there is an entry on the block stack pointing to L1 as the try-finally block, and the value stack has the value 1 pushed on it. Clearly if that saved continuation is ever invoked (called? used? activated? What do you call what you do to a continuation?) it should substitute whatever value was passed into the continuation for the result of the unary minus, and the program should continue by pushing it on top of the value stack, adding it to 1, and returning the result, executing the block of code at L1 on the way out. So clearly when the continuation is used, 1 should be on the value stack and L1 should be on trh block stack. Assuming that the unary minus function initially returns just fine, the value stack and the block stack of the frame will be popped. So I conclude that saving a continuation must save at least the value and block stack of the frame being saved. Is it safe not to save the frame and block stacks of frames further down on the call stack? I don't think so -- these are all destroyed when frames are popped off the call stack (even if the frame is kept alive, its value and block stack are always empty when the function has returned). So I hope that Christian has code that saves the frame and block stacks! (It would be fun to try and optimize this by doing it lazily, so that frames which haven't returned yet aren't copied yet.) How does Scheme do this? I don't know if it has something like the block stack, but surely it has a value stack! Still mystified, --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Mon Jul 12 09:03:59 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 12 Jul 1999 03:03:59 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <000201becc34$b79f7900$9b9e2299@tim> [Guido wonders about continuations -- must be a bad night for sleep ] Paul Wilson's book-in-progress has a (large) page of HTML that you can digest quickly and that will clear up many mysteries: ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_142.html Scheme may be the most-often implemented language on Earth (ask 100 Schemers what they use Scheme for, persist until you get the truth, and 81 will eventually tell you that mostly they putz around writing their own Scheme interpreter <0.51 wink>); so there are a *lot* of approaches out there. Wilson describes a simple approach for a compiler. A key to understanding it is that continuations aren't "special" in Scheme: they're the norm. Even plain old calls are set up by saving the caller's continuation, then handing control to the callee. In Wilson's approach, "the eval stack" is a globally shared stack, but at any given moment contains only the eval temps relevant to the function currently executing. In preparation for a call, the caller saves away its state in "a continuation", a record which includes: the current program counter a pointer to the continuation record it inherited a pointer to the structure supporting name resolution (locals & beyond) the current eval stack, which gets drained (emptied) at this point There isn't anything akin to Python's block stack (everything reduces to closures, lambdas and continuations). Note: the continuation is immutable; once constructed, it's never changed. Then the callees' arguments are pushed on the eval stack, a pointer to the continuation as saved above is stored in "the continuation register", and control is transferred to the callee. Then a function return is exactly the same operation as "invoking a continuation": whatever is in the continuation register at the time of the return/invoke is dereferenced, and the PC, continuation register, env pointer and eval stack values are copied out of the continuation record. The return value is passed back in another "virtual register", and pushed onto the eval stack first thing after the guts of the continuation are restored. So this copies the eval stack all the time, at every call and every return/invoke. Kind of. This is partly why "tail calls" are such a big deal in Scheme: a tail call need not (*must* not, in std Scheme) create a new continuation. The target of a tail call simply inherits the continuation pointer inherited by its caller. Of course many Scheme implementations optimize beyond this. > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. In the absence of tail calls, the approach above saves the stack on every call and restores it on every return, so there's no "extra" copying needed when capturing, or invoking, a continuation (cold comfort, I agree ). About Christian's code, we'd better let it speak for itself -- I'm not clear on the details of what he's doing today. Generalities: > ... > So I hope that Christian has code that saves the frame and block > stacks! Yes, but nothing gets copied until a continuation gets captured, and at the start of that I believe only one frame gets cloned. > (It would be fun to try and optimize this by doing it lazily, > so that frames which haven't returned yet aren't copied yet.) He's aware of that . > How does Scheme do this? I don't know if it has something like the > block stack, but surely it has a value stack! Stacks and registers and such aren't part of the language spec, but, you bet -- however it may be spelled in a given implementation, "a value stack" is there. BTW, many optimizing Schemes define a weaker form of continuation too (call/ec, for "escaping continuation"). Skipping the mumbo jumbo definition <0.9 wink>, you can only invoke one of those if its target is on the path back from the invoker to the root of the call tree (climb up tree like Cheetah, not leap across branches like Tarzan). This amounts to a setjmp/longjmp in C -- and may be implemented that way! i-say-do-it-right-or-not-at-all-ly y'rs - tim From tismer at appliedbiometrics.com Mon Jul 12 11:44:06 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 11:44:06 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201becbd6$8df90660$569e2299@tim> <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <3789B8E6.C4CB6840@appliedbiometrics.com> Guido van Rossum wrote: ... > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. You need to preserve the stack and the block stack of a frame, if and only if it can be reached twice. I make this dependent from its refcount. Every frame monitors itself before and after every call_function, if a handler field in the frame "f_callguard" has been set. If so, the callguard is called. Its task is to see wether we must preserve the current state of the frame and to carry this out. The idea is to create a shadow frame "on demand". When I touch a frame with a refcount > 1, I duplicate it at its f_back pointer. By that is is turned into a "continuation frame" which is nothing more than the stack copy, IP, and the block stack. By that, the frame stays in place where it was, all pointers are still fine. The "real" one is now in the back, and the continuation frame's purpose when called is only to restore the state of the "real one" and run it (after doing a new save if necessary). I call this technique "push back frames". > > (In regular Python, the frame stack also contains local variables. > These are explicitly exempted from being saved by a continuation. I > don't know how Christian does this, but I presume he uses the > dictionary which can be shared between frames.) I keep the block stack and a stack copy. All the locals are only existing once. The frame is also only one frame. Actually always a new one (due to push back), but virtually it is "the frame", with multiple continuation frames pointing at it. ... > Clearly if that saved continuation is ever invoked (called? used? > activated? What do you call what you do to a continuation?) I think of throwing. Mine are thrown. The executive of standard frames is "eval_code2_loop(f, passed_retval)", where the executive of a continuation frame is "throw_continuation(f, passed_retval)". ... > Is it safe not to save the frame and block stacks of frames further > down on the call stack? I don't think so -- these are all destroyed > when frames are popped off the call stack (even if the frame is kept > alive, its value and block stack are always empty when the function > has returned). > > So I hope that Christian has code that saves the frame and block > stacks! (It would be fun to try and optimize this by doing it lazily, > so that frames which haven't returned yet aren't copied yet.) :-) I have exactly that, and I do it lazily already. Unless somebody saves a continuation, nothing special happens. But if he does, the push back process follows his path like a zip (? Rei?verschlu?) and ensures that the path can be walked again. Tarzan has now the end of this liane in his hand. He might use it to swing over, or he might drop it, and it ribbles away and vanishes as if it never existed. Give me some final testing, and you will be able to try it out in a few days. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Mon Jul 12 11:56:00 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 11:56:00 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201becc34$b79f7900$9b9e2299@tim> Message-ID: <3789BBB0.39F6BD20@appliedbiometrics.com> Tim Peters wrote: ... > BTW, many optimizing Schemes define a weaker form of continuation too > (call/ec, for "escaping continuation"). Skipping the mumbo jumbo definition > <0.9 wink>, you can only invoke one of those if its target is on the path > back from the invoker to the root of the call tree (climb up tree like > Cheetah, not leap across branches like Tarzan). This amounts to a > setjmp/longjmp in C -- and may be implemented that way! Right, maybe this would do enough. We will throw away what's not needed, when we know what we actually need... > i-say-do-it-right-or-not-at-all-ly y'rs - tim ...and at the moment I think it was right to take it all. just-fixing-continuations-spun-off-in-an-__init__-which- -is-quite-hard-since-still-recursive,-and-I-will-ship-it-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Mon Jul 12 17:42:14 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 12 Jul 1999 11:42:14 -0400 (EDT) Subject: [Python-Dev] Generator details References: <199907101548.LAA04399@eric.cnri.reston.va.us> <000001becc0d$fcb64f40$229e2299@tim> Message-ID: <14218.3286.847367.125679@anthem.cnri.reston.va.us> | value, frame = generator(args) | try: | while frame: | etc | value, frame = resume_frame(frame) | finally: | if frame: | shut_frame_down(frame) Minor point, but why not make resume() and shutdown() methods on the frame? Isn't this much cleaner? value, frame = generator(args) try: while frame: etc value, frame = frame.resume() finally: if frame: frame.shutdown() -Barry From tismer at appliedbiometrics.com Mon Jul 12 21:39:40 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 12 Jul 1999 21:39:40 +0200 Subject: [Python-Dev] continuationmodule.c preview Message-ID: <378A447C.D4DD24D8@appliedbiometrics.com> Howdy, please find attached my latest running version of continuationmodule.c which is really able to do continuations. You need stackless Python 0.3 for it, which I just submitted. This module is by no means ready. The central functions are getpcc() and putcc. Call/cc is at the moment to be done like: def callcc(fun, *args, **kw): cont = getpcc() return apply(fun, (cont,)+args, kw) getpcc(level=1) gets a parent's current continuation. putcc(cont, val) throws a continuation. At the moment, these are still frames (albeit special ones) which I will change. They should be turned into objects which have a link to the actual frame, which can be unlinked after a shot or by hand. This makes it easier to clean up circular references. I have a rough implementation of this in Python, also a couple of generators and coroutines, but all not pleasing me yet. Due to the fact that my son is ill, my energy has dropped a little for the moment, so I thought I'd better release something now. I will make the module public when things have been settled a little more. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home -------------- next part -------------- A non-text attachment was scrubbed... Name: continuationmodule.c Type: application/x-unknown-content-type-cfile Size: 19750 bytes Desc: not available URL: From guido at CNRI.Reston.VA.US Mon Jul 12 22:04:21 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 12 Jul 1999 16:04:21 -0400 Subject: [Python-Dev] Python bugs database started Message-ID: <199907122004.QAA09348@eric.cnri.reston.va.us> Barry has installed Jitterbug on python.org and now we can use it to track Python bugs. I already like it much better than the todo wizard, because the response time is much better (the CGI program is written in C). Please try it out -- submit bugs, search for bugs, etc. The URL is http://www.python.org/python-bugs/. Some of you already subscribed to the mailing list (python-bugs-list) -- beware that this list receives a message for each bug reported and each followup. The HTML is preliminary -- it is configurable (somewhat) and I would like to make it look nicer, but don't have the time right now. There are certain features (such as moving bugs to different folders) that are only accessible to authorized users. If you have a good reason I might authorize you. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Tue Jul 13 06:03:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:25 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: <199907122004.QAA09348@eric.cnri.reston.va.us> Message-ID: <000701becce4$a973c920$31a02299@tim> > Please try it out -- submit bugs, search for bugs, etc. The URL is > http://www.python.org/python-bugs/. Cool! About those "Jitterbug bugs" (repeated submissions): those popped up for me, DA, and MH. The first and the last are almost certainly using IE5 as their browser, and that DA shows increasing signs of becoming a Windows Mutant too . The first time I submitted a bug, I backed up to the entry page and hit Refresh to get the category counts updated (never saw Jitterbug before, so must play!). IE5 whined about something-or-other being out of date, and would I like to "repost the data"? I said sure. I did that a few other times after posting other bugs, and-- while I don't know for sure --it looks likely that you got a number of resubmissions equal to the number of times I told IE5 "ya, ya, repost whatever you want". Next time I post a bug I'll just close the browser and come back an hour later. If "the repeat bug" goes away then, it's half IE5's fault for being confused about which page it's on, and half mine for assuming IE5 knows what it's doing. meta-bugging-ly y'rs - tim From tim_one at email.msn.com Tue Jul 13 06:03:30 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:30 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <199907120301.XAA06001@eric.cnri.reston.va.us> Message-ID: <000801becce4$aafd7660$31a02299@tim> [Guido] > ... > Hmm... I think that if the generator is started by a for loop, it's > okay for the loop to assume it is the master of the universe -- just > like there's no force in the world (apart from illegal C code :) that > can change the hidden loop counter in present-day for loop. If it comes to a crunch, me too. I think your idea of forcing an exception in the frame's destructor (to get the stacks cleaned up, and any suspended "finally" blocks executed) renders this a non-issue, though (it will "just work", and if people resort to illegal C code, it will *still* work ). hadn't-noticed-you-can't-spell-"illegal-code"-without-"c"-ly y'rs - tim From tim_one at email.msn.com Tue Jul 13 06:03:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:03:33 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907120336.XAA06056@eric.cnri.reston.va.us> Message-ID: <000901becce4$ac88aa40$31a02299@tim> Backtracking a bit: [Guido] > This is another key description of continuations (maybe not quite > worth a hug :). I suppose a kiss is out of the question, then. > The continuation captures exactly all state that is represented by > "position in the program" and no state that is represented by variables. Right! > But there are many hairy details. In antiquated assembly, there might > not be a call stack, and a continuation could be represented by a > single value: the program counter. But now we have a call stack, a > value stack, a block stack (in Python) and who knows what else. > > I'm trying to understand whether we can get away with saving just a > pointer to a frame, whether we need to copy the frame, or whether we > need to copy the entire frame stack. As you convinced yourself in following paragraphs, for 1st-class continuations "the entire frame stack" *may* be necessary. > ... > How does Scheme do this? I looked up R. Kent Dybvig's doctoral dissertation, at ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/3imp.ps.gz He gives detailed explanations of 3 Scheme implementations there (from whence "3imp", I guess). The first is all heap-based, and looks much like the simple Wilson implementation I summarized yesterday. Dybvig profiled it and discovered it spent half its time in, together, function call overhead and name resolution. So he took a different approach: Scheme is, at heart, just another lexically scoped language, like Algol or Pascal. So how about implementing it with a perfectly conventional shared, contiguous stack? Because that doesn't work: the advanced features (lexical closures with indefinite extent, and user-captured continuations) aren't stack-like. Tough, forget those at the start, and do whatever it takes later to *make* 'em work. So he did. When his stack implementation hit a user's call/cc, it made a physical copy of the entire stack. And everything ran much faster! He points out that "real programs" come in two flavors: 1) Very few, or no, call/cc thingies. Then most calls are no worse than Algol/Pascal/C functions, and the stack implementation runs them at Algol/Pascal/C speed (if we knew of anything faster than a plain stack, the latter would use it). 2) Lots of call/cc thingies. Then "the stack" is likely to be shallow (the program is spending most of its time co-transferring, not recursing deeply), and because the stack is contiguous he can exploit the platform's fastest block-copy operation (no need to chase pointer links, etc). So, in some respects, Dybvig's stack implementation of Scheme was more Pythonic than Python's current implementation . His third implementation was for some propeller-head theoretical "string machine", so I won't even mention it. worrying-about-the-worst-case-can-hurt-the-normal-cases-ly y'rs - tim From da at ski.org Tue Jul 13 06:15:28 1999 From: da at ski.org (David Ascher) Date: Mon, 12 Jul 1999 21:15:28 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Python bugs database started In-Reply-To: <000701becce4$a973c920$31a02299@tim> Message-ID: > About those "Jitterbug bugs" (repeated submissions): those popped up for > me, DA, and MH. The first and the last are almost certainly using IE5 as > their browser, and that DA shows increasing signs of becoming a Windows > Mutant too . > > Next time I post a bug I'll just close the browser and come back an hour > later. If "the repeat bug" goes away then, it's half IE5's fault for being > confused about which page it's on, and half mine for assuming IE5 knows what > it's doing. FYI, I did the same thing but w/ Communicator. (I do use windows, but refuse to use IE =). This one's not specifically MS' fault. From tim_one at email.msn.com Tue Jul 13 06:47:43 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 00:47:43 -0400 Subject: [Python-Dev] Generator details In-Reply-To: <14218.3286.847367.125679@anthem.cnri.reston.va.us> Message-ID: <001501beccea$d83f6740$31a02299@tim> [Barry] > Minor point, but why not make resume() and shutdown() methods on the > frame? Isn't this much cleaner? > > value, frame = generator(args) > try: > while frame: > etc > value, frame = frame.resume() > finally: > if frame: > frame.shutdown() Yes -- and at least it's better than arguing over what to name them . btw-tabs-in-email-don't-look-the-way-you-expect-them-to-ly y'rs - tim From tim_one at email.msn.com Tue Jul 13 08:47:43 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 02:47:43 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <378A447C.D4DD24D8@appliedbiometrics.com> Message-ID: <000001beccfb$9beb4fa0$2f9e2299@tim> The latest versions of the Icon language (9.3.1 & beyond) sprouted an interesting change in semantics: if you open a file for reading in "translated" (text) mode now, it normalizes Unix, Mac and Windows line endings to plain \n. Writing in text mode still produces what's natural for the platform. Anyone think that's *not* a good idea? c-will-never-get-fixed-ly y'rs - tim From Vladimir.Marangozov at inrialpes.fr Tue Jul 13 13:54:00 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Tue, 13 Jul 1999 12:54:00 +0100 (NFT) Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000101bec6b3$4e752be0$349e2299@tim> from "Tim Peters" at "Jul 5, 99 02:55:02 am" Message-ID: <199907131154.MAA22698@pukapuka.inrialpes.fr> After a short vacation, I'm trying to swallow the latest discussion about control flow management & derivatives. Could someone help me please by answering two naive questions that popped up spontaneously in my head: Tim Peters wrote: [a biased short course on generators, continuations, coroutines] > > ... > > GENERATORS > > Generators add two new abstract operations, "suspend" and "resume". When a > generator suspends, it's exactly like a return today except we simply > decline to decref the frame. That's it! The locals, and where we are in > the computation, aren't thrown away. A "resume" then consists of > *re*starting the frame at its next bytecode instruction, with the retained > frame's locals and eval stack just as they were. > > ... > > too-simple-to-be-obvious?-ly y'rs - tim Yes. I'm trying to understand the following: 1. What does a generator generate? 2. Clearly, what's the difference between a generator and a thread? -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tismer at appliedbiometrics.com Tue Jul 13 13:41:32 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 13 Jul 1999 13:41:32 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <199907131154.MAA22698@pukapuka.inrialpes.fr> Message-ID: <378B25EC.2739BCE3@appliedbiometrics.com> Vladimir Marangozov wrote: ... > > too-simple-to-be-obvious?-ly y'rs - tim > > Yes. I'm trying to understand the following: > > 1. What does a generator generate? Trying my little understanding. A generator generates a series of results if you ask for it. That's done by a resume call (generator, resume your computation), and the generate continues until he either comes to a suspend (return a value, but be prepared to continue from here) or it does a final return. > 2. Clearly, what's the difference between a generator and a thread? Threads can be scheduled automatically, and they don't return values to each other, natively. Generators are asymmetric to their callers, they're much like functions. Coroutines are more symmetric. They "return" to each other values. They are not determined as caller and callee, but they cooperate on the same level. Therefore, threads and coroutines look more similar, just that coroutines usually are'nt scheduled automatically. Add a scheduler, don't pass values, and you have threads, nearly. (of course I dropped the I/O blocking stuff which doesn't apply and isn't the intent of fake threads). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Tue Jul 13 14:53:52 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 13 Jul 1999 08:53:52 -0400 Subject: [Python-Dev] End of the line In-Reply-To: Your message of "Tue, 13 Jul 1999 02:47:43 EDT." <000001beccfb$9beb4fa0$2f9e2299@tim> References: <000001beccfb$9beb4fa0$2f9e2299@tim> Message-ID: <199907131253.IAA10730@eric.cnri.reston.va.us> > The latest versions of the Icon language (9.3.1 & beyond) sprouted an > interesting change in semantics: if you open a file for reading in > "translated" (text) mode now, it normalizes Unix, Mac and Windows line > endings to plain \n. Writing in text mode still produces what's natural for > the platform. > > Anyone think that's *not* a good idea? I've been thinking about this myself -- exactly what I would do. Not clear how easy it is to implement (given that I'm not so enthused about the idea of rewriting the entire I/O system without using stdio -- see archives). The implementation must be as fast as the current one -- people used to complain bitterly when readlines() or read() where just a tad slower than they *could* be. There's a lookahead of 1 character needed -- ungetc() might be sufficient except that I think it's not guaranteed to work on unbuffered files. Should also do this for the Python parser -- there it would be a lot easier. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Tue Jul 13 16:41:25 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 13 Jul 1999 10:41:25 -0400 (EDT) Subject: [Python-Dev] Python bugs database started References: <199907122004.QAA09348@eric.cnri.reston.va.us> <000701becce4$a973c920$31a02299@tim> Message-ID: <14219.20501.697542.358579@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> The first time I submitted a bug, I backed up to the entry TP> page and hit Refresh to get the category counts updated (never TP> saw Jitterbug before, so must play!). IE5 whined about TP> something-or-other being out of date, and would I like to TP> "repost the data"? I said sure. This makes perfect sense, and explains exactly what's going on. Let's call it "poor design"[1] instead of "user error". A quick scan last night of the Jitterbug site shows no signs of fixes or workarounds. What would Jitterbug have to do to avoid these kinds of problems? Maybe keep a checksum of the current submission and check it against the next one to make sure it's not a re-submit. Maybe a big warning sign reading "Do not repost this form!" Hmm. I think I'll complain on the Jitterbug mailing list. -Barry [1] In the midst of re-reading D. Norman's "The Design of Everyday Things", otherwise I would have said you guys were just incompetent Webweenies :) From da at ski.org Tue Jul 13 18:01:55 1999 From: da at ski.org (David Ascher) Date: Tue, 13 Jul 1999 09:01:55 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Python bugs database started In-Reply-To: <14219.20501.697542.358579@anthem.cnri.reston.va.us> Message-ID: On Tue, 13 Jul 1999, Barry A. Warsaw wrote: > > This makes perfect sense, and explains exactly what's going on. Let's > call it "poor design"[1] instead of "user error". A quick scan last > night of the Jitterbug site shows no signs of fixes or workarounds. > What would Jitterbug have to do to avoid these kinds of problems? > Maybe keep a checksum of the current submission and check it against > the next one to make sure it's not a re-submit. That's be good -- alternatively, insert a 'safe' CGI script after the validation -- "Thanks for submitting the bug. Click here to go back to the home page". From guido at CNRI.Reston.VA.US Tue Jul 13 18:09:48 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 13 Jul 1999 12:09:48 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: Your message of "Tue, 13 Jul 1999 09:01:55 PDT." References: Message-ID: <199907131609.MAA11208@eric.cnri.reston.va.us> > That's be good -- alternatively, insert a 'safe' CGI script after the > validation -- "Thanks for submitting the bug. Click here to go back to > the home page". That makes a lot of sense! I'm now quite sure that I had the same "Repost form data?" experience, and just didn't realized that mattered, because I was staring at the part of the form that was showing the various folders. The Jitterbug software is nice for tracking bugs, but its user interface *SUCKS*. I wish I had the time to redseign that part -- unfortunately it's probably totally integrated with the rest of the code... --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at CNRI.Reston.VA.US Tue Jul 13 18:19:26 1999 From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw) Date: Tue, 13 Jul 1999 12:19:26 -0400 (EDT) Subject: [Python-Dev] Python bugs database started References: <199907131609.MAA11208@eric.cnri.reston.va.us> Message-ID: <14219.26382.122095.608613@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> The Jitterbug software is nice for tracking bugs, but its Guido> user interface *SUCKS*. I wish I had the time to redseign Guido> that part -- unfortunately it's probably totally integrated Guido> with the rest of the code... There is an unsupported fork that some guy did that totally revamped the interface: http://lists.samba.org/listproc/jitterbug/0095.html Still not great tho'. -Barry From MHammond at skippinet.com.au Wed Jul 14 04:25:50 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 14 Jul 1999 12:25:50 +1000 Subject: [Python-Dev] Interrupting a thread Message-ID: <006d01becda0$318035e0$0801a8c0@bobcat> Ive struck this a number of times, and the simple question is "can we make it possible to interrupt a thread without the thread's knowledge" or otherwise stated "how can we asynchronously raise an exception in another thread?" The specific issue is that quite often, I find it necessary to interrupt one thread from another. One example is Pythonwin - rather than use the debugger hooks as IDLE does, I use a secondary thread. But how can I use that thread to interrupt the code executing in the first? (With magic that only works sometimes is how :-) Another example came up on the newsgroup recently - discussion about making Medusa a true Windows NT Service. A trivial solution would be to have a "service thread", that simply runs Medusa's loop in a seperate thread. When the "service thread" recieves a shut-down request from NT, how can it interrupt Medusa? I probably should not have started with a Medusa example - it may have a solution. Pretend I said "any arbitary script written to run similarly to a Unix daemon". There are one or 2 other cases where I have wanted to execute existing code that assumes it runs stand-alone, and can really only be stopped with a KeyboardInterrupt. I can't see a decent way to do this. [I guess this ties into the "signals and threads" limitations - I believe you cant direct signals at threads either?] Is it desirable? Unfortunately, I can see that it might be hard :-( But-sounds-pretty-easy-under-those-fake-threads-ly, Mark. From tim_one at email.msn.com Wed Jul 14 05:56:20 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 23:56:20 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199907131154.MAA22698@pukapuka.inrialpes.fr> Message-ID: <000d01becdac$d4dee900$7d9e2299@tim> [Vladimir Marangozov] > Yes. I'm trying to understand the following: > > 1. What does a generator generate? Any sequence of objects: the lines in a file, the digits of pi, a postorder traversal of the nodes of a binary tree, the files in a directory, the machines on a LAN, the critical bugs filed before 3/1/1995, the set of builtin types, all possible ways of matching a regexp to a string, the 5-card poker hands beating a pair of deuces, ... anything! Icon uses the word "generators", and it's derived from that language's ubiquitous use of the beasts to generate paths in a backtracking search space. In OO languages it may be better to name them "iterators", after the closest common OO concept. The CLU language had full-blown (semi-coroutine, like Icon generators) iterators 20 years ago, and the idea was copied & reinvented by many later languages. Sather is probably the best known of those, and also calls them iterators. > 2. Clearly, what's the difference between a generator and a thread? If you can clearly explain what "a thread" is, I can clearly explain the similarities and differences. Well? I'm holding my breath here . Generators/iterators are simpler than threads, whether looked at from a user's viewpoint or an implementor's. Their semantics are synchronous and deterministic. Python's for/__getitem__ protocol *is* an iterator protocol already, but if I ask you which is the 378th 5-card poker hand beating a pair of deuces, and ask you a new question like that every hour, you may start to suspect there may be a better way to *approach* coding enumerations in general . then-again-there-may-not-be-ly y'rs - tim From tim_one at email.msn.com Wed Jul 14 05:56:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 13 Jul 1999 23:56:15 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <199907131253.IAA10730@eric.cnri.reston.va.us> Message-ID: <000c01becdac$d2ad6300$7d9e2299@tim> [Tim] > ... Icon ... sprouted an interesting change in semantics: if you open > a file for reading in ...text mode ... it normalizes Unix, Mac and > Windows line endings to plain \n. Writing in text mode still produces > what's natural for the platform. [Guido] > I've been thinking about this myself -- exactly what I would do. Me too . > Not clear how easy it is to implement (given that I'm not so enthused > about the idea of rewriting the entire I/O system without using stdio > -- see archives). The Icon implementation is very simple: they *still* open the file in stdio text mode. "What's natural for the platform" on writing then comes for free. On reading, libc usually takes care of what's needed, and what remains is to check for stray '\r' characters that stdio glossed over. That is, in fileobject.c, replacing if ((*buf++ = c) == '\n') { if (n < 0) buf--; break; } with a block like (untested!) *buf++ = c; if (c == '\n' || c == '\r') { if (c == '\r') { *(buf-1) = '\n'; /* consume following newline, if any */ c = getc(fp); if (c != '\n') ungetc(c, fp); } if (n < 0) buf--; break; } Related trickery needed in readlines. Of course the '\r' business should be done only if the file was opened in text mode. > The implementation must be as fast as the current one -- people used > to complain bitterly when readlines() or read() where just a tad > slower than they *could* be. The above does add one compare per character. Haven't timed it. readlines may be worse. BTW, people complain bitterly anyway, but it's in comparison to Perl text mode line-at-a-time reads! D:\Python>wc a.c 1146880 3023873 25281537 a.c D:\Python> Reading that via def g(): f = open("a.c") while 1: line = f.readline() if not line: break and using python -O took 51 seconds. Running the similar Perl (although it's not idiomatic Perl to assign each line to an explict var, or to test that var in the loop, or to use "if !" instead of "unless" -- did all those to make it more like the Python): open(DATA, ") {last if ! $line;} took 17 seconds. So when people are complaining about a factor of 3, I'm not inclined to get excited about a few percent . > There's a lookahead of 1 character needed -- ungetc() might be > sufficient except that I think it's not guaranteed to work on > unbuffered files. Don't believe I've bumped into that. *Have* bumped into problems with ungetc not playing nice with fseek/ftell, and that's probably enough to kill it right there (alas). > Should also do this for the Python parser -- there it would be a lot > easier. And probably the biggest bang for the buck. the-problem-with-exposing-libc-is-that-libc-isn't-worth-exposing Message-ID: <007401becdb6$22445c80$0801a8c0@bobcat> I asked Guido to provide comments on one of the chapters in our book: I was discussing appending the mode ("t" or "b") to the open() call > p.10, bottom: text mode is the default -- I've never seen the 't' > option described! (So even if it exists, better be silent about it.) > You need to append 'b' to get binary mode instead. This brings up an interesting issue. MSVC exposes a global variable that contains the default mode - ie, you can change the default to binary. (_fmode for those with the docs) This has some implications and questions: * Will Guido ever bow to pressure (when it arrives :) to expose this via the "msvcrt" module? I can imagine where it may be useful in a limited context. A reasonable argument would be that, like _setmode and other MS specific stuff, if it exists it should be exposed. * But even if not, due to the shared CRTL, in COM and other worlds we really cant predict what the default is. Although Python does not touch it, that does not stop someone else touching it. A web-server built using MSVC on Windows may use it? Thus, it appears that to be 100% sure what mode you are using, you should not rely on the default, but should _always_ use "b" or "t" on the file mode. Any thoughts or comments? The case for abandoning the CRTL's text mode gets stronger and stronger! Mark. From tim_one at email.msn.com Wed Jul 14 08:35:31 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 02:35:31 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <000901becdc3$119be9e0$a09e2299@tim> [Mark Hammond] > Ive struck this a number of times, and the simple question is "can we > make it possible to interrupt a thread without the thread's knowledge" > or otherwise stated "how can we asynchronously raise an exception in > another thread?" I don't think there's any portable way to do this. Even restricting the scope to Windows, forget Python for a moment: can you do this reliably with NT threads from C, availing yourself of every trick in the SDK? Not that I know of; not without crafting a new protocol that the targeted threads agree to in advance. > ... > But-sounds-pretty-easy-under-those-fake-threads-ly, Yes, piece o' cake! Fake threads can do anything, because unless we write every stick of their implementation they can't do anything at all . odd-how-solutions-create-more-problems-than-they-solve-ly y'rs - tim From tim_one at email.msn.com Wed Jul 14 08:35:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 02:35:33 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <007401becdb6$22445c80$0801a8c0@bobcat> Message-ID: <000a01becdc3$12d94be0$a09e2299@tim> [Mark Hammond] > ... > MSVC exposes a global variable that contains the default [fopen] mode - > ie, you can change the default to binary. (_fmode for those with the > docs) > > This has some implications and questions: > * Will Guido ever bow to pressure (when it arrives :) to expose this via > the "msvcrt" module? No. It changes the advertised semantics of Python builtins, and no option ever does that. If it went in at all, it would have to be exposed as a Python-level feature that changed the semantics similarly on all platforms -- and even then Guido wouldn't put it in . > ... > Thus, it appears that to be 100% sure what mode you are using, you should > not rely on the default, but should _always_ use "b" or "t" on the file > mode. And on platforms that have libc options to treat "t" as if it were "b"? There's no limit to how perverse platform options can get! There's no fully safe ground to stand on, so Python stands on the minimal guarantees libc provides. If a user violates those, tough, they can't use Python. Unless, of course, they contribute a lot of money to the PSA . > ... > Any thoughts or comments? The case for abandoning the CRTL's text mode > gets stronger and stronger! C's text mode is, alas, a bad joke. The only thing worse is Microsoft's half-assed implementation of it <0.5 wink>. ctrl-z-=-eof-even-gets-in-the-way-under-windows!-ly y'rs - tim From MHammond at skippinet.com.au Wed Jul 14 08:58:25 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 14 Jul 1999 16:58:25 +1000 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <000901becdc3$119be9e0$a09e2299@tim> Message-ID: <007e01becdc6$45982490$0801a8c0@bobcat> > I don't think there's any portable way to do this. Even > restricting the > scope to Windows, forget Python for a moment: can you do > this reliably with > NT threads from C, availing yourself of every trick in the > SDK? Not that I Nope - not if I forget Python. However, when I restrict myself _to_ Python, I find this nice little ceval.c loop and nice little per-thread structures - even with nice-looking exception place-holders ;-) Something tells me that it wont be quite as easy as filling these in (while you have the lock, of course!), but it certainly seems far more plausible than if we consider it a C problem :-) > odd-how-solutions-create-more-problems-than-they-solve-ly y'rs - tim Only because they often open your eyes to a whole new class of problem . Continuations/generators/co-routines (even threads themselves!) would appear to be a good example - for all their power, I shudder to think at the number of questions they will generate! If I understand correctly, it is a recognised deficiency WRT signals and threads - so its all Guido's fault for adding these damn threads in the first place :-) just-more-proof-there-is-no-such-thing-as-a-free-lunch-ly, Mark. From jack at oratrix.nl Wed Jul 14 10:07:59 1999 From: jack at oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 10:07:59 +0200 Subject: [Python-Dev] Python bugs database started In-Reply-To: Message by Guido van Rossum , Tue, 13 Jul 1999 12:09:48 -0400 , <199907131609.MAA11208@eric.cnri.reston.va.us> Message-ID: <19990714080759.D49B2303120@snelboot.oratrix.nl> > The Jitterbug software is nice for tracking bugs, but its user > interface *SUCKS*. I wish I had the time to redseign that part -- > unfortunately it's probably totally integrated with the rest of the > code... We looked into bug tracking systems recently, and basically they all suck. We went with gnats in the end, but it has pretty similar problems on the GUI side. But maybe we could convince some people with too much time on their hands to do a Python bug reporting system:-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jack at oratrix.nl Wed Jul 14 10:21:16 1999 From: jack at oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 10:21:16 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Tim Peters" , Tue, 13 Jul 1999 23:56:15 -0400 , <000c01becdac$d2ad6300$7d9e2299@tim> Message-ID: <19990714082116.6DE96303120@snelboot.oratrix.nl> > The Icon implementation is very simple: they *still* open the file in stdio > text mode. "What's natural for the platform" on writing then comes for > free. On reading, libc usually takes care of what's needed, and what > remains is to check for stray '\r' characters that stdio glossed over. This'll work for Unix and PC conventions, but not for the Mac. Mac end of line is \r, so reading a line from a mac file on unix will give you the whole file. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tismer at appliedbiometrics.com Wed Jul 14 14:13:10 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 14 Jul 1999 14:13:10 +0200 Subject: [Python-Dev] Interrupting a thread References: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <378C7ED6.F0DB4E6E@appliedbiometrics.com> Mark Hammond wrote: ... > Another example came up on the newsgroup recently - discussion about making > Medusa a true Windows NT Service. A trivial solution would be to have a > "service thread", that simply runs Medusa's loop in a seperate thread. Ah, thanks, that was what I'd like to know :-) > When the "service thread" recieves a shut-down request from NT, how can it > interrupt Medusa? Very simple. I do this shutdown stuff already, at a user request. Medusa has its polling loop which is so simple (wait until a timeout, then run again) that I pulled it out of Medusa, and added a polling function. I have even simulated timer objects by this, which do certain tasks from time to time (at the granularity of the loop of course). One of these looks if there is a global object in module __main__ with a special name which is executable. This happens to be the shutdown, which may be injected by another thread as well. I can send you an example. > I probably should not have started with a Medusa example - it may have a > solution. Pretend I said "any arbitary script written to run similarly to > a Unix daemon". There are one or 2 other cases where I have wanted to > execute existing code that assumes it runs stand-alone, and can really only > be stopped with a KeyboardInterrupt. I can't see a decent way to do this. Well, yes, I would want to have this too, and see also no way. > [I guess this ties into the "signals and threads" limitations - I believe > you cant direct signals at threads either?] > > Is it desirable? Unfortunately, I can see that it might be hard :-( > > But-sounds-pretty-easy-under-those-fake-threads-ly, You mean you would catch every signal in the one thread, and redirect it to the right fake thread. Given exactly two real threads, one always sitting waiting in a multiple select, the other running any number of fake threads. Would this be enough to do everything which is done with threads today? maybe-almost-ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Wed Jul 14 14:24:53 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 08:24:53 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: Your message of "Wed, 14 Jul 1999 15:10:38 +1000." <007401becdb6$22445c80$0801a8c0@bobcat> References: <007401becdb6$22445c80$0801a8c0@bobcat> Message-ID: <199907141224.IAA12211@eric.cnri.reston.va.us> > I asked Guido to provide comments on one of the chapters in our book: > > I was discussing appending the mode ("t" or "b") to the open() call > > > p.10, bottom: text mode is the default -- I've never seen the 't' > > option described! (So even if it exists, better be silent about it.) > > You need to append 'b' to get binary mode instead. In addition, 't' probably isn't even supported on many Unix systems! > This brings up an interesting issue. > > MSVC exposes a global variable that contains the default mode - ie, you can > change the default to binary. (_fmode for those with the docs) The best thing to do with this variable is to ignore it. In large programs like Python that link together pieces of code that never ever heard about each other, making global changes to the semantics of standard library functions is a bad thing. Code that sets it or requires you to set it is broken. > This has some implications and questions: > * Will Guido ever bow to pressure (when it arrives :) to expose this via > the "msvcrt" module? I can imagine where it may be useful in a limited > context. A reasonable argument would be that, like _setmode and other MS > specific stuff, if it exists it should be exposed. No. (And I've never bought that argument before -- I always use "is there sufficient need and no other way.") > * But even if not, due to the shared CRTL, in COM and other worlds we > really cant predict what the default is. Although Python does not touch > it, that does not stop someone else touching it. A web-server built using > MSVC on Windows may use it? But would be stupid for it to do so, and I would argue that the web server was broken. Since they should know better than this, I doubt they do this (this option is more likely to be used in small, self-contained programs). Until you find a concrete example, let's ignore the possibility. > Thus, it appears that to be 100% sure what mode you are using, you should > not rely on the default, but should _always_ use "b" or "t" on the file > mode. Stop losing sleep over it. > Any thoughts or comments? The case for abandoning the CRTL's text mode > gets stronger and stronger! OK, you write the code :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Wed Jul 14 15:03:07 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Wed, 14 Jul 1999 09:03:07 -0400 (EDT) Subject: [Python-Dev] Python bugs database started In-Reply-To: <19990714080759.D49B2303120@snelboot.oratrix.nl> References: <199907131609.MAA11208@eric.cnri.reston.va.us> <19990714080759.D49B2303120@snelboot.oratrix.nl> Message-ID: <14220.35467.644552.307210@amarok.cnri.reston.va.us> Jack Jansen writes: >But maybe we could convince some people with too much time on their hands to >do a Python bug reporting system:-) Digicool has a relatively simple bug tracking system for Zope which you can try out at http://www.zope.org/Collector/ . -- A.M. Kuchling http://starship.python.net/crew/amk/ I'm going to dance now, I'm afraid. -- Ishtar ends it all, in SANDMAN #45: "Brief Lives:5" From gmcm at hypernet.com Wed Jul 14 16:02:22 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 14 Jul 1999 09:02:22 -0500 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <007401becdb6$22445c80$0801a8c0@bobcat> References: <199907121650.MAA06687@eric.cnri.reston.va.us> Message-ID: <1280165369-10624337@hypernet.com> [Mark] > I asked Guido to provide comments on one of the chapters in our > book: > > I was discussing appending the mode ("t" or "b") to the open() call [Guido] > > p.10, bottom: text mode is the default -- I've never seen the 't' > > option described! (So even if it exists, better be silent about it.) > > You need to append 'b' to get binary mode instead. I hadn't either, until I made the mistake of helping Mr took-6-exchanges-before-he-used-the-right-DLL Embedder, who used it in his code. Certainly not mentioned in man fopen on my Linux box. > This brings up an interesting issue. > > MSVC exposes a global variable that contains the default mode - ie, > you can change the default to binary. (_fmode for those with the > docs) Mentally prepend another underscore. This is something for that other p-language. >... The case for abandoning the CRTL's text > mode gets stronger and stronger! If you're tying this in with Tim's Icon worship, note that in these days of LANS, the issue is yet more complex. It would be dandy if I could read text any old text file and have it look sane, but I may be writing it to a different machine without any way of knowing that. When I bother to manipulate these things, I usually choose to use *nix style text files. But I don't deal with Macs, and the only common Windows tool that can't deal with plain \n is Notepad. and-stripcr.py-is-everywhere-available-on-my-Linux-box-ly y'rs - Gordon From guido at CNRI.Reston.VA.US Wed Jul 14 17:05:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 11:05:04 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: Your message of "Wed, 14 Jul 1999 12:25:50 +1000." <006d01becda0$318035e0$0801a8c0@bobcat> References: <006d01becda0$318035e0$0801a8c0@bobcat> Message-ID: <199907141505.LAA12313@eric.cnri.reston.va.us> > Ive struck this a number of times, and the simple question is "can we make > it possible to interrupt a thread without the thread's knowledge" or > otherwise stated "how can we asynchronously raise an exception in another > thread?" > > The specific issue is that quite often, I find it necessary to interrupt > one thread from another. One example is Pythonwin - rather than use the > debugger hooks as IDLE does, I use a secondary thread. But how can I use > that thread to interrupt the code executing in the first? (With magic that > only works sometimes is how :-) > > Another example came up on the newsgroup recently - discussion about making > Medusa a true Windows NT Service. A trivial solution would be to have a > "service thread", that simply runs Medusa's loop in a seperate thread. > When the "service thread" recieves a shut-down request from NT, how can it > interrupt Medusa? > > I probably should not have started with a Medusa example - it may have a > solution. Pretend I said "any arbitary script written to run similarly to > a Unix daemon". There are one or 2 other cases where I have wanted to > execute existing code that assumes it runs stand-alone, and can really only > be stopped with a KeyboardInterrupt. I can't see a decent way to do this. > > [I guess this ties into the "signals and threads" limitations - I believe > you cant direct signals at threads either?] > > Is it desirable? Unfortunately, I can see that it might be hard :-( > > But-sounds-pretty-easy-under-those-fake-threads-ly, Hmm... Forget about signals -- they're twisted Unixisms (even if they are nominally supported on NT). The interesting thing is that you can interrupt the "main" thread easily (from C) using Py_AddPendingCall() -- this registers a function that will be invoked by the main thread the next time it gets to the top of the VM loop. But the mechanism here was designed with a specific purpose in mind, and it doesn't allow you to aim at a specific thread -- it only works for the main thread. It might be possible to add an API that allows you to specify a thread id though... Of course if the thread to be interrupted is blocked waiting for I/O, this is not going to interrupt the I/O. (On Unix, that's what signals do; is there an equivalent on NT? I don't think so.) Why do you say that your magic only works sometimes? You mailed me your code once and the Python side of it looks okay to me: it calls PyErr_SetInterrupt(), which calls Py_AddPendingCall(), which is threadsafe. Of course it only works if the thread you try to interrupt is recognized by Python as the main thread -- perhaps this is not always under your control, e.g. when COM interferes? Where is this going? Is the answer "provide a C-level API like Py_AddPendingCall() that takes a thread ID" good enough? Note that for IDLE, I have another problem -- how to catch the ^C event when Tk is processing events? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Wed Jul 14 17:42:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 11:42:14 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <19990714082116.6DE96303120@snelboot.oratrix.nl> Message-ID: <000101bece0f$72095c80$f7a02299@tim> [Tim] > On reading, libc usually takes care of what's needed, and what > remains is to check for stray '\r' characters that stdio glossed over. [Jack Jansen] > This'll work for Unix and PC conventions, but not for the Mac. > Mac end of line is \r, so reading a line from a mac file on unix will > give you the whole file. I don't see how. Did you look at the code I posted? It treats '\r' the same as '\n', except that when it sees an '\r' it eats a following '\n' (if any) too, and replaces the '\r' with '\n' regardless. Maybe you're missing that Python reads lines one character at a time? So e.g. the behavior of the platform libc fgets is irrelevant. From tim_one at email.msn.com Wed Jul 14 17:53:46 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 14 Jul 1999 11:53:46 -0400 Subject: [Python-Dev] Interrupting a thread In-Reply-To: <007e01becdc6$45982490$0801a8c0@bobcat> Message-ID: <000301bece11$0f0f9d40$f7a02299@tim> [Tim sez there's no portable way to violate another thread "even in C"] [Mark Hammond] > Nope - not if I forget Python. However, when I restrict myself _to_ > Python, I find this nice little ceval.c loop and nice little per-thread > structures - even with nice-looking exception place-holders ;-) Good point! Python does have its own notion of threads. > Something tells me that it wont be quite as easy as filling these > in (while you have the lock, of course!), but it certainly seems far > more plausible than if we consider it a C problem :-) Adding a scheme that builds on the global lock and Python-controlled thread switches may not be prudent if your life's goal is to make Python free-threaded . But if "if you can't beat 'em, join 'em" rules the day, making Py_AddPendingCall thread safe, adding a target thread argument, and fleshing out the XXX Darn! With the advent of thread state, we should have an array of pending calls per thread in the thread state! Later... comment before it, could go a long way toward facilitating groping in the back seat of dad's car . cheaper-than-renting-a-motel-room-for-sure-ly y'rs - tim From jack at oratrix.nl Wed Jul 14 17:53:36 1999 From: jack at oratrix.nl (Jack Jansen) Date: Wed, 14 Jul 1999 17:53:36 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Tim Peters" , Wed, 14 Jul 1999 11:42:14 -0400 , <000101bece0f$72095c80$f7a02299@tim> Message-ID: <19990714155336.94DA8303120@snelboot.oratrix.nl> > [Jack Jansen] > > This'll work for Unix and PC conventions, but not for the Mac. > > Mac end of line is \r, so reading a line from a mac file on unix will > > give you the whole file. > [...] > > Maybe you're missing that Python reads lines one character at a time? So > e.g. the behavior of the platform libc fgets is irrelevant. You're absolutely right... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Wed Jul 14 18:15:12 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 14 Jul 1999 12:15:12 -0400 Subject: [Python-Dev] Python bugs database started In-Reply-To: Your message of "Wed, 14 Jul 1999 09:03:07 EDT." <14220.35467.644552.307210@amarok.cnri.reston.va.us> References: <199907131609.MAA11208@eric.cnri.reston.va.us> <19990714080759.D49B2303120@snelboot.oratrix.nl> <14220.35467.644552.307210@amarok.cnri.reston.va.us> Message-ID: <199907141615.MAA12513@eric.cnri.reston.va.us> > Digicool has a relatively simple bug tracking system for Zope which > you can try out at http://www.zope.org/Collector/ . I asked, and Collector is dead -- but the new offering (Tracker) isn't ready for prime time yet. I'll suffer through Jitterbug until Tracker is out of beta (the first outsider who submitted a bug also did the Reload thing :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Wed Jul 14 18:14:47 1999 From: da at ski.org (David Ascher) Date: Wed, 14 Jul 1999 09:14:47 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Interrupting a thread In-Reply-To: <1280166671-10546045@hypernet.com> Message-ID: On Wed, 14 Jul 1999, Gordon McMillan wrote: a reply to the python-dev thread on python-list. You didn't really intend to do that, did you Gordon? =) --david From tim_one at email.msn.com Thu Jul 15 06:21:10 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 00:21:10 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <000901becce4$ac88aa40$31a02299@tim> Message-ID: <000001bece79$77edf240$51a22299@tim> Just so Guido doesn't feel like the quesion is being ignored : > ... > How does Scheme do this? [continuations] One more reference here. Previously sketched Wilson's simple heap implementation and Dybvig's simple stack one. They're easy to understand, but are (heap) slow all the time, or (stack) fast most of the time but horribly slow in some cases. For the other extreme end of things, check out: Representing Control in the Presence of First-Class Continuations Robert Hieb, R. Kent Dybvig, and Carl Bruggeman PLDI, June 1990 http://www.cs.indiana.edu/~dyb/papers/stack.ps In part: In this paper we show how stacks can be used to implement activation records in a way that is compatible with continuation operations, multiple control threads, and deep recursion. Our approach allows a small upper bound to be placed on the cost of continuation operations and stack overflow and underflow recovery. ... ordinary procedure calls and returns are not adversely affected. ... One important feature of our method is that the stack is not copied when a continuation is captured. Consequently, capturing a continuation is very efficient, and objects that are known to have dynamic extent can be stack? allocated and modified since they remain in the locations in which they were originally allocated. By copying only a small portion of the stack when a continuation is reinstated, reinstatement costs are bounded by a small constant. The basic gimmick is a segmented stack, where large segments are heap-allocated and each contains multiple contiguous frames (across their code base, only 1% of frames exceeded 30 machine words). But this is a complicated approach, best suited for industrial-strength native-code compilers (speed at any cost -- the authors go thru hell to save an add here, a pointer store there, etc). At least at the time the paper was written, it was the approach implemented by Dybvig's Chez Scheme (a commercial native-code Scheme compiler noted for high speed). Given that Python allocates frames from the heap, I doubt there's a much faster approach than the one Christian has crafted out of his own sweat and blood! It's worth a paper of its own. or-at-least-two-hugs-ly y'rs - tim From tim_one at email.msn.com Thu Jul 15 09:00:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 03:00:14 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <199907141224.IAA12211@eric.cnri.reston.va.us> Message-ID: <000301bece8f$b0dd7060$51a22299@tim> >> I was discussing appending the mode ("t" or "b") to the open() call > In addition, 't' probably isn't even supported on many Unix systems! 't' is not ANSI C, so there's no guarantee that it's portable. Hate to say it, but Python should really strip t out before passing a mode string to fopen! From tim_one at email.msn.com Thu Jul 15 09:00:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 03:00:18 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <1280165369-10624337@hypernet.com> Message-ID: <000401bece8f$b2810e40$51a22299@tim> [Mark] >> ... The case for abandoning the CRTL's text mode gets stronger >> and stronger! [Gordon] > If you're tying this in with Tim's Icon worship, Icon inherits stdio behavior-- for the most part --too. It does define its own mode string characters, though (like "t" for translated and "u" for untranslated); Icon has been ported to platforms that can't even spell libc, let alone support it. > note that in these days of LANS, the issue is yet more complex. It would > be dandy if I could read text any old text file and have it look sane, but > I may be writing it to a different machine without any way of knowing that. So where's the problem? No matter *what* machine you end up on, Python could read the thing fine. Or are you assuming some fantasy world in which people sometimes run software other than Python ? Caveat: give the C std a close reading. It guarantees much less about text mode than anyone who hasn't studied it would believe; e.g., text mode doesn't guarantee to preserve chars with the high bit set, or most control chars either (MS's treatment of CTRL-Z as EOF under text mode conforms to the std!). Also doesn't guarantee to preserve a line-- even if composed of nothing but printable chars --if it's longer than 509(!) characters. That's what I mean when I say stdio's text mode is a bad joke. > When I bother to manipulate these things, I usually choose to use > *nix style text files. But I don't deal with Macs, and the only > common Windows tool that can't deal with plain \n is Notepad. I generally create text files in binary mode, faking the \n convention by hand. Of course, I didn't do this before I became a Windows Guy <0.5 wink>. > and-stripcr.py-is-everywhere-available-on-my-Linux-box-ly y'rs A plug for my linefix.py (Python FTP contrib, under System), which converts among Unix/Windows/Mac in any direction (by default, from any to Unix). who-needs-linux-when-there's-a-python-in-the-window-ly y'rs - tim From MHammond at skippinet.com.au Thu Jul 15 09:16:32 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Thu, 15 Jul 1999 17:16:32 +1000 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <000301bece8f$b0dd7060$51a22299@tim> Message-ID: <000801bece91$f80576c0$0801a8c0@bobcat> > 't' is not ANSI C, so there's no guarantee that it's > portable. Hate to say > it, but Python should really strip t out before passing a > mode string to > fopen! OK - thanks all - it is clear that this MS aberration is not, and never will be supported by Python. Not being a standards sort of guy I must admit I assumed both the "t" and "b" were standards. Thanks for the clarifications! Mark. From gstein at lyra.org Thu Jul 15 09:15:20 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 15 Jul 1999 00:15:20 -0700 Subject: [Python-Dev] RE: Python on Windows chapter. References: <000301bece8f$b0dd7060$51a22299@tim> Message-ID: <378D8A88.583A4DBF@lyra.org> Tim Peters wrote: > > >> I was discussing appending the mode ("t" or "b") to the open() call > > > In addition, 't' probably isn't even supported on many Unix systems! > > 't' is not ANSI C, so there's no guarantee that it's portable. Hate to say > it, but Python should really strip t out before passing a mode string to > fopen! Should we also filter the socket type when creating sockets? Or the address family? What if I pass "bamboozle" as the fopen mode? Should that become "bab" after filtering? Oh, but what about those two "b" characters? Maybe just reduce it to one? We also can't forget to filter chmod() arguments... can't have unknown bits set. etc etc In other words, I think the idea of "stripping out the t" is bunk. Python is not fatherly. It gives you the rope and lets you figure it out for yourself. You should know that :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Thu Jul 15 10:59:56 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 15 Jul 1999 04:59:56 -0400 Subject: [Python-Dev] RE: Python on Windows chapter. In-Reply-To: <378D8A88.583A4DBF@lyra.org> Message-ID: <000001becea0$69df8ca0$aea22299@tim> [Tim] > 't' is not ANSI C, so there's no guarantee that it's portable. > Hate to say it, but Python should really strip t out before passing > a mode string to fopen! [Greg Stein] > Should we also filter the socket type when creating sockets? Or the > address family? Filtering 't' is a matter of increasing portability by throwing out an option that doesn't do anything on the platforms that accept it, yet can cause a program to die on platforms that don't -- despite that it says nothing. So it's helpful to toss it, not restrictive. > What if I pass "bamboozle" as the fopen mode? Should that become "bab" > after filtering? Oh, but what about those two "b" characters? Those go far beyond what I suggested, Greg. Even so , it would indeed help a great many non-C programmers if Python defined the mode strings it accepts & barfed on others by default. The builtin open is impossible for a non-C weenie to understand from the docs (as a frustrated sister delights in reminding me). It should be made friendlier. Experts can use a new os.fopen if they need to pass "bamboozle"; fine by me; I do think the builtins should hide as much ill-defined libc crap as possible (btw, "open" is unique in this respect). > Maybe just reduce it to one? We also can't forget to filter chmod() > arguments... can't have unknown bits set. I at least agree that chmod has a miserable UI . > etc etc > > In other words, I think the idea of "stripping out the t" is bunk. > Python is not fatherly. It gives you the rope and lets you figure it out > for yourself. You should know that :-) So should Mark -- but we have his testimony that, like most other people, he has no idea what's "std C" and what isn't. In this case he should have noticed that Python's "open" docs don't admit to "t"'s existence either, but even so I see no reason to take comfort in the expectation that he'll eventually be hanged for this sin. ypu-i'd-rather-"open"-died-when-passed-"t"-ly y'rs - tim From guido at cnri.reston.va.us Fri Jul 16 00:29:54 1999 From: guido at cnri.reston.va.us (Guido van Rossum) Date: 15 Jul 1999 18:29:54 -0400 Subject: [Python-Dev] ISPs and Python Message-ID: <5lu2r5czrx.fsf@eric.cnri.reston.va.us> Remember the days when the big problem was to find an ISP who would install Python? Apparently that problem has gone away... The problem is now to get one that installs a decent set of Python extensions :-) See attached c.l.py post. This is similar to the evolution of Python's name recognition -- used to be, managers would say "what's Python?"; then they said "nobody else uses Python"; now presumably they will have to make up some kind ad-hoc no-Python company policy :-) --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Start of forwarded message ------- From: Sim & Golda Zacks Newsgroups: comp.lang.python Subject: Re: htmllib, cgi, HTMLfmt, genCGI, HTMLgen, html, Zope, ... Date: Wed, 14 Jul 1999 00:00:25 -0400 Organization: ExecPC Internet - Milwaukee, WI Message-ID: <7mh1qu$c6m at newsops.execpc.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I am in the exact same situation as you are. I am a web programmer and I'm trying to implement the CGI and database stuff with Python. I am using the HTMLFMT module from the INTERNET PROGRAMMING book and the cgi module from the standard library. What the HTMLFMT library does for you is just that you don't have to type in all the tags, basically it's nothing magical, if I didn't have it I would have to make something up and it probably wouldn't be half as good. the standard cgi unit gives you all the fields from the form, and I haven't looked at the cgi modules from the book yet to see if they give me any added benefit. The big problem I came across was my web host, and all of the other ones I talked to, refused to install the mysql interface to Python, and it has to be included in the build (or something like that) So I just installed gadfly, which seems to be working great for me right now. I'm still playing with it not in production yet. I have no idea what ZOPE does, but everyone who talks about it seems to love it. Hope this helps Sim Zacks [...] ------- End of forwarded message ------- From mhammond at skippinet.com.au Fri Jul 16 01:21:40 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Fri, 16 Jul 1999 09:21:40 +1000 Subject: [Python-Dev] ISPs and Python In-Reply-To: <5lu2r5czrx.fsf@eric.cnri.reston.va.us> Message-ID: <001001becf18$cb850610$0801a8c0@bobcat> > Remember the days when the big problem was to find an ISP who would > install Python? Apparently that problem has gone away... The problem > is now to get one that installs a decent set of Python extensions :-) he he. Yes, hence I believe the general agreement exists that we should begin to focus on these more external issues than the language itself. Pity we all agree, but are still such hackers :-) > looked at the cgi modules from the book yet to see if they > give me any added > benefit. The big problem I came across was my web host, and > all of the other From mal at lemburg.com Fri Jul 16 09:44:20 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 16 Jul 1999 09:44:20 +0200 Subject: [Python-Dev] ISPs and Python References: <001001becf18$cb850610$0801a8c0@bobcat> Message-ID: <378EE2D4.A67F5BD@lemburg.com> Mark Hammond wrote: > > > Remember the days when the big problem was to find an ISP who would > > install Python? Apparently that problem has gone away... The problem > > is now to get one that installs a decent set of Python extensions :-) > > he he. Yes, hence I believe the general agreement exists that we should > begin to focus on these more external issues than the language itself. > Pity we all agree, but are still such hackers :-) > > > looked at the cgi modules from the book yet to see if they > > give me any added > > benefit. The big problem I came across was my web host, and > > all of the other > > >From the ISP's POV, this is reasonable. I wouldnt be surprised to find > they started with the same policy for Perl. The issue is less likely to be > anything to do with Python, but to do with stability. If every client was > allowed to install their own extension, then that could wreak havoc. Some > ISPs will allow a private Python build, but some only allow you to use > their shared version, which they obviously want kept pretty stable. > > The answer would seem to be to embrace MALs efforts. Not only should we be > looking at pre-compiled (as I believe his effort is) but also towards > "batteries included, plus spare batteries, wall charger, car charger and > solar panels". ISP targetted installations with _many_ extensions > installed could be very useful - who cares if it is 20MB - if they dont > want that, let then do it manually with the standard installation like > everyone else. mxCGIPython is a project aimed at exactly this situation. The only current caveat with it is that the binaries are not capable of loading shared extensions (maybe some linker guru could help here). In summary the cgipython binaries are complete Python interpreters with a frozen Python standard lib included. This means that you only need to install a single file on your ISP account and you're set for CGI/Python. More infos + the binaries are available here: http://starship.skyport.net/~lemburg/mxCGIPython.html The package could also be tweaked to include a set of common extensions, I suppose, since it uses freeze.py to do most of the job. > There could almost be commercial scope here for a support company. > Offering ISP/Corporate specific CDs and support. Installations targetted > at machines shared among a huge number of users, with almost every common > Python extension any of these users would need. Corporates and ISPs may > pay far more handsomly than individuals for this kind of stuff. > > I know I am ranting still, but I repeat my starting point that addressing > issues like this are IMO the single best thing we could do for Python. We > could leave the language along for 2 years, and come back to it when this > shite is better under control :-) Naa, that would spoil all the fun ;-) But anyways, going commercial with Python is not that far-fetched anymore nowadays... something like what the Linux distributors are doing for Linux could probably also be done with Python. Which brings us back to the package name topic or better the import mechanism... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 168 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From skip at mojam.com Fri Jul 16 20:04:58 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 16 Jul 1999 14:04:58 -0400 (EDT) Subject: [Python-Dev] Python bugs database started In-Reply-To: <14219.20501.697542.358579@anthem.cnri.reston.va.us> References: <199907122004.QAA09348@eric.cnri.reston.va.us> <000701becce4$a973c920$31a02299@tim> <14219.20501.697542.358579@anthem.cnri.reston.va.us> Message-ID: <14223.29664.66832.630010@94.chicago-33-34rs.il.dial-access.att.net> TP> The first time I submitted a bug, I backed up to the entry page and TP> hit Refresh to get the category counts updated (never saw Jitterbug TP> before, so must play!). IE5 whined about something-or-other being TP> out of date, and would I like to "repost the data"? I said sure. Barry> This makes perfect sense, and explains exactly what's going on. Barry> Let's call it "poor design"[1] instead of "user error". A quick Barry> scan last night of the Jitterbug site shows no signs of fixes or Barry> workarounds. What would Jitterbug have to do to avoid these Barry> kinds of problems? If the submission form uses METHOD=GET instead of METHOD=POST, the backup problem should go away. Skip (finally hobbling through my email after the move to Illinois...) From tim_one at email.msn.com Sun Jul 18 09:06:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 18 Jul 1999 03:06:16 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <199907131253.IAA10730@eric.cnri.reston.va.us> Message-ID: <000b01bed0ec$075c47a0$36a02299@tim> > The latest versions of the Icon language [convert \r\n, \r and \n to > plain \n in text mode upon read, and convert \n to the platform convention > on write] It's a trend : the latest version of the REBOL language also does this. The Java compiler does it for Java source files, but I don't know how runtime file read/write work in Java. Anyone know offhand if there's a reliable way to determine whether an open file descriptor (a C FILE*) is seekable? if-i'm-doomed-to-get-obsessed-by-this-may-as-well-make-it-faster- too-ly y'rs - tim From mal at lemburg.com Sun Jul 18 22:29:43 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 18 Jul 1999 22:29:43 +0200 Subject: [Python-Dev] End of the line References: <000b01bed0ec$075c47a0$36a02299@tim> Message-ID: <37923937.4E73E8D8@lemburg.com> Tim Peters wrote: > > Anyone know offhand if there's a reliable way to determine whether an open > file descriptor (a C FILE*) is seekable? I'd simply use trial&error: if (fseek(stream,0,SEEK_CUR) < 0) { if (errno != EBADF)) { /* Not seekable */ errno = 0; } else /* Error */ ; } else /* Seekable */ ; How to get this thread safe is left as exercise to the interested reader ;) Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 166 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da at ski.org Thu Jul 22 01:41:28 1999 From: da at ski.org (David Ascher) Date: Wed, 21 Jul 1999 16:41:28 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Perl 5.6 'feature list' Message-ID: Not all that exciting, but good to know what they're doing: http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html From tim_one at email.msn.com Thu Jul 22 04:52:26 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 21 Jul 1999 22:52:26 -0400 Subject: [Python-Dev] Perl 5.6 'feature list' In-Reply-To: Message-ID: <000f01bed3ed$3b509800$642d2399@tim> [David Ascher] > Not all that exciting, but good to know what they're doing: > > http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html It is good to know, and I didn't, so thanks for passing that on! I see they're finally stealing Python's version numbering scheme . In other news, I just noticed that REBOL threw 1st-class continuations *out* of the language, leaving just the "escape up the current call chain" exception-handling (throw/catch) kind. This isn't an open project, so it's hard to second-guess why. Or easy, depending on how you look at it . i-suggest-looking-at-it-the-right-way-ly y'rs - tim From jim at digicool.com Thu Jul 22 14:15:08 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 08:15:08 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a default value Message-ID: <37970B4C.8E8C741E@digicool.com> I like the list pop method because it provides a way to use lists as thread safe queues and stacks (since append and pop are protected by the global interpreter lock). With pop, you can essentially test whether the list is empty and get a value if it isn't in one atomic operation: try: foo=queue.pop(0) except IndexError: ... empty queue case else: ... non-empty case, do something with foo Unfortunately, this incurs exception overhead. I'd rather do something like: foo=queue.pop(0,marker) if foo is marker: ... empty queue case else: ... non-empty case, do something with foo I'd be happy to provide a patch. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fredrik at pythonware.com Thu Jul 22 15:14:50 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 22 Jul 1999 15:14:50 +0200 Subject: [Python-Dev] Perl 5.6 'feature list' References: Message-ID: <001501bed444$2f5dbe90$f29b12c2@secret.pythonware.com> David Ascher wrote: > Not all that exciting, but good to know what they're doing: > > http://www.perl.com/cgi-bin/pace/pub/1999/06/perl5-6.html well, "unicode all the way down" and "language level event loop" sounds pretty exciting to me... (but christian's work beats it all, of course...) From skip at mojam.com Thu Jul 22 16:24:53 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 22 Jul 1999 09:24:53 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a default value In-Reply-To: <37970B4C.8E8C741E@digicool.com> References: <37970B4C.8E8C741E@digicool.com> Message-ID: <14231.10515.423401.512972@153.chicago-41-42rs.il.dial-access.att.net> Jim> I like the list pop method because it provides a way to use lists Jim> as thread safe queues and stacks (since append and pop are Jim> protected by the global interpreter lock). The global interpreter lock is a property of the current implementation of Python, not of the language itself. At one point in the past Greg Stein created a set of patches that eliminated the lock. While it's perhaps convenient to use now, it may not always exist. I'm not so sure that it should be used as a motivator for changes to libraries in the standard distribution. Skip From jim at digicool.com Thu Jul 22 16:47:13 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 10:47:13 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a defaultvalue References: <37970B4C.8E8C741E@digicool.com> <14231.10515.423401.512972@153.chicago-41-42rs.il.dial-access.att.net> Message-ID: <37972EF1.372C2CB1@digicool.com> Skip Montanaro wrote: > > Jim> I like the list pop method because it provides a way to use lists > Jim> as thread safe queues and stacks (since append and pop are > Jim> protected by the global interpreter lock). > > The global interpreter lock is a property of the current implementation of > Python, not of the language itself. At one point in the past Greg Stein > created a set of patches that eliminated the lock. While it's perhaps > convenient to use now, it may not always exist. I'm not so sure that it > should be used as a motivator for changes to libraries in the standard > distribution. If the global interpreter lock goes away, then some other locking mechanism will be used to make built-in object operations atomic. For example, in Greg's changes, each list was protected by a list lock. The key is that pop combines checking for an empty list and removing an element into a single operation. As long as the operations append and pop are atomic, then lists can be used as thread-safe stacks and queues. The benefit of the proposal does not really depend on the global interpreter lock. It only depends on list operations being atomic. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gmcm at hypernet.com Thu Jul 22 18:07:31 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:07:31 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37970B4C.8E8C741E@digicool.com> Message-ID: <1279466648-20991135@hypernet.com> Jim Fulton writes: > With pop, you can essentially test whether the list is > empty and get a value if it isn't in one atomic operation: > > try: > foo=queue.pop(0) > except IndexError: > ... empty queue case > else: > ... non-empty case, do something with foo > > Unfortunately, this incurs exception overhead. I'd rather do > something like: > > foo=queue.pop(0,marker) > if foo is marker: > ... empty queue case > else: > ... non-empty case, do something with foo I'm assuming you're asking for the equivalent of: def pop(self, default=None): much like dict.get? Then how do I get the old behavior? (I've been known to do odd things - like change behavior based on the number of args - in extension modules, but this ain't an extension). - Gordon From fredrik at pythonware.com Thu Jul 22 17:23:00 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 22 Jul 1999 17:23:00 +0200 Subject: [Python-Dev] End of the line References: <000001beccfb$9beb4fa0$2f9e2299@tim> Message-ID: <009901bed456$161a4950$f29b12c2@secret.pythonware.com> Tim Peters wrote: > The latest versions of the Icon language (9.3.1 & beyond) sprouted an > interesting change in semantics: if you open a file for reading in > "translated" (text) mode now, it normalizes Unix, Mac and Windows line > endings to plain \n. Writing in text mode still produces what's natural for > the platform. > > Anyone think that's *not* a good idea? if we were to change this, how would you tell Python to open a file in text mode? From jim at digicool.com Thu Jul 22 17:30:22 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 11:30:22 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279466648-20991135@hypernet.com> Message-ID: <3797390E.50972562@digicool.com> Gordon McMillan wrote: > > Then how do I get the old behavior? Just pass 0 or 1 argument. >(I've been known to do odd > things - like change behavior based on the number of args - in > extension modules, but this ain't an extension). It *is* a built-in method. It will be handled just like dictionaries handle the second argument to get. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gmcm at hypernet.com Thu Jul 22 18:33:06 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:33:06 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <3797390E.50972562@digicool.com> Message-ID: <1279465114-21083404@hypernet.com> Jim Fulton wrote: > Gordon McMillan wrote: > > > > Then how do I get the old behavior? > > Just pass 0 or 1 argument. > > >(I've been known to do odd > > things - like change behavior based on the number of args - in > > extension modules, but this ain't an extension). > > It *is* a built-in method. It will be handled just like > dictionaries handle the second argument to get. d.get(nonexistantkey) does not throw an exception, it returns None. If list.pop() does not throw an exception when list is empty, it's new behavior. Which are you asking for: breaking code that expects IndexError Violating Pythonic expectations by, in effect, creating 2 methods list.pop(void) list.pop(default_return) - Gordon From jim at digicool.com Thu Jul 22 17:44:22 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 11:44:22 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> Message-ID: <37973C56.5ACFFDEC@digicool.com> Gordon McMillan wrote: > > Jim Fulton wrote: > > > Gordon McMillan wrote: > > > > > > Then how do I get the old behavior? > > > > Just pass 0 or 1 argument. > > > > >(I've been known to do odd > > > things - like change behavior based on the number of args - in > > > extension modules, but this ain't an extension). > > > > It *is* a built-in method. It will be handled just like > > dictionaries handle the second argument to get. > > d.get(nonexistantkey) does not throw an exception, it returns None. Oops, I'd forgotten that. > If list.pop() does not throw an exception when list is empty, it's > new behavior. > > Which are you asking for: > breaking code that expects IndexError No. > Violating Pythonic expectations by, in effect, creating 2 methods > list.pop(void) > list.pop(default_return) Yes, except that I disagree that this is non-pythonic. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From mal at lemburg.com Thu Jul 22 19:27:53 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 19:27:53 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> <37973C56.5ACFFDEC@digicool.com> Message-ID: <37975499.FB61E4E3@lemburg.com> Jim Fulton wrote: > > > Violating Pythonic expectations by, in effect, creating 2 methods > > list.pop(void) > > list.pop(default_return) > > Yes, except that I disagree that this is non-pythonic. Wouldn't a generic builtin for these kinds of things be better, e.g. a function returning a default value in case an exception occurs... something like: tryexcept(list.pop(), IndexError, default) which returns default in case an IndexError occurs. Don't think this would be much faster that the explicit try:...except: though... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gmcm at hypernet.com Thu Jul 22 18:54:58 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 22 Jul 1999 11:54:58 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37973C56.5ACFFDEC@digicool.com> Message-ID: <1279463517-21179480@hypernet.com> Jim Fulton wrote: > > Gordon McMillan wrote: ... > > Violating Pythonic expectations by, in effect, creating 2 methods > > list.pop(void) > > list.pop(default_return) > > Yes, except that I disagree that this is non-pythonic. > I'll leave the final determination to Mr. Python, but I disagree. Offhand I can't think of a built-in that can't be expressed in normal Python notation, where "optional" args are really defaulted args. Which would lead us to either a new list method, or redefining pop: def pop(usedefault=0, default=None) and making you use 2 args. But maybe I've missed a precedent because I'm so used to it. (Hmm, I guess string.split is a sort-of precedent, because the first default arg behaves differently than anything you could pass in). - Gordon From bwarsaw at cnri.reston.va.us Thu Jul 22 20:33:57 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 14:33:57 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279465114-21083404@hypernet.com> <37973C56.5ACFFDEC@digicool.com> <37975499.FB61E4E3@lemburg.com> Message-ID: <14231.25621.888844.205034@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> Wouldn't a generic builtin for these kinds of things be M> better, e.g. a function returning a default value in case M> an exception occurs... something like: M> tryexcept(list.pop(), IndexError, default) M> which returns default in case an IndexError occurs. Don't think M> this would be much faster that the explicit try:...except: M> though... Don't know if this would be better (or useful, etc.), but it could possibly be faster than explicit try/except, because with try/except you have to instantiate the exception object. Presumably tryexcept() -- however it was spelled -- would catch the exception in C, thus avoiding the overhead of exception object instantiation. -Barry From bwarsaw at cnri.reston.va.us Thu Jul 22 20:36:09 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 14:36:09 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <3797390E.50972562@digicool.com> <1279465114-21083404@hypernet.com> Message-ID: <14231.25753.710299.405579@anthem.cnri.reston.va.us> >>>>> "Gordo" == Gordon McMillan writes: Gordo> Which are you asking for: breaking code that expects Gordo> IndexError Violating Pythonic expectations by, in effect, Gordo> creating 2 methods Gordo> list.pop(void) Gordo> list.pop(default_return) The docs /do/ say that list.pop() is experimental, so that probably gives Guido all the out he'd need to change the semantics :). I myself have yet to use list.pop() so I don't know how disasterous the change in semantics would be to existing code. -Barry From jim at digicool.com Thu Jul 22 18:49:33 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 12:49:33 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> Message-ID: <37974B9D.59C2D45E@digicool.com> Gordon McMillan wrote: > > Offhand I can't think of a built-in that can't be expressed in normal > Python notation, where "optional" args are really defaulted args. I can define the pop I want in Python as follows: _marker=[] class list: ... def pop(index=-1, default=marker): try: v=self[index] except IndexError: if default is not marker: return default if self: m='pop index out of range' else: m='pop from empty list' raise IndexError, m del self[index] return v Although I'm not sure why the "pythonicity" of an interface should depend on it's implementation. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim at digicool.com Thu Jul 22 18:53:26 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 12:53:26 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> Message-ID: <37974C86.2EC53BE7@digicool.com> BTW, a good precedent for what I want is getattr. getattr(None,'spam') raises an error, but: getattr(None,'spam',1) returns 1 Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From bwarsaw at cnri.reston.va.us Thu Jul 22 21:02:21 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 15:02:21 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> Message-ID: <14231.27325.387718.435420@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> BTW, a good precedent for what I want JF> is getattr. JF> getattr(None,'spam') JF> raises an error, but: JF> getattr(None,'spam',1) JF> returns 1 Okay, how did this one sneak in, huh? I didn't even realize this had been added to getattr()! CVS reveals it was added b/w 1.5.1 and 1.5.2a1, so maybe I just missed the checkin message. Fred, the built-in-funcs doc needs updating: http://www.python.org/doc/current/lib/built-in-funcs.html FWIW, the CVS log message says this feature is experimental too. :) -Barry From jim at digicool.com Thu Jul 22 21:20:46 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 15:20:46 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> Message-ID: <37976F0E.DFB4067B@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "JF" == Jim Fulton writes: > > JF> BTW, a good precedent for what I want > JF> is getattr. > > JF> getattr(None,'spam') > > JF> raises an error, but: > > JF> getattr(None,'spam',1) > > JF> returns 1 > > Okay, how did this one sneak in, huh? I don't know. Someone told me about it. I find it wildly useful. > I didn't even realize this had > been added to getattr()! CVS reveals it was added b/w 1.5.1 and > 1.5.2a1, so maybe I just missed the checkin message. > > Fred, the built-in-funcs doc needs updating: > > http://www.python.org/doc/current/lib/built-in-funcs.html > > FWIW, the CVS log message says this feature is experimental too. :) Eek! I want it to stay! I also really like list.pop. :) Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at cnri.reston.va.us Thu Jul 22 21:26:32 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Thu, 22 Jul 1999 15:26:32 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14231.27325.387718.435420@anthem.cnri.reston.va.us> References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> Message-ID: <14231.28776.160422.442859@weyr.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> BTW, a good precedent for what I want JF> is getattr. JF> getattr(None,'spam') JF> raises an error, but: JF> getattr(None,'spam',1) JF> returns 1 Barry A. Warsaw writes: > Fred, the built-in-funcs doc needs updating: This is done in the CVS repository; thanks for pointing out the oversight! Do people realize that pop() already has an optional parameter? That *is* in the docs: http://www.python.org/docs/current/lib/typesseq-mutable.html See note 4 below the table. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Thu Jul 22 21:37:20 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 15:37:20 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <14231.29424.569863.149366@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> I don't know. Someone told me about it. I find it JF> wildly useful. No kidding! :) From mal at lemburg.com Thu Jul 22 22:32:23 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 22:32:23 +0200 Subject: [Python-Dev] Importing extension modules Message-ID: <37977FD7.BD7A9826@lemburg.com> I'm currently testing a pure Python version of mxDateTime (my date/time package), which uses a contributed Python version of the C extension. Now, to avoid problems with pickled DateTime objects (they include the complete module name), I would like to name *both* the Python and the C extension version mxDateTime. With the current lookup scheme (shared mods are searched before Python modules) this is no problem since the shared mod is found before the Python version and used instead, so getting this working is rather simple. The question is: will this setup remain a feature in future versions of Python ? (Does it work this way on all platforms ?) Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Thu Jul 22 22:45:24 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 22 Jul 1999 22:45:24 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <379782E4.7DC79460@lemburg.com> Jim Fulton wrote: > > [getattr(obj,name[,default])] > > Okay, how did this one sneak in, huh? > > I don't know. Someone told me about it. I find it > wildly useful. Me too... ;-) > > I didn't even realize this had > > been added to getattr()! CVS reveals it was added b/w 1.5.1 and > > 1.5.2a1, so maybe I just missed the checkin message. http://www.deja.com/getdoc.xp?AN=366635977 -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at appliedbiometrics.com Thu Jul 22 22:50:42 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 22 Jul 1999 22:50:42 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> Message-ID: <37978422.F36BB130@appliedbiometrics.com> > > Fred, the built-in-funcs doc needs updating: > > > > http://www.python.org/doc/current/lib/built-in-funcs.html > > > > FWIW, the CVS log message says this feature is experimental too. :) > > Eek! I want it to stay! > > I also really like list.pop. :) Seconded! Also, things which appeared between some alphas and made it upto the final, are just there. It would be fair to update the CVS tree and say the features made it into the dist, even if it just was a mistake not to remove them in time. It was time enough. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Thu Jul 22 22:50:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 16:50:36 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> Message-ID: <14231.33820.422195.45250@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> http://www.deja.com/getdoc.xp?AN=366635977 Ah, thanks! Your rationale was exactly the reason why I added dict.get(). I'm still not 100% sure about list.pop() though, since it's not exactly equivalent -- list.pop() modifies the list as a side-effect :) Makes me think you might want an alternative spelling for list[s], call it list.get() and put the optional default on that method. Then again, maybe list.pop() with an optional default is good enough. -Barry From jim at digicool.com Thu Jul 22 22:55:05 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 16:55:05 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> Message-ID: <37978529.B1AC5273@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> http://www.deja.com/getdoc.xp?AN=366635977 > > Ah, thanks! Your rationale was exactly the reason why I added > dict.get(). I'm still not 100% sure about list.pop() though, since > it's not exactly equivalent -- list.pop() modifies the list as a > side-effect :) Makes me think you might want an alternative spelling > for list[s], call it list.get() and put the optional default on that > method. Then again, maybe list.pop() with an optional default is good > enough. list.get and list.pop are different, since get wouldn't modify the list and pop would. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From bwarsaw at cnri.reston.va.us Thu Jul 22 23:13:49 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 22 Jul 1999 17:13:49 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> <37978529.B1AC5273@digicool.com> Message-ID: <14231.35214.1590.898304@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> list.get and list.pop are different, since get wouldn't modify JF> the list and pop would. Right. Would we need them both? From jim at digicool.com Thu Jul 22 23:36:03 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 22 Jul 1999 17:36:03 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <1279463517-21179480@hypernet.com> <37974C86.2EC53BE7@digicool.com> <14231.27325.387718.435420@anthem.cnri.reston.va.us> <37976F0E.DFB4067B@digicool.com> <379782E4.7DC79460@lemburg.com> <14231.33820.422195.45250@anthem.cnri.reston.va.us> <37978529.B1AC5273@digicool.com> <14231.35214.1590.898304@anthem.cnri.reston.va.us> Message-ID: <37978EC3.CAAF2632@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "JF" == Jim Fulton writes: > > JF> list.get and list.pop are different, since get wouldn't modify > JF> the list and pop would. > > Right. Would we need them both? Sure. Since a sequence is sort of a special kind of mapping, get makes sense. I definately, want pop. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tim_one at email.msn.com Fri Jul 23 05:08:05 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:05 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37976F0E.DFB4067B@digicool.com> Message-ID: <000201bed4b8$951f9ae0$2c2d2399@tim> [Barry] > FWIW, the CVS log message says this feature [3-arg getattr] is > experimental too. :) [Jim] > Eek! I want it to stay! > > I also really like list.pop. :) Don't panic: Guido has never removed a feature explicitly called "experimental"; he's only removed non-experimental ones. that's-why-we-call-stackless-python-"an-experiment"-ly y'rs - tim From tim_one at email.msn.com Fri Jul 23 05:08:07 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:07 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <009901bed456$161a4950$f29b12c2@secret.pythonware.com> Message-ID: <000301bed4b8$964492e0$2c2d2399@tim> [Tim] > The latest versions of the Icon language ... normalizes Unix, Mac > and Windows line endings to plain \n. Writing in text mode still > produces what's natural for the platform. [/F] > if we were to change this, how would you > tell Python to open a file in text mode? Meaning whatever it is the platform libc does? In Icon or REBOL, you don't. Icon is more interesting because they changed the semantics of their "t" (for "translated") mode without providing any way to go back to the old behavior (REBOL did this too, but didn't have Icon's 15 years of history to wrestle with). Curiously (I doubt Griswold *cared* about this!), the resulting behavior still conforms to ANSI C, because that std promises little about text mode semantics in the presence of non-printable characters. Nothing of mine would miss C's raw text mode (lack of) semantics, so I don't care. I *would* like Python to define portable semantics for the mode strings it accepts in the builtin open regardless, and push platform-specific silliness (including raw C text mode, if someone really wants that; or MS's "c" mode, etc) into a new os.fopen function. Push random C crap into expert modules, where it won't baffle my sister <0.7 wink>. I expect Python should still open non-binary files in the platform's text mode, though, to minimize surprises for C extensions mucking with the underlying stream object (Icon/REBOL don't have this problem, although Icon opens the file in native libc text mode anyway). next-step:-define-tabs-to-mean-8-characters-and-drop-unicode-in- favor-of-7-bit-ascii-ly y'rs - tim From tim_one at email.msn.com Fri Jul 23 05:08:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 22 Jul 1999 23:08:02 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37975499.FB61E4E3@lemburg.com> Message-ID: <000101bed4b8$9395eda0$2c2d2399@tim> [M.-A. Lemburg] > Wouldn't a generic builtin for these kinds of things be > better, e.g. a function returning a default value in case > an exception occurs... something like: > > tryexcept(list.pop(), IndexError, default) > > which returns default in case an IndexError occurs. Don't > think this would be much faster that the explicit try:...except: > though... As a function (builtin or not), tryexcept will never get called if list.pop() raises an exception. tryexcept would need to be a new statement type, and the compiler would have to generate code akin to try: whatever = list.pop() except IndexError: whatever = default If you want to do it in a C function instead to avoid the Python-level exception overhead, the compiler would have to wrap list.pop() in a lambda in order to delay evaluation until the C code got control; and then you've got worse overhead . generalization-is-the-devil's-playground-ly y'rs - tim From tim_one at email.msn.com Fri Jul 23 09:23:27 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 23 Jul 1999 03:23:27 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second argument giving a defaultvalue In-Reply-To: <37970B4C.8E8C741E@digicool.com> Message-ID: <000201bed4dc$41c9f240$392d2399@tim> In a moment of insanity, Guido gave me carte blanche to suggest new list methods, and list.pop & list.extend were the result. I considered spec'ing list.pop to take an optional "default on bad index" argument too, but after playing with it didn't like it (always appeared just as easy & clearer to use "if list:" / "while list:" etc). Jim has a novel use I hadn't considered: > With pop, you can essentially test whether the list is > empty and get a value if it isn't in one atomic operation: > > try: > foo=queue.pop(0) > except IndexError: > ... empty queue case > else: > ... non-empty case, do something with foo > > Unfortunately, this incurs exception overhead. I'd rather do > something like: > > foo=queue.pop(0,marker) > if foo is marker: > ... empty queue case > else: > ... non-empty case, do something with foo It's both clever and pretty. OTOH, the original try/except isn't expensive unless the "except" triggers frequently, in which case (the queue is often empty) a thread is likely better off with a yielding Queue.get() call. So this strikes me as useful only for thread micro-optimization, and a kind of optimization most users should be steered away from anyway. Does anyone have a real use for this outside of threads? If not, I'd rather it not go in. For threads that need an optimized non-blocking probe, I'd write it: gotone = 0 if queue: try: foo = queue.pop(0) gotone = 1 except IndexError: pass if gotone: # use foo else: # twiddle thumbs For the IndexError to trigger there, a thread has to lose its bytecode slice between a successful "if queue" and the queue.pop, and not get another chance to run until other threads have emptied the queue. From mal at lemburg.com Fri Jul 23 10:27:47 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 23 Jul 1999 10:27:47 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> Message-ID: <37982783.E60E9941@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > Wouldn't a generic builtin for these kinds of things be > > better, e.g. a function returning a default value in case > > an exception occurs... something like: > > > > tryexcept(list.pop(), IndexError, default) > > > > which returns default in case an IndexError occurs. Don't > > think this would be much faster that the explicit try:...except: > > though... > > As a function (builtin or not), tryexcept will never get called if > list.pop() raises an exception. Dang. You're right... > tryexcept would need to be a new statement > type, and the compiler would have to generate code akin to > > try: > whatever = list.pop() > except IndexError: > whatever = default > > If you want to do it in a C function instead to avoid the Python-level > exception overhead, the compiler would have to wrap list.pop() in a lambda > in order to delay evaluation until the C code got control; and then you've > got worse overhead . Oh well, forget the whole idea then. list.pop() is really not needed that often anyways to warrant the default arg thing, IMHO. dict.get() and getattr() have the default arg as performance enhancement and I believe that you wouldn't get all that much better performance on average by adding a second optional argument to list.pop(). BTW, there is a generic get() function in mxTools (you know where...) in case someone should be looking for such a beast. It works with all sequences and mappings. Also, has anybody considered writing list.pop(..,default) this way: if list: obj = list.pop() else: obj = default No exceptions, no changes, fast as hell :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at appliedbiometrics.com Fri Jul 23 12:39:27 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 23 Jul 1999 12:39:27 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> Message-ID: <3798465F.33A253D4@appliedbiometrics.com> "M.-A. Lemburg" wrote: ... > Also, has anybody considered writing list.pop(..,default) this way: > > if list: > obj = list.pop() > else: > obj = default > > No exceptions, no changes, fast as hell :-) Yes, that's the best way to go, I think. But wasn't the primary question directed on an atomic function which is thread-safe? I'm not sure, this thread has grown too fast :-) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From mal at lemburg.com Fri Jul 23 13:07:22 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 23 Jul 1999 13:07:22 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <3798465F.33A253D4@appliedbiometrics.com> Message-ID: <37984CEA.1DF062F6@lemburg.com> Christian Tismer wrote: > > "M.-A. Lemburg" wrote: > ... > > Also, has anybody considered writing list.pop(..,default) this way: > > > > if list: > > obj = list.pop() > > else: > > obj = default > > > > No exceptions, no changes, fast as hell :-) > > Yes, that's the best way to go, I think. > But wasn't the primary question directed on > an atomic function which is thread-safe? > I'm not sure, this thread has grown too fast :-) I think that was what Jim had in mind in the first place. Hmm, so maybe we're not after lists after all: maybe what we need is access to the global interpreter lock in Python, so that we can write: sys.lock.acquire() if list: obj = list.pop() else: obj = default sys.lock.release() Or maybe we need some general lock in the thread module for these purposes... don't know. It's been some time since I used threads. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 162 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim at digicool.com Fri Jul 23 13:58:23 1999 From: jim at digicool.com (Jim Fulton) Date: Fri, 23 Jul 1999 07:58:23 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <3798465F.33A253D4@appliedbiometrics.com> Message-ID: <379858DF.D317A40F@digicool.com> Christian Tismer wrote: > > "M.-A. Lemburg" wrote: > ... > > Also, has anybody considered writing list.pop(..,default) this way: > > > > if list: > > obj = list.pop() > > else: > > obj = default > > > > No exceptions, no changes, fast as hell :-) > > Yes, that's the best way to go, I think. > But wasn't the primary question directed on > an atomic function which is thread-safe? Right. And the above code doesn't solve this problem. Tim's code *does* solve the problem. It's the code we were using. It is a bit verbose though. > I'm not sure, this thread has grown too fast :-) Don't they all? Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at cnri.reston.va.us Fri Jul 23 17:07:37 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Fri, 23 Jul 1999 11:07:37 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37982783.E60E9941@lemburg.com> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> Message-ID: <14232.34105.421424.838212@weyr.cnri.reston.va.us> Tim Peters wrote: > As a function (builtin or not), tryexcept will never get called if > list.pop() raises an exception. M.-A. Lemburg writes: > Oh well, forget the whole idea then. list.pop() is really not Giving up already? Wouldn't you just love this as an expression operator (which could work)? How about: top = list.pop() excepting IndexError, default Hehehe... ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip at mojam.com Fri Jul 23 18:23:31 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 23 Jul 1999 11:23:31 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.34105.421424.838212@weyr.cnri.reston.va.us> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> Message-ID: <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> Fred> Giving up already? Wouldn't you just love this as an expression Fred> operator (which could work)? Fred> How about: Fred> top = list.pop() excepting IndexError, default Why not go all the way to Perl with top = list.pop() unless IndexError ??? ;-) Skip From fdrake at cnri.reston.va.us Fri Jul 23 18:30:17 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Fri, 23 Jul 1999 12:30:17 -0400 (EDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> Message-ID: <14232.39065.687719.135590@weyr.cnri.reston.va.us> Skip Montanaro writes: > Why not go all the way to Perl with > > top = list.pop() unless IndexError Trying to kill me, Skip? ;-) Actually, the semantics are different. If we interpret that using the Perl semantics for "unless", don't we have the same thing as: if not IndexError: top = list.pop() Since IndexError will normally be a non-empty string or a class, this is pretty much: if 0: top = list.pop() which certainly isn't quite as interesting. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip at mojam.com Fri Jul 23 22:23:12 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 23 Jul 1999 15:23:12 -0500 (CDT) Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <14232.39065.687719.135590@weyr.cnri.reston.va.us> References: <000101bed4b8$9395eda0$2c2d2399@tim> <37982783.E60E9941@lemburg.com> <14232.34105.421424.838212@weyr.cnri.reston.va.us> <14232.38398.818044.283682@52.chicago-35-40rs.il.dial-access.att.net> <14232.39065.687719.135590@weyr.cnri.reston.va.us> Message-ID: <14232.52576.746910.229435@227.chicago-26-27rs.il.dial-access.att.net> Fred> Skip Montanaro writes: >> Why not go all the way to Perl with >> >> top = list.pop() unless IndexError Fred> Trying to kill me, Skip? ;-) Nope, just a flesh wound. I'll wait for the resulting infection to really do you in. ;-) Fred> Actually, the semantics are different. If we interpret that using Fred> the Perl semantics for "unless", don't we have the same thing as: Yes, but the flavor is the same. Reading Perl code that uses the unless keyword always seemed counterintuitive to me. Something like x = y unless foo; always reads to me like, "Assign y to x. No, wait a minute. I forgot something. Only do that if foo isn't true." What was so bad about if (!foo) { x = y; } That was my initial reaction to the use of the trailing except. We argue a lot in the Python community about whether or not a proposed language feature increases the expressive power of the language or not (which is a good idea in my opinion). The Perl community has apparently never been afflicted with that disease. smiles all 'round... Skip From tismer at appliedbiometrics.com Sat Jul 24 01:36:33 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 24 Jul 1999 01:36:33 +0200 Subject: [Python-Dev] continuations for the curious Message-ID: <3798FC81.A57E9CFE@appliedbiometrics.com> Howdy, my modules are nearly ready. I will be out of my office for two weeks, but had no time to finalize and publish yet. Stackless Python has reached what I wanted it to reach: A continuation can be saved at every opcode. The continuationmodule has been shrunk heavily. Some extension is still needed, continuations are still frames, but they can be picked like Sam wanted it originally. Sam, I'm pretty sure this is more than enough for coroutines. Just have a look at getpcc(), this is now very easy. All involved frames are armed so that they *can* save themselves, but will do so just if necessary. The cheapest solution I could think of, no other optimization is necessary. If your coroutine functions like to swap two frames, and if they manage to do so that the refcount of the target stays at one, no extra frame will be generated. That's it, really. If someone wants to play, get the stackless module, replace ceval.c, and build continuationmodule.c as a dll or whatever. testct.py contains a lot of crap. The first implementation of class coroutine is working right. The second one is wrong by concept. later - chris ftp://ftp.pns.cc/pub/veryfar.zip -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing at nightmare.com Sat Jul 24 03:52:00 1999 From: rushing at nightmare.com (Sam Rushing) Date: Fri, 23 Jul 1999 18:52:00 -0700 (PDT) Subject: [Python-Dev] continuations for the curious In-Reply-To: <3798FC81.A57E9CFE@appliedbiometrics.com> References: <3798FC81.A57E9CFE@appliedbiometrics.com> Message-ID: <14233.7163.919863.981628@seattle.nightmare.com> Hey Chris, I think you're missing some include files from 'veryfar.zip'? ceval.c: In function `PyEval_EvalCode': ceval.c:355: warning: return makes pointer from integer without a cast ceval.c: In function `PyEval_EvalCode_nr': ceval.c:375: `Py_UnwindToken' undeclared (first use this function) ceval.c:375: (Each undeclared identifier is reported only once ceval.c:375: for each function it appears in.) ceval.c: In function `eval_code2_setup': ceval.c:490: structure has no member named `f_execute' ceval.c:639: structure has no member named `f_first_instr' ceval.c:640: structure has no member named `f_next_instr' -Sam From tim_one at email.msn.com Sat Jul 24 04:16:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 23 Jul 1999 22:16:16 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37984CEA.1DF062F6@lemburg.com> Message-ID: <000c01bed57a$82a79620$832d2399@tim> > ... > Hmm, so maybe we're not after lists after all: maybe what > we need is access to the global interpreter lock in Python, > so that we can write: > > sys.lock.acquire() > if list: > obj = list.pop() > else: > obj = default > sys.lock.release() The thread attempting the sys.lock.acquire() necessarily already owns the global lock, so the attempt to acquire it is a guaranteed deadlock -- arguably not helpful . > Or maybe we need some general lock in the thread module for these > purposes... don't know. It's been some time since I used > threads. Jim could easily allocate a list lock for this purpose if that's what he wanted; and wrap it in a class with a nice interface too. He'd eventually end up with the std Queue.py module, though. But if he doesn't want the overhead of an exception when the queue is empty, he sure doesn't want the comparatively huge overhead of a (any flavor of) lock either (which may drag the OS into the picture). There's nothing wrong with wanting a fast thread-safe queue! I just don't like the idea of adding an otherwise-ugly new gimmick to core lists for it; also have to wonder about Jim's larger picture if he's writing stuff in Python that's *so* time-critical that the overhead of an ordinary exception from time to time is a genuine problem. The verbosity of the alternative can be hidden in a lock-free class or function, if it's the clumsiness instead of the time that's grating. From mal at lemburg.com Sat Jul 24 10:38:59 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 24 Jul 1999 10:38:59 +0200 Subject: [Python-Dev] I'd like list.pop to accept an optional second References: <000c01bed57a$82a79620$832d2399@tim> Message-ID: <37997BA3.B5AB23B4@lemburg.com> Tim Peters wrote: > > > ... > > Hmm, so maybe we're not after lists after all: maybe what > > we need is access to the global interpreter lock in Python, > > so that we can write: > > > > sys.lock.acquire() > > if list: > > obj = list.pop() > > else: > > obj = default > > sys.lock.release() > > The thread attempting the sys.lock.acquire() necessarily already owns the > global lock, so the attempt to acquire it is a guaranteed deadlock -- > arguably not helpful . True, sys.lock.acquire() would have to set a flag *not* to release the lock until the next call to sys.lock.release(), which then clears this flag again. Sort of a lock for the unlocking the lock ;-) Could this work, or am I having a mind twister somewhere in there again ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 160 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gmcm at hypernet.com Sat Jul 24 14:41:39 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sat, 24 Jul 1999 07:41:39 -0500 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37997BA3.B5AB23B4@lemburg.com> Message-ID: <1279306201-30642004@hypernet.com> M.-A. Lemburg writes: > True, sys.lock.acquire() would have to set a flag *not* to release > the lock until the next call to sys.lock.release(), which then > clears this flag again. Sort of a lock for the unlocking the lock > ;-) > > Could this work, or am I having a mind twister somewhere in > there again ? Sounds like a critical section to me. On Windows, those are lightweight and very handy. You can build one with Python thread primitives, but unfortunately, they come out on the heavy side. Locks come in 4 types, categorized by whether they can be released only by the owning thread, and whether they can be acquired recursively. The interpreter lock is in the opposite quadrant from a critical section, so "sys.lock.freeze()" and "sys.lock.thaw()" have little chance of having an efficient implementation on any platform. A shame. That would be pretty cool. - Gordon From tim_one at email.msn.com Sun Jul 25 20:57:50 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 25 Jul 1999 14:57:50 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <000c01becdac$d2ad6300$7d9e2299@tim> Message-ID: <000001bed6cf$984cd8e0$b02d2399@tim> [Tim, notes that Perl line-at-a-time text mode input runs 3x faster than Python's on his platform] And much to my surprise, it turns out Perl reads lines a character at a time too! And they do not reimplement stdio. But they do cheat. Perl's internals are written on top of an abstract IO API, with "PerlIO *" instead of "FILE *", "PerlIO_tell(PerlIO *)" instead of "ftell(FILE*)", and so on. Nothing surprising in the details, except maybe that stdin is modeled as a function "PerlIO *PerlIO_stdin(void)" instead of as global data (& ditto for stdout/stderr). The usual *implementation* of these guys is as straight macro substitution to the corresponding C stdio call. It's possible to implement them some other way, but I don't see anything in the source that suggests anyone has done so, except possibly to build it all on AT&T's SFIO lib. So where's the cheating? In these API functions: int PerlIO_has_base(PerlIO *); int PerlIO_has_cntptr(PerlIO *); int PerlIO_canset_cnt(PerlIO *); char *PerlIO_get_ptr(PerlIO *); int PerlIO_get_cnt(PerlIO *); void PerlIO_set_cnt(PerlIO *,int); void PerlIO_set_ptrcnt(PerlIO *,char *,int); char *PerlIO_get_base(PerlIO *); int PerlIO_get_bufsiz(PerlIO *); In almost all platform stdio implementations, the C FILE struct has members that may vary in name but serve the same purpose: an internal buffer, and some way (pointer or offset) to get at "the next" buffer character. The guys above are usually just (after layers & layers of config stuff sets it up) macros that expand into the platform's internal way of spelling these things. For example, the count member is spelled under Windows as fp->_cnt under VC, or as fp->level under Borland. The payoff is in Perl's sv_gets function, in file sv.c. This is long and very complicated, but at its core has a fast inner loop that copies characters (provided the PerlIO_has/canXXX functions say it's possible) directly from the stdio buffer into a Perl string variable -- in the way a platform fgets function *would* do it if it bothered to optimize fgets. In my experience, platforms usually settle for the same kind of fgetc/EOF?/newline? loop Python uses, as if fgets were a stdio client rather than a stdio primitive. Perl's keeps everything in registers inside the loop, updates the FILE struct members only at the boundaries, and doesn't check for EOF except at the boundaries (so long as the buffer has unread stuff in it, you can't be at EOF). If the stdio buffer is exhausted before the input terminator is seen (Perl has "input record separator" and "paragraph mode" gimmicks, so it's hairier than just looking for \n), it calls PerlIO_getc once to force the platform to refill the buffer, and goes back to the screaming loop. Major hackery, but major payoff (on most platforms) too. The abstract I/O layer is a fine idea regardless. The sad thing is that the real reason Perl is so fast here is that platform fgets is so needlessly slow. perl-input-is-faster-than-c-input-ly y'rs - tim From tim_one at email.msn.com Mon Jul 26 06:58:31 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 26 Jul 1999 00:58:31 -0400 Subject: [Python-Dev] I'd like list.pop to accept an optional second In-Reply-To: <37982783.E60E9941@lemburg.com> Message-ID: <000601bed723$81bb1020$492d2399@tim> [M.-A. Lemburg] > ... > Oh well, forget the whole idea then. list.pop() is really not > needed that often anyways to warrant the default arg thing, IMHO. > dict.get() and getattr() have the default arg as performance > enhancement I like their succinctness too; count = dict.get(key, 0) is helpfully "slimmer" than either of try: count = dict[key] except KeyError: count = 0 or count = 0 if dict.has_key(key): count = dict[key] > and I believe that you wouldn't get all that much better performance > on average by adding a second optional argument to list.pop(). I think you wouldn't at *all*, except in Jim's novel case. That is, when a list is empty, it's usually the signal to get out of a loop, and you can either test if list: item = list.pop() else: break today or item = list.pop(-1, marker) if item is marker: break tomorrow. The second way doesn't buy anything to my eye, and the first way is very often the pretty while list: item = list.pop() if-it-weren't-for-jim's-use-i'd-see-no-use-at-all-ly y'rs - tim From mal at lemburg.com Mon Jul 26 10:31:01 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 26 Jul 1999 10:31:01 +0200 Subject: [Python-Dev] Thread locked sections References: <1279306201-30642004@hypernet.com> Message-ID: <379C1CC5.51A89688@lemburg.com> Gordon McMillan wrote: > > M.-A. Lemburg writes: > > > True, sys.lock.acquire() would have to set a flag *not* to release > > the lock until the next call to sys.lock.release(), which then > > clears this flag again. Sort of a lock for the unlocking the lock > > ;-) > > > > Could this work, or am I having a mind twister somewhere in > > there again ? > > Sounds like a critical section to me. On Windows, those are > lightweight and very handy. You can build one with Python thread > primitives, but unfortunately, they come out on the heavy side. > > Locks come in 4 types, categorized by whether they can be released > only by the owning thread, and whether they can be acquired > recursively. The interpreter lock is in the opposite quadrant from a > critical section, so "sys.lock.freeze()" and "sys.lock.thaw()" have > little chance of having an efficient implementation on any platform. Actually, I think all that's needed is another global like the interpreter_lock in ceval.c. Since this lock is only accessed via abstract functions, I presume the unlock flag could easily be added. The locking section would only focus on Python, though: other threads could still be running provided they don't execute Python code, e.g. write data to a spooler. So it's not really the equivalent of a critical section as the one you can define in C. PS: I changed the subject line... hope this doesn't kill the thread ;) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 158 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From Brian at digicool.com Mon Jul 26 15:46:00 1999 From: Brian at digicool.com (Brian Lloyd) Date: Mon, 26 Jul 1999 09:46:00 -0400 Subject: [Python-Dev] End of the line Message-ID: <613145F79272D211914B0020AFF6401914DC02@gandalf.digicool.com> > [Tim, notes that Perl line-at-a-time text mode input runs 3x > faster than > Python's on his platform] > > And much to my surprise, it turns out Perl reads lines a > character at a time > too! And they do not reimplement stdio. But they do cheat. > > [some notes on the cheating and PerlIO api snipped] > > The usual *implementation* of these guys is as straight macro > substitution > to the corresponding C stdio call. It's possible to > implement them some > other way, but I don't see anything in the source that > suggests anyone has > done so, except possibly to build it all on AT&T's SFIO lib. Hmm - speed bonuses not withstanding, an implementation of such a beast in the Python sources would've helped a lot to reduce the ugly hairy gymnastics required to get Python going on Win CE, where (until very recently) there was no concept of most of the things you expect to find in stdio... Brian Lloyd brian at digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From mhammond at skippinet.com.au Tue Jul 27 00:49:56 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 08:49:56 +1000 Subject: [Python-Dev] Thread locked sections In-Reply-To: <379C1CC5.51A89688@lemburg.com> Message-ID: <002801bed7b9$2fa8b620$0801a8c0@bobcat> > Actually, I think all that's needed is another global like > the interpreter_lock in ceval.c. Since this lock is only > accessed via abstract functions, I presume the unlock flag could > easily be added. Well, my personal opinion is that this is really quite wrong. The most obvious thing to me is that we are exposing an implementation detail we all would dearly like to see removed one day - the global interpreter lock. But even if we ignore that, it seems to me that you are describing an application abstraction, not a language abstraction. This thread started with Jim wanting a thread-safe, atomic list operation. This is not an unusual requirement (ie, a thread-safe, atomic operation), so languages give you access to primitives that let you build this. To my mind, you are asking for the equivilent of a C function that says "suspend all threads except me, cos Im doing something _really_ important". C does not provide that, and I have never thought it should. As Gordon said, Win32 has critical sections, but these are really just lightweight locks. I really dont see how Python is different - it gives you all the tools you need to build these abstractions. I really dont see what you are after that can not be done with a lock. If the performance is a problem, then to paraphrase the Timbot, it may be questionable if you are using Python appropriately in this case. Mark. From tim_one at email.msn.com Tue Jul 27 03:41:17 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 26 Jul 1999 21:41:17 -0400 Subject: [Python-Dev] End of the line In-Reply-To: <613145F79272D211914B0020AFF6401914DC02@gandalf.digicool.com> Message-ID: <000b01bed7d1$1eac5620$eea22299@tim> [Tim, on the cheating PerlIO API] [Brian Lloyd] > Hmm - speed bonuses not withstanding, an implementation of > such a beast in the Python sources would've helped a lot to > reduce the ugly hairy gymnastics required to get Python going > on Win CE, where (until very recently) there was no concept > of most of the things you expect to find in stdio... I don't think it would have helped you there. If e.g. ftell is missing, it's no easier to implement it yourself under the name "PerlIO_ftell" than under the name "ftell" ... Back before Larry Wall got it into in his head that Perl is a grand metaphor for freedom and creativity (or whatever), he justifiably claimed that Perl's great achievement was in taming Unix. Which it did! Perl essentially defined yet a 537th variation of libc/shell/tool semantics, but in a way that worked the same across its 536 Unix hosts. The PerlIO API is a great help with *that*: if a platform is a little off kilter in its implementation of one of these functions, Perl can use a corresponding PerlIO wrapper to hide the shortcoming in a platform-specific file, and the rest of Perl blissfully assumes everything works the same everywhere. That's a good, cool idea. Ironically, Perl does more to hide gratuitous platform differences here than Python does! But it's just a pile of names if you've got no stdio to build on. let's-model-PythonIO-on-the-win32-api-ly y'rs - tim From mhammond at skippinet.com.au Tue Jul 27 04:13:09 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 12:13:09 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <000b01bed7d1$1eac5620$eea22299@tim> Message-ID: <002a01bed7d5$93a4a780$0801a8c0@bobcat> > let's-model-PythonIO-on-the-win32-api-ly y'rs - tim Interestingly, this raises a point worth mentioning sans-wink :-) Win32 has quite a nice concept that file handles (nearly all handles really) are "waitable". Indeed, in the Win32 world, this feature usually prevents me from using the "threading" module - I need to wait on objects other than threads or locks (usually files, but sometimes child processes). I also usually need a "wait for the first one of these objects", which threading doesnt provide, but that is a digression... What Im getting at is that a Python IO model should maybe go a little further than "tradtional" IO - asynchronous IO and synchronisation capabilities should also be specified. Of course, these would be optional, but it would be excellent if a platform could easily slot into pre-defined Python semantics if possible. Is this reasonable, or really simply too hard to abstract in the manner I an talking!? Mark. From mal at lemburg.com Tue Jul 27 10:31:27 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 10:31:27 +0200 Subject: [Python-Dev] Thread locked sections References: <002801bed7b9$2fa8b620$0801a8c0@bobcat> Message-ID: <379D6E5F.B29251EF@lemburg.com> Mark Hammond wrote: > > > Actually, I think all that's needed is another global like > > the interpreter_lock in ceval.c. Since this lock is only > > access From mal at lemburg.com Tue Jul 27 11:23:05 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 11:23:05 +0200 Subject: [Python-Dev] Thread locked sections References: <002801bed7b9$2fa8b620$0801a8c0@bobcat> Message-ID: <379D7A79.DB97B2C@lemburg.com> [The previous mail got truncated due to insufficient disk space; here is a summary...] Mark Hammond wrote: > > > Actually, I think all that's needed is another global like > > the interpreter_lock in ceval.c. Since this lock is only > > accessed via abstract functions, I presume the unlock flag could > > easily be added. > > Well, my personal opinion is that this is really quite wrong. The most > obvious thing to me is that we are exposing an implementation detail we all > would dearly like to see removed one day - the global interpreter lock. > > But even if we ignore that, it seems to me that you are describing an > application abstraction, not a language abstraction. This thread started > with Jim wanting a thread-safe, atomic list operation. This is not an > unusual requirement (ie, a thread-safe, atomic operation), so languages > give you access to primitives that let you build this. > > To my mind, you are asking for the equivilent of a C function that says > "suspend all threads except me, cos Im doing something _really_ important". > C does not provide that, and I have never thought it should. As Gordon > said, Win32 has critical sections, but these are really just lightweight > locks. I really dont see how Python is different - it gives you all the > tools you need to build these abstractions. > > I really dont see what you are after that can not be done with a lock. If > the performance is a problem, then to paraphrase the Timbot, it may be > questionable if you are using Python appropriately in this case. The locked section may not be leading in the right direction, but it surely helps in situations where you cannot otherwise enforce useage of an object specific lock, e.g. for builtin file objects (some APIs insist on getting the real thing, not a thread safe wrapper). Here is a hack that let's you do much the same with an unpatched Python interpreter: sys.setcheckinterval(sys.maxint) # *) # >=10 Python OPs to flush the ticker counter and have the new # check interavl setting take effect: 0==0; 0==0; 0==0; 0==0 try: ...lock section... finally: sys.setcheckinterval(10) *) sys.setcheckinterval should really return the previous value so that we can reset the value to the original one afterwards. Note that the lock section may not call code which uses the Py_*_ALLOW_THREADS macros. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 157 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From rushing at nightmare.com Tue Jul 27 12:33:03 1999 From: rushing at nightmare.com (Sam Rushing) Date: Tue, 27 Jul 1999 03:33:03 -0700 (PDT) Subject: [Python-Dev] continuations for the curious In-Reply-To: <3798FC81.A57E9CFE@appliedbiometrics.com> References: <3798FC81.A57E9CFE@appliedbiometrics.com> Message-ID: <14237.33980.82091.445607@seattle.nightmare.com> I've been playing for a bit, trying to write my own coroutine class (obeying the law of "you won't understand it until you write it yourself") based on one I've worked up for 'lunacy'. I think I have it, let me know what you think: >>> from coroutine import * >>> cc = coroutine (counter, 100, 10) >>> cc.resume() 100 >>> cc.resume() 110 >>> Differences: 1) callcc wraps the 'escape frame' with a lambda, so that it can be invoked like any other function. this actually simplifies the bootstrapping, because starting the function is identical to resuming it. 2) the coroutine object keeps track of who resumed it, so that it can resume the caller without having to know who it is. 3) the coroutine class keeps track of which is the currently 'active' coroutine. It's currently a class variable, but I think this can lead to leaks, so it might have to be made a global. +----------------------------------------------------------------- | For those folks (like me) that were confused about where to get | all the necessary files for building the latest Stackless Python, | here's the procedure: | | 1) unwrap a fresh copy of 1.5.2 | 2) unzip | http://www.pns.cc/anonftp/pub/stackless_990713.zip | on top of it | 3) then, unzip | ftp://ftp.pns.cc/pub/veryfar.zip | on top of that | 4) add "continuation continuationmodule.c" to Modules/Setup -Sam -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: coroutine.py URL: From jack at oratrix.nl Tue Jul 27 14:04:39 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 27 Jul 1999 14:04:39 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Mark Hammond" , Tue, 27 Jul 1999 12:13:09 +1000 , <002a01bed7d5$93a4a780$0801a8c0@bobcat> Message-ID: <19990727120440.13D5F303120@snelboot.oratrix.nl> > What Im getting at is that a Python IO model should maybe go a little > further than "tradtional" IO - asynchronous IO and synchronisation > capabilities should also be specified. Of course, these would be optional, > but it would be excellent if a platform could easily slot into pre-defined > Python semantics if possible. What Python could do with reasonable ease is a sort of "promise" model, where an I/O operation returns an object that waits for the I/O to complete upon access or destruction. Something like def foo(): obj = stdin.delayed_read() obj2 = stdout.delayed_write("data") do_lengthy_computation() data = obj.get() # Here we wait for the read to complete del obj2 # Here we wait for the write to complete. This gives a fairly nice programming model. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mhammond at skippinet.com.au Tue Jul 27 14:10:56 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 27 Jul 1999 22:10:56 +1000 Subject: [Python-Dev] Thread locked sections In-Reply-To: <379D7A79.DB97B2C@lemburg.com> Message-ID: <004201bed829$16211c40$0801a8c0@bobcat> [Marc writes] > The locked section may not be leading in the right direction, > but it surely helps in situations where you cannot otherwise > enforce useage of an object specific lock, e.g. for builtin > file objects (some APIs insist on getting the real thing, not > a thread safe wrapper). Really, all this boils down to is that you want a Python-ish critical section - ie, a light-weight lock. This presumably would be desirable if it could be shown Python locks are indeed "heavy" - I know that from the C POV they may be considered as such, but I havent seen many complaints about lock speed from Python. So in an attempt to get _some_ evidence, I wrote a test program that used the Queue module to append 10000 integers then remove them all. I then hacked the queue module to remove all locking, and ran the same test. The results were 2.4 seconds for the non-locking version, vs 3.8 for the standard version. Without time (or really inclination ) to take this further, it _does_ appear a native Python "critical section" could indeed save a few milli-seconds for a few real-world apps. So if we ignore the implementation details Marc started spelling, does the idea of a Python "critical section" appeal? Could simply be a built-in way of saying "no other _Python_ threads should run" (and of-course the "allow them again"). The semantics could be simply to ensure the Python program integrity - it need say nothing about the Python internal "state" as such. Mark. From mal at lemburg.com Tue Jul 27 14:27:55 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 27 Jul 1999 14:27:55 +0200 Subject: [Python-Dev] continuations for the curious References: <3798FC81.A57E9CFE@appliedbiometrics.com> <14237.33980.82091.445607@seattle.nightmare.com> Message-ID: <379DA5CB.B3619365@lemburg.com> Sam Rushing wrote: > > +----------------------------------------------------------------- > | For those folks (like me) that were confused about where to get > | all the necessary files for building the latest Stackless Python, > | here's the procedure: Thanks... this guide made me actually try it ;-) > | > | 1) unwrap a fresh copy of 1.5.2 > | 2) unzip > | http://www.pns.cc/anonftp/pub/stackless_990713.zip > | on top of it > | 3) then, unzip > | ftp://ftp.pns.cc/pub/veryfar.zip > | on top of that It seems that Christian forgot the directory information in this ZIP file. You have to move the continuationmodule.c file to Modules/ by hand. > | 4) add "continuation continuationmodule.c" to Modules/Setup -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 157 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mhammond at skippinet.com.au Tue Jul 27 16:45:12 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Wed, 28 Jul 1999 00:45:12 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <19990727120440.13D5F303120@snelboot.oratrix.nl> Message-ID: <004401bed83e$a9252b70$0801a8c0@bobcat> [Jack seems to like an asynch IO model] > def foo(): > obj = stdin.delayed_read() > obj2 = stdout.delayed_write("data") > do_lengthy_computation() > data = obj.get() # Here we wait for the read to complete > del obj2 # Here we wait for the write to > complete. > > This gives a fairly nice programming model. Indeed. Taking this a little further, I come up with something like: inlock = threading.Lock() buffer = stdin.delayed_read(inlock) outlock = threading.Lock() stdout.delayed_write(outlock, "The data") fired = threading.Wait(inlock, outlock) # new fn :-) if fired is inlock: # etc. The idea is we can make everything wait on a single lock abstraction. threading.Wait() could accept lock objects, thread objects, Sockets, etc. Obviously a bit to work out, but it does make an appealing model. OTOH, I wonder how it fits with continutations etc. Not too badly from my weak understanding. May be an interesting convergence! Mark. From jack at oratrix.nl Tue Jul 27 17:31:13 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 27 Jul 1999 17:31:13 +0200 Subject: [Python-Dev] End of the line In-Reply-To: Message by "Mark Hammond" , Wed, 28 Jul 1999 00:45:12 +1000 , <004401bed83e$a9252b70$0801a8c0@bobcat> Message-ID: <19990727153113.4A2F1303120@snelboot.oratrix.nl> > [Jack seems to like an asynch IO model] > > > def foo(): > > obj = stdin.delayed_read() > > obj2 = stdout.delayed_write("data") > > do_lengthy_computation() > > data = obj.get() # Here we wait for the read to complete > > del obj2 # Here we wait for the write to > > complete. > > > > This gives a fairly nice programming model. > > Indeed. Taking this a little further, I come up with something like: > > inlock = threading.Lock() > buffer = stdin.delayed_read(inlock) > > outlock = threading.Lock() > stdout.delayed_write(outlock, "The data") > > fired = threading.Wait(inlock, outlock) # new fn :-) > > if fired is inlock: # etc. I think this is exactly what I _didn't_ want:-) I'd like the delayed read to return an object that will automatically wait when I try to get the data from it, and the delayed write object to automatically wait when I garbage-collect it. Of course, there's no reason why you couldn't also wait on these objects (or, on unix, pass them to select(), or whatever). On second thought the method of the delayed read should be called read() in stead of get(), of course. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mhammond at skippinet.com.au Wed Jul 28 00:21:19 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Wed, 28 Jul 1999 08:21:19 +1000 Subject: [Python-Dev] End of the line In-Reply-To: <19990727153113.4A2F1303120@snelboot.oratrix.nl> Message-ID: <000c01bed87e$5af42060$0801a8c0@bobcat> [I missed Jack's point] > I think this is exactly what I _didn't_ want:-) > > I'd like the delayed read to return an object that will > automatically wait > when I try to get the data from it, and the delayed write object to > automatically wait when I garbage-collect it. OK - that is fine. My driving requirement was that I be able to wait on _multiple_ files at the same time - ie, I dont know which one will complete first. There is no reason then why your initial suggestion can not satisfy my requirement, as long as the "buffer type object" returned from read is itself waitable. I agree there is no driving need for a seperate buffer type object and seperate waitable object necessarily. [OTOH, your scheme could be simply built on top of my scheme as a framework] Unfortunately, this doesnt seem to have grabbed anyone elses interest.. Mark. From da at ski.org Wed Jul 28 23:46:21 1999 From: da at ski.org (David Ascher) Date: Wed, 28 Jul 1999 14:46:21 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Tcl news Message-ID: 8.2b1 is released: Some surprising news: they now use cygwin tools to do the windows build. Not surprising news: they still haven't incorporated some bug fixes I submitted eons ago =) http://www.scriptics.com/software/relnotes/tcl8.2b1 --david From tim_one at email.msn.com Thu Jul 29 05:10:40 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 28 Jul 1999 23:10:40 -0400 Subject: [Python-Dev] RE: delayed I/O; multiple waits In-Reply-To: <000c01bed87e$5af42060$0801a8c0@bobcat> Message-ID: <001201bed96f$f06990c0$71a22299@tim> [Mark Hammond] > ... > Unfortunately, this doesnt seem to have grabbed anyone elses interest.. You lost me when you said it should be optional -- that's fine for an extension module, but it sounded like you wanted this to somehow be part of the language core. If WaitForMultipleObjects (which is what you *really* want ) is thought to be a cool enough idea to be in the core, we should think about how to implement it on non-Win32 platforms too. needs-more-words-ly y'rs - tim From mhammond at skippinet.com.au Thu Jul 29 05:52:47 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu, 29 Jul 1999 13:52:47 +1000 Subject: [Python-Dev] RE: delayed I/O; multiple waits In-Reply-To: <001201bed96f$f06990c0$71a22299@tim> Message-ID: <002e01bed975$d392d910$0801a8c0@bobcat> > You lost me when you said it should be optional -- that's fine for an > extension module, but it sounded like you wanted this to Cool - I admit I knew it was too vague, but left it in anyway. > the language core. If WaitForMultipleObjects (which is what > you *really* Sort-of. IMO, the threading module does need a WaitForMultipleObjects (whatever the spelling) but I also recall the discussion that this is not trivial. But what I _really_ want is an enhanced concept of "waitable" - threading can only wait on locks and threads. If we have this, the WaitForMultiple would become even more pressing, but they are not directly related. So, I see 2 issues, both of which usually prevent me personally from using the threading module in the real world. By "optional", I meant a way for a platform to slot into existing "waitable" semantics. Win32 file operations are waitable. I dont really want native win32 file operations to be in the core, but I would like some standard way that, if possible, I could map the waitable semantics to Python waitable semantics. Thus, although the threading module knows nothing about win32 file objects or handles, it would be nice if it could still wait on them. > needs-more-words-ly y'rs - tim Unfortunately, if I knew exactly what I wanted I would be asking for implementation advice rather than grasping at straws :-) Attempting to move from totally raw to half-baked, I suppose this is what I had in mind: * Platform optionally defines what a "waitable" object is, in the same way it now defines what a lock is. Locks are currently _required_ only with threading - waitables would never be required. * Python defines a "waitable" protocol - eg, a new "tp_wait"/"__wait__" slot. If this slot is filled/function exists, it is expected to provide a "waitable" object or NULL/None. * Threading support for platforms that support it define a tp_wait slot that maps the Thread ID to the "waitable object" * Ditto lock support for the plaform. * Extensions such as win32 handles also provide this. * Dream up extensions to file objects a-la Jack's idea. When a file is opened asynch, tp_wait returns non-NULL (via platform specific hooks), or NULL when opened sync (making it not waitable). Non-asynch platforms need zero work here - the asynch open fails, tp_wait slot never filled in. Thus, for platforms that provide no extra asynch support, threading can still only wait on threads and locks. The threading module could take advantage of the new protocol thereby supporting any waitable object. Like I said, only half-baked, but I think expresses a potentially workable idea. Does this get closer to either a) explaining what I meant, or b) confirming I am dribbling? Biggest problem I see is that the only platform that may take advantage is Windows, thereby making a platform specific solution (such as win32event I use now) perfectly reasonable. Maybe my focus should simply be on allowing win32event.WaitFor* to accept threading instances and standard Python lock objects!! Mark. From Brian at digicool.com Fri Jul 30 16:23:49 1999 From: Brian at digicool.com (Brian Lloyd) Date: Fri, 30 Jul 1999 10:23:49 -0400 Subject: [Python-Dev] RE: NT select.select? Message-ID: <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> > Is there some low limit on maximum number of sockets you can > have in the > Python-NT's select call? A program that happens to work > perfectly on Linux > seems to die on NT around 64(?) sockets to the 'too many file > descriptors > in call' error. > > Any portable ways to bypass it? > > -Markus Hi Markus, It turns out that NT has a default 64 fd limit on arguments to select(). The good news is that you can actually bump the limit up to whatever number you want by specifying a define when compiling python15.dll. If you have the ability to rebuild your python15.dll, you can add the define: FD_SETSIZE=1024 to the preprocessor options for the python15 project to raise the limit to 1024 fds. The default 64 fd limit is too low for anyone trying to run an async server that handles even a modest load, so I've submitted a bug report to python.org asking that the define above find its way into the next python release... Brian Lloyd brian at digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From guido at CNRI.Reston.VA.US Fri Jul 30 17:04:58 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 30 Jul 1999 11:04:58 -0400 Subject: [Python-Dev] RE: NT select.select? In-Reply-To: Your message of "Fri, 30 Jul 1999 10:23:49 EDT." <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> References: <613145F79272D211914B0020AFF6401914DC19@gandalf.digicool.com> Message-ID: <199907301504.LAA13183@eric.cnri.reston.va.us> > It turns out that NT has a default 64 fd limit on arguments to > select(). The good news is that you can actually bump the limit up > to whatever number you want by specifying a define when compiling > python15.dll. > > If you have the ability to rebuild your python15.dll, you can add > the define: > > FD_SETSIZE=1024 > > to the preprocessor options for the python15 project to raise the > limit to 1024 fds. > > The default 64 fd limit is too low for anyone trying to run > an async server that handles even a modest load, so I've > submitted a bug report to python.org asking that the define > above find its way into the next python release... Brian, (Also in response to your bug report.) I'm a little worried that upping the limit to 1024 would cause some performance problems if you're making a lot of select() calls. The select allocates three arrays of length FD_SETSIZE+3; each array item is 12 bytes. This is a total allocation of more than 36K for a meager select() call! And all that memory also has to be cleared by the FD_ZERO() call. If you actually have that many sockets, that's worth paying for (the socket objects themselves use up just as much memory, and your Python data structures for the sockets, no matter how small, are probably several times bigger), but for a more typical program, I see this as a lot of overhead. Is there a way that this can be done more dynamically, e.g. by making the set size as big as needed on windows but no bigger? (Before you suggest allocating that memory statically, remember it's possible to call select from multiple threads. Allocating 36K of thread-local space for each thread also doesn't sound too pleasant.) --Guido van Rossum (home page: http://www.python.org/~guido/) From Brian at digicool.com Fri Jul 30 20:25:01 1999 From: Brian at digicool.com (Brian Lloyd) Date: Fri, 30 Jul 1999 14:25:01 -0400 Subject: [Python-Dev] RE: NT select.select? Message-ID: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Guido wrote: > > Brian, > > (Also in response to your bug report.) I'm a little worried that > upping the limit to 1024 would cause some performance problems if > you're making a lot of select() calls. The select allocates three > arrays of length FD_SETSIZE+3; each array item is 12 bytes. This is a > total allocation of more than 36K for a meager select() call! > And all > that memory also has to be cleared by the FD_ZERO() call. > > If you actually have that many sockets, that's worth paying for (the > socket objects themselves use up just as much memory, and your Python > data structures for the sockets, no matter how small, are probably > several times bigger), but for a more typical program, I see > this as a > lot of overhead. > > Is there a way that this can be done more dynamically, e.g. by making > the set size as big as needed on windows but no bigger? > > (Before you suggest allocating that memory statically, remember it's > possible to call select from multiple threads. Allocating 36K of > thread-local space for each thread also doesn't sound too pleasant.) > > --Guido van Rossum (home page: http://www.python.org/~guido/) Hmm - after going through all of the Win32 sdks, it doesn't appear to be possible to do it any other way than as a -D option at compile time, so optimizing for the common case (folks who _don't_ need large numbers of fds) is reasonable. Since we distribute a python15.dll with Zope on windows, this isn't that big a deal for us - we can just compile in a higher limit in our distributed dll. I was mostly thinking of the win32 users who don't have the ability to rebuild their dll, but maybe this isn't that much of a problem; I suspect that the people who write significant socket apps that would run into this problem probably have access to a compiler if they need it. Brian Lloyd brian at digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com From da at ski.org Fri Jul 30 20:59:37 1999 From: da at ski.org (David Ascher) Date: Fri, 30 Jul 1999 11:59:37 -0700 (Pacific Daylight Time) Subject: [Python-Dev] RE: NT select.select? In-Reply-To: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Message-ID: On Fri, 30 Jul 1999, Brian Lloyd wrote: > Since we distribute a python15.dll with Zope on windows, this > isn't that big a deal for us - we can just compile in a higher > limit in our distributed dll. I was mostly thinking of the win32 > users who don't have the ability to rebuild their dll, but > maybe this isn't that much of a problem; I suspect that the > people who write significant socket apps that would run into > this problem probably have access to a compiler if they need it. It's a worthy piece of knowledge to document somehow -- I'm not sure where that should be... From fdrake at cnri.reston.va.us Fri Jul 30 21:05:37 1999 From: fdrake at cnri.reston.va.us (Fred L. Drake) Date: Fri, 30 Jul 1999 15:05:37 -0400 (EDT) Subject: [Python-Dev] RE: NT select.select? In-Reply-To: References: <613145F79272D211914B0020AFF6401914DC1E@gandalf.digicool.com> Message-ID: <14241.63361.737047.998159@weyr.cnri.reston.va.us> David Ascher writes: > It's a worthy piece of knowledge to document somehow -- I'm not sure where > that should be... Perhaps a paragraph in the library reference? If someone can send along a clear bit of text (unformatted is fine), I'll be glad to add it. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives