From tim_one@email.msn.com Sat May 1 09:32:30 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 1 May 1999 04:32:30 -0400 Subject: [Python-Dev] Speed (was RE: [Python-Dev] More flexible namespaces.) In-Reply-To: <14121.55659.754846.708467@amarok.cnri.reston.va.us> Message-ID: <000801be93ad$27772ea0$7a9e2299@tim> [Andrew M. Kuchling] > ... > A performance improvement project would definitely be a good idea > for 1.6, and a good sub-topic for python-dev. To the extent that optimization requires uglification, optimization got pushed beyond Guido's comfort zone back around 1.4 -- little has made it in since then. Not griping; I'm just trying to avoid enduring the same discussions for the third to twelfth times . Anywho, on the theory that a sweeping speedup patch has no chance of making it in regardless, how about focusing on one subsystem? In my experience, the speed issue Python gets beat up the most for is the relative slowness of function calls. It would be very good if eval_code2 somehow or other could manage to invoke a Python function without all the hair of a recursive C call, and I believe Guido intends to move in that direction for Python2 anyway. This would be a good time to start exploring that seriously. inspirationally y'rs - tim From da@ski.org Sat May 1 23:15:32 1999 From: da@ski.org (David Ascher) Date: Sat, 1 May 1999 15:15:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <37296856.5875AAAF@lemburg.com> Message-ID: > Since you put out to objectives, I'd like to propose a little > different approach... > > 1. Have eval/exec accept any mapping object as input > > 2. Make those two copy the content of the mapping object into real > dictionaries > > 3. Provide a hook into the dictionary implementation that can be > used to redirect KeyErrors and use that redirection to forward > the request to the original mapping objects Interesting counterproposal. I'm not sure whether any of the proposals on the table really do what's needed for e.g. case-insensitive namespace handling. I can see how all of the proposals so far allow case-insensitive reference name handling in the global namespace, but don't we also need to hook into the local-namespace creation process to allow case-insensitivity to work throughout? --david From da@ski.org Sun May 2 16:15:57 1999 From: da@ski.org (David Ascher) Date: Sun, 2 May 1999 08:15:57 -0700 (Pacific Daylight Time) Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <00bc01be942a$47d94070$0801a8c0@bobcat> Message-ID: On Sun, 2 May 1999, Mark Hammond wrote: > > I'm not sure whether any of the > > proposals on > > the table really do what's needed for e.g. case-insensitive namespace > > handling. I can see how all of the proposals so far allow > > case-insensitive reference name handling in the global namespace, but > > don't we also need to hook into the local-namespace creation > > process to > > allow case-insensitivity to work throughout? > > Why not? I pictured case insensitive namespaces working so that they > retain the case of the first assignment, but all lookups would be > case-insensitive. > > Ohh - right! Python itself would need changing to support this. I suppose > that faced with code such as: > > def func(): > if spam: > Spam=1 > > Python would generate code that refers to "spam" as a local, and "Spam" as > a global. > > Is this why you feel it wont work? I hadn't thought of that, to be truthful, but I think it's more generic. [FWIW, I never much cared for the tag-variables-at-compile-time optimization in CPython, and wouldn't miss it if were lost.] The point is that if I eval or exec code which calls a function specifying some strange mapping as the namespaces (global and current-local) I presumably want to also specify how local namespaces work for the function calls within that code snippet. That means that somehow Python has to know what kind of namespace to use for local environments, and not use the standard dictionary. Maybe we can simply have it use a '.clear()'ed .__copy__ of the specified environment. exec 'foo()' in globals(), mylocals would then call foo and within foo, the local env't would be mylocals.__copy__.clear(). Anyway, something for those-with-the-patches to keep in mind. --david From tismer@appliedbiometrics.com Sun May 2 14:00:37 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sun, 02 May 1999 15:00:37 +0200 Subject: [Python-Dev] More flexible namespaces. References: Message-ID: <372C4C75.5B7CCAC8@appliedbiometrics.com> David Ascher wrote: [Marc:> > > Since you put out to objectives, I'd like to propose a little > > different approach... > > > > 1. Have eval/exec accept any mapping object as input > > > > 2. Make those two copy the content of the mapping object into real > > dictionaries > > > > 3. Provide a hook into the dictionary implementation that can be > > used to redirect KeyErrors and use that redirection to forward > > the request to the original mapping objects I don't think that this proposal would give so much new value. Since a mapping can also be implemented in arbitrary ways, say by functions, a mapping is not necessarily finite and might not be changeable into a dict. [David:> > Interesting counterproposal. I'm not sure whether any of the proposals on > the table really do what's needed for e.g. case-insensitive namespace > handling. I can see how all of the proposals so far allow > case-insensitive reference name handling in the global namespace, but > don't we also need to hook into the local-namespace creation process to > allow case-insensitivity to work throughout? Case-independant namespaces seem to be a minor point, nice to have for interfacing to other products, but then, in a function, I see no benefit in changing the semantics of function locals? The lookup of foreign symbols would always be through a mapping object. If you take COM for instance, your access to a COM wrapper for an arbitrary object would be through properties of this object. After assignment to a local function variable, why should we support case-insensitivity at all? I would think mapping objects would be a great simplification of lazy imports in COM, where we would like to avoid to import really huge namespaces in one big slurp. Also the wrapper code could be made quite a lot easier and faster without so much getattr/setattr trapping. Does btw. anybody really want to see case-insensitivity in Python programs? I'm quite happy with it as it is, and I would even force the use to always use the same case style after he has touched an external property once. Example for Excel: You may write "xl.workbooks" in lowercase, but then you have to stay with it. This would keep Python source clean for, say, PyLint. my 0.02 Euro - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From MHammond@skippinet.com.au Sun May 2 00:28:11 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Sun, 2 May 1999 09:28:11 +1000 Subject: [Python-Dev] More flexible namespaces. In-Reply-To: Message-ID: <00bc01be942a$47d94070$0801a8c0@bobcat> > I'm not sure whether any of the > proposals on > the table really do what's needed for e.g. case-insensitive namespace > handling. I can see how all of the proposals so far allow > case-insensitive reference name handling in the global namespace, but > don't we also need to hook into the local-namespace creation > process to > allow case-insensitivity to work throughout? Why not? I pictured case insensitive namespaces working so that they retain the case of the first assignment, but all lookups would be case-insensitive. Ohh - right! Python itself would need changing to support this. I suppose that faced with code such as: def func(): if spam: Spam=1 Python would generate code that refers to "spam" as a local, and "Spam" as a global. Is this why you feel it wont work? Mark. From mal@lemburg.com Sun May 2 20:24:54 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 02 May 1999 21:24:54 +0200 Subject: [Python-Dev] More flexible namespaces. References: <372C4C75.5B7CCAC8@appliedbiometrics.com> Message-ID: <372CA686.215D71DF@lemburg.com> Christian Tismer wrote: > > David Ascher wrote: > [Marc:> > > > Since you put out the objectives, I'd like to propose a little > > > different approach... > > > > > > 1. Have eval/exec accept any mapping object as input > > > > > > 2. Make those two copy the content of the mapping object into real > > > dictionaries > > > > > > 3. Provide a hook into the dictionary implementation that can be > > > used to redirect KeyErrors and use that redirection to forward > > > the request to the original mapping objects > > I don't think that this proposal would give so much new > value. Since a mapping can also be implemented in arbitrary > ways, say by functions, a mapping is not necessarily finite > and might not be changeable into a dict. [Disclaimer: I'm not really keen on having the possibility of letting code execute in arbitrary namespace objects... it would make code optimizations even less manageable.] You can easily support infinite mappings by wrapping the function into an object which returns an empty list for .items() and then use the hook mentioned in 3 to redirect the lookup to that function. The proposal allows one to use such a proxy to simulate any kind of mapping -- it works much like the __getattr__ hook provided for instances. > [David:> > > Interesting counterproposal. I'm not sure whether any of the proposals on > > the table really do what's needed for e.g. case-insensitive namespace > > handling. I can see how all of the proposals so far allow > > case-insensitive reference name handling in the global namespace, but > > don't we also need to hook into the local-namespace creation process to > > allow case-insensitivity to work throughout? > > Case-independant namespaces seem to be a minor point, > nice to have for interfacing to other products, but then, > in a function, I see no benefit in changing the semantics > of function locals? The lookup of foreign symbols would > always be through a mapping object. If you take COM for > instance, your access to a COM wrapper for an arbitrary > object would be through properties of this object. After > assignment to a local function variable, why should we > support case-insensitivity at all? > > I would think mapping objects would be a great > simplification of lazy imports in COM, where > we would like to avoid to import really huge > namespaces in one big slurp. Also the wrapper code > could be made quite a lot easier and faster without > so much getattr/setattr trapping. What do lazy imports have to do with case [in]sensitive namespaces ? Anyway, how about a simple lazy import mechanism in the standard distribution, i.e. why not make all imports lazy ? Since modules are first class objects this should be easy to implement... > Does btw. anybody really want to see case-insensitivity > in Python programs? I'm quite happy with it as it is, > and I would even force the use to always use the same > case style after he has touched an external property > once. Example for Excel: You may write "xl.workbooks" > in lowercase, but then you have to stay with it. > This would keep Python source clean for, say, PyLint. "No" and "me too" ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 243 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From MHammond@skippinet.com.au Mon May 3 01:52:41 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Mon, 3 May 1999 10:52:41 +1000 Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <372CA686.215D71DF@lemburg.com> Message-ID: <000e01be94ff$4047ef20$0801a8c0@bobcat> [Marc] > [Disclaimer: I'm not really keen on having the possibility of > letting code execute in arbitrary namespace objects... it would > make code optimizations even less manageable.] Good point - although surely that would simply mean (certain) optimisations can't be performed for code executing in that environment? How to detect this at "optimization time" may be a little difficult :-) However, this is the primary purpose of this thread - to workout _if_ it is a good idea, as much as working out _how_ to do it :-) > The proposal allows one to use such a proxy to simulate any > kind of mapping -- it works much like the __getattr__ hook > provided for instances. My only problem with Marc's proposal is that there already _is_ an established mapping protocol, and this doesnt use it; instead it invents a new one with the benefit being potentially less code breakage. And without attempting to sound flippant, I wonder how many extension modules will be affected? Module init code certainly assumes the module __dict__ is a dictionary, but none of my code assumes anything about other namespaces. Marc's extensions may be a special case, as AFAIK they inject objects into other dictionaries (ie, new builtins?). Again, not trying to downplay this too much, but if it is only a problem for Marc's more esoteric extensions, I dont feel that should hold up an otherwise solid proposal. [Chris, I think?] > > Case-independant namespaces seem to be a minor point, > > nice to have for interfacing to other products, but then, > > in a function, I see no benefit in changing the semantics > > of function locals? The lookup of foreign symbols would I disagree here. Consider Alice, and similar projects, where a (arguably misplaced, but nonetheless) requirement is that the embedded language be case-insensitive. Period. The Alice people are somewhat special in that they had the resources to change the interpreters guts. Most people wont, and will look for a different language to embedd. Of course, I agree with you for the specific cases you are talking - COM, Active Scripting etc. Indeed, everything I would use this for would prefer to keep the local function semantics identical. > > Does btw. anybody really want to see case-insensitivity > > in Python programs? I'm quite happy with it as it is, > > and I would even force the use to always use the same > > case style after he has touched an external property > > once. Example for Excel: You may write "xl.workbooks" > > in lowercase, but then you have to stay with it. > > This would keep Python source clean for, say, PyLint. > > "No" and "me too" ;-) I think we are missing the point a little. If we focus on COM, we may come up with a different answer. Indeed, if we are to focus on COM integration with Python, there are other areas I would prefer to start with :-) IMO, we should attempt to come up with a more flexible namespace mechanism that is in the style of Python, and will not noticeably slowdown Python. Then COM etc can take advantage of it - much in the same way that Python's existing namespace model existed pre-COM, and COM had to take advantage of what it could! Of course, a key indicator of the likely success is how well COM _can_ take advantage of it, and how much Alice could have taken advantage of it - I cant think of any other yardsticks? Mark. From mal@lemburg.com Mon May 3 08:56:53 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 03 May 1999 09:56:53 +0200 Subject: [Python-Dev] More flexible namespaces. References: <000e01be94ff$4047ef20$0801a8c0@bobcat> Message-ID: <372D56C5.4738DE3D@lemburg.com> Mark Hammond wrote: > > [Marc] > > [Disclaimer: I'm not really keen on having the possibility of > > letting code execute in arbitrary namespace objects... it would > > make code optimizations even less manageable.] > > Good point - although surely that would simply mean (certain) optimisations > can't be performed for code executing in that environment? How to detect > this at "optimization time" may be a little difficult :-) > > However, this is the primary purpose of this thread - to workout _if_ it is > a good idea, as much as working out _how_ to do it :-) > > > The proposal allows one to use such a proxy to simulate any > > kind of mapping -- it works much like the __getattr__ hook > > provided for instances. > > My only problem with Marc's proposal is that there already _is_ an > established mapping protocol, and this doesnt use it; instead it invents a > new one with the benefit being potentially less code breakage. ...and that's the key point: you get the intended features and the core code will not have to be changed in significant ways. Basically, I think these kind of core extensions should be done in generic ways, e.g. by letting the eval/exec machinery accept subclasses of dictionaries, rather than trying to raise the abstraction level used and slowing things down in general just to be able to use the feature on very few occasions. > And without attempting to sound flippant, I wonder how many extension > modules will be affected? Module init code certainly assumes the module > __dict__ is a dictionary, but none of my code assumes anything about other > namespaces. Marc's extensions may be a special case, as AFAIK they inject > objects into other dictionaries (ie, new builtins?). Again, not trying to > downplay this too much, but if it is only a problem for Marc's more > esoteric extensions, I dont feel that should hold up an otherwise solid > proposal. My mxTools extension does the assignment in Python, so it wouldn't be affected. The others only do the usual modinit() stuff. Before going any further on this thread we may have to ponder a little more on the objectives that we have. If it's only case-insensitive lookups then I guess a simple compile time switch exchanging the implementations of string hash and compare functions would do the trick. If we're after doing wild things like lookups accross networks, then a more specific approach is needed. So what is it that we want in 1.6 ? > [Chris, I think?] > > > Case-independant namespaces seem to be a minor point, > > > nice to have for interfacing to other products, but then, > > > in a function, I see no benefit in changing the semantics > > > of function locals? The lookup of foreign symbols would > > I disagree here. Consider Alice, and similar projects, where a (arguably > misplaced, but nonetheless) requirement is that the embedded language be > case-insensitive. Period. The Alice people are somewhat special in that > they had the resources to change the interpreters guts. Most people wont, > and will look for a different language to embedd. > > Of course, I agree with you for the specific cases you are talking - COM, > Active Scripting etc. Indeed, everything I would use this for would prefer > to keep the local function semantics identical. As I understand the needs in COM and AS you are talking about object attributes, right ? Making these case-insensitive is a job for a proxy or a __getattr__ hack. > > > Does btw. anybody really want to see case-insensitivity > > > in Python programs? I'm quite happy with it as it is, > > > and I would even force the use to always use the same > > > case style after he has touched an external property > > > once. Example for Excel: You may write "xl.workbooks" > > > in lowercase, but then you have to stay with it. > > > This would keep Python source clean for, say, PyLint. > > > > "No" and "me too" ;-) > > I think we are missing the point a little. If we focus on COM, we may come > up with a different answer. Indeed, if we are to focus on COM integration > with Python, there are other areas I would prefer to start with :-) > > IMO, we should attempt to come up with a more flexible namespace mechanism > that is in the style of Python, and will not noticeably slowdown Python. > Then COM etc can take advantage of it - much in the same way that Python's > existing namespace model existed pre-COM, and COM had to take advantage of > what it could! > > Of course, a key indicator of the likely success is how well COM _can_ take > advantage of it, and how much Alice could have taken advantage of it - I > cant think of any other yardsticks? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 242 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From fredrik@pythonware.com Mon May 3 15:01:10 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 16:01:10 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat> Message-ID: <005b01be956d$66d48450$f29b12c2@pythonware.com> scriptics is positioning tcl as a perl killer: http://www.scriptics.com/scripting/perl.html afaict, unicode and event handling are the two main thingies missing from python 1.5. -- unicode: is on its way. -- event handling: asynclib/asynchat provides an awesome framework for event-driven socket pro- gramming. however, Python still lacks good cross- platform support for event-driven access to files and pipes. are threads good enough, or would it be cool to have something similar to Tcl's fileevent stuff in Python? -- regexps: has anyone compared the new uni- code-aware regexp package in Tcl with pcre? comments? btw, the rebol folks have reached 2.0: http://www.rebol.com/ maybe 1.6 should be renamed to Python 6.0? From akuchlin@cnri.reston.va.us Mon May 3 16:14:15 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 3 May 1999 11:14:15 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <005b01be956d$66d48450$f29b12c2@pythonware.com> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> Message-ID: <14125.47524.196878.583460@amarok.cnri.reston.va.us> Fredrik Lundh writes: >-- regexps: has anyone compared the new uni- >code-aware regexp package in Tcl with pcre? I looked at it a bit when Tcl 8.1 was in beta; it derives from Henry Spencer's 1998-vintage code, which seems to try to do a lot of optimization and analysis. It may even compile DFAs instead of NFAs when possible, though it's hard for me to be sure. This might give it a substantial speed advantage over engines that do less analysis, but I haven't benchmarked it. The code is easy to read, but difficult to understand because the theory underlying the analysis isn't explained in the comments; one feels there should be an accompanying paper to explain how everything works, and it's why I'm not sure if it really is producing DFAs for some expressions. Tcl seems to represent everything as UTF-8 internally, so there's only one regex engine; there's . The code is scattered over more files: amarok generic>ls re*.[ch] regc_color.c regc_locale.c regcustom.h regerrs.h regfree.c regc_cvec.c regc_nfa.c rege_dfa.c regex.h regfronts.c regc_lex.c regcomp.c regerror.c regexec.c regguts.h amarok generic>wc -l re*.[ch] 742 regc_color.c 170 regc_cvec.c 1010 regc_lex.c 781 regc_locale.c 1528 regc_nfa.c 2124 regcomp.c 85 regcustom.h 627 rege_dfa.c 82 regerror.c 18 regerrs.h 308 regex.h 952 regexec.c 25 regfree.c 56 regfronts.c 388 regguts.h 8896 total amarok generic> This would be an issue for using it with Python, since all these files would wind up scattered around the Modules directory. For comparison, pypcre.c is around 4700 lines of code. -- A.M. Kuchling http://starship.python.net/crew/amk/ Things need not have happened to be true. Tales and dreams are the shadow-truths that will endure when mere facts are dust and ashes, and forgot. -- Neil Gaiman, _Sandman_ #19: _A Midsummer Night's Dream_ From guido@CNRI.Reston.VA.US Mon May 3 16:32:09 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 03 May 1999 11:32:09 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Mon, 03 May 1999 11:14:15 EDT." <14125.47524.196878.583460@amarok.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> Message-ID: <199905031532.LAA05617@eric.cnri.reston.va.us> > I looked at it a bit when Tcl 8.1 was in beta; it derives from > Henry Spencer's 1998-vintage code, which seems to try to do a lot of > optimization and analysis. It may even compile DFAs instead of NFAs > when possible, though it's hard for me to be sure. This might give it > a substantial speed advantage over engines that do less analysis, but > I haven't benchmarked it. The code is easy to read, but difficult to > understand because the theory underlying the analysis isn't explained > in the comments; one feels there should be an accompanying paper to > explain how everything works, and it's why I'm not sure if it really > is producing DFAs for some expressions. > > Tcl seems to represent everything as UTF-8 internally, so > there's only one regex engine; there's . Hmm... I looked when Tcl 8.1 was in alpha, and I *think* that at that point the regex engine was compiled twice, once for 8-bit chars and once for 16-bit chars. But this may have changed. I've noticed that Perl is taking the same position (everything is UTF-8 internally). On the other hand, Java distinguishes 16-bit chars from 8-bit bytes. Python is currently in the Java camp. This might be a good time to make sure that we're still convinced that this is the right thing to do! > The code is scattered over > more files: > > amarok generic>ls re*.[ch] > regc_color.c regc_locale.c regcustom.h regerrs.h regfree.c > regc_cvec.c regc_nfa.c rege_dfa.c regex.h regfronts.c > regc_lex.c regcomp.c regerror.c regexec.c regguts.h > amarok generic>wc -l re*.[ch] > 742 regc_color.c > 170 regc_cvec.c > 1010 regc_lex.c > 781 regc_locale.c > 1528 regc_nfa.c > 2124 regcomp.c > 85 regcustom.h > 627 rege_dfa.c > 82 regerror.c > 18 regerrs.h > 308 regex.h > 952 regexec.c > 25 regfree.c > 56 regfronts.c > 388 regguts.h > 8896 total > amarok generic> > > This would be an issue for using it with Python, since all > these files would wind up scattered around the Modules directory. For > comparison, pypcre.c is around 4700 lines of code. I'm sure that if it's good code, we'll find a way. Perhaps a more interesting question is whether it is Perl5 compatible. I contacted Henry Spencer at the time and he was willing to let us use his code. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@cnri.reston.va.us Mon May 3 16:56:46 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 3 May 1999 11:56:46 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> Message-ID: <14125.49911.982236.754340@amarok.cnri.reston.va.us> Guido van Rossum writes: >Hmm... I looked when Tcl 8.1 was in alpha, and I *think* that at that >point the regex engine was compiled twice, once for 8-bit chars and >once for 16-bit chars. But this may have changed. It doesn't seem to currently; the code in tclRegexp.c looks like this: /* Remember the UTF-8 string so Tcl_RegExpRange() can convert the * matches from character to byte offsets. */ regexpPtr->string = string; Tcl_DStringInit(&stringBuffer); uniString = Tcl_UtfToUniCharDString(string, -1, &stringBuffer); numChars = Tcl_DStringLength(&stringBuffer) / sizeof(Tcl_UniChar); /* Perform the regexp match. */ result = TclRegExpExecUniChar(interp, re, uniString, numChars, -1, ((string > start) ? REG_NOTBOL : 0)); ISTR the Spencer engine does, however, define a small and large representation for NFAs and have two versions of the engine, one for each representation. Perhaps that's what you're thinking of. >I've noticed that Perl is taking the same position (everything is >UTF-8 internally). On the other hand, Java distinguishes 16-bit chars >from 8-bit bytes. Python is currently in the Java camp. This might >be a good time to make sure that we're still convinced that this is >the right thing to do! I don't know. There's certainly the fundamental dichotomy that strings are sometimes used to represent characters, where changing encodings on input and output is reasonably, and sometimes used to hold chunks of binary data, where any changes are incorrect. Perhaps Paul Prescod is right, and we should try to get some other data type (array.array()) for holding binary data, as distinct from strings. >I'm sure that if it's good code, we'll find a way. Perhaps a more >interesting question is whether it is Perl5 compatible. I contacted >Henry Spencer at the time and he was willing to let us use his code. Mostly Perl-compatible, though it doesn't look like the 5.005 features are there, and I haven't checked for every single 5.004 feature. Adding missing features might be problematic, because I don't really understand what the code is doing at a high level. Also, is there a user community for this code? Do any other projects use it? Philip Hazel has been quite helpful with PCRE, an important thing when making modifications to the code. Should I make a point of looking at what using the Spencer engine would entail? It might not be too difficult (an evening or two, maybe?) to write a re.py that sat on top of the Spencer code; that would at least let us do some benchmarking. -- A.M. Kuchling http://starship.python.net/crew/amk/ In Einstein's theory of relativity the observer is a man who sets out in quest of truth armed with a measuring-rod. In quantum theory he sets out with a sieve. -- Sir Arthur Eddington From guido@CNRI.Reston.VA.US Mon May 3 17:02:22 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 03 May 1999 12:02:22 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Mon, 03 May 1999 11:56:46 EDT." <14125.49911.982236.754340@amarok.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> Message-ID: <199905031602.MAA05829@eric.cnri.reston.va.us> > Should I make a point of looking at what using the Spencer > engine would entail? It might not be too difficult (an evening or > two, maybe?) to write a re.py that sat on top of the Spencer code; > that would at least let us do some benchmarking. Surely this would be more helpful than weeks of specilative emails -- go for it! --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Mon May 3 18:10:55 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 19:10:55 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat><005b01be956d$66d48450$f29b12c2@pythonware.com><14125.47524.196878.583460@amarok.cnri.reston.va.us><199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> Message-ID: <005801be9588$7ad0fcc0$f29b12c2@pythonware.com> > Also, is there a user community for this code? how about comp.lang.tcl ;-) From fredrik@pythonware.com Mon May 3 18:15:00 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 19:15:00 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> <199905031602.MAA05829@eric.cnri.reston.va.us> Message-ID: <005901be9588$7af59bc0$f29b12c2@pythonware.com> talking about regexps, here's another thing that would be quite nice to have in 1.6 (available from the Python level, that is). or is it already in there somewhere? ... http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=464362873 Tcl 8.1b3 Request: Generated by Scriptics' bug entry form at Submitted by: Frederic BONNET OperatingSystem: Windows 98 CustomShell: Applied patch to the regexp engine (the exec part) Synopsis: regexp improvements DesiredBehavior: As previously requested by Don Libes: > I see no way for Tcl_RegExpExec to indicate "could match" meaning > "could match if more characters arrive that were suitable for a > match". This is required for a class of applications involving > matching on a stream required by Expect's interact command. Henry > assured me that this facility would be in the engine (I'm not the only > one that needs it). Note that it is not sufficient to add one more > return value to Tcl_RegExpExec (i.e., 2) because one needs to know > both if something matches now and can match later. I recommend > another argument (canMatch *int) be added to Tcl_RegExpExec. /patch info follows/ ... From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon May 3 23:28:23 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Mon, 3 May 1999 18:28:23 -0400 (EDT) Subject: [Python-Dev] New mailing list: python-bugs-list Message-ID: <14126.8967.793734.892670@anthem.cnri.reston.va.us> I've been using Jitterbug for a couple of weeks now as my bug database for Mailman and JPython. So it was easy enough for me to set up a database for Python bug reports. Guido is in the process of tailoring the Jitterbug web interface to his liking and will announce it to the appropriate forums when he's ready. In the meantime, I've created YAML that you might be interested in. All bug reports entered into Jitterbug will be forwarded to python-bugs-list@python.org. You are invited to subscribe to the list by visiting http://www.python.org/mailman/listinfo/python-bugs-list Enjoy, -Barry From jeremy@cnri.reston.va.us Mon May 3 23:30:10 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Mon, 3 May 1999 18:30:10 -0400 (EDT) Subject: [Python-Dev] New mailing list: python-bugs-list In-Reply-To: <14126.8967.793734.892670@anthem.cnri.reston.va.us> References: <14126.8967.793734.892670@anthem.cnri.reston.va.us> Message-ID: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us> Pretty low volume list, eh? From MHammond@skippinet.com.au Tue May 4 00:28:39 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 4 May 1999 09:28:39 +1000 Subject: [Python-Dev] New mailing list: python-bugs-list In-Reply-To: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us> Message-ID: <000701be95bc$ad0b45e0$0801a8c0@bobcat> ha - we wish. More likely to be full of detailed bug reports about how 1/2 != 0.5, or that "def foo(baz=[])" is buggy, etc :-) Mark. > Pretty low volume list, eh? From tim_one@email.msn.com Tue May 4 06:16:17 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 4 May 1999 01:16:17 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us> Message-ID: <000701be95ed$3d594180$dca22299@tim> [Guido & Andrew on Tcl's new regexp code] > I'm sure that if it's good code, we'll find a way. Perhaps a more > interesting question is whether it is Perl5 compatible. I contacted > Henry Spencer at the time and he was willing to let us use his code. Haven't looked at the code, but did read the manpage just now: http://www.scriptics.com/man/tcl8.1/TclCmd/regexp.htm WRT Perl5 compatibility, it sez: Incompatibilities of note include `\b', `\B', the lack of special treatment for a trailing newline, the addition of complemented bracket expressions to the things affected by newline-sensitive matching, the restrictions on parentheses and back references in lookahead constraints, and the longest/shortest-match (rather than first-match) matching semantics. So some gratuitous differences, and maybe a killer: Guido hasn't had much kind to say about "longest" (aka POSIX) matching semantics. An example from the page: (week|wee)(night|knights) matches all ten characters of `weeknights' which means it matched 'wee' and 'knights'; Python/Perl match 'week' and 'night'. It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA is correct; indeed, it's a pain to get that behavior any other way! otoh-it's-potentially-very-much-faster-ly y'rs - tim From tim_one@email.msn.com Tue May 4 06:51:01 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 4 May 1999 01:51:01 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <000701be95ed$3d594180$dca22299@tim> Message-ID: <000901be95f2$195556c0$dca22299@tim> [Tim] > ... > It's the *natural* semantics if Andrew's suspicion that it's > compiling a DFA is correct ... More from the man page: AREs report the longest/shortest match for the RE, rather than the first found in a specified search order. This may affect some RREs which were written in the expectation that the first match would be reported. (The careful crafting of RREs to optimize the search order for fast matching is obsolete (AREs examine all possible matches in parallel, and their performance is largely insensitive to their complexity) but cases where the search order was exploited to deliberately find a match which was not the longest/shortest will need rewriting.) Nails it, yes? Now, in 10 seconds, try to remember a regexp where this really matters . Note in passing that IDLE's colorizer regexp *needs* to search for triple-quoted strings before single-quoted ones, else the P/P semantics would consider """ to be an empty single-quoted string followed by a double quote. This isn't a case where it matters in a bad way, though! The "longest" rule picks the correct alternative regardless of the order in which they're written. at-least-in-that-specific-regex<0.1-wink>-ly y'rs - tim From guido@CNRI.Reston.VA.US Tue May 4 13:26:04 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 04 May 1999 08:26:04 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Tue, 04 May 1999 01:16:17 EDT." <000701be95ed$3d594180$dca22299@tim> References: <000701be95ed$3d594180$dca22299@tim> Message-ID: <199905041226.IAA07627@eric.cnri.reston.va.us> [Tim] > So some gratuitous differences, and maybe a killer: Guido hasn't had much > kind to say about "longest" (aka POSIX) matching semantics. > > An example from the page: > > (week|wee)(night|knights) > matches all ten characters of `weeknights' > > which means it matched 'wee' and 'knights'; Python/Perl match 'week' and > 'night'. > > It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA > is correct; indeed, it's a pain to get that behavior any other way! Possibly contradicting what I once said about DFAs (I have no idea what I said any more :-): I think we shouldn't be hung up about the subtleties of DFA vs. NFA; for most people, the Perl-compatibility simply means that they can use the same metacharacters. My guess is that people don'y so much translate long Perl regexp's to Python but simply transport their (always incomplete -- Larry Wall *wants* it that way :-) knowledge of Perl regexps to Python. My meta-guess is that this is also Henry Spencer's and John Ousterhout's guess. As for Larry Wall, I guess he really doesn't care :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@cnri.reston.va.us Tue May 4 17:14:41 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Tue, 4 May 1999 12:14:41 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us> References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us> Message-ID: <14127.6410.646122.342115@amarok.cnri.reston.va.us> Guido van Rossum writes: >Possibly contradicting what I once said about DFAs (I have no idea >what I said any more :-): I think we shouldn't be hung up about the >subtleties of DFA vs. NFA; for most people, the Perl-compatibility >simply means that they can use the same metacharacters. My guess is I don't like slipping in such a change to the semantics with no visible change to the module name or interface. On the other hand, if it's not NFA-based, then it can provide POSIX semantics without danger of taking exponential time to determine the longest match. BTW, there's an interesting reference, I assume to this code, in _Mastering Regular Expressions_; Spencer is quoted on page 121 as saying it's "at worst quadratic in text size.". Anyway, we can let it slide until a Python interface gets written. -- A.M. Kuchling http://starship.python.net/crew/amk/ In the black shadow of the Baba Yaga babies screamed and mothers miscarried; milk soured and men went mad. -- In SANDMAN #38: "The Hunt" From guido@CNRI.Reston.VA.US Tue May 4 17:19:06 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 04 May 1999 12:19:06 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Tue, 04 May 1999 12:14:41 EDT." <14127.6410.646122.342115@amarok.cnri.reston.va.us> References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us> <14127.6410.646122.342115@amarok.cnri.reston.va.us> Message-ID: <199905041619.MAA08408@eric.cnri.reston.va.us> > BTW, there's an interesting reference, I assume to this code, in > _Mastering Regular Expressions_; Spencer is quoted on page 121 as > saying it's "at worst quadratic in text size.". Not sure if that was the same code -- this is *new* code, not Spencer's old code. I think Friedl's book is older than the current code. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Wed May 5 06:37:02 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 5 May 1999 01:37:02 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us> Message-ID: <000701be96b9$4e434460$799e2299@tim> I've consistently found that the best way to kill a thread is to rename it accurately . Agree w/ Guido that few people really care about the differing semantics. Agree w/ Andrew that it's bad to pull a semantic switcheroo at this stage anyway: code will definitely break. Like \b(?: (?Pand|if|else|...) | (?P[a-zA-Z_]\w*) )\b The (special)|(general) idiom relies on left-to-right match-and-out searching of alternatives to do its job correctly. Not to mention that \b is not a word-boundary assertion in the new pkg (talk about pointlessly irritating differences! at least this one could be easily hidden via brainless preprocessing). Over the long run, moving to a DFA locks Python out of the directions Perl is *moving*, namely embedding all sorts of runtime gimmicks in regexps that exploit knowing the "state of the match so far". DFAs don't work that way. I don't mind losing those possibilities, because I think the regexp sublanguage is strained beyond its limits already. But that's a decision with Big Consequences, so deserves some thought. I'd definitely like the (sometimes dramatically) increased speed a DFA can offer (btw, this code appears to use a lazily-generated DFA, to avoid the exponential *compile*-time a straightforward DFA implementation can suffer -- the code is very complex and lacks any high-level internal docs, so we better hope Henry stays in love with it <0.5 wink>). > ... > My guess is that people don't so much translate long Perl regexp's > to Python but simply transport their (always incomplete -- Larry Wall > *wants* it that way :-) knowledge of Perl regexps to Python. This is directly proportional to the number of feeble CGI programmers Python attracts . The good news is that they wouldn't know an NFA from a DFA if Larry bit Henry on the ass ... > My meta-guess is that this is also Henry Spencer's and John > Ousterhout's guess. I think Spencer strongly favors DFA semantics regardless of fashion, and Ousterhout is a pragmatist. So I trust JO's judgment more <0.9 wink>. > As for Larry Wall, I guess he really doesn't care :-) I expect he cares a lot! Because a DFA would prevent Perl from going even more insane in its present direction. About the age of the code, postings to comp.lang.tcl have Henry saying he was working on the alpha version intensely as recently as Decemeber ('98). A few complaints about the alpha release trickled in, about regexp compile speed and regexp matching speed in specific cases. Perhaps paradoxically, the latter were about especially simple regexps with long fixed substrings (where this mountain of sophisticated machinery is likely to get beat cold by an NFA with some fixed-substring lookahead smarts -- which latter Henry intended to graft into this pkg too). [Andrew] > BTW, there's an interesting reference, I assume to this code, in > _Mastering Regular Expressions_; Spencer is quoted on page 121 as > saying it's "at worst quadratic in text size.". [Guido] > Not sure if that was the same code -- this is *new* code, not > Spencer's old code. I think Friedl's book is older than the current > code. I expect this is an invariant, though: it's not natural for a DFA to know where subexpression matches begin and end, and there's a pile of xxx_dissect functions in regexec.c that use what strongly appear to be worst-case quadratic-time algorithms for figuring that out after it's known that the overall expression has *a* match. Expect too, but don't know, that only pathological cases are actually expensive. Question: has this package been released in any other context, or is it unique to Tcl? I searched in vain for an announcement (let alone code) from Henry, or any discussion of this code outside the Tcl world. whatever-happens-i-vote-we-let-them-debug-it-ly y'rs - tim From gstein@lyra.org Wed May 5 07:22:20 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 4 May 1999 23:22:20 -0700 (PDT) Subject: [Python-Dev] Tcl 8.1's regexp code In-Reply-To: <000701be96b9$4e434460$799e2299@tim> Message-ID: On Wed, 5 May 1999, Tim Peters wrote: >... > Question: has this package been released in any other context, or is it > unique to Tcl? I searched in vain for an announcement (let alone code) from > Henry, or any discussion of this code outside the Tcl world. Apache uses it. However, the Apache guys have considered possibility updating the thing. I gather that they have a pretty old snapshot. Another guy mentioned PCRE and I pointed out that Python uses it for its regex support. In other words, if Apache *does* update the code, then it may be that Apache will drop the HS engine in favor of PCRE. Cheers, -g -- Greg Stein, http://www.lyra.org/ From Ivan.Porres@abo.fi Wed May 5 09:29:21 1999 From: Ivan.Porres@abo.fi (Ivan Porres Paltor) Date: Wed, 05 May 1999 11:29:21 +0300 Subject: [Python-Dev] Python for Small Systems patch Message-ID: <37300161.8DFD1D7F@abo.fi> Python for Small Systems is a minimal version of the python interpreter, intended to run on small embedded systems with a limited amount of memory. Since there is some interest in the newsgroup, we have decide to release an alpha version of the patch. You can download the patch from the following page: http://www.abo.fi/~iporres/python There is no documentation about the changes, but I guess that it is not so difficult to figure out what Raul has been doing. There are some simple examples in the Demo/hitachi directory. The configure scripts are broken. We plan to modify the configure scripts for cross-compilation. We are still testing, cleaning and trying to reduce the memory requirements of the patched interpreter. We also plan to write some documentation. Please send comments to Raul (rparra@abo.fi) or to me (iporres@abo.fi), Regards, Ivan -- Ivan Porres Paltor Turku Centre for Computer Science �bo Akademi, Department of Computer Science Phone: +358-2-2154033 Lemmink�inengatan 14A FIN-20520 Turku - Finland http://www.abo.fi/~iporres From tismer@appliedbiometrics.com Wed May 5 12:52:24 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 05 May 1999 13:52:24 +0200 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> Message-ID: <373030F8.21B73451@appliedbiometrics.com> Ivan Porres Paltor wrote: > > Python for Small Systems is a minimal version of the python interpreter, > intended to run on small embedded systems with a limited amount of > memory. > > Since there is some interest in the newsgroup, we have decide to release > an alpha version of the patch. You can download the patch from the > following page: > > http://www.abo.fi/~iporres/python > > There is no documentation about the changes, but I guess that it is not > so difficult to figure out what Raul has been doing. Ivan, small Python is a very interesting thing, thanks for the preview. But, aren't 12600 lines of diff a little too much to call it "not difficult to figure out"? :-) The very last line was indeed helpful: +++ Pss/miniconfigure Tue Mar 16 16:59:42 1999 @@ -0,0 +1 @@ +./configure --prefix="/home/rparra/python/Python-1.5.1" --without-complex --without-float --without-long --without-file --without-libm --without-libc --without-fpectl --without-threads --without-dec-threads --with-libs= But I'd be interested in a brief list of which other features are out, and even more which structures were changed. Would that be possible? thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From Ivan.Porres@abo.fi Wed May 5 14:17:17 1999 From: Ivan.Porres@abo.fi (Ivan Porres Paltor) Date: Wed, 05 May 1999 16:17:17 +0300 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> Message-ID: <373044DD.FE4499E@abo.fi> Christian Tismer wrote: > Ivan, > small Python is a very interesting thing, > thanks for the preview. > > But, aren't 12600 lines of diff a little too much > to call it "not difficult to figure out"? :-) Raul Parra (rpb), the author of the patch, got the "source scissors" (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a embedded system with some RAM, no keyboard, no screen and no OS. An example application can be a printer where the print jobs are python bytecompiled scripts (instead of postscript). We plan to write some documentation about the patch. Meanwhile, here are some of the changes: WITHOUT_PARSER, WITHOUT_COMPILER Defining WITHOUT_PARSER removes the parser. This has a lot of implications (no eval() !) but saves a lot of memory. The interpreter can only execute byte-compiled scripts, that is PyCodeObjects. Most embedded processors have poor floating point capabilities. (They can not compete with DSP's): WITHOUT-COMPLEX Removes support for complex numbers WITHOUT-LONG Removes long numbers WITHOUT-FLOAT Removes floating point numbers Dependences with the OS: WITHOUT-FILE Removes file objects. No file, no print, no input, no interactive prompt. This is not to bad in a device without hard disk, keyboard or screen... WITHOUT-GETPATH Removes dependencies with os path.(Probabily this change should be integrated with WITHOUT-FILE) These changes render most of the standard modules unusable. There are no fundamental changes on the interpter, just cut and cut.... Ivan -- Ivan Porres Paltor Turku Centre for Computer Science �bo Akademi, Department of Computer Science Phone: +358-2-2154033 Lemmink�inengatan 14A FIN-20520 Turku - Finland http://www.abo.fi/~iporres From tismer@appliedbiometrics.com Wed May 5 14:31:05 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 05 May 1999 15:31:05 +0200 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> <373044DD.FE4499E@abo.fi> Message-ID: <37304819.AD636B67@appliedbiometrics.com> Ivan Porres Paltor wrote: > > Christian Tismer wrote: > > Ivan, > > small Python is a very interesting thing, > > thanks for the preview. > > > > But, aren't 12600 lines of diff a little too much > > to call it "not difficult to figure out"? :-) > > Raul Parra (rpb), the author of the patch, got the "source scissors" > (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a > embedded system with some RAM, no keyboard, no screen and no OS. An > example application can be a printer where the print jobs are python > bytecompiled scripts (instead of postscript). > > We plan to write some documentation about the patch. Meanwhile, here are > some of the changes: Many thanks, this is really interesting > These changes render most of the standard modules unusable. > There are no fundamental changes on the interpter, just cut and cut.... I see. A last thing which I'm curious about is the executable size. If this can be compared to a Windows dll at all. Did you compile without the changes for your target as well? How is the ratio? The python15.dll file contains everything of core Python and is about 560 KB large. If your engine goes down to, say below 200 KB, this could be a great thing for embedding Python into other apps. ciao & thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed May 5 15:55:40 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 5 May 1999 10:55:40 -0400 (EDT) Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) References: <199905041226.IAA07627@eric.cnri.reston.va.us> <000701be96b9$4e434460$799e2299@tim> Message-ID: <14128.23532.499380.835737@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Over the long run, moving to a DFA locks Python out of the TP> directions Perl is *moving*, namely embedding all sorts of TP> runtime gimmicks in regexps that exploit knowing the "state of TP> the match so far". DFAs don't work that way. I don't mind TP> losing those possibilities, because I think the regexp TP> sublanguage is strained beyond its limits already. But that's TP> a decision with Big Consequences, so deserves some thought. I know zip about the internals of the various regexp package. But as far as the Python level interface, would it be feasible to support both as underlying regexp engines underneath re.py? The idea would be that you'd add an extra flag (re.PERL / re.TCL ? re.DFA / re.NFA ? re.POSIX / re.USEFUL ? :-) that would select the engine and compiler. Then all the rest of the magic happens behind the scenes, with appropriate exceptions thrown if there are syntax mismatches in the regexp that can't be worked around by preprocessors, etc. Or would that be more confusing than yet another different regexp module? -Barry From tim_one@email.msn.com Wed May 5 16:55:20 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 5 May 1999 11:55:20 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code In-Reply-To: Message-ID: <000601be970f$adef5740$a59e2299@tim> [Tim] > Question: has this package [Tcl's 8.1 regexp support] been released in > any other context, or is it unique to Tcl? I searched in vain for an > announcement (let alone code) from Henry, or any discussion of this code > outside the Tcl world. [Greg Stein] > Apache uses it. > > However, the Apache guys have considered possibility updating the thing. I > gather that they have a pretty old snapshot. Another guy mentioned PCRE > and I pointed out that Python uses it for its regex support. In other > words, if Apache *does* update the code, then it may be that Apache will > drop the HS engine in favor of PCRE. Hmm. I just downloaded the Apache 1.3.4 source to check on this, and it appears to be using a lightly massaged version of Spencer's old (circa '92-'94) just-POSIX regexp package. Henry has been distributing regexp pkgs for a loooong time . The Tcl 8.1 regexp pkg is much hairier. If the Apache folk want to switch in order to get the Perl regexp syntax extensions, this Tcl version is worth looking at too. If they want to switch for some other reason, it would be good to know what that is! The base pkg Apache uses is easily available all over the web; the pkg Tcl 8.1 is using I haven't found anywhere except in the Tcl download (which is why I'm wondering about it -- so far, it doesn't appear to be distributed by Spencer himself, in a non-Tcl-customized form). looks-like-an-entirely-new-pkg-to-me-ly y'rs - tim From beazley@cs.uchicago.edu Wed May 5 17:54:45 1999 From: beazley@cs.uchicago.edu (David Beazley) Date: Wed, 5 May 1999 11:54:45 -0500 (CDT) Subject: [Python-Dev] My (possibly delusional) book project Message-ID: <199905051654.LAA11410@tartarus.cs.uchicago.edu> Although this is a little off-topic for the developer list, I want to fill people in on a new Python book project. A few months ago, I was approached about doing a new Python reference book and I've since decided to proceed with the project (after all, an increased presence at the bookstore is probably a good thing :-). In any event, my "vision" for this book is to take the material in the Python tutorial, language reference, library reference, and extension guide and squeeze it into a compact book no longer than 300 pages (and hopefully without having to use a 4-point font). Actually, what I'm really trying to do is write something in a style similar to the K&R C Programming book (very terse, straight to the point, and technically accurate). The book's target audience is experienced/expert programmers. With this said, I would really like to get feedback from the developer community about this project in a few areas. First, I want to make sure the language reference is in sync with the latest version of Python, that it is as accurate as possible, and that it doesn't leave out any important topics or recent developments. Second, I would be interested in knowing how to emphasize certain topics (for instance, should I emphasize class-based exceptions over string-based exceptions even though most books only cover the former case?). The other big area is the library reference. Given the size of the library, I'm going to cut a number of modules out. However, the choice of what to cut is not entirely clear (for now, it's a judgment call on my part). All of the work in progress for this project is online at: http://rustler.cs.uchicago.edu/~beazley/essential/reference.html I would love to get constructive feedback about this from other developers. Of course, I'll keep people posted in any case. Cheers, Dave From tim_one@email.msn.com Thu May 6 06:43:16 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 6 May 1999 01:43:16 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) In-Reply-To: <14128.23532.499380.835737@anthem.cnri.reston.va.us> Message-ID: <000d01be9783$57543940$2ca22299@tim> [Tim notes that moving to a DFA regexp engine would rule out some future aping of Perl mistakes ] [Barry "The Great Compromiser" Warsaw] > I know zip about the internals of the various regexp package. But as > far as the Python level interface, would it be feasible to support > both as underlying regexp engines underneath re.py? The idea would be > that you'd add an extra flag (re.PERL / re.TCL ? re.DFA / re.NFA ? > re.POSIX / re.USEFUL ? :-) that would select the engine and compiler. > Then all the rest of the magic happens behind the scenes, with > appropriate exceptions thrown if there are syntax mismatches in the > regexp that can't be worked around by preprocessors, etc. > > Or would that be more confusing than yet another different regexp > module? It depends some on what percentage of the Python distribution Guido wants to devote to regexp code <0.6 wink>; the Tcl pkg would be the largest block of code in Modules/, where regexp packages already consume more than anything else. It's a lot of delicate, difficult code. Someone would need to step up and champion each alternative package. I haven't asked Andrew lately, but I'd bet half a buck the thrill of supporting pcre has waned. If there were competing packages, your suggested interface is fine. I just doubt the Python developers will support more than one (Andrew may still be young, but he can't possibly still be naive enough to sign up for two of these nightmares ). i'm-so-old-i-never-signed-up-for-one-ly y'rs - tim From rushing@nightmare.com Thu May 13 07:34:19 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Wed, 12 May 1999 23:34:19 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905070507.BAA22545@python.org> References: <199905070507.BAA22545@python.org> Message-ID: <14138.28243.553816.166686@seattle.nightmare.com> [list has been quiet, thought I'd liven things up a bit. 8^)] I'm not sure if this has been brought up before in other forums, but has there been discussion of separating the Python and C invocation stacks, (i.e., removing recursive calls to the intepreter) to facilitate coroutines or first-class continuations? One of the biggest barriers to getting others to use asyncore/medusa is the need to program in continuation-passing-style (callbacks, callbacks to callbacks, state machines, etc...). Usually there has to be an overriding requirement for speed/scalability before someone will even look into it. And even when you do 'get' it, there are limits to how inside-out your thinking can go. 8^) If Python had coroutines/continuations, it would be possible to hide asyncore-style select()/poll() machinery 'behind the scenes'. I believe that Concurrent ML does exactly this... Other advantages might be restartable exceptions, different threading models, etc... -Sam rushing@nightmare.com rushing@eGroups.net From mal@lemburg.com Thu May 13 09:23:13 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 13 May 1999 10:23:13 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> Message-ID: <373A8BF1.AE124BF@lemburg.com> rushing@nightmare.com wrote: > > [list has been quiet, thought I'd liven things up a bit. 8^)] Well, there certainly is enough on the todo list... it's probably the usual "ain't got no time" thing. > I'm not sure if this has been brought up before in other forums, but > has there been discussion of separating the Python and C invocation > stacks, (i.e., removing recursive calls to the intepreter) to > facilitate coroutines or first-class continuations? Wouldn't it be possible to move all the C variables passed to eval_code() via the execution frame ? AFAIK, the frame is generated on every call to eval_code() and thus could also be generated *before* calling it. > One of the biggest barriers to getting others to use asyncore/medusa > is the need to program in continuation-passing-style (callbacks, > callbacks to callbacks, state machines, etc...). Usually there has to > be an overriding requirement for speed/scalability before someone will > even look into it. And even when you do 'get' it, there are limits to > how inside-out your thinking can go. 8^) > > If Python had coroutines/continuations, it would be possible to hide > asyncore-style select()/poll() machinery 'behind the scenes'. I > believe that Concurrent ML does exactly this... > > Other advantages might be restartable exceptions, different threading > models, etc... Don't know if moving the C stack stuff into the frame objects will get you the desired effect: what about other things having state (e.g. connections or files), that are not even touched by this mechanism ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 232 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From rushing@nightmare.com Thu May 13 10:40:19 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Thu, 13 May 1999 02:40:19 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373A8BF1.AE124BF@lemburg.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <373A8BF1.AE124BF@lemburg.com> Message-ID: <14138.38550.89759.752058@seattle.nightmare.com> M.-A. Lemburg writes: > Wouldn't it be possible to move all the C variables passed to > eval_code() via the execution frame ? AFAIK, the frame is > generated on every call to eval_code() and thus could also > be generated *before* calling it. I think this solves half of the problem. The C stack is both a value stack and an execution stack (i.e., it holds variables and return addresses). Getting rid of arguments (and a return value!) gets rid of the need for the 'value stack' aspect. In aiming for an enter-once, exit-once VM, the thorniest part is to somehow allow python->c->python calls. The second invocation could never save a continuation because its execution context includes a C frame. This is a general problem, not specific to Python; I probably should have thought about it a bit before posting... > Don't know if moving the C stack stuff into the frame objects > will get you the desired effect: what about other things having > state (e.g. connections or files), that are not even touched > by this mechanism ? I don't think either of those cause 'real' problems (i.e., nothing should crash that assumes an open file or socket), but there may be other stateful things that might. I don't think that refcounts would be a problem - a saved continuation wouldn't be all that different from an exception traceback. -Sam p.s. Here's a tiny VM experiment I wrote a while back, to explain what I mean by 'stackless': http://www.nightmare.com/stuff/machine.h http://www.nightmare.com/stuff/machine.c Note how OP_INVOKE (the PROC_CLOSURE clause) pushes new context onto heap-allocated data structures rather than calling the VM recursively. From skip@mojam.com (Skip Montanaro) Thu May 13 12:38:39 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 13 May 1999 07:38:39 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14138.28243.553816.166686@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> Message-ID: <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Sam> I'm not sure if this has been brought up before in other forums, Sam> but has there been discussion of separating the Python and C Sam> invocation stacks, (i.e., removing recursive calls to the Sam> intepreter) to facilitate coroutines or first-class continuations? I thought Guido was working on that for the mobile agent stuff he was working on at CNRI. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu May 13 16:10:52 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 13 May 1999 11:10:52 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Message-ID: <14138.60284.584739.711112@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I thought Guido was working on that for the mobile agent stuff SM> he was working on at CNRI. Nope, we decided that we could accomplish everything we needed without this. We occasionally revisit this but Guido keeps insisting it's a lot of work for not enough benefit :-) -Barry From guido@CNRI.Reston.VA.US Thu May 13 16:19:10 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 13 May 1999 11:19:10 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Thu, 13 May 1999 07:38:39 EDT." <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Message-ID: <199905131519.LAA01097@eric.cnri.reston.va.us> Interesting topic! While I 'm on the road, a few short notes. > I thought Guido was working on that for the mobile agent stuff he was > working on at CNRI. Indeed. At least I planned on working on it. I ended up abandoning the idea because I expected it would be a lot of work and I never had the time (same old story indeed). Sam also hit it on the nail: the hardest problem is what to do about all the places where C calls back into Python. I've come up with two partial solutions: (1) allow for a way to arrange for a call to be made immediately after you return to the VM from C; this would take care of apply() at least and a few other "tail-recursive" cases; (2) invoke a new VM when C code needs a Python result, requiring it to return. The latter clearly breaks certain uses of coroutines but could probably be made to work most of the time. Typical use of the 80-20 rule. And I've just come up with a third solution: a variation on (1) where you arrange *two* calls: one to Python and then one to C, with the result of the first. (And a bit saying whether you want the C call to be made even when an exception happened.) In general, I still think it's a cool idea, but I also still think that continuations are too complicated for most programmers. (This comes from the realization that they are too complicated for me!) Corollary: even if we had continuations, I'm not sure if this would take away the resistance against asyncore/asynchat. Of course I could be wrong. Different suggestion: it would be cool to work on completely separating out the VM from the rest of Python, through some kind of C-level API specification. Two things should be possiblw with this new architecture: (1) small platform ports could cut out the interactive interpreter, the parser and compiler, and certain data types such as long, complex and files; (2) there could be alternative pluggable VMs with certain desirable properties such as platform-specific optimization (Christian, are you listening? :-). I think the most challenging part might be defining an API for passing in the set of supported object types and operations. E.g. the EXEC_STMT opcode needs to be be implemented in a way that allows "exec" to be absent from the language. Perhaps an __exec__ function (analogous to __import__) is the way to go. The set of built-in functions should also be passed in, so that e.g. one can easily leave out open(), eval() and comppile(), complex(), long(), float(), etc. I think it would be ideal if no #ifdefs were needed to remove features (at least not in the VM code proper). Fortunately, the VM doesn't really know about many object types -- frames, fuctions, methods, classes, ints, strings, dictionaries, tuples, tracebacks, that may be all it knows. (Lists?) Gotta run, --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Thu May 13 20:50:44 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 13 May 1999 21:50:44 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <199905131519.LAA01097@eric.cnri.reston.va.us> Message-ID: <01d501be9d79$e4890060$f29b12c2@pythonware.com> > In general, I still think it's a cool idea, but I also still think > that continuations are too complicated for most programmers. (This > comes from the realization that they are too complicated for me!) in an earlier life, I used non-preemtive threads (that is, explicit yields) and co-routines to do some really cool stuff with very little code. looks like a stack-less inter- preter would make it trivial to implement that. might just be nostalgia, but I think I would give an arm or two to get that (not necessarily my own, though ;-) From rushing@nightmare.com Fri May 14 03:00:09 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Thu, 13 May 1999 19:00:09 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> Message-ID: <14139.30970.644343.612721@seattle.nightmare.com> Guido van Rossum writes: > I've come up with two partial solutions: (1) allow for a way to > arrange for a call to be made immediately after you return to the > VM from C; this would take care of apply() at least and a few > other "tail-recursive" cases; (2) invoke a new VM when C code > needs a Python result, requiring it to return. The latter clearly > breaks certain uses of coroutines but could probably be made to > work most of the time. Typical use of the 80-20 rule. I know this is disgusting, but could setjmp/longjmp 'automagically' force a 'recursive call' to jump back into the top-level loop? This would put some serious restraint on what C called from Python could do... I think just about any Scheme implementation has to solve this same problem... I'll dig through my collection of them for ideas. > In general, I still think it's a cool idea, but I also still think > that continuations are too complicated for most programmers. (This > comes from the realization that they are too complicated for me!) > Corollary: even if we had continuations, I'm not sure if this would > take away the resistance against asyncore/asynchat. Of course I could > be wrong. Theoretically, you could have a bit of code that looked just like 'normal' imperative code, that would actually be entering and exiting the context for non-blocking i/o. If it were done right, the same exact code might even run under 'normal' threads. Recently I've written an async server that needed to talk to several other RPC servers, and a mysql server. Pseudo-example, with possibly-async calls in UPPERCASE: auth, archive = db.FETCH_USER_INFO (user) if verify_login(user,auth): rpc_server = self.archive_servers[archive] group_info = rpc_server.FETCH_GROUP_INFO (group) if valid (group_info): return rpc_server.FETCH_MESSAGE (message_number) else: ... else: ... This code in CPS is a horrible, complicated mess, it takes something like 8 callback methods, variables and exceptions have to be passed around in 'continuation' objects. It's hairy because there are three levels of callback state. Ugh. If Python had closures, then it would be a *little* easier, but would still make the average Pythoneer swoon. Closures would let you put the above logic all in one method, but the code would still be 'inside-out'. > Different suggestion: it would be cool to work on completely > separating out the VM from the rest of Python, through some kind of > C-level API specification. I think this is a great idea. I've been staring at python bytecodes a bit lately thinking about how to do something like this, for some subset of Python. [...] Ok, we've all seen the 'stick'. I guess I should give an example of the 'carrot': I think that a web server built on such a Python could have the performance/scalability of thttpd, with the ease-of-programming of Roxen. As far as I know, there's nothing like it out there. Medusa would be put out to pasture. 8^) -Sam From guido@CNRI.Reston.VA.US Fri May 14 13:03:31 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 14 May 1999 08:03:31 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Thu, 13 May 1999 19:00:09 PDT." <14139.30970.644343.612721@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> Message-ID: <199905141203.IAA01808@eric.cnri.reston.va.us> > I know this is disgusting, but could setjmp/longjmp 'automagically' > force a 'recursive call' to jump back into the top-level loop? This > would put some serious restraint on what C called from Python could > do... Forget about it. setjmp/longjmp are invitations to problems. I also assume that they would interfere badly with C++. > I think just about any Scheme implementation has to solve this same > problem... I'll dig through my collection of them for ideas. Anything that assumes knowledge about how the C compiler and/or the CPU and OS lay out the stack is a no-no, because it means that the first thing one has to do for a port to a new architecture is figure out how the stack is laid out. Another thread in this list is porting Python to microplatforms like PalmOS. Typically the scheme Hackers are not afraid to delve deep into the machine, but I refuse to do that -- I think it's too risky. > > In general, I still think it's a cool idea, but I also still think > > that continuations are too complicated for most programmers. (This > > comes from the realization that they are too complicated for me!) > > Corollary: even if we had continuations, I'm not sure if this would > > take away the resistance against asyncore/asynchat. Of course I could > > be wrong. > > Theoretically, you could have a bit of code that looked just like > 'normal' imperative code, that would actually be entering and exiting > the context for non-blocking i/o. If it were done right, the same > exact code might even run under 'normal' threads. Yes -- I remember in 92 or 93 I worked out a way to emulat coroutines with regular threads. (I think in cooperation with Steve Majewski.) > Recently I've written an async server that needed to talk to several > other RPC servers, and a mysql server. Pseudo-example, with > possibly-async calls in UPPERCASE: > > auth, archive = db.FETCH_USER_INFO (user) > if verify_login(user,auth): > rpc_server = self.archive_servers[archive] > group_info = rpc_server.FETCH_GROUP_INFO (group) > if valid (group_info): > return rpc_server.FETCH_MESSAGE (message_number) > else: > ... > else: > ... > > This code in CPS is a horrible, complicated mess, it takes something > like 8 callback methods, variables and exceptions have to be passed > around in 'continuation' objects. It's hairy because there are three > levels of callback state. Ugh. Agreed. > If Python had closures, then it would be a *little* easier, but would > still make the average Pythoneer swoon. Closures would let you put > the above logic all in one method, but the code would still be > 'inside-out'. I forget how this worked :-( > > Different suggestion: it would be cool to work on completely > > separating out the VM from the rest of Python, through some kind of > > C-level API specification. > > I think this is a great idea. I've been staring at python bytecodes a > bit lately thinking about how to do something like this, for some > subset of Python. > > [...] > > Ok, we've all seen the 'stick'. I guess I should give an example of > the 'carrot': I think that a web server built on such a Python could > have the performance/scalability of thttpd, with the > ease-of-programming of Roxen. As far as I know, there's nothing like > it out there. Medusa would be put out to pasture. 8^) I'm afraid I haven't kept up -- what are Roxen and thttpd? What do they do that Apache doesn't? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Fri May 14 14:16:13 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 14 May 1999 15:16:13 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> > I'm afraid I haven't kept up -- what are Roxen and thttpd? What do > they do that Apache doesn't? http://www.roxen.com/ a lean and mean secure web server written in Pike (http://pike.idonex.se/), from a company here in Link�ping. From tismer@appliedbiometrics.com Fri May 14 16:15:20 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 14 May 1999 17:15:20 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <373C3E08.FCCB141B@appliedbiometrics.com> Guido van Rossum wrote: [setjmp/longjmp -no-no] > Forget about it. setjmp/longjmp are invitations to problems. I also > assume that they would interfere badly with C++. > > > I think just about any Scheme implementation has to solve this same > > problem... I'll dig through my collection of them for ideas. > > Anything that assumes knowledge about how the C compiler and/or the > CPU and OS lay out the stack is a no-no, because it means that the > first thing one has to do for a port to a new architecture is figure > out how the stack is laid out. Another thread in this list is porting > Python to microplatforms like PalmOS. Typically the scheme Hackers > are not afraid to delve deep into the machine, but I refuse to do that > -- I think it's too risky. ... I agree that this is generally bad. While it's a cakewalk to do a stack swap for the few (X86 based:) platforms where I work with. This is much less than a thread change. But on the general issues: Can the Python-calls-C and C-calls-Python problem just be solved by turning the whole VM state into a data structure, including a Python call stack which is independent? Maybe this has been mentioned already. This might give a little slowdown, but opens possibilities like continuation-passing style, and context switches between different interpreter states would be under direct control. Just a little dreaming: Not using threads, but just tiny interpreter incarnations with local state, and a special C call or better a new opcode which activates the next state in some list (of course a Python list). This would automagically produce ICON iterators (duck) and coroutines (cover). If I guess right, continuation passing could be done by just shifting tiny tuples around. Well, Tim, help me :-) [closures] > > I think this is a great idea. I've been staring at python bytecodes a > > bit lately thinking about how to do something like this, for some > > subset of Python. Lumberjack? How is it going? [to Sam] ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri May 14 16:32:51 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 14 May 1999 11:32:51 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> Message-ID: <14140.16931.987089.887772@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> a lean and mean secure web server written in Pike FL> (http://pike.idonex.se/), from a company here in FL> Link�ping. Interesting off-topic Pike connection. My co-maintainer for CC-Mode original came on board to add Pike support, which has a syntax similar enough to C to be easily integrated. I think I've had as much success convincing him to use Python as he's had convincing me to use Pike :-) -Barry From gstein@lyra.org Fri May 14 22:54:02 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 14 May 1999 14:54:02 -0700 Subject: [Python-Dev] Roxen (was Re: [Python-Dev] 'stackless' python?) References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> <14140.16931.987089.887772@anthem.cnri.reston.va.us> Message-ID: <373C9B7A.3676A910@lyra.org> Barry A. Warsaw wrote: > > >>>>> "FL" == Fredrik Lundh writes: > > FL> a lean and mean secure web server written in Pike > FL> (http://pike.idonex.se/), from a company here in > FL> Link�ping. > > Interesting off-topic Pike connection. My co-maintainer for CC-Mode > original came on board to add Pike support, which has a syntax similar > enough to C to be easily integrated. I think I've had as much success > convincing him to use Python as he's had convincing me to use Pike :-) Heh. Pike is an outgrowth of the MUD world's LPC programming language. A guy named "Profezzorn" started a project (in '94?) to redevelop an LPC compiler/interpreter ("driver") from scratch to avoid some licensing constraints. The project grew into a generalized network handler, since MUDs' typical designs are excellent for these tasks. From there, you get the Roxen web server. Cheers, -g -- Greg Stein, http://www.lyra.org/ From rushing@nightmare.com Sat May 15 00:36:11 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Fri, 14 May 1999 16:36:11 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <14140.44469.848840.740112@seattle.nightmare.com> Guido van Rossum writes: > > If Python had closures, then it would be a *little* easier, but would > > still make the average Pythoneer swoon. Closures would let you put > > the above logic all in one method, but the code would still be > > 'inside-out'. > > I forget how this worked :-( [with a faked-up lambda-ish syntax] def thing (a): return do_async_job_1 (a, lambda (b): if (a>1): do_async_job_2a (b, lambda (c): [...] ) else: do_async_job_2b (a,b, lambda (d,e,f): [...] ) ) The call to do_async_job_1 passes 'a', and a callback, which is specified 'in-line'. You can follow the logic of something like this more easily than if each lambda is spun off into a different function/method. > > I think that a web server built on such a Python could have the > > performance/scalability of thttpd, with the ease-of-programming > > of Roxen. As far as I know, there's nothing like it out there. > > Medusa would be put out to pasture. 8^) > > I'm afraid I haven't kept up -- what are Roxen and thttpd? What do > they do that Apache doesn't? thttpd (& Zeus, Squid, Xitami) use select()/poll() to gain performance and scalability, but suffer from the same programmability problem as Medusa (only worse, 'cause they're in C). Roxen is written in Pike, a c-like language with gc, threads, etc... Roxen is I think now the official 'GNU Web Server'. Here's an interesting web-server comparison chart: http://www.acme.com/software/thttpd/benchmarks.html -Sam From guido@CNRI.Reston.VA.US Sat May 15 03:23:24 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 14 May 1999 22:23:24 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Fri, 14 May 1999 16:36:11 PDT." <14140.44469.848840.740112@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <14140.44469.848840.740112@seattle.nightmare.com> Message-ID: <199905150223.WAA02457@eric.cnri.reston.va.us> > def thing (a): > return do_async_job_1 (a, > lambda (b): > if (a>1): > do_async_job_2a (b, > lambda (c): > [...] > ) > else: > do_async_job_2b (a,b, > lambda (d,e,f): > [...] > ) > ) > > The call to do_async_job_1 passes 'a', and a callback, which is > specified 'in-line'. You can follow the logic of something like this > more easily than if each lambda is spun off into a different > function/method. I agree that it is still ugly. > http://www.acme.com/software/thttpd/benchmarks.html I see. Any pointers to a graph of thttp market share? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Sat May 15 08:51:00 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 15 May 1999 03:51:00 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <000701be9ea7$acab7f40$159e2299@tim> [GvR] > ... > Anything that assumes knowledge about how the C compiler and/or the > CPU and OS lay out the stack is a no-no, because it means that the > first thing one has to do for a port to a new architecture is figure > out how the stack is laid out. Another thread in this list is porting > Python to microplatforms like PalmOS. Typically the scheme Hackers > are not afraid to delve deep into the machine, but I refuse to do that > -- I think it's too risky. The Icon language needs a bit of platform-specific context-switching assembly code to support its full coroutine features, although its bread-and-butter generators ("semi coroutines") don't need anything special. The result is that Icon ports sometimes limp for a year before they support full coroutines, waiting for someone wizardly enough to write the necessary code. This can, in fact, be quite difficult; e.g., on machines with HW register windows (where "the stack" can be a complicated beast half buried in hidden machine state, sometimes needing kernel privilege to uncover). Not attractive. Generators are, though . threads-too-ly y'rs - tim From tim_one@email.msn.com Sat May 15 08:51:03 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 15 May 1999 03:51:03 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com> Message-ID: <000801be9ea7$ae45f560$159e2299@tim> [Christian Tismer] > ... > But on the general issues: > Can the Python-calls-C and C-calls-Python problem just be solved > by turning the whole VM state into a data structure, including > a Python call stack which is independent? Maybe this has been > mentioned already. The problem is that when C calls Python, any notion of continuation has to include C's state too, else resuming the continuation won't return into C correctly. The C code that *implements* Python could be reworked to support this, but in the general case you've got some external C extension module calling into Python, and then Python hasn't a clue about its caller's state. I'm not a fan of continuations myself; coroutines can be implemented faithfully via threads (I posted a rather complete set of Python classes for that in the pre-DejaNews days, a bit more flexible than Icon's coroutines); and: > This would automagically produce ICON iterators (duck) > and coroutines (cover). Icon iterators/generators could be implemented today if anyone bothered (Majewski essentially implemented them back around '93 already, but seemed to lose interest when he realized it couldn't be extended to full continuations, because of C/Python stack intertwingling). > If I guess right, continuation passing could be done > by just shifting tiny tuples around. Well, Tim, help me :-) Python-calling-Python continuations should be easily doable in a "stackless" Python; the key ideas were already covered in this thread, I think. The thing that makes generators so much easier is that they always return directly to their caller, at the point of call; so no C frame can get stuck in the middle even under today's implementation; it just requires not deleting the generator's frame object, and adding an opcode to *resume* the frame's execution the next time the generator is called. Unlike as in Icon, it wouldn't even need to be tied to a funky notion of goal-directed evaluation. don't-try-to-traverse-a-tree-without-it-ly y'rs - tim From gstein@lyra.org Sat May 15 09:17:15 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 15 May 1999 01:17:15 -0700 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <14140.44469.848840.740112@seattle.nightmare.com> <199905150223.WAA02457@eric.cnri.reston.va.us> Message-ID: <373D2D8B.390C523C@lyra.org> Guido van Rossum wrote: > ... > > http://www.acme.com/software/thttpd/benchmarks.html > > I see. Any pointers to a graph of thttp market share? thttpd currently has about 70k sites (of 5.4mil found by Netcraft). That puts it at #6. However, it is interesting to note that 60k of those sites are in the .uk domain. I can't figure out who is running it, but I would guess that a large UK-based ISP is hosting a bunch of domains on thttpd. It is somewhat difficult to navigate the various reports (and it never fails that the one you want is not present), but the data is from Netcraft's survey at: http://www.netcraft.com/survey/ Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Sat May 15 17:43:20 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 15 May 1999 12:43:20 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com> Message-ID: <000701be9ef2$0a9713e0$659e2299@tim> [Christian Tismer] > ... > But on the general issues: > Can the Python-calls-C and C-calls-Python problem just be solved > by turning the whole VM state into a data structure, including > a Python call stack which is independent? Maybe this has been > mentioned already. The problem is that when C calls Python, any notion of continuation has to include C's state too, else resuming the continuation won't return into C correctly. The C code that *implements* Python could be reworked to support this, but in the general case you've got some external C extension module calling into Python, and then Python hasn't a clue about its caller's state. I'm not a fan of continuations myself; coroutines can be implemented faithfully via threads (I posted a rather complete set of Python classes for that in the pre-DejaNews days, a bit more flexible than Icon's coroutines); and: > This would automagically produce ICON iterators (duck) > and coroutines (cover). Icon iterators/generators could be implemented today if anyone bothered (Majewski essentially implemented them back around '93 already, but seemed to lose interest when he realized it couldn't be extended to full continuations, because of C/Python stack intertwingling). > If I guess right, continuation passing could be done > by just shifting tiny tuples around. Well, Tim, help me :-) Python-calling-Python continuations should be easily doable in a "stackless" Python; the key ideas were already covered in this thread, I think. The thing that makes generators so much easier is that they always return directly to their caller, at the point of call; so no C frame can get stuck in the middle even under today's implementation; it just requires not deleting the generator's frame object, and adding an opcode to *resume* the frame's execution the next time the generator is called. Unlike as in Icon, it wouldn't even need to be tied to a funky notion of goal-directed evaluation. don't-try-to-traverse-a-tree-without-it-ly y'rs - tim From rushing@nightmare.com Sun May 16 12:10:18 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Sun, 16 May 1999 04:10:18 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <81365478@toto.iv> Message-ID: <14142.40867.103424.764346@seattle.nightmare.com> Tim Peters writes: > I'm not a fan of continuations myself; coroutines can be > implemented faithfully via threads (I posted a rather complete set > of Python classes for that in the pre-DejaNews days, a bit more > flexible than Icon's coroutines); and: Continuations are more powerful than coroutines, though I admit they're a bit esoteric. I programmed in Scheme for years without seeing the need for them. But when you need 'em, you *really* need 'em. No way around it. For my purposes (massively scalable single-process servers and clients) threads don't cut it... for example I have a mailing-list exploder that juggles up to 2048 simultaneous SMTP connections. I think it can go higher - I've tested select() on FreeBSD with 16,000 file descriptors. [...] BTW, I have actually made progress borrowing a bit of code from SCM. It uses the stack-copying technique, along with setjmp/longjmp. It's too ugly and unportable to be a real candidate for inclusion in Official Python. [i.e., if it could be made to work it should be considered a stopgap measure for the desperate]. I haven't tested it thoroughly, but I have successfully saved and invoked (and reinvoked) a continuation. Caveat: I have to turn off Py_DECREF in order to keep it from crashing. | >>> import callcc | >>> saved = None | >>> def thing(n): | ... if n == 2: | ... global saved | ... saved = callcc.new() | ... print 'n==',n | ... if n == 0: | ... print 'Done!' | ... else: | ... thing (n-1) | ... | >>> thing (5) | n== 5 | n== 4 | n== 3 | n== 2 | n== 1 | n== 0 | Done! | >>> saved | | >>> saved.throw (0) | n== 2 | n== 1 | n== 0 | Done! | >>> saved.throw (0) | n== 2 | n== 1 | n== 0 | Done! | >>> I will probably not be able to work on this for a while (baby due any day now), so anyone is welcome to dive right in. I don't have much experience wading through gdb tracking down reference bugs, I'm hoping a brave soul will pick up where I left off. 8^) http://www.nightmare.com/stuff/python-callcc.tar.gz ftp://www.nightmare.com/stuff/python-callcc.tar.gz -Sam From tismer@appliedbiometrics.com Sun May 16 16:31:01 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sun, 16 May 1999 17:31:01 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> Message-ID: <373EE4B5.6EE6A678@appliedbiometrics.com> rushing@nightmare.com wrote: [...] > BTW, I have actually made progress borrowing a bit of code from SCM. > It uses the stack-copying technique, along with setjmp/longjmp. It's > too ugly and unportable to be a real candidate for inclusion in > Official Python. [i.e., if it could be made to work it should be > considered a stopgap measure for the desperate]. I tried it and built it as a Win32 .pyd file, and it seems to work, but... > I haven't tested it thoroughly, but I have successfully saved and > invoked (and reinvoked) a continuation. Caveat: I have to turn off > Py_DECREF in order to keep it from crashing. Indeed, and this seems to be a problem too hard to solve without lots of work. Since you keep a snapshot of the current machine stack, it contains a number of object references which have been valid when the snapshot was taken, but many are most probably invalid when you restart the continuation. I guess, incref-ing all current alive objects on the interpreter stack would be the minimum, maybe more. A tuple of necessary references could be used as an attribute of a Continuation object. I will look how difficult this is. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Sun May 16 19:31:01 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sun, 16 May 1999 20:31:01 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> <373EE4B5.6EE6A678@appliedbiometrics.com> Message-ID: <373F0EE5.A8DE00C5@appliedbiometrics.com> Christian Tismer wrote: > > rushing@nightmare.com wrote: [...] > > I haven't tested it thoroughly, but I have successfully saved and > > invoked (and reinvoked) a continuation. Caveat: I have to turn off > > Py_DECREF in order to keep it from crashing. It is possible, but a little hard. To take a working snapshot of the current thread's stack, one needs not only the stack snapshot which continue.c provides, but also a restorable copy of all frame objects involved so far. A copy of the current frame chain must be built, with proper reference counting of all involved elements. And this is the crux: The current stack pointer of the VM is not present in the frame objects, but hangs around somewhere on the machine stack. Two solutions: 1) modify PyFrameObject by adding a field which holds the stack pointer, when a function is called. I don't like to change the VM in any way for this. 2) use the lasti field which holds the last VM instruction offset. Then scan the opcodes of the code object and calculate the current stack level. This is possible since Guido's code generator creates code with the stack level lexically bound to the code offset. Now we can incref all the referenced objects in the frame. This must be done for the whole chain, which is copied and relinked during that. This chain is then held as a property of the continuation object. To throw the continuation, the current frame chain must be cleared, and the saved one is inserted, together with the machine stack operation which Sam has already. A little hefty, isn't it? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one@email.msn.com Mon May 17 06:42:59 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 17 May 1999 01:42:59 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14142.40867.103424.764346@seattle.nightmare.com> Message-ID: <000f01bea028$1f75c360$fb9e2299@tim> [Sam] > Continuations are more powerful than coroutines, though I admit > they're a bit esoteric. "More powerful" is a tedious argument you should always avoid . > I programmed in Scheme for years without seeing the need for them. > But when you need 'em, you *really* need 'em. No way around it. > > For my purposes (massively scalable single-process servers and > clients) threads don't cut it... for example I have a mailing-list > exploder that juggles up to 2048 simultaneous SMTP connections. I > think it can go higher - I've tested select() on FreeBSD with 16,000 > file descriptors. The other point being that you want to avoid "inside out" logic, though, right? Earlier you posted a kind of ideal: Recently I've written an async server that needed to talk to several other RPC servers, and a mysql server. Pseudo-example, with possibly-async calls in UPPERCASE: auth, archive = db.FETCH_USER_INFO (user) if verify_login(user,auth): rpc_server = self.archive_servers[archive] group_info = rpc_server.FETCH_GROUP_INFO (group) if valid (group_info): return rpc_server.FETCH_MESSAGE (message_number) else: ... else: ... I assume you want to capture a continuation object in the UPPERCASE methods, store it away somewhere, run off to your select/poll/whatever loop, and have it invoke the stored continuation objects as the data they're waiting for arrives. If so, that's got to be the nicest use for continuations I've seen! All invisible to the end user. I don't know how to fake it pleasantly without threads, either, and understand that threads aren't appropriate for resource reasons. So I don't have a nice alternative. > ... > | >>> import callcc > | >>> saved = None > | >>> def thing(n): > | ... if n == 2: > | ... global saved > | ... saved = callcc.new() > | ... print 'n==',n > | ... if n == 0: > | ... print 'Done!' > | ... else: > | ... thing (n-1) > | ... > | >>> thing (5) > | n== 5 > | n== 4 > | n== 3 > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> saved > | > | >>> saved.throw (0) > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> saved.throw (0) > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> Suppose the driver were in a script instead: thing(5) # line 1 print repr(saved) # line 2 saved.throw(0) # line 3 saved.throw(0) # line 4 Then the continuation would (eventually) "return to" the "print repr(saved)" and we'd get an infinite output tail of: Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! ... and never reach line 4. Right? That's the part that Guido hates . takes-one-to-know-one-ly y'rs - tim From tismer@appliedbiometrics.com Mon May 17 08:07:22 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 17 May 1999 09:07:22 +0200 Subject: [Python-Dev] 'stackless' python? References: <000f01bea028$1f75c360$fb9e2299@tim> Message-ID: <373FC02A.69F2D912@appliedbiometrics.com> Tim Peters wrote: [to Sam] > The other point being that you want to avoid "inside out" logic, though, > right? Earlier you posted a kind of ideal: > > Recently I've written an async server that needed to talk to several > other RPC servers, and a mysql server. Pseudo-example, with > possibly-async calls in UPPERCASE: > > auth, archive = db.FETCH_USER_INFO (user) > if verify_login(user,auth): > rpc_server = self.archive_servers[archive] > group_info = rpc_server.FETCH_GROUP_INFO (group) > if valid (group_info): > return rpc_server.FETCH_MESSAGE (message_number) > else: > ... > else: > ... > > I assume you want to capture a continuation object in the UPPERCASE methods, > store it away somewhere, run off to your select/poll/whatever loop, and have > it invoke the stored continuation objects as the data they're waiting for > arrives. > > If so, that's got to be the nicest use for continuations I've seen! All > invisible to the end user. I don't know how to fake it pleasantly without > threads, either, and understand that threads aren't appropriate for resource > reasons. So I don't have a nice alternative. It can always be done with threads, but also without. Tried it last night, with proper refcounting, and it wasn't too easy since I had to duplicate the Python frame chain. ... > Suppose the driver were in a script instead: > > thing(5) # line 1 > print repr(saved) # line 2 > saved.throw(0) # line 3 > saved.throw(0) # line 4 > > Then the continuation would (eventually) "return to" the "print repr(saved)" > and we'd get an infinite output tail of: > > Continuation object at 80d30d0> > n== 2 > n== 1 > n== 0 > Done! > Continuation object at 80d30d0> > n== 2 > n== 1 > n== 0 > Done! This is at the moment exactly what happens, with the difference that after some repetitions we GPF due to dangling references to too often decref'ed objects. My incref'ing prepares for just one re-incarnation and should prevend a second call. But this will be solved, soon. > and never reach line 4. Right? That's the part that Guido hates . Yup. With a little counting, it was easy to survive: def main(): global a a=2 thing (5) a=a-1 if a: saved.throw (0) Weird enough and needs a much better interface. But finally I'm quite happy that it worked so smoothly after just a couple of hours (well, about six :) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing@nightmare.com Mon May 17 10:46:29 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Mon, 17 May 1999 02:46:29 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <000f01bea028$1f75c360$fb9e2299@tim> References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> Message-ID: <14143.56604.21827.891993@seattle.nightmare.com> Tim Peters writes: > [Sam] > > Continuations are more powerful than coroutines, though I admit > > they're a bit esoteric. > > "More powerful" is a tedious argument you should always avoid . More powerful in the sense that you can use continuations to build lots of different control structures (coroutines, backtracking, exceptions), but not vice versa. Kinda like a better tool for blowing one's own foot off. 8^) > Suppose the driver were in a script instead: > > thing(5) # line 1 > print repr(saved) # line 2 > saved.throw(0) # line 3 > saved.throw(0) # line 4 > > Then the continuation would (eventually) "return to" the "print repr(saved)" > and we'd get an infinite output tail [...] > > and never reach line 4. Right? That's the part that Guido hates . Yes... the continuation object so far isn't very usable. It needs a driver of some kind around it. In the Scheme world, there are two common ways of using continuations - let/cc and call/cc. [call/cc is what is in the standard, it's official name is call-with-current-continuation] let/cc stores the continuation in a variable binding, while introducing a new scope. It requires a change to the underlying language: (+ 1 (let/cc escape (...) (escape 34))) => 35 'escape' is a function that when called will 'resume' with whatever follows the let/cc clause. In this case it would continue with the addition... call/cc is a little trickier, but doesn't require any change to the language... instead of making a new binding directly, you pass in a function that will receive the binding: (+ 1 (call/cc (lambda (escape) (...) (escape 34)))) => 35 In words, it's much more frightening: "call/cc is a function, that when called with a function as an argument, will pass that function an argument that is a new function, which when called with a value will resume the computation with that value as the result of the entire expression" Phew. In Python, an example might look like this: SAVED = None def save_continuation (k): global SAVED SAVED = k def thing(): [...] value = callcc (lambda k: save_continuation(k)) # or more succinctly: def thing(): [...] value = callcc (save_continuation) In order to do useful work like passing values back and forth between coroutines, we have to have some way of returning a value from the continuation when it is reinvoked. I should emphasize that most folks will never see call/cc 'in the raw', it will usually have some nice wrapper around to implement whatever construct is needed. -Sam From arw@ifu.net Mon May 17 19:06:18 1999 From: arw@ifu.net (Aaron Watters) Date: Mon, 17 May 1999 14:06:18 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads Message-ID: <37405A99.1DBAF399@ifu.net> The illustrious Sam Rushing avers: >Continuations are more powerful than coroutines, though I admit >they're a bit esoteric. I programmed in Scheme for years without >seeing the need for them. But when you need 'em, you *really* need >'em. No way around it. Frankly, I think I thought I understood this once but now I know I don't. How're continuations more powerful than coroutines? And why can't they be implemented using threads (and semaphores etc)? ...I'm not promising I'll understand the answer... -- Aaron Watters === I taught I taw a putty-cat! From gmcm@hypernet.com Mon May 17 20:18:43 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 17 May 1999 14:18:43 -0500 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37405A99.1DBAF399@ifu.net> Message-ID: <1285153546-166193857@hypernet.com> The estimable Aaron Watters queries: > The illustrious Sam Rushing avers: > >Continuations are more powerful than coroutines, though I admit > >they're a bit esoteric. I programmed in Scheme for years without > >seeing the need for them. But when you need 'em, you *really* need > >'em. No way around it. > > Frankly, I think I thought I understood this once but now I know I > don't. How're continuations more powerful than coroutines? And why > can't they be implemented using threads (and semaphores etc)? I think Sam's (immediate ) problem is that he can't afford threads - he may have hundreds to thousands of these suckers. As a fuddy-duddy old imperative programmer, I'm inclined to think "state machine". But I'd guess that functional-ophiles probably see that as inelegant. (Safe guess - they see _anything_ that isn't functional as inelegant!). crude-but-not-rude-ly y'rs - Gordon From jeremy@cnri.reston.va.us Mon May 17 19:43:34 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Mon, 17 May 1999 14:43:34 -0400 (EDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37405A99.1DBAF399@ifu.net> References: <37405A99.1DBAF399@ifu.net> Message-ID: <14144.24242.128959.726878@bitdiddle.cnri.reston.va.us> >>>>> "AW" == Aaron Watters writes: AW> The illustrious Sam Rushing avers: >> Continuations are more powerful than coroutines, though I admit >> they're a bit esoteric. I programmed in Scheme for years without >> seeing the need for them. But when you need 'em, you *really* >> need 'em. No way around it. AW> Frankly, I think I thought I understood this once but now I know AW> I don't. How're continuations more powerful than coroutines? AW> And why can't they be implemented using threads (and semaphores AW> etc)? I think I understood, too. I'm hoping that someone will debug my answer and enlighten us both. A continuation is a mechanism for making control flow explicit. A continuation is a means of naming and manipulating "the rest of the program." In Scheme terms, the continuation is the function that the value of the current expression should be passed to. The call/cc mechanisms lets you capture the current continuation and explicitly call on it. The most typical use of call/cc is non-local exits, but it gives you incredible flexibility for implementing your control flow. I'm fuzzy on coroutines, as I've only seen them in "Structure Programming" (which is as old as I am :-) and never actually used them. The basic idea is that when a coroutine calls another coroutine, control is transfered to the second coroutine at the point at which it last left off (by itself calling another coroutine or by detaching, which returns control to the lexically enclosing scope). It seems to me that coroutines are an example of the kind of control structure that you could build with continuations. It's not clear that the reverse is true. I have to admit that I'm a bit unclear on the motivation for all this. As Gordon said, the state machine approach seems like it would be a good approach. Jeremy From klm@digicool.com Mon May 17 20:08:57 1999 From: klm@digicool.com (Ken Manheimer) Date: Mon, 17 May 1999 15:08:57 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads Message-ID: <613145F79272D211914B0020AFF640190BEEDE@gandalf.digicool.com> Jeremy Hylton: > I have to admit that I'm a bit unclear on the motivation for all > this. As Gordon said, the state machine approach seems like it would > be a good approach. If i understand what you mean by state machine programming, it's pretty inherently uncompartmented, all the combinations of state variables need to be accounted for, so the number of states grows factorially on the number of state vars, in general it's awkward. The advantage of going with what functional folks come up with, like continuations, is that it tends to be well compartmented - functional. (Come to think of it, i suppose that compartmentalization as opposed to state is their mania.) As abstract as i can be (because i hardly know what i'm talking about) (but i have done some specifically finite state machine programming, and did not enjoy it), Ken klm@digicool.com From arw@ifu.net Mon May 17 20:20:13 1999 From: arw@ifu.net (Aaron Watters) Date: Mon, 17 May 1999 15:20:13 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <1285153546-166193857@hypernet.com> Message-ID: <37406BED.95AEB896@ifu.net> The ineffible Gordon McMillan retorts: > As a fuddy-duddy old imperative programmer, I'm inclined to think > "state machine". But I'd guess that functional-ophiles probably see > that as inelegant. (Safe guess - they see _anything_ that isn't > functional as inelegant!). As a fellow fuddy-duddy I'd agree except that if you write properlylayered software you have to unrole and rerole all those layers for every transition of the multi-level state machine, and even though with proper discipline it can be implemented without becoming hideous, it still adds significant overhead compared to "stop right here and come back later" which could be implemented using threads/coroutines(?)/continuations. I think this is particularly true in Python with the relatively high function call overhead. Or maybe I'm out in left field doing cartwheels... I guess the question of interest is why are threads insufficient? I guess they have system limitations on the number of threads or other limitations that wouldn't be a problem with continuations? If there aren't a *lot* of situations where coroutines are vital, I'd be hesitant to do major surgery. But I'm a fuddy-duddy. -- Aaron Watters === I did! I did! From tismer@appliedbiometrics.com Mon May 17 21:03:01 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 17 May 1999 22:03:01 +0200 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <1285153546-166193857@hypernet.com> <37406BED.95AEB896@ifu.net> Message-ID: <374075F5.F29B4EAB@appliedbiometrics.com> Aaron Watters wrote: > > The ineffible Gordon McMillan retorts: > > > As a fuddy-duddy old imperative programmer, I'm inclined to think > > "state machine". But I'd guess that functional-ophiles probably see > > that as inelegant. (Safe guess - they see _anything_ that isn't > > functional as inelegant!). > > As a fellow fuddy-duddy I'd agree except that if you write properlylayered > software you have to unrole and rerole all those layers for every > transition of the multi-level state machine, and even though with proper > discipline it can be implemented without becoming hideous, it still adds > significant overhead compared to "stop right here and come back later" > which could be implemented using threads/coroutines(?)/continuations. Coroutines are most elegant here, since (fir a simple example) they are a symmetric pair of functions which call each other. There is neither the one-pulls, the other pushes asymmetry, nor the need to maintain state and be controlled by a supervisor function. > I think this is particularly true in Python with the relatively high > function > call overhead. Or maybe I'm out in left field doing cartwheels... > I guess the question of interest is why are threads insufficient? I guess > they have system limitations on the number of threads or other limitations > that wouldn't be a problem with continuations? If there aren't a *lot* of > situations where coroutines are vital, I'd be hesitant to do major > surgery. For me (as always) most interesting is the possible speed of coroutines. They involve no threads overhead, no locking, no nothing. Python supports it better than expected. If the stack level of two code objects is the same at a switching point, the whole switch is nothing more than swapping two frame objects, and we're done. This might be even cheaper than general call/cc, like a function call. Sam's prototype works already, with no change to the interpreter (but knowledge of Python frames, and a .dll of course). I think we'll continue a while. continuously - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From gmcm@hypernet.com Mon May 17 23:17:25 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 17 May 1999 17:17:25 -0500 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <374075F5.F29B4EAB@appliedbiometrics.com> Message-ID: <1285142823-166838954@hypernet.com> Co-Christian-routines Tismer continues: > Aaron Watters wrote: > > > > The ineffible Gordon McMillan retorts: > > > > > As a fuddy-duddy old imperative programmer, I'm inclined to think > > > "state machine". But I'd guess that functional-ophiles probably see > > > that as inelegant. (Safe guess - they see _anything_ that isn't > > > functional as inelegant!). > > > > As a fellow fuddy-duddy I'd agree except that if you write properlylayered > > software you have to unrole and rerole all those layers for every > > transition of the multi-level state machine, and even though with proper > > discipline it can be implemented without becoming hideous, it still adds > > significant overhead compared to "stop right here and come back later" > > which could be implemented using threads/coroutines(?)/continuations. > > Coroutines are most elegant here, since (fir a simple example) > they are a symmetric pair of functions which call each other. > There is neither the one-pulls, the other pushes asymmetry, nor the > need to maintain state and be controlled by a supervisor function. Well, the state maintains you, instead of the other way 'round. (Any other ex-Big-Blue-ers out there that used to play these games with checkpoint and SyncSort?). I won't argue elegance. Just a couple points: - there's an art to writing state machines which is largely unrecognized (most of them are unnecessarily horrid). - a multiplexed solution (vs a threaded solution) requires that something be inside out. In one case it's your code, in the other, your understanding of the problem. Neither is trivial. Not to be discouraging - as long as your solution doesn't involve using regexps on bytecode , I say go for it! - Gordon From guido@CNRI.Reston.VA.US Tue May 18 05:03:34 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 18 May 1999 00:03:34 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Mon, 17 May 1999 02:46:29 PDT." <14143.56604.21827.891993@seattle.nightmare.com> References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> <14143.56604.21827.891993@seattle.nightmare.com> Message-ID: <199905180403.AAA04772@eric.cnri.reston.va.us> Sam (& others), I thought I understood what continuations were, but the examples of what you can do with them so far don't clarify the matter at all. Perhaps it would help to explain what a continuation actually does with the run-time environment, instead of giving examples of how to use them and what the result it? Here's a start of my own understanding (brief because I'm on a 28.8k connection which makes my ordinary typing habits in Emacs very painful). 1. All program state is somehow contained in a single execution stack. This includes globals (which are simply name bindings in the botton stack frame). It also includes a code pointer for each stack frame indicating where the function corresponding to that stack frame is executing (this is the return address if there is a newer stack frame, or the current instruction for the newest frame). 2. A continuation does something equivalent to making a copy of the entire execution stack. This can probably be done lazily. There are probably lots of details. I also expect that Scheme's semantic model is different than Python here -- e.g. does it matter whether deep or shallow copies are made? I.e. are there mutable *objects* in Scheme? (I know there are mutable and immutable *name bindings* -- I think.) 3. Calling a continuation probably makes the saved copy of the execution stack the current execution state; I presume there's also a way to pass an extra argument. 4. Coroutines (which I *do* understand) are probably done by swapping between two (or more) continuations. 5. Other control constructs can be done by various manipulations of continuations. I presume that in many situations the saved continuation becomes the main control locus permanently, and the (previously) current stack is simply garbage-collected. Of course the lazy copy makes this efficient. If this all is close enough to the truth, I think that continuations involving C stack frames are definitely out -- as Tim Peters mentioned, you don't know what the stuff on the C stack of extensions refers to. (My guess would be that Scheme implementations assume that any pointers on the C stack point to Scheme objects, so that C stack frames can be copied and conservative GC can be used -- this will never happen in Python.) Continuations involving only Python stack frames might be supported, if we can agree on the the sharing / copying semantics. This is where I don't know enough see questions at #2 above). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Tue May 18 05:46:12 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:46:12 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37406BED.95AEB896@ifu.net> Message-ID: <000901bea0e9$5aa2dec0$829e2299@tim> [Aaron Watters] > ... > I guess the question of interest is why are threads insufficient? I > guess they have system limitations on the number of threads or other > limitations that wouldn't be a problem with continuations? Sam is mucking with thousands of simultaneous I/O-bound socket connections, and makes a good case that threads simply don't fly here (each one consumes a stack, kernel resources, etc). It's unclear (to me) that thousands of continuations would be *much* better, though, by the time Christian gets done making thousands of copies of the Python stack chain. > If there aren't a *lot* of situations where coroutines are vital, I'd > be hesitant to do major surgery. But I'm a fuddy-duddy. Go to Sam's site (http://www.nightmare.com/), download Medusa, and read the docs. They're very well written and describe the problem space exquisitely. I don't have any problems like that I need to solve, but it's interesting to ponder! alas-no-time-for-it-now-ly y'rs - tim From tim_one@email.msn.com Tue May 18 05:45:52 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:45:52 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373FC02A.69F2D912@appliedbiometrics.com> Message-ID: <000301bea0e9$4fd473a0$829e2299@tim> [Christian Tismer] > ... > Yup. With a little counting, it was easy to survive: > > def main(): > global a > a=2 > thing (5) > a=a-1 > if a: > saved.throw (0) Did "a" really need to be global here? I hope you see the same behavior without the "global a"; e.g., this Scheme: (define -cont- #f) (define thing (lambda (n) (if (= n 2) (call/cc (lambda (k) (set! -cont- k)))) (display "n == ") (display n) (newline) (if (= n 0) (begin (display "Done!") (newline)) (thing (- n 1))))) (define main (lambda () (let ((a 2)) (thing 5) (display "a is ") (display a) (newline) (set! a (- a 1)) (if (> a 0) (-cont- #f))))) (main) prints: n == 5 n == 4 n == 3 n == 2 n == 1 n == 0 Done! a is 2 n == 2 n == 1 n == 0 Done! a is 1 Or does brute-force frame-copying cause the continuation to set "a" back to 2 each time? > Weird enough Par for the continuation course! They're nasty when eaten raw. > and needs a much better interface. Ya, like screw 'em and use threads . > But finally I'm quite happy that it worked so smoothly > after just a couple of hours (well, about six :) Yup! Playing with Python internals is a treat. to-be-continued-ly y'rs - tim From tim_one@email.msn.com Tue May 18 05:45:57 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:45:57 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14143.56604.21827.891993@seattle.nightmare.com> Message-ID: <000401bea0e9$51e467e0$829e2299@tim> [Sam] >>> Continuations are more powerful than coroutines, though I admit >>> they're a bit esoteric. [Tim] >> "More powerful" is a tedious argument you should always avoid . [Sam] > More powerful in the sense that you can use continuations to build > lots of different control structures (coroutines, backtracking, > exceptions), but not vice versa. "More powerful" is a tedious argument you should always avoid >. >> Then the continuation would (eventually) "return to" the >> "print repr(saved)" and we'd get an infinite output tail [...] >> and never reach line 4. Right? > Yes... the continuation object so far isn't very usable. But it's proper behavior for a continuation all the same! So this aspect shouldn't be "fixed". > ... > let/cc stores the continuation in a variable binding, while > introducing a new scope. It requires a change to the underlying > language: Isn't this often implemented via a macro, though, so that (let/cc name code) "acts like" (call/cc (lambda (name) code)) ? I haven't used a Scheme with native let/cc, but poking around it appears that the real intent is to support exception-style function exits with a mechanism cheaper than 1st-class continuations: twice saw the let/cc object (the thingie bound to "name") defined as being invalid the instant after "code" returns, so it's an "up the call stack" gimmick. That doesn't sound powerful enough for what you're after. > [nice let/cc call/cc tutorialette] > ... > In order to do useful work like passing values back and forth between > coroutines, we have to have some way of returning a value from the > continuation when it is reinvoked. Somehow, I suspect that's the least of our problems <0.5 wink>. If continuations are in Python's future, though, I agree with the need as stated. > I should emphasize that most folks will never see call/cc 'in the > raw', it will usually have some nice wrapper around to implement > whatever construct is needed. Python already has well-developed exception and thread facilities, so it's hard to make a case for continuations as a catch-all implementation mechanism. That may be the rub here: while any number of things *can* be implementated via continuations, I think very few *need* to be implemented that way, and full-blown continuations aren't easy to implement efficiently & portably. The Icon language was particularly concerned with backtracking searches, and came up with generators as another clearer/cheaper implementation technique. When it went on to full-blown coroutines, it's hard to say whether continuations would have been a better approach. But the coroutine implementation it has is sluggish and buggy and hard to port, so I doubt they could have done noticeably worse. Would full-blown coroutines be powerful enough for your needs? assuming-the-practical-defn-of-"powerful-enough"-ly y'rs - tim From rushing@nightmare.com Tue May 18 06:18:06 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Mon, 17 May 1999 22:18:06 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <000401bea0e9$51e467e0$829e2299@tim> References: <14143.56604.21827.891993@seattle.nightmare.com> <000401bea0e9$51e467e0$829e2299@tim> Message-ID: <14144.61765.308962.101884@seattle.nightmare.com> Tim Peters writes: > Isn't this often implemented via a macro, though, so that > > (let/cc name code) > > "acts like" > > (call/cc (lambda (name) code)) Yup, they're equivalent, in the sense that given one you can make a macro to do the other. call/cc is preferred because it doesn't require a new binding construct. > ? I haven't used a Scheme with native let/cc, but poking around it > appears that the real intent is to support exception-style function > exits with a mechanism cheaper than 1st-class continuations: twice > saw the let/cc object (the thingie bound to "name") defined as > being invalid the instant after "code" returns, so it's an "up the > call stack" gimmick. That doesn't sound powerful enough for what > you're after. Except that since the escape procedure is 'first-class' it can be stored away and invoked (and reinvoked) later. [that's all that 'first-class' means: a thing that can be stored in a variable, returned from a function, used as an argument, etc..] I've never seen a let/cc that wasn't full-blown, but it wouldn't surprise me. > The Icon language was particularly concerned with backtracking > searches, and came up with generators as another clearer/cheaper > implementation technique. When it went on to full-blown > coroutines, it's hard to say whether continuations would have been > a better approach. But the coroutine implementation it has is > sluggish and buggy and hard to port, so I doubt they could have > done noticeably worse. Many Scheme implementors either skip it, or only support non-escaping call/cc (i.e., exceptions in Python). > Would full-blown coroutines be powerful enough for your needs? Yes, I think they would be. But I think with Python it's going to be just about as hard, either way. -Sam From rushing@nightmare.com Tue May 18 06:48:29 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Mon, 17 May 1999 22:48:29 -0700 (PDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <51325225@toto.iv> Message-ID: <14144.63787.502454.111804@seattle.nightmare.com> Aaron Watters writes: > Frankly, I think I thought I understood this once but now I know I > don't. 8^) That's what I said when I backed into the idea via medusa a couple of years ago. > How're continuations more powerful than coroutines? And why can't > they be implemented using threads (and semaphores etc)? My understanding of the original 'coroutine' (from Pascal?) was that it allows two procedures to 'resume' each other. The classic coroutine example is the 'samefringe' problem: given two trees of differing structure, are they equal in the sense that a traversal of the leaves results in the same list? Coroutines let you do this efficiently, comparing leaf-by-leaf without storing the whole tree. continuations can do coroutines, but can also be used to implement backtracking, exceptions, threads... probably other stuff I've never heard of or needed. The reason that Scheme and ML are such big fans of continuations is because they can be used to implement all these other features. Look at how much try/except and threads complicate other language implementations. It's like a super-tool-widget - if you make sure it's in your toolbox, you can use it to build your circular saw and lathe from scratch. Unfortunately there aren't many good sites on the web with good explanatory material. The best reference I have is "Essentials of Programming Languages". For those that want to play with some of these ideas using little VM's written in Python: http://www.nightmare.com/software.html#EOPL -Sam From rushing@nightmare.com Tue May 18 06:56:37 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Mon, 17 May 1999 22:56:37 -0700 (PDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <13631823@toto.iv> Message-ID: <14144.65355.400281.123856@seattle.nightmare.com> Jeremy Hylton writes: > I have to admit that I'm a bit unclear on the motivation for all > this. As Gordon said, the state machine approach seems like it would > be a good approach. For simple problems, state machines are ideal. Medusa uses state machines that are built out of Python methods. But past a certain level of complexity, they get too hairy to understand. A really good example can be found in /usr/src/linux/net/ipv4. 8^) -Sam From rushing@nightmare.com Tue May 18 08:05:20 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Tue, 18 May 1999 00:05:20 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <60057226@toto.iv> Message-ID: <14145.927.588572.113256@seattle.nightmare.com> Guido van Rossum writes: > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? This helped me a lot, and is the angle used in "Essentials of Programming Languages": Usually when folks refer to a 'stack', they're refering to an *implementation* of the stack data type: really an optimization that assumes an upper bound on stack size, and that things will only be pushed and popped in order. If you were to implement a language's variable and execution stacks with actual data structures (linked lists), then it's easy to see what's needed: the head of the list represents the current state. As functions exit, they pop things off the list. The reason I brought this up (during a lull!) was that Python is already paying all of the cost of heap-allocated frames, and it didn't seem to me too much of a leap from there. > 1. All program state is somehow contained in a single execution stack. Yup. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. Yup. > I.e. are there mutable *objects* in Scheme? > (I know there are mutable and immutable *name bindings* -- I think.) Yes, Scheme is pro-functional... but it has arrays, i/o, and set-cdr!, all the things that make it 'impure'. I think shallow copies are what's expected. In the examples I have, the continuation is kept in a 'register', and call/cc merely packages it up with a little function wrapper. You are allowed to stomp all over lexical variables with "set!". > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. Yup. > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. Yup. Here's an example in Scheme: http://www.nightmare.com/stuff/samefringe.scm Somewhere I have an example of coroutines being used for parsing, very elegant. Something like one coroutine does lexing, and passes tokens one-by-one to the next level, which passes parsed expressions to a compiler, or whatever. Kinda like pipes. > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course > the lazy copy makes this efficient. Yes... I think backtracking would be an example of this. You're doing a search on a large space (say a chess game). After a certain point you want to try a previous fork, to see if it's promising, but you don't want to throw away your current work. Save it, then unwind back to the previous fork, try that option out... if it turns out to be better then toss the original. > If this all is close enough to the truth, I think that > continuations involving C stack frames are definitely out -- as Tim > Peters mentioned, you don't know what the stuff on the C stack of > extensions refers to. (My guess would be that Scheme > implementations assume that any pointers on the C stack point to > Scheme objects, so that C stack frames can be copied and > conservative GC can be used -- this will never happen in Python.) I think you're probably right here - usually there are heavy restrictions on what kind of data can pass through the C interface. But I know of at least one Scheme (mzscheme/PLT) that uses conservative gc and has c/c++ interfaces. [... dig dig ...] From this: http://www.cs.rice.edu/CS/PLT/packages/doc/insidemz/node21.htm#exceptions and looking at the code it looks like they enforce the restriction exactly as you described in an earlier mail: call/cc is safe for c->scheme calls only if they invoke a new top-level machine. > Continuations involving only Python stack frames might be > supported, if we can agree on the the sharing / copying semantics. > This is where I don't know enough see questions at #2 above). Woo Hoo! Where do I send the Shrubbery? -Sam From rushing@nightmare.com Tue May 18 08:17:11 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Tue, 18 May 1999 00:17:11 -0700 (PDT) Subject: [Python-Dev] another good motivation Message-ID: <14145.4917.164756.300678@seattle.nightmare.com> "Escaping the event loop: an alternative control structure for multi-threaded GUIs" http://cs.nyu.edu/phd_students/fuchs/ http://cs.nyu.edu/phd_students/fuchs/gui.ps -Sam From tismer@appliedbiometrics.com Tue May 18 14:46:53 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 15:46:53 +0200 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <000901bea0e9$5aa2dec0$829e2299@tim> Message-ID: <37416F4D.8E95D71A@appliedbiometrics.com> Tim Peters wrote: > > [Aaron Watters] > > ... > > I guess the question of interest is why are threads insufficient? I > > guess they have system limitations on the number of threads or other > > limitations that wouldn't be a problem with continuations? > > Sam is mucking with thousands of simultaneous I/O-bound socket connections, > and makes a good case that threads simply don't fly here (each one consumes > a stack, kernel resources, etc). It's unclear (to me) that thousands of > continuations would be *much* better, though, by the time Christian gets > done making thousands of copies of the Python stack chain. Well, what he needs here are coroutines and just a single frame object for every minithread (I think this is a "fiber"?). If these fibers later do deep function calls before they switch, there will of course be more frames then. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Tue May 18 15:35:30 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 16:35:30 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> <14143.56604.21827.891993@seattle.nightmare.com> <199905180403.AAA04772@eric.cnri.reston.va.us> Message-ID: <37417AB2.80920595@appliedbiometrics.com> Guido van Rossum wrote: > > Sam (& others), > > I thought I understood what continuations were, but the examples of > what you can do with them so far don't clarify the matter at all. > > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? > > Here's a start of my own understanding (brief because I'm on a 28.8k > connection which makes my ordinary typing habits in Emacs very > painful). > > 1. All program state is somehow contained in a single execution stack. > This includes globals (which are simply name bindings in the botton > stack frame). It also includes a code pointer for each stack frame > indicating where the function corresponding to that stack frame is > executing (this is the return address if there is a newer stack frame, > or the current instruction for the newest frame). Right. For now, this information is on the C stack for each called function, although almost completely available in the frame chain. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. This can probably be done lazily. There are > probably lots of details. I also expect that Scheme's semantic model > is different than Python here -- e.g. does it matter whether deep or > shallow copies are made? I.e. are there mutable *objects* in Scheme? > (I know there are mutable and immutable *name bindings* -- I think.) To make it lazy, a gatekeeper must be put on top of the two splitted frames, which catches the event that one of them returns. It appears to me that this it the same callcc.new() object which catches this, splitting frames when hit by a return. > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. > > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. Right, which is just two or three assignments. > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course the > lazy copy makes this efficient. Yes, great. It looks like that switching continuations is not more expensive than a single Python function call. > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). This would mean to avoid creating incompatible continuations. A continutation may not switch to a frame chain which was created by a different VM incarnation since this would later on corrupt the machine stack. One way to assure that would be a thread-safe function in sys, similar to sys.exc_info() which gives an id for the current interpreter. continuations living somewhere in globals would be marked by the interpreter which created them, and reject to be thrown if they don't match. The necessary interpreter support appears to be small: Extend the PyFrame structure by two fields: - interpreter ID (addr of some local variable would do) - stack pointer at current instruction. Change the CALL_FUNCTION opcode to avoid calling eval recursively in the case of a Python function/method, but the current frame, build the new one and start over. RETURN will pop a frame and reload its local variables instead of returning, as long as there is a frame to pop. I'm unclear how exceptions should be handled. Are they currently propagated up across different C calls other than ceval2 recursions? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy@cnri.reston.va.us Tue May 18 16:05:39 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Tue, 18 May 1999 11:05:39 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14145.927.588572.113256@seattle.nightmare.com> References: <60057226@toto.iv> <14145.927.588572.113256@seattle.nightmare.com> Message-ID: <14145.33150.767551.472591@bitdiddle.cnri.reston.va.us> >>>>> "SR" == rushing writes: SR> Somewhere I have an example of coroutines being used for SR> parsing, very elegant. Something like one coroutine does SR> lexing, and passes tokens one-by-one to the next level, which SR> passes parsed expressions to a compiler, or whatever. Kinda SR> like pipes. This is the first example that's used in Structured Programming (Dahl, Djikstra, and Hoare). I'd be happy to loan a copy to any of the Python-dev people who sit nearby. Jeremy From tismer@appliedbiometrics.com Tue May 18 16:31:11 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 17:31:11 +0200 Subject: [Python-Dev] 'stackless' python? References: <000301bea0e9$4fd473a0$829e2299@tim> Message-ID: <374187BF.36CC65E7@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > Yup. With a little counting, it was easy to survive: > > > > def main(): > > global a > > a=2 > > thing (5) > > a=a-1 > > if a: > > saved.throw (0) > > Did "a" really need to be global here? I hope you see the same behavior > without the "global a"; e.g., this Scheme: (H�stel) Actually, I inserted the "global" later. It worked as well with a local variable, but I didn't understand it. Still don't :-) > Or does brute-force frame-copying cause the continuation to set "a" back to > 2 each time? No, it doesn't. Behavior is exactly the same with or without global. I'm not sure wether this is a bug or a feature. I *think* 'a' as a local has a slot in the frame, so it's actually a different 'a' living in both copies. But this would not have worked. Can it be that before a function call, the interpreter turns its locals into a dict, using fast_to_locals? That would explain it. This is not what I think it should be! Locals need to be copied. > > and needs a much better interface. > > Ya, like screw 'em and use threads . Never liked threads. These fibers are so neat since they don't need threads, no locking, and they are available on systems without threads. > > But finally I'm quite happy that it worked so smoothly > > after just a couple of hours (well, about six :) > > Yup! Playing with Python internals is a treat. > > to-be-continued-ly y'rs - tim throw(42) - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip@mojam.com (Skip Montanaro) Tue May 18 16:49:42 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 18 May 1999 11:49:42 -0400 Subject: [Python-Dev] Is there another way to solve the continuation problem? Message-ID: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com> Okay, from my feeble understanding of the problem it appears that coroutines/continuations and threads are going to be problematic at best for Sam's needs. Are there other "solutions"? We know about state machines. They have the problem that the number of states grows exponentially (?) as the number of state variables increases. Can exceptions be coerced into providing the necessary structure without botching up the application too badly? Seems that at some point where you need to do some I/O, you could raise an exception whose second expression contains the necessary state to get back to where you need to be once the I/O is ready to go. The controller that catches the exceptions would use select or poll to prepare for the I/O then dispatch back to the handlers using the information from exceptions. class IOSetup: pass class WaveHands: """maintains exception raise info and selects one to go to next""" def choose_one(r,w,e): pass def remember(info): pass def controller(...): waiters = WaveHands() while 1: r, w, e = select([...], [...], [...]) # using r,w,e, select a waiter to call func, place = waiters.choose_one(r,w,e) try: func(place) except IOSetup, info: waiters.remember(info) def spam_func(place): if place == "spam": # whatever I/O we needed to do is ready to go bytes = read(some_fd) process(bytes) # need to read some more from some_fd. args are: # function, target, fd category (r, w), selectable object, raise IOSetup, (spam_func, "eggs" , "r", some_fd) elif place == "eggs": # that next chunk is ready - get it and proceed... elif yadda, yadda, yadda... One thread, some craftiness needed to construct things. Seems like it might isolate some of the statefulness to smaller functional units than a pure state machine. Clearly not as clean as continuations would be. Totally bogus? Totally inadequate? Maybe Sam already does things this way? Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer@appliedbiometrics.com Tue May 18 18:23:08 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 19:23:08 +0200 Subject: [Python-Dev] 'stackless' python? References: <000301bea0e9$4fd473a0$829e2299@tim> Message-ID: <3741A1FC.E84DC926@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > Yup. With a little counting, it was easy to survive: > > > > def main(): > > global a > > a=2 > > thing (5) > > a=a-1 > > if a: > > saved.throw (0) > > Did "a" really need to be global here? I hope you see the same behavior > without the "global a"; e.g., this Scheme: Actually, the frame-copying was not enough to make this all behave correctly. Since I didn't change the interpreter, the ceval.c incarnations still had copies to the old frames. The only effect which I achieved with frame copying was that the refcounts were increased correctly. I have to remove the hardware stack copying now. Will try to create a non-recursive version of the interpreter. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From MHammond@skippinet.com.au Wed May 19 00:16:54 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Wed, 19 May 1999 09:16:54 +1000 Subject: [Python-Dev] Is there another way to solve the continuation problem? In-Reply-To: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com> Message-ID: <006d01bea184$869f1480$0801a8c0@bobcat> > Sam's needs. Are there other "solutions"? We know about > state machines. > They have the problem that the number of states grows > exponentially (?) as > the number of state variables increases. Well, I can give you my feeble understanding of "IO Completion Ports", the technique Win32 provides to "solve" this problem. My experience is limited to how we used these in a server product designed to maintain thousands of long-term client connections each spooling large chunks of data (MSOffice docs - yes, that large :-). We too could obviously not afford a thread per connection. Searching through NT's documentation, completion ports are the technique they recommend for high-performance IO, and it appears to deliver. NT has the concept of a completion port, which in many ways is like an "inverted semaphore". You create a completion port with a "max number of threads" value. Then, for every IO object you need to use (files, sockets, pipes etc) you "attach" it to the completion port, along with an integer key. This key is (presumably) unique to the file, and usually a pointer to some structure maintaing the state of the file (ie, connection) The general programming model is that you have a small number of threads (possibly 1), and a large number of io objects (eg files). Each of these threads is executing a state machine. When IO is "ready" for a particular file, one of the available threads is woken, and passed the "key" associated with the file. This key identifies the file, and more importantly the state of that file. The thread uses the state to perform the next IO operation, then immediately go back to sleep. When that IO operation completes, some other thread is woken to handle that state change. What makes this work of course is that _all_ IO is asynch - not a single IO call in this whole model can afford to block. NT provides asynch IO natively. This sounds very similar to what Medusa does internally, although the NT model provides a "thread pooling" scheme built-in. Although our server performed very well with a single thread and hundreds of high-volume connections, we chose to run with a default of 5 threads here. For those still interested, our project has the multi-threaded state machine I described above implemented in C. Most of the work is responsible for spooling the client request data (possibly 100s of kbs) before handing that data off to the real server. When the C code transitions the client through the state of "send/get from the real server", we actually set a different completion port. This other completion port wakes a thread written in Python. So our architecture consists of a C implemented thread-pool managing client connections, and a different Python implemented thread pool that does the real work for each of these client connections. (The Python side of the world is bound by the server we are talking to, so Python performance doesnt matter as much - C wouldnt buy enough) This means that our state machines are not that complex. Each "thread pool" is managing its own, fairly simple state. NT automatically allows you to associate state with the IO object, and as we have multiple thread pools, each one is simple - the one spooling client data is simple, the one doing the actual server work is simple. If we had to have a single, monolithic state machine managing all aspects of the client spooling, _and_ the server work, it would be horrid. This is all in a shrink-wrapped relatively cheap "Document Management" product being targetted (successfully, it appears) at huge NT/Exchange based sites. Australia's largest Telco are implementing it, and indeed the company has VC from Intel! Lots of support from MS, as it helps compete with Domino. Not bad for a little startup - now they are wondering what to do with this Python-thingy they now have in their product that noone else has ever heard off; but they are planning on keeping it for now :-) [Funnily, when they started, they didnt think they even _needed_ a server, so I said "Ill just knock up a little one in Python", and we havent looked back :-] Mark. From tim_one@email.msn.com Wed May 19 01:48:00 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 18 May 1999 20:48:00 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905180403.AAA04772@eric.cnri.reston.va.us> Message-ID: <000701bea191$3f4d1a20$2e9e2299@tim> [GvR] > ... > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? Paul Wilson (the GC guy) has a very nice-- but incomplete --intro to Scheme and its implementation: ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html You can pick up a lot from that fast. Is Steven (Majewski) on this list? He doped most of this out years ago. > Here's a start of my own understanding (brief because I'm on a 28.8k > connection which makes my ordinary typing habits in Emacs very > painful). > > 1. All program state is somehow contained in a single execution stack. > This includes globals (which are simply name bindings in the botton > stack frame). Better to think of name resolution following lexical links. Lexical closures with indefinite extent are common in Scheme, so much so that name resolution is (at least conceptually) best viewed as distinct from execution stacks. Here's a key: continuations are entirely about capturing control flow state, and nothing about capturing binding or data state. Indeed, mutating bindings and/or non-local data are the ways distinct invocations of a continuation communicate with each other, and for this reason true functional languages generally don't support continuations of the call/cc flavor. > It also includes a code pointer for each stack frame indicating where > the function corresponding to that stack frame is executing (this is > the return address if there is a newer stack frame, or the current > instruction for the newest frame). Yes, although the return address is one piece of information in the current frame's continuation object -- continuations are used internally for "regular calls" too. When a function returns, it passes control thru its continuation object. That process restores-- from the continuation object --what the caller needs to know (in concept: a pointer to *its* continuation object, its PC, its name-resolution chain pointer, and its local eval stack). Another key point: a continuation object is immutable. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. This can probably be done lazily. There are > probably lots of details. The point of the above is to get across that for Scheme-calling-Scheme, creating a continuation object copies just a small, fixed number of pointers (the current continuation pointer, the current name-resolution chain pointer, the PC), plus the local eval stack. This is for a "stackless" interpreter that heap-allocates name-mapping and execution-frame and continuation objects. Half the literature is devoted to optimizing one or more of those away in special cases (e.g., for continuations provably "up-level", using a stack + setjmp/longjmp instead). > I also expect that Scheme's semantic model is different than Python > here -- e.g. does it matter whether deep or shallow copies are made? > I.e. are there mutable *objects* in Scheme? (I know there are mutable > and immutable *name bindings* -- I think.) Same as Python here; Scheme isn't a functional language; has mutable bindings and mutable objects; any copies needed should be shallow, since it's "a feature" that invoking a continuation doesn't restore bindings or object values (see above re communication). > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. Right, except "stack" is the wrong mental model in the presence of continuations; it's a general rooted graph (A calls B, B saves a continuation pointing back to A, B goes on to call A, A saves a continuation pointing back to B, etc). If the explicitly saved continuations are never *invoked*, control will eventually pop back to the root of the graph, so in that sense there's *a* stack implicit at any given moment. > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. > > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course the > lazy copy makes this efficient. There's much less copying going on in Scheme-to-Scheme than you might think; other than that, right on. > If this all is close enough to the truth, I think that continuations > involving C stack frames are definitely out -- as Tim Peters > mentioned, you don't know what the stuff on the C stack of extensions > refers to. (My guess would be that Scheme implementations assume that > any pointers on the C stack point to Scheme objects, so that C stack > frames can be copied and conservative GC can be used -- this will > never happen in Python.) "Scheme" has become a generic term covering dozens of implementations with varying semantics, and a quick tour of the web suggests that cross-language Schemes generally put severe restrictions on continuations across language boundaries. Most popular seems to be to outlaw them by decree. > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). I'd like to go back to examples of what they'd be used for -- but fully fleshed out. In the absence of Scheme's ubiquitous lexical closures and "lambdaness" and syntax-extension facilities, I'm unsure they're going to work out reasonably in Python practice; it's not enough that they can be very useful in Scheme, and Sam is highly motivated to go to extremes here. give-me-a-womb-and-i-still-won't-give-birth-ly y'rs - tim From tismer@appliedbiometrics.com Wed May 19 02:10:15 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 03:10:15 +0200 Subject: [Python-Dev] 'stackless' python? References: <000701bea191$3f4d1a20$2e9e2299@tim> Message-ID: <37420F77.48E9940F@appliedbiometrics.com> Tim Peters wrote: ... > > Continuations involving only Python stack frames might be supported, > > if we can agree on the the sharing / copying semantics. This is where > > I don't know enough see questions at #2 above). > > I'd like to go back to examples of what they'd be used for -- but > fully fleshed out. In the absence of Scheme's ubiquitous lexical closures > and "lambdaness" and syntax-extension facilities, I'm unsure they're going > to work out reasonably in Python practice; it's not enough that they can be > very useful in Scheme, and Sam is highly motivated to go to extremes here. > > give-me-a-womb-and-i-still-won't-give-birth-ly y'rs - tim I've put quite many hours into a non-recursive ceval.c already. Should I continue? At least this would be a little improvement, also if the continuation thing will not be born. ? - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing@nightmare.com Wed May 19 03:52:04 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Tue, 18 May 1999 19:52:04 -0700 (PDT) Subject: [Python-Dev] Is there another way to solve the continuation problem? In-Reply-To: <101382377@toto.iv> Message-ID: <14146.8395.754509.591141@seattle.nightmare.com> Skip Montanaro writes: > Can exceptions be coerced into providing the necessary structure > without botching up the application too badly? Seems that at some > point where you need to do some I/O, you could raise an exception > whose second expression contains the necessary state to get back to > where you need to be once the I/O is ready to go. The controller > that catches the exceptions would use select or poll to prepare for > the I/O then dispatch back to the handlers using the information > from exceptions. > [... code ...] Well, you just re-invented the 'Reactor' pattern! 8^) http://www.cs.wustl.edu/~schmidt/patterns-ace.html > One thread, some craftiness needed to construct things. Seems like > it might isolate some of the statefulness to smaller functional > units than a pure state machine. Clearly not as clean as > continuations would be. Totally bogus? Totally inadequate? Maybe > Sam already does things this way? What you just described is what Medusa does (well, actually, 'Python' does it now, because the two core libraries that implement this are now in the library - asyncore.py and asynchat.py). asyncore doesn't really use exceptions exactly that way, and asynchat allows you to add another layer of processing (basically, dividing the input into logical 'lines' or 'records' depending on a 'line terminator'). The same technique is at the heart of many well-known network servers, including INND, BIND, X11, Squid, etc.. It's really just a state machine underneath (with python functions or methods implementing the 'states'). As long as things don't get too complex. Python simplifies things enough to allow one to 'push the difficulty envelope' a bit further than one could reasonably tolerate in C. For example, Squid implements async HTTP (server and client, because it's a proxy) - but stops short of trying to implement async FTP. Medusa implements async FTP, but it's the largest file in the Medusa distribution, weighing in at a hefty 32KB. The hard part comes when you want to plug different pieces and protocols together. For example, building a simple HTTP or FTP server is relatively easy, but building an HTTP server *that proxied to an FTP server* is much more difficult. I've done these kinds of things, viewing each as a challenge; but past a certain point it boggles. The paper I posted about earlier by Matthew Fuchs has a really good explanation of this, but in the context of GUI event loops... I think it ties in neatly with this discussion because at the heart of any X11 app is a little guy manipulating a file descriptor. -Sam From tim_one@email.msn.com Wed May 19 06:41:39 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:39 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14144.61765.308962.101884@seattle.nightmare.com> Message-ID: <000b01bea1ba$443a1a00$2e9e2299@tim> [Sam] > ... > Except that since the escape procedure is 'first-class' it can be > stored away and invoked (and reinvoked) later. [that's all that > 'first-class' means: a thing that can be stored in a variable, > returned from a function, used as an argument, etc..] > > I've never seen a let/cc that wasn't full-blown, but it wouldn't > surprise me. The let/cc's in question were specifically defined to create continuations valid only during let/cc's dynamic extent, so that, sure, you could store them away, but trying to invoke one later could be an error. It's in that sense I meant they weren't "first class". Other flavors of Scheme appear to call this concept "weak continuation", and use a different verb to invoke it (like call-with-escaping-continuation, or call/ec). Suspect the let/cc oddballs I found were simply confused implementations (there are a lot of amateur Scheme implementations out there!). >> Would full-blown coroutines be powerful enough for your needs? > Yes, I think they would be. But I think with Python it's going to > be just about as hard, either way. Most people on this list are comfortable with coroutines already because they already understand them -- Jeremy can even reach across the hall and hand Guido a helpful book . So pondering coroutines increase the number of brain cells willing to think about the implementation. continuation-examples-leave-people-still-going-"huh?"-after-an- hour-of-explanation-ly y'rs - tim From tim_one@email.msn.com Wed May 19 06:41:45 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:45 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3741A1FC.E84DC926@appliedbiometrics.com> Message-ID: <000e01bea1ba$47fe7500$2e9e2299@tim> [Christian Tismer] >>> ... >>> Yup. With a little counting, it was easy to survive: >>> >>> def main(): >>> global a >>> a=2 >>> thing (5) >>> a=a-1 >>> if a: >>> saved.throw (0) [Tim] >> Did "a" really need to be global here? I hope you see the same behavior >> without the "global a"; [which he does, but for mysterious reasons] [Christian] > Actually, the frame-copying was not enough to make this > all behave correctly. Since I didn't change the interpreter, > the ceval.c incarnations still had copies to the old frames. > The only effect which I achieved with frame copying was > that the refcounts were increased correctly. All right! Now you're closer to the real solution ; i.e., copying wasn't really needed here, but keeping stuff alive was. In Scheme terms, when we entered main originally a set of bindings was created for its locals, and it is that very same set of bindings to which the continuation returns. So the continuation *should* reuse them -- making a copy of the locals is semantically hosed. This is clearer in Scheme because its "stack" holds *only* control-flow info (bindings follow a chain of static links, independent of the current "call stack"), so there's no temptation to run off copying bindings too. elegant-and-baffling-for-the-price-of-one-ly y'rs - tim From tim_one@email.msn.com Wed May 19 06:41:56 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:56 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37420F77.48E9940F@appliedbiometrics.com> Message-ID: <001301bea1ba$4eb498c0$2e9e2299@tim> [Christian Tismer] > I've put quite many hours into a non-recursive ceval.c > already. Does that mean 6 or 600 ? > Should I continue? At least this would be a little improvement, also > if the continuation thing will not be born. ? Guido wanted to move in the "flat interpreter" direction for Python2 anyway, so my belief is it's worth pursuing. but-then-i-flipped-a-coin-with-two-heads-ly y'rs - tim From arw@ifu.net Wed May 19 14:04:53 1999 From: arw@ifu.net (Aaron Watters) Date: Wed, 19 May 1999 09:04:53 -0400 Subject: [Python-Dev] continuations and C extensions? Message-ID: <3742B6F5.C6CB7313@ifu.net> the immutable GvR intones: > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). What if there are native C calls mixed in (eg, list.sort calls back to myclass.__cmp__ which decides to do a call/cc). One of the really big advantages of Python in my book is the relative simplicity of embedding and extensions, and this is generally one of the failings of lisp implementations. I understand lots of scheme implementations purport to be extendible and embeddable, but in practice you can't do it with *existing* code -- there is always a show stopper involving having to change the way some Oracle library which you don't have the source for does memory management or something... I've known several grad students who have been bitten by this... I think having to unroll the C stack safely might be one problem area. With, eg, a netscape nsapi embedding you can actually get into netscape code calls my code calls netscape code calls my code... suspends in a continuation? How would that work? [my ignorance is torment!] Threading and extensions are probably also problematic, but at least it's better understood, I think. Just kvetching. Sorry. -- Aaron Watters ps: Of course there are valid reasons and excellent advantages to having continuations, but it's also interesting to consider the possible cost. There ain't no free lunch. From tismer@appliedbiometrics.com Wed May 19 20:30:18 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 21:30:18 +0200 Subject: [Python-Dev] 'stackless' python? References: <000e01bea1ba$47fe7500$2e9e2299@tim> Message-ID: <3743114A.220FFA0B@appliedbiometrics.com> Tim Peters wrote: ... > [Christian] > > Actually, the frame-copying was not enough to make this > > all behave correctly. Since I didn't change the interpreter, > > the ceval.c incarnations still had copies to the old frames. > > The only effect which I achieved with frame copying was > > that the refcounts were increased correctly. > > All right! Now you're closer to the real solution ; i.e., copying > wasn't really needed here, but keeping stuff alive was. In Scheme terms, > when we entered main originally a set of bindings was created for its > locals, and it is that very same set of bindings to which the continuation > returns. So the continuation *should* reuse them -- making a copy of the > locals is semantically hosed. I tried the most simple thing, and this seemed to be duplicating the current state of the machine. The frame holds the stack, and references to all objects. By chance, the locals are not in a dict, but unpacked into the frame. (Sometimes I agree with Guido, that optimization is considered harmful :-) > This is clearer in Scheme because its "stack" holds *only* control-flow info > (bindings follow a chain of static links, independent of the current "call > stack"), so there's no temptation to run off copying bindings too. The Python stack, besides its intermingledness with the machine stack, is basically its chain of frames. The value stack pointer still hides in the machine stack, but that's easy to change. So the real Scheme-like part is this chain, methinks, with the current bytecode offset and value stack info. Making a copy of this in a restartable way means to increase the refcount of all objects in a frame. Would it be correct to undo the effect of fast locals before splitting, and redoing it on activation? Or do I need to rethink the whole structure? What should be natural for Python, it at all? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy@cnri.reston.va.us Wed May 19 20:46:49 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Wed, 19 May 1999 15:46:49 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> References: <000e01bea1ba$47fe7500$2e9e2299@tim> <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: [Tim Peters] >> This is clearer in Scheme because its "stack" holds *only* >> control-flow info (bindings follow a chain of static links, >> independent of the current "call stack"), so there's no >> temptation to run off copying bindings too. CT> The Python stack, besides its intermingledness with the machine CT> stack, is basically its chain of frames. The value stack pointer CT> still hides in the machine stack, but that's easy to change. So CT> the real Scheme-like part is this chain, methinks, with the CT> current bytecode offset and value stack info. CT> Making a copy of this in a restartable way means to increase the CT> refcount of all objects in a frame. Would it be correct to undo CT> the effect of fast locals before splitting, and redoing it on CT> activation? Wouldn't it be easier to increase the refcount on the frame object? Then you wouldn't need to worry about the recounts on all the objects in the frame, because they would only be decrefed when the frame is deallocated. It seems like the two other things you would need are some way to get a copy of the current frame and a means to invoke eval_code2 with an already existing stack frame instead of a new one. (This sounds too simple, so it's obviously wrong. I'm just not sure where. Is the problem that you really need a seperate stack/graph to hold the frames? If we leave them on the Python stack, it could be hard to dis-entangle value objects from control objects.) Jeremy From tismer@appliedbiometrics.com Wed May 19 21:10:16 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 22:10:16 +0200 Subject: [Python-Dev] 'stackless' python? References: <000e01bea1ba$47fe7500$2e9e2299@tim> <3743114A.220FFA0B@appliedbiometrics.com> <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us> Message-ID: <37431AA8.BC77C615@appliedbiometrics.com> Jeremy Hylton wrote: [TP+CT about frame copies et al] > Wouldn't it be easier to increase the refcount on the frame object? > Then you wouldn't need to worry about the recounts on all the objects > in the frame, because they would only be decrefed when the frame is > deallocated. Well, the frame is supposed to be run twice, since there are two incarnations of interpreters working on it: The original one, and later, when it is thown, another one (or the same, but, in principle). The frame could have been in any state, with a couple of objects on the stack. My splitting function can be invoked in some nested context, so I have a current opcode position, and a current stack position. Running this once leaves the stack empty, since all the objects are decrefed. Running this a second time gives a GPF, since the stack is empty. Therefore, I made a copy which means to create a duplicate frame with an extra refcound for all the objects. This makes sure that both can be restarted at any time. > It seems like the two other things you would need are some way to get > a copy of the current frame and a means to invoke eval_code2 with an > already existing stack frame instead of a new one. Well, that's exactly where I'm working on. > (This sounds too simple, so it's obviously wrong. I'm just not sure > where. Is the problem that you really need a seperate stack/graph to > hold the frames? If we leave them on the Python stack, it could be > hard to dis-entangle value objects from control objects.) Oh, perhaps I should explain it a bit clearer? What did you mean by the Python stack? The hardware machine stack? What do we have at the moment: The stack is the linked list of frames. Every frame has a local Python evaluation stack. Calls of Python functions produce a new frame, and the old one is put beneath. This is the control stack. The additional info on the hardware stack happens to be a parallel friend of this chain, and currently holds extra info, but this is an artifact. Adding the current Python stack level to the frame makes the hardware stack totally unnecessary. There is a possible speed loss, anyway. Today, the recursive call of ceval2 is optimized and quite fast. The non-recursive Version will have to copy variables in and out from the frames, instead, so there is of course a little speed penalty to pay. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Wed May 19 22:38:07 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 23:38:07 +0200 Subject: [Python-Dev] 'stackless' python? References: <001301bea1ba$4eb498c0$2e9e2299@tim> Message-ID: <37432F3F.2694DA0E@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > I've put quite many hours into a non-recursive ceval.c > > already. > > Does that mean 6 or 600 ? 6, or 10, or 20, if I count the time from the first start with Sam's code, maybe. > > > Should I continue? At least this would be a little improvement, also > > if the continuation thing will not be born. ? > > Guido wanted to move in the "flat interpreter" direction for Python2 anyway, > so my belief is it's worth pursuing. > > but-then-i-flipped-a-coin-with-two-heads-ly y'rs - tim Right. Who'se faces? :-) On the stackless thing, what should I do. I started to insert minimum patches, but it turns out that I have to change frames a little (extending). I can make quite small changes to the interpreter to replace the recursive calls, but this involves extra flags in some cases, where the interpreter is called the first time and so on. What has more probability to be included into a future Python: Tweaking the current thing only minimally, to make it as similar as possible as the former? Or do as much redesign as I think is needed to do it in a clean way. This would mean to split eval_code2 into two functions, where one is the interpreter kernel, and one is the frame manager. There are also other places which do quite deep function calls and finally call eval_code2. I think these should return a frame object now. I could convince them to call or return frame, depending on a flag, but it would be clean to rename the functions, let them always deal with frames, and put the original function on top of it. Short, I can do larger changes which clean this all a bit up, or I can make small changes which are more tricky to grasp, but give just small diffs. How to touch untouchable code the best? :-) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy@cnri.reston.va.us Wed May 19 22:49:38 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Wed, 19 May 1999 17:49:38 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37432F3F.2694DA0E@appliedbiometrics.com> References: <001301bea1ba$4eb498c0$2e9e2299@tim> <37432F3F.2694DA0E@appliedbiometrics.com> Message-ID: <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us> I think it makes sense to avoid being obscure or unclear in order to minimize the size of the patch or the diff. Realistically, it's unlikely that anything like your original patch is going to make it into the CVS tree. It's primary value is as proof of concept and as code that the rest of us can try out. If you make large changes, but they are clearer, you'll help us out a lot. We can worry about minimizing the impact of the changes on the codebase after, after everyone has figured out what's going on and agree that its worth doing. feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's, Jeremy From tismer@appliedbiometrics.com Wed May 19 23:25:20 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 00:25:20 +0200 Subject: [Python-Dev] 'stackless' python? References: <001301bea1ba$4eb498c0$2e9e2299@tim> <37432F3F.2694DA0E@appliedbiometrics.com> <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us> Message-ID: <37433A50.31E66CB1@appliedbiometrics.com> Jeremy Hylton wrote: > > I think it makes sense to avoid being obscure or unclear in order to > minimize the size of the patch or the diff. Realistically, it's > unlikely that anything like your original patch is going to make it > into the CVS tree. It's primary value is as proof of concept and as > code that the rest of us can try out. If you make large changes, but > they are clearer, you'll help us out a lot. Many many thanks. This is good advice. I will make absolutely clear what's going on, keep parts untouched as possible, cut out parts which must change, and I will not look into speed too much. Better have a function call more and a bit less optimization, but a clear and rock-solid introduction of a concept. > We can worry about minimizing the impact of the changes on the > codebase after, after everyone has figured out what's going on and > agree that its worth doing. > > feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's, > Jeremy Hihi - the new little slot with local variables of the interpreter happens to have the name "continuation". Maybe I'd better rename it to "activation record"?. Now, there is no longer a recoursive call. Instead, a frame object is returned, which is waiting to be activated by a dispatcher. Some more ideas are popping up. Right now, only the recursive calls can vanish. Callbacks from C code which is called by the interpreter whcih is called by... is still a problem. But it might perhaps vanish completely. We have to see how much the cost is. But if I can manage to let the interpreter duck and cover also on every call to a builtin? The interpreter again returns to the dispatcher which then calls the builtin. Well, if that builtin happens to call to the interpreter again, it will be a dispatcher again. The machine stack grows a little, but since everything is saved in the frames, these stacks are no longer related. This means, the principle works with existing extension modules, since interpreter-world and C-stack world are decoupled. To avoid stack growth, of course a number of builtins would be better changed, but it is no must in the first place. execfile for instance is a candidate which needn't call the interpreter. It could equally parse the file, generate the code object, build a frame and just return it. This is what the dispatcher likes: returned frames are put on the chain and fired. waah, my bus - running - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one@email.msn.com Thu May 20 00:56:33 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 19 May 1999 19:56:33 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <000701bea253$3a182a00$179e2299@tim> I'm home sick today, so tortured myself <0.9 wink>. Sam mentioned using coroutines to compare the fringes of two trees, and I picked a simpler problem: given a nested list structure, generate the leaf elements one at a time, in left-to-right order. A solution to Sam's problem can be built on that, by getting a generator for each tree and comparing the leaves a pair at a time until there's a difference. Attached are solutions in Icon, Python and Scheme. I have the least experience with Scheme, but browsing around didn't find a better Scheme approach than this. The Python solution is the least satisfactory, using an explicit stack to simulate recursion by hand; if you didn't know the routine's purpose in advance, you'd have a hard time guessing it. The Icon solution is very short and simple, and I'd guess obvious to an average Icon programmer. It uses the subset of Icon ("generators") that doesn't require any C-stack trickery. However, alone of the three, it doesn't create a function that could be explicitly called from several locations to produce "the next" result; Icon's generators are tied into Icon's unique control structures to work their magic, and breaking that connection requires moving to full-blown Icon coroutines. It doesn't need to be that way, though. The Scheme solution was the hardest to write, but is a largely mechanical transformation of a recursive fringe-lister that constructs the entire fringe in one shot. Continuations are used twice: to enable the recursive routine to resume itself where it left off, and to get each leaf value back to the caller. Getting that to work required rebinding non-local identifiers in delicate ways. I doubt the intent would be clear to an average Scheme programmer. So what would this look like in Continuation Python? Note that each place the Scheme says "lambda" or "letrec", it's creating a new lexical scope, and up-level references are very common. Two functions are defined at top level, but seven more at various levels of nesting; the latter can't be pulled up to the top because they refer to vrbls local to the top-level functions. Another (at least initially) discouraging thing to note is that Scheme schemes for hiding the pain of raw call/cc often use Scheme's macro facilities. may-not-be-as-fun-as-it-sounds-ly y'rs - tim Here's the Icon: procedure main() x := [[1, [[2, 3]]], [4], [], [[[5]], 6]] every writes(fringe(x), " ") write() end procedure fringe(node) if type(node) == "list" then suspend fringe(!node) else suspend node end Here's the Python: from types import ListType class Fringe: def __init__(self, value): self.stack = [(value, 0)] def __getitem__(self, ignored): while 1: # find topmost pending list with something to do while 1: if not self.stack: raise IndexError v, i = self.stack[-1] if i < len(v): break self.stack.pop() this = v[i] self.stack[-1] = (v, i+1) if type(this) is ListType: self.stack.append((this, 0)) else: break return this testcase = [[1, [[2, 3]]], [4], [], [[[5]], 6]] for x in Fringe(testcase): print x, print Here's the Scheme: (define list->generator ; Takes a list as argument. ; Returns a generator g such that each call to g returns ; the next element in the list's symmetric-order fringe. (lambda (x) (letrec {(produce-value #f) ; set to return-to continuation (looper (lambda (x) (cond ((null? x) 'nada) ; ignore null ((list? x) (looper (car x)) (looper (cdr x))) (else ; want to produce this non-list fringe elt, ; and also resume here (call/cc (lambda (here) (set! getnext (lambda () (here 'keep-going))) (produce-value x))))))) (getnext (lambda () (looper x) ; have to signal end of sequence somehow; ; assume false isn't a legitimate fringe elt (produce-value #f)))} ; return niladic function that returns next value (lambda () (call/cc (lambda (k) (set! produce-value k) (getnext))))))) (define display-fringe (lambda (x) (letrec ((g (list->generator x)) (thiselt #f) (looper (lambda () (set! thiselt (g)) (if thiselt (begin (display thiselt) (display " ") (looper)))))) (looper)))) (define test-case '((1 ((2 3))) (4) () (((5)) 6))) (display-fringe test-case) From MHammond@skippinet.com.au Thu May 20 01:14:24 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Thu, 20 May 1999 10:14:24 +1000 Subject: [Python-Dev] Interactive Debugging of Python Message-ID: <008b01bea255$b80cf790$0801a8c0@bobcat> All this talk about stack frames and manipulating them at runtime has reminded me of one of my biggest gripes about Python. When I say "biggest gripe", I really mean "biggest surprise" or "biggest shame". That is, Python is very interactive and dynamic. However, when I am debugging Python, it seems to lose this. There is no way for me to effectively change a running program. Now with VC6, I can do this with C. Although it is slow and a little dumb, I can change the C side of my Python world while my program is running, but not the Python side of the world. Im wondering how feasable it would be to change Python code _while_ running under the debugger. Presumably this would require a way of recompiling the current block of code, patching this code back into the object, and somehow tricking the stack frame to use this new block of code; even if a first-cut had to restart the block or somesuch... Any thoughts on this? Mark. From tim_one@email.msn.com Thu May 20 03:41:03 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 19 May 1999 22:41:03 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <000901bea26a$34526240$179e2299@tim> [Christian Tismer] > I tried the most simple thing, and this seemed to be duplicating > the current state of the machine. The frame holds the stack, > and references to all objects. > By chance, the locals are not in a dict, but unpacked into > the frame. (Sometimes I agree with Guido, that optimization > is considered harmful :-) I don't see that the locals are a problem here -- provided you simply leave them alone . > The Python stack, besides its intermingledness with the machine > stack, is basically its chain of frames. Right. > The value stack pointer still hides in the machine stack, but > that's easy to change. I'm not sure what "value stack" means here, or "machine stack". The latter means the C stack? Then I don't know which values you have in mind that are hiding in it (the locals are, as you say, unpacked in the frame, and the evaluation stack too). By "evaluation stack" I mean specifically f->f_valuestack; the current *top* of stack pointer (specifically stack_pointer) lives in the C stack -- is that what we're talking about? Whichever, when we're talking about the code, let's use the names the code uses . > So the real Scheme-like part is this chain, methinks, with > the current bytecode offset and value stack info. Curiously, f->f_lasti is already materialized every time we make a call, in order to support tracing. So if capturing a continuation is done via a function call (hard to see any other way it could be done ), a bytecode offset is already getting saved in the frame object. > Making a copy of this in a restartable way means to increase > the refcount of all objects in a frame. You later had a vision of splitting the frame into two objects -- I think. Whichever part the locals live in should not be copied at all, but merely have its (single) refcount increased. The other part hinges on details of your approach I don't know. The nastiest part seems to be f->f_valuestack, which conceptually needs to be (shallow) copied in the current frame and in all other frames reachable from the current frame's continuation (the chain rooted at f->f_back today); that's the sum total (along with the same frames' bytecode offsets) of capturing the control flow state. > Would it be correct to undo the effect of fast locals before > splitting, and redoing it on activation? Unsure what splitting means, but in any case I can't conceive of a reason for doing anything to the locals. Their values aren't *supposed* to get restored upon continuation invocation, so there's no reason to do anything with their values upon continuation creation either. Right? Or are we talking about different things? almost-as-good-as-pantomimem-ly y'rs - tim From rushing@nightmare.com Thu May 20 05:04:20 1999 From: rushing@nightmare.com (rushing@nightmare.com) Date: Wed, 19 May 1999 21:04:20 -0700 (PDT) Subject: [Python-Dev] A "real" continuation example In-Reply-To: <50692631@toto.iv> Message-ID: <14147.34175.950743.79464@seattle.nightmare.com> Tim Peters writes: > The Scheme solution was the hardest to write, but is a largely > mechanical transformation of a recursive fringe-lister that > constructs the entire fringe in one shot. Continuations are used > twice: to enable the recursive routine to resume itself where it > left off, and to get each leaf value back to the caller. Getting > that to work required rebinding non-local identifiers in delicate > ways. I doubt the intent would be clear to an average Scheme > programmer. It's the only way to do it - every example I've seen of using call/cc looks just like it. I reworked your Scheme a bit. IMHO letrec is for compilers, not for people. The following should be equivalent: (define (list->generator x) (let ((produce-value #f)) (define (looper x) (cond ((null? x) 'nada) ((list? x) (looper (car x)) (looper (cdr x))) (else (call/cc (lambda (here) (set! getnext (lambda () (here 'keep-going))) (produce-value x)))))) (define (getnext) (looper x) (produce-value #f)) (lambda () (call/cc (lambda (k) (set! produce-value k) (getnext)))))) (define (display-fringe x) (let ((g (list->generator x))) (let loop ((elt (g))) (if elt (begin (display elt) (display " ") (loop (g))))))) (define test-case '((1 ((2 3))) (4) () (((5)) 6))) (display-fringe test-case) > So what would this look like in Continuation Python? Here's my first hack at it. Most likely wrong. It is REALLY HARD to do this without having the feature to play with. This presumes a function "call_cc" that behaves like Scheme's. I believe the extra level of indirection is necessary. (i.e., call_cc takes a function as an argument that takes a continuation function) class list_generator: def __init__ (x): self.x = x self.k_suspend = None self.k_produce = None def walk (self, x): if type(x) == type([]): for item in x: self.walk (item) else: self.item = x # call self.suspend() with a continuation # that will continue walking the tree call_cc (self.suspend) def __call__ (self): # call self.resume() with a continuation # that will return the next fringe element return call_cc (self.resume) def resume (self, k_produce): self.k_produce = k_produce if self.k_suspend: # resume the suspended walk self.k_suspend (None) else: self.walk (self.x) def suspend (self, k_suspend): self.k_suspend = k_suspend # return a value for __call__ self.k_produce (self.item) Variables hold continuations have a 'k_' prefix. In real life it might be possible to put the suspend/call/resume machinery in a base class (Generator?), and override 'walk' as you please. -Sam From tim_one@email.msn.com Thu May 20 08:21:45 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 20 May 1999 03:21:45 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com> Message-ID: <001d01bea291$6b3efbc0$179e2299@tim> [Sam, takes up the Continuation Python Challenge] Thanks, Sam! I think this is very helpful. > ... > It's the only way to do it - every example I've seen of using call/cc > looks just like it. Same here -- alas <0.5 wink>. > I reworked your Scheme a bit. IMHO letrec is for compilers, not for > people. The following should be equivalent: I confess I stopped paying attention to Scheme after R4RS, and largely because the std decreed that *so* many forms were optional. Your rework is certainly nicer, but internal defines and named let are two that R4RS refused to require, so I always avoided them. BTW, I *am* a compiler, so that never bothered me . >> So what would this look like in Continuation Python? > Here's my first hack at it. Most likely wrong. It is REALLY HARD to > do this without having the feature to play with. Fully understood. It's also really hard to implement the feature without knowing how someone who wants it would like it to behave. But I don't think anyone is getting graded on this, so let's have fun . Ack! I have to sleep. Will study the code in detail later, but first impression was it looked good! Especially nice that it appears possible to package up most of the funky call_cc magic in a base class, so that non-wizards could reuse it by following a simple protocol. great-fun-to-come-up-with-one-of-these-but-i'd-hate-to-have-to-redo- from-scratch-every-time-ly y'rs - tim From skip@mojam.com (Skip Montanaro) Thu May 20 14:27:59 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 20 May 1999 09:27:59 -0400 (EDT) Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com> References: <50692631@toto.iv> <14147.34175.950743.79464@seattle.nightmare.com> Message-ID: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com> Sam> I reworked your Scheme a bit. IMHO letrec is for compilers, not for Sam> people. Sam, you are aware of course that the timbot *is* a compiler, right? ;-) >> So what would this look like in Continuation Python? Sam> Here's my first hack at it. Most likely wrong. It is REALLY HARD to Sam> do this without having the feature to play with. The thought that it's unlikely one could arrive at a reasonable approximation of a correct solution for such a small problem without the ability to "play with" it is sort of scary. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer@appliedbiometrics.com Thu May 20 15:10:32 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 16:10:32 +0200 Subject: [Python-Dev] Interactive Debugging of Python References: <008b01bea255$b80cf790$0801a8c0@bobcat> Message-ID: <374417D8.8DBCB617@appliedbiometrics.com> Mark Hammond wrote: > > All this talk about stack frames and manipulating them at runtime has > reminded me of one of my biggest gripes about Python. When I say "biggest > gripe", I really mean "biggest surprise" or "biggest shame". > > That is, Python is very interactive and dynamic. However, when I am > debugging Python, it seems to lose this. There is no way for me to > effectively change a running program. Now with VC6, I can do this with C. > Although it is slow and a little dumb, I can change the C side of my Python > world while my program is running, but not the Python side of the world. > > Im wondering how feasable it would be to change Python code _while_ running > under the debugger. Presumably this would require a way of recompiling the > current block of code, patching this code back into the object, and somehow > tricking the stack frame to use this new block of code; even if a first-cut > had to restart the block or somesuch... > > Any thoughts on this? I'm writing a prototype of a stackless Python, which means that you will be able to access the current state of the interpreter completely. The inner interpreter loop will be isolated from the frame dispatcher. It will break whenever the ticker goes zero. If you set the ticker to one, you will be able to single step on every opcode, have the value stack, the frame chain, everything. I think, with this you can do very much. But tell me if you want a callback hook somewhere. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Thu May 20 17:52:21 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 18:52:21 +0200 Subject: [Python-Dev] 'stackless' python? References: <000901bea26a$34526240$179e2299@tim> Message-ID: <37443DC5.1330EAC6@appliedbiometrics.com> Cleaning up, clarifying, trying to understand... Tim Peters wrote: > > [Christian Tismer] > > I tried the most simple thing, and this seemed to be duplicating > > the current state of the machine. The frame holds the stack, > > and references to all objects. > > By chance, the locals are not in a dict, but unpacked into > > the frame. (Sometimes I agree with Guido, that optimization > > is considered harmful :-) > > I don't see that the locals are a problem here -- provided you simply leave > them alone . This depends on wether I have to duplicate frames or not. Below... > > The Python stack, besides its intermingledness with the machine > > stack, is basically its chain of frames. > > Right. > > > The value stack pointer still hides in the machine stack, but > > that's easy to change. > > I'm not sure what "value stack" means here, or "machine stack". The latter > means the C stack? Then I don't know which values you have in mind that are > hiding in it (the locals are, as you say, unpacked in the frame, and the > evaluation stack too). By "evaluation stack" I mean specifically > f->f_valuestack; the current *top* of stack pointer (specifically > stack_pointer) lives in the C stack -- is that what we're talking about? Exactly! > Whichever, when we're talking about the code, let's use the names the code > uses . The evaluation stack pointer is a local variable in the C stack and must be written to the frame to become independant from the C stack. Sounds better now? > > > So the real Scheme-like part is this chain, methinks, with > > the current bytecode offset and value stack info. > > Curiously, f->f_lasti is already materialized every time we make a call, in > order to support tracing. So if capturing a continuation is done via a > function call (hard to see any other way it could be done ), a > bytecode offset is already getting saved in the frame object. You got me. I'm just completing what is partially there. > > Making a copy of this in a restartable way means to increase > > the refcount of all objects in a frame. > > You later had a vision of splitting the frame into two objects -- I think. My wrong wording. Not splitting, but duplicting. If a frame is the current state, I make it two frames to have two current states. One will be saved, the other will be run. This is what I call "splitting". Actually, splitting must occour whenever a frame can be reached twice, in order to keep elements alive. > Whichever part the locals live in should not be copied at all, but merely > have its (single) refcount increased. The other part hinges on details of > your approach I don't know. The nastiest part seems to be f->f_valuestack, > which conceptually needs to be (shallow) copied in the current frame and in > all other frames reachable from the current frame's continuation (the chain > rooted at f->f_back today); that's the sum total (along with the same > frames' bytecode offsets) of capturing the control flow state. Well, I see. You want one locals and one globals, shared by two incarnations. Gets me into trouble. > > Would it be correct to undo the effect of fast locals before > > splitting, and redoing it on activation? > > Unsure what splitting means, but in any case I can't conceive of a reason > for doing anything to the locals. Their values aren't *supposed* to get > restored upon continuation invocation, so there's no reason to do anything > with their values upon continuation creation either. Right? Or are we > talking about different things? Let me explain. What Python does right now is: When a function is invoked, all local variables are copied into fast_locals, well of course just references are copied and counts increased. These fast locals give a lot of speed today, we must have them. You are saying I have to share locals between frames. Besides that will be a reasonable slowdown, since an extra structure must be built and accessed indirectly (right now, i's all fast, living in the one frame buffer), I cannot say that I'm convinced that this is what we need. Suppose you have a function def f(x): # do something ... # in some context, wanna have a snapshot global snapshot # initialized to None if not snapshot: snapshot = callcc.new() # continue computation x = x+1 ... What I want to achieve is that I can run this again, from my snapshot. But with shared locals, my parameter x of the snapshot would have changed to x+1, which I don't find useful. I want to fix a state of the current frame and still think it should "own" its locals. Globals are borrowed, anyway. Class instances will anyway do what you want, since the local "self" is a mutable object. How do you want to keep computations independent when locals are shared? For me it's just easier to implement and also to think with the shallow copy. Otherwise, where is my private place? Open for becoming convinced, of course :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy@cnri.reston.va.us Thu May 20 20:26:30 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Thu, 20 May 1999 15:26:30 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com> References: <000901bea26a$34526240$179e2299@tim> <37443DC5.1330EAC6@appliedbiometrics.com> Message-ID: <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: CT> What I want to achieve is that I can run this again, from my CT> snapshot. But with shared locals, my parameter x of the snapshot CT> would have changed to x+1, which I don't find useful. I want to CT> fix a state of the current frame and still think it should "own" CT> its locals. Globals are borrowed, anyway. Class instances will CT> anyway do what you want, since the local "self" is a mutable CT> object. CT> How do you want to keep computations independent when locals are CT> shared? For me it's just easier to implement and also to think CT> with the shallow copy. Otherwise, where is my private place? CT> Open for becoming convinced, of course :-) I think you're making things a lot more complicated by trying to instantiate new variable bindings for locals every time you create a continuation. Can you give an example of why that would be helpful? (Ok. I'm not sure I can offer a good example of why it would be helpful to share them, but it makes intuitive sense to me.) The call_cc mechanism is going to let you capture the current continuation, save it somewhere, and call on it again as often as you like. Would you get a fresh locals each time you used it? or just the first time? If only the first time, it doesn't seem that you've gained a whole lot. Also, all the locals that are references to mutable objects are already effectively shared. So it's only a few oddballs like ints that are an issue. Jeremy From tim_one@email.msn.com Thu May 20 23:04:04 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 20 May 1999 18:04:04 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com> Message-ID: <000601bea30c$ad51b220$9d9e2299@tim> [Tim] > So what would this look like in Continuation Python? [Sam] > Here's my first hack at it. Most likely wrong. It is > REALLY HARD to do this without having the feature to play with. [Skip] > The thought that it's unlikely one could arrive at a reasonable > approximation of a correct solution for such a small problem without the > ability to "play with" it is sort of scary. Yes it is. But while the problem is small, it's not easy, and only the Icon solution wrote itself (not a surprise -- Icon was designed for expressing this kind of algorithm, and the entire language is actually warped towards it). My first stab at the Python stack-fiddling solution had bugs too, but I conveniently didn't post that . After studying Sam's code, I expect it *would* work as written, so it's a decent bet that it's a reasonable approximation to a correct solution as-is. A different Python approach using threads can be built using Demo/threads/Generator.py from the source distribution. To make that a fair comparison, I would have to post the supporting machinery from Generator.py too -- and we can ask Guido whether Generator.py worked right the first time he tried it . The continuation solution is subtle, requiring real expertise; but the threads solution doesn't fare any better on that count (building the support machinery with threads is also a baffler if you don't have thread expertise). If we threw Python metaclasses into the pot too, they'd be a third kind of nightmare for the non-expert. So, if you're faced with this kind of task, there's simply no easy way to get it done. Thread- and (it appears) continuation- based machinery can be crafted once by an expert, then packaged into an easy-to-use protocol for non-experts. All in all, I view continuations as a feature most people should actively avoid! I think it has that status in Scheme too (e.g., the famed Schemer's SICP textbook doesn't even mention call/cc). Its real value (if any ) is as a Big Invisible Hammer for certified wizards. Where call_cc leaks into the user's view of the world I'd try to hide it; e.g., where Sam has def walk (self, x): if type(x) == type([]): for item in x: self.walk (item) else: self.item = x # call self.suspend() with a continuation # that will continue walking the tree call_cc (self.suspend) I'd do def walk(self, x): if type(x) == type([]): for item in x: self.walk(item) else: self.put(x) where "put" is inherited from the base class (part of the protocol) and hides the call_cc business. Do enough of this, and we'll rediscover why Scheme demands that tail calls not push a new stack frame <0.9 wink>. the-tradeoffs-are-murky-ly y'rs - tim From tim_one@email.msn.com Thu May 20 23:04:09 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 20 May 1999 18:04:09 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com> Message-ID: <000701bea30c$af7a1060$9d9e2299@tim> [Christian] [... clarified stuff ... thanks! ... much clearer ...] > ... > If a frame is the current state, I make it two frames to have two > current states. One will be saved, the other will be run. This is > what I call "splitting". Actually, splitting must occour whenever > a frame can be reached twice, in order to keep elements alive. That part doesn't compute: if a frame can be reached by more than one path, its refcount must be at least equal to the number of its immediate predecessors, and its refcount won't fall to 0 before it becomes unreachable. So while you may need to split stuff for *some* reasons, I can't see how keeping elements alive could be one of those reasons (unless you're zapping frame contents *before* the frame itself is garbage?). > ... > Well, I see. You want one locals and one globals, shared by two > incarnations. Gets me into trouble. Just clarifying what Scheme does. Since they've been doing this forever, I don't want to toss their semantics on a whim . It's at least a conceptual thing: why *should* locals follow different rules than globals? If Python2 grows lexical closures, the only thing special about today's "locals" is that they happen to be the first guys found on the search path. Conceptually, that's really all they are today too. Here's the clearest Scheme example I can dream up: (define k #f) (define (printi i) (display "i is ") (display i) (newline)) (define (test n) (let ((i n)) (printi i) (set! i (- i 1)) (printi i) (display "saving continuation") (newline) (call/cc (lambda (here) (set! k here))) (set! i (- i 1)) (printi i) (set! i (- i 1)) (printi i))) No loops, no recursive calls, just a straight chain of fiddle-a-local ops. Here's some output: > (test 5) i is 5 i is 4 saving continuation i is 3 i is 2 > (k #f) i is 1 i is 0 > (k #f) i is -1 i is -2 > (k #f) i is -3 i is -4 > So there's no question about what Scheme thinks is proper behavior here. > ... > Let me explain. What Python does right now is: > When a function is invoked, all local variables are copied > into fast_locals, well of course just references are copied > and counts increased. These fast locals give a lot of speed > today, we must have them. Scheme (most of 'em, anyway) also resolves locals via straight base + offset indexing. > You are saying I have to share locals between frames. Besides > that will be a reasonable slowdown, since an extra structure > must be built and accessed indirectly (right now, i's all fast, > living in the one frame buffer), GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't care where that points *to* I cannot say that I'm convinced that this is what we need. > > Suppose you have a function > > def f(x): > # do something > ... > # in some context, wanna have a snapshot > global snapshot # initialized to None > if not snapshot: > snapshot = callcc.new() > # continue computation > x = x+1 > ... > > What I want to achieve is that I can run this again, from my > snapshot. But with shared locals, my parameter x of the > snapshot would have changed to x+1, which I don't find useful. You need a completely fleshed-out example to score points here: the use of call/cc is subtle, hinging on details, and fragments ignore too much. If you do want the same x, commonx = x if not snapshot: # get the continuation # continue computation x = commonx x = x+1 ... That is, it's easy to get it. But if you *do* want to see changes to the locals (which is one way for those distinct continuation invocations to *cooperate* in solving a task -- see below), but the implementation doesn't allow for it, I don't know what you can do to worm around it short of making x global too. But then different *top* level invocations of f will stomp on that shared global, so that's not a solution either. Maybe forget functions entirely and make everything a class method. > I want to fix a state of the current frame and still think > it should "own" its locals. Globals are borrowed, anyway. > Class instances will anyway do what you want, since > the local "self" is a mutable object. > > How do you want to keep computations independent > when locals are shared? For me it's just easier to > implement and also to think with the shallow copy. > Otherwise, where is my private place? > Open for becoming convinced, of course :-) I imagine it comes up less often in Scheme because it has no loops: communication among "iterations" is via function arguments or up-level lexical vrbls. So recall your uses of Icon generators instead: like Python, Icon does have loops, and two-level scoping, and I routinely build loopy Icon generators that keep state in locals. Here's a dirt-simple example I emailed to Sam earlier this week: procedure main() every result := fib(0, 1) \ 10 do write(result) end procedure fib(i, j) local temp repeat { suspend i temp := i + j i := j j := temp } end which prints 0 1 1 2 3 5 8 13 21 34 If Icon restored the locals (i, j, temp) upon each fib resumption, it would generate a zero followed by an infinite sequence of ones(!). Think of a continuation as a *paused* computation (which it is) rather than an *independent* one (which it isn't ), and I think it gets darned hard to argue. theory-and-practice-agree-here-in-my-experience-ly y'rs - tim From MHammond@skippinet.com.au Fri May 21 00:01:22 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Fri, 21 May 1999 09:01:22 +1000 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <374417D8.8DBCB617@appliedbiometrics.com> Message-ID: <00c001bea314$aefc5b40$0801a8c0@bobcat> > I'm writing a prototype of a stackless Python, which means that > you will be able to access the current state of the interpreter > completely. > The inner interpreter loop will be isolated from the frame > dispatcher. It will break whenever the ticker goes zero. > If you set the ticker to one, you will be able to single > step on every opcode, have the value stack, the frame chain, > everything. I think the main point is how to change code when a Python frame already references it. I dont think the structure of the frames is as important as the general concept. But while we were talking frame-fiddling it seemed a good point to try and hijack it a little :-) Would it be possible to recompile just a block of code (eg, just the current function or method) and patch it back in such a way that the current frame continues execution of the new code? I feel this is somewhat related to the inability to change class implementation for an existing instance. I know there have been hacks around this before but they arent completly reliable and IMO it would be nice if the core Python made it easier to change already running code - whether that code is in an existing stack frame, or just in an already created instance, it is very difficult to do. This has come to try and deflect some conversation away from changing Python as such towards an attempt at enhancing its _environment_. To paraphrase many people before me, even if we completely froze the language now there would still plenty of work ahead of us :-) Mark. From guido@CNRI.Reston.VA.US Fri May 21 01:06:51 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 20 May 1999 20:06:51 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: Your message of "Fri, 21 May 1999 09:01:22 +1000." <00c001bea314$aefc5b40$0801a8c0@bobcat> References: <00c001bea314$aefc5b40$0801a8c0@bobcat> Message-ID: <199905210006.UAA07900@eric.cnri.reston.va.us> > I think the main point is how to change code when a Python frame already > references it. I dont think the structure of the frames is as important as > the general concept. But while we were talking frame-fiddling it seemed a > good point to try and hijack it a little :-) > > Would it be possible to recompile just a block of code (eg, just the > current function or method) and patch it back in such a way that the > current frame continues execution of the new code? This topic sounds mostly unrelated to the stackless discussion -- in either case you need to be able to fiddle the contents of the frame and the bytecode pointer to reflect the changed function. Some issues: - The slots containing local variables may be renumbered after recompilation; fortunately we know the name--number mapping so we can move them to their new location. But it is still tricky. - Should you be able to edit functions that are present on the call stack below the top? Suppose we have two functions: def f(): return 1 + g() def g(): return 0 Suppose set a break in g(), and then edit the source of f(). We can do all sorts of evil to f(): e.g. we could change it to return g() + 2 which affects the contents of the value stack when g() returns (originally, the value stack contained the value 1, now it is empty). Or we could even change f() to return 3 thereby eliminating the call to g() altogether! What kind of limitations do other systems that support modifying a "live" program being debugged impose? Only allowing modification of the function at the top of the stack might eliminate some problems, although there are still ways to mess up. The value stack is not always empty even when we only stop at statement boundaries -- e.g. it contains 'for' loop indices, and there's also the 'block' stack, which contains try-except information. E.g. what should happen if we change def f(): for i in range(10): print 1 stopped at the 'print 1' into def f(): print 1 ??? (Ditto for removing or adding a try/except block.) > I feel this is somewhat related to the inability to change class > implementation for an existing instance. I know there have been hacks > around this before but they arent completly reliable and IMO it would be > nice if the core Python made it easier to change already running code - > whether that code is in an existing stack frame, or just in an already > created instance, it is very difficult to do. I've been thinking a bit about this. Function objects now have mutable func_code attributes (and also func_defaults), I think we can use this. The hard part is to do the analysis needed to decide which functions to recompile! Ideally, we would simply edit a file and tell the programming environment "recompile this". The programming environment would compare the changed file with the old version that it had saved for this purpose, and notice (for example) that we changed two methods of class C. It would then recompile those methods only and stuff the new code objects in the corresponding function objects. But what would it do when we changed a global variable? Say a module originally contains a statement "x = 0". Now we change the source code to say "x = 100". Should we change the variable x? Suppose that x is modified by some of the computations in the module, and the that, after some computations, the actual value of x was 50. Should the "recompile" reset x to 100 or leave it alone? One option would be to actually change the semantics of the class and def statements so that they modify an existing class or function rather than using assignment. Effectively, this proposal would change the semantics of class A: ...some code... class A: ...some more code... to be the same as class A: ...more code... ...some more code... This is somewhat similar to the way the module or package commands in some other dynamic languages work, I think; and I don't think this would break too much existing code. The proposal would also change def f(): ...some code... def f(): ...other code... but here the equivalence is not so easy to express, since I want different semantics (I don't want the second f's code to be tacked onto the end of the first f's code). If we understand that def f(): ... really does the following: f = NewFunctionObject() f.func_code = ...code object... then the construct above (def f():... def f(): ...) would do this: f = NewFunctionObject() f.func_code = ...some code... f.func_code = ...other code... i.e. there is no assignment of a new function object for the second def. Of course if there is a variable f but it is not a function, it would have to be assigned a new function object first. But in the case of def, this *does* break existing code. E.g. # module A from B import f . . . if ...some test...: def f(): ...some code... This idiom conditionally redefines a function that was also imported from some other module. The proposed new semantics would change B.f in place! So perhaps these new semantics should only be invoked when a special "reload-compile" is asked for... Or perhaps the programming environment could do this through source parsing as I proposed before... > This has come to try and deflect some conversation away from changing > Python as such towards an attempt at enhancing its _environment_. To > paraphrase many people before me, even if we completely froze the language > now there would still plenty of work ahead of us :-) Please, no more posts about Scheme. Each new post mentioning call/cc makes it *less* likely that something like that will ever be part of Python. "What if Guido's brain exploded?" :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@mojam.com (Skip Montanaro) Fri May 21 02:13:28 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 20 May 1999 21:13:28 -0400 (EDT) Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> References: <00c001bea314$aefc5b40$0801a8c0@bobcat> <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <14148.45321.204380.19130@cm-29-94-2.nycap.rr.com> Guido> What kind of limitations do other systems that support modifying Guido> a "live" program being debugged impose? Only allowing Guido> modification of the function at the top of the stack might Guido> eliminate some problems, although there are still ways to mess Guido> up. Frame objects maintain pointers to the active code objects, locals and globals, so modifying a function object's code or globals shouldn't have any effect on currently executing frames, right? I assume frame objects do the usual INCREF/DECREF dance, so the old code object won't get deleted before the frame object is tossed. Guido> But what would it do when we changed a global variable? Say a Guido> module originally contains a statement "x = 0". Now we change Guido> the source code to say "x = 100". Should we change the variable Guido> x? Suppose that x is modified by some of the computations in the Guido> module, and the that, after some computations, the actual value Guido> of x was 50. Should the "recompile" reset x to 100 or leave it Guido> alone? I think you should note the change for users and give them some way to easily pick between old initial value, new initial value or current value. Guido> Please, no more posts about Scheme. Each new post mentioning Guido> call/cc makes it *less* likely that something like that will ever Guido> be part of Python. "What if Guido's brain exploded?" :-) I agree. I see call/cc or set! and my eyes just glaze over... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From MHammond@skippinet.com.au Fri May 21 02:42:14 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Fri, 21 May 1999 11:42:14 +1000 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <00c501bea32b$277ce3d0$0801a8c0@bobcat> [Guido writes...] > This topic sounds mostly unrelated to the stackless discussion -- in Sure is - I just saw that as an excuse to try and hijack it > Some issues: > > - The slots containing local variables may be renumbered after Generally, I think we could make something very useful even with a number of limitations. For example, I would find a first cut completely acceptable and a great improvement on today if: * Only the function at the top of the stack can be recompiled and have the code reflected while executing. This function also must be restarted after such an edit. If the function uses global variables or makes calls that restarting will screw-up, then either a) make the code changes _before_ doing this stuff, or b) live with it for now, and help us remove the limitation :-) That may make the locals being renumbered easier to deal with, and also remove some of the problems you discussed about editing functions below the top. > What kind of limitations do other systems that support modifying a > "live" program being debugged impose? Only allowing modification of I can only speak for VC, and from experience at that - I havent attempted to find documentation on it. It accepts most changes while running. The current line is fine. If you create or change the definition of globals (and possibly even the type of locals?), the "incremental compilation" fails, and you are given the option of continuing with the old code, or stopping the process and doing a full build. When the debug session terminates, some link process (and maybe even compilation?) is done to bring the .exe on disk up to date with the changes. If you do wierd stuff like delete the line being executed, it usually gives you some warning message before either restarting the function or trying to pick a line somewhere near the line you deleted. Either way, it can screw up, moving the "current" line somewhere else - it doesnt crash the debugger, but may not do exactly what you expected. It is still a _huge_ win, and a great feature! Ironically, I turn this feature _off_ for Python extensions. Although changing the C code is great, in 99% of the cases I also need to change some .py code, and as existing instances are affected I need to restart the app anyway - so I may as well do a normal build at that time. ie, C now lets me debug incrementally, but a far more dynamic language prevents this feature being useful ;-) > the function at the top of the stack might eliminate some problems, > although there are still ways to mess up. The value stack is not > always empty even when we only stop at statement boundaries If we forced a restart would this be better? Can we reliably reset the stack to the start of the current function? > I've been thinking a bit about this. Function objects now have > mutable func_code attributes (and also func_defaults), I think we can > use this. > > The hard part is to do the analysis needed to decide which functions > to recompile! Ideally, we would simply edit a file and tell the > programming environment "recompile this". The programming environment > would compare the changed file with the old version that it had saved > for this purpose, and notice (for example) that we changed two methods > of class C. It would then recompile those methods only and stuff the > new code objects in the corresponding function objects. If this would work for the few changed functions/methods, what would the impact be of doing it for _every_ function (changed or not)? Then the analysis can drop to the module level which is much easier. I dont think a slight performace hit is a problem at all when doing this stuff. > One option would be to actually change the semantics of the class and > def statements so that they modify an existing class or function > rather than using assignment. Effectively, this proposal would change > the semantics of > > class A: > ...some code... > > class A: > ...some more code... > > to be the same as > > class A: > ...more code... > ...some more code... Or extending this (didnt this come up at the latest IPC?) # .\package\__init__.py class BigMutha: pass # .\package\something.py class package.BigMutha: def some_category_of_methods(): ... # .\package\other.py class package.BigMutha: def other_category_of_methods(): ... [Of course, this wont fly as it stands; just a conceptual possibility] > So perhaps these new semantics should only be invoked when a special > "reload-compile" is asked for... Or perhaps the programming > environment could do this through source parsing as I proposed > before... From your interesting summary, I believe this would be the best approach to get started with. This way we limit any strange new semantics to what are clearly debugging related features. It also means the debug specific features could attempt more hacks that the "real" environment would never attempt. Of course, this isnt to suggest these new semantics arent worth exploring (even if just for the possibilities of splitting class definitions as my code attempts to show), but IMO should be seperate from these debugging features. > Python. "What if Guido's brain exploded?" :-) At least on that particular topic I didnt even consider I was the only one in fear of that! But it is good to know that you specifically are too :-) Mark. From guido@CNRI.Reston.VA.US Fri May 21 04:02:49 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 20 May 1999 23:02:49 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: Your message of "Fri, 21 May 1999 11:42:14 +1000." <00c501bea32b$277ce3d0$0801a8c0@bobcat> References: <00c501bea32b$277ce3d0$0801a8c0@bobcat> Message-ID: <199905210302.XAA08129@eric.cnri.reston.va.us> > Generally, I think we could make something very useful even with a number > of limitations. For example, I would find a first cut completely > acceptable and a great improvement on today if: > > * Only the function at the top of the stack can be recompiled and have the > code reflected while executing. This function also must be restarted after > such an edit. If the function uses global variables or makes calls that > restarting will screw-up, then either a) make the code changes _before_ > doing this stuff, or b) live with it for now, and help us remove the > limitation :-) OK, restarting the function seems a reasonable compromise and would seem relatively easy to implement. Not *real* easy though: it turns out that eval_code2() is called with a code object as argument, and it's not entirely trivial to figure out the corresponding function object from which to grab the new code object. But it could be done -- give it a try. (Don't wait for me, I'm ducking for cover until at least mid June.) > Ironically, I turn this feature _off_ for Python extensions. Although > changing the C code is great, in 99% of the cases I also need to change > some .py code, and as existing instances are affected I need to restart the > app anyway - so I may as well do a normal build at that time. ie, C now > lets me debug incrementally, but a far more dynamic language prevents this > feature being useful ;-) I hear you. > If we forced a restart would this be better? Can we reliably reset the > stack to the start of the current function? Yes, no problem. > If this would work for the few changed functions/methods, what would the > impact be of doing it for _every_ function (changed or not)? Then the > analysis can drop to the module level which is much easier. I dont think a > slight performace hit is a problem at all when doing this stuff. Yes, this would be fine too. > >"What if Guido's brain exploded?" :-) > > At least on that particular topic I didnt even consider I was the only one > in fear of that! But it is good to know that you specifically are too :-) Have no fear. I've learned to say no. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Fri May 21 06:36:44 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 21 May 1999 01:36:44 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <000401bea34b$e93fcda0$d89e2299@tim> [GvR] > ... > What kind of limitations do other systems that support modifying a > "live" program being debugged impose? As an ex-compiler guy, I should have something wise to say about that. Alas, I've never used a system that allowed more than poking new values into vrbls, and the thought of any more than that makes me vaguely ill! Oh, that's right -- I'm vaguely ill anyway today. Still-- oooooh -- the problems. This later got reduced to restarting the topmost function from scratch. That has some attraction, especially on the bang-for-buck-o-meter. > ... > Please, no more posts about Scheme. Each new post mentioning call/cc > makes it *less* likely that something like that will ever be part of > Python. "What if Guido's brain exploded?" :-) What a pussy . Really, overall continuations are much less trouble to understand than threads -- there's only one function in the entire interface! OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS. changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs - tim From tismer@appliedbiometrics.com Fri May 21 08:12:05 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 21 May 1999 09:12:05 +0200 Subject: [Python-Dev] Interactive Debugging of Python References: <00c001bea314$aefc5b40$0801a8c0@bobcat> Message-ID: <37450745.21D63A5@appliedbiometrics.com> Mark Hammond wrote: > > > I'm writing a prototype of a stackless Python, which means that > > you will be able to access the current state of the interpreter > > completely. > > The inner interpreter loop will be isolated from the frame > > dispatcher. It will break whenever the ticker goes zero. > > If you set the ticker to one, you will be able to single > > step on every opcode, have the value stack, the frame chain, > > everything. > > I think the main point is how to change code when a Python frame already > references it. I dont think the structure of the frames is as important as > the general concept. But while we were talking frame-fiddling it seemed a > good point to try and hijack it a little :-) > > Would it be possible to recompile just a block of code (eg, just the > current function or method) and patch it back in such a way that the > current frame continues execution of the new code? Sure. Since the frame holds a pointer to the code, and the current IP and SP, your code can easily change it (with care, or GPF:) . It could even create a fresh code object and let it run only for the running instance. By instance, I mean a frame which is running a code object. > I feel this is somewhat related to the inability to change class > implementation for an existing instance. I know there have been hacks > around this before but they arent completly reliable and IMO it would be > nice if the core Python made it easier to change already running code - > whether that code is in an existing stack frame, or just in an already > created instance, it is very difficult to do. I think this has been difficult, only since information was hiding in the inner interpreter loop. Gonna change now. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Fri May 21 08:21:22 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 21 May 1999 09:21:22 +0200 Subject: [Python-Dev] 'stackless' python? References: <000901bea26a$34526240$179e2299@tim> <37443DC5.1330EAC6@appliedbiometrics.com> <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us> Message-ID: <37450972.D19E160@appliedbiometrics.com> Jeremy Hylton wrote: > > >>>>> "CT" == Christian Tismer writes: > > CT> What I want to achieve is that I can run this again, from my > CT> snapshot. But with shared locals, my parameter x of the snapshot > CT> would have changed to x+1, which I don't find useful. I want to > CT> fix a state of the current frame and still think it should "own" > CT> its locals. Globals are borrowed, anyway. Class instances will > CT> anyway do what you want, since the local "self" is a mutable > CT> object. > > CT> How do you want to keep computations independent when locals are > CT> shared? For me it's just easier to implement and also to think > CT> with the shallow copy. Otherwise, where is my private place? > CT> Open for becoming convinced, of course :-) > > I think you're making things a lot more complicated by trying to > instantiate new variable bindings for locals every time you create a > continuation. Can you give an example of why that would be helpful? I'm not sure wether you all understand me, and vice versa. There is no copying at all, but for the frame. I copy the frame, which means I also incref all the objects which it holds. Done. This is the bare minimum which I must do. > (Ok. I'm not sure I can offer a good example of why it would be > helpful to share them, but it makes intuitive sense to me.) > > The call_cc mechanism is going to let you capture the current > continuation, save it somewhere, and call on it again as often as you > like. Would you get a fresh locals each time you used it? or just > the first time? If only the first time, it doesn't seem that you've > gained a whole lot. call_cc does a copy of the state which is the frame. This is stored away until it is revived. Nothing else happens. As Guido pointed out, virtually the whole frame chain is duplicated, but only on demand. > Also, all the locals that are references to mutable objects are > already effectively shared. So it's only a few oddballs like ints > that are an issue. Simply look at a frame, what it is. What do you need to do to run it again with a given state. You have to preserve the stack variables. And you have to preserve the current locals, since some of them might even have a copy on the stack, and we want to stay consistent. I believe it would become obvious if you tried to implement it. Maybe I should close my ears and get something ready to show? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Fri May 21 10:00:26 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 21 May 1999 11:00:26 +0200 Subject: [Python-Dev] 'stackless' python? References: <000701bea30c$af7a1060$9d9e2299@tim> Message-ID: <374520AA.2ADEA687@appliedbiometrics.com> Tim Peters wrote: > > [Christian] > [... clarified stuff ... thanks! ... much clearer ...] But still not clear enough, I fear. > > ... > > If a frame is the current state, I make it two frames to have two > > current states. One will be saved, the other will be run. This is > > what I call "splitting". Actually, splitting must occour whenever > > a frame can be reached twice, in order to keep elements alive. > > That part doesn't compute: if a frame can be reached by more than one path, > its refcount must be at least equal to the number of its immediate > predecessors, and its refcount won't fall to 0 before it becomes > unreachable. So while you may need to split stuff for *some* reasons, I > can't see how keeping elements alive could be one of those reasons (unless > you're zapping frame contents *before* the frame itself is garbage?). I was saying that under the side condition that I don't want to change frames as they are now. Maybe that's misconcepted, but this is what I did: If a frame as we have it today shall be resumed twice, then it has to be copied, since: The stack is in it and has some state which will change after resuming. That was the whole problem with my first prototype, which was done hoping that I don't need to change the interpreter at all. Wrong, bad, however. What I actually did was more than seems to be needed: I made a copy of the whole current frame chain. Later on, Guido said this can be done on demand. He's right. [Scheme sample - understood] > GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't > care where that points *to* other frame and ceval2 wouldn't know the difference). Maybe a frame entered > due to continuation needs extra setup work? Scheme saves itself by putting > name-resolution and continuation info into different structures; to mimic > the semantics, Python would need to get the same end effect. Point taken. The pointer doesn't save time of access, it just saves allocating another structure. So we can use something else without speed loss. [have to cut a little] > So recall your uses of Icon generators instead: like Python, Icon does have > loops, and two-level scoping, and I routinely build loopy Icon generators > that keep state in locals. Here's a dirt-simple example I emailed to Sam > earlier this week: > > procedure main() > every result := fib(0, 1) \ 10 do > write(result) > end > > procedure fib(i, j) > local temp > repeat { > suspend i > temp := i + j > i := j > j := temp > } > end [prints fib series] > If Icon restored the locals (i, j, temp) upon each fib resumption, it would > generate a zero followed by an infinite sequence of ones(!). Now I'm completely missing the point. Why should I want to restore anything? At a suspend, which when done by continuations will be done by temporarily having two identical states, one is saved and another is continued. The continued one in your example just returns the current value and immediately forgets about the locals. The other one is continued later, and of course with the same locals which were active when going asleep. > Think of a continuation as a *paused* computation (which it is) rather than > an *independent* one (which it isn't ), and I think it gets darned > hard to argue. No, you get me wrong. I understand what you mean. It is just the decision wether a frame, which will be reactivated later as a continuation, should use a reference to locals like the reference which it has for the globals. This causes me a major frame redesign. Current design: A frame is: back chain, state, code, unpacked locals, globals, stack. Code and globals are shared. State, unpacked locals and stack are private. Possible new design: A frame is: back chain, state, code, variables, globals, stack. variables is: unpacked locals. This makes the variables into an extra structure which is shared. Probably a list would be the thing, or abusing a tuple as a mutable object. Hmm. I think I should get something ready, and we should keep this thread short, or we will loose the rest of Guido's goodwill (if not already). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From da@ski.org Fri May 21 17:27:42 1999 From: da@ski.org (David Ascher) Date: Fri, 21 May 1999 09:27:42 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <000401bea34b$e93fcda0$d89e2299@tim> Message-ID: On Fri, 21 May 1999, Tim Peters wrote: > OK. So how do you feel about coroutines? Would sure be nice to have *some* > way to get pseudo-parallel semantics regardless of OS. I read about coroutines years ago on c.l.py, but I admit I forgot it all. Can you explain them briefly in pseudo-python? --david From tim_one@email.msn.com Sat May 22 05:22:50 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 22 May 1999 00:22:50 -0400 Subject: [Python-Dev] Coroutines In-Reply-To: Message-ID: <000401bea40a$c1d2d2c0$659e2299@tim> [Tim] > OK. So how do you feel about coroutines? Would sure be nice > to have *some* way to get pseudo-parallel semantics regardless of OS. [David Ascher] > I read about coroutines years ago on c.l.py, but I admit I forgot it all. > Can you explain them briefly in pseudo-python? How about real Python? http://www.python.org/tim_one/000169.html contains a complete coroutine implementation using threads under the covers (& exactly 5 years old tomorrow ). If I were to do it over again, I'd use a different object interface (making coroutines objects in their own right instead of funneling everything through a "coroutine controller" object), but the ideas are the same in every coroutine language. The post contains several executable examples, from simple to "literature standard". I had forgotten all about this: it contains solutions to the same "compare tree fringes" problem Sam mentioned, *and* the generator-based building block I posted three other solutions for in this thread. That last looks like: # fringe visits a nested list in inorder, and detaches for each non-list # element; raises EarlyExit after the list is exhausted def fringe( co, list ): for x in list: if type(x) is type([]): fringe(co, x) else: co.detach(x) def printinorder( list ): co = Coroutine() f = co.create(fringe, co, list) try: while 1: print co.tran(f), except EarlyExit: pass print printinorder([1,2,3]) # 1 2 3 printinorder([[[[1,[2]]],3]]) # ditto x = [0, 1, [2, [3]], [4,5], [[[6]]] ] printinorder(x) # 0 1 2 3 4 5 6 Generators are really "half a coroutine", so this doesn't show the full power (other examples in the post do). co.detach is a special way to deal with this asymmetry. In the general case you use co.tran all the time, where (see the post for more info) v = co.tran(c [, w]) means "resume coroutine c from the place it last did a co.tran, optionally passing it the value w, and when somebody does a co.tran back to *me*, resume me right here, binding v to the value *they* pass to co.tran ). Knuth complains several times that it's very hard to come up with a coroutine example that's both simple and clear <0.5 wink>. In a nutshell, coroutines don't have a "caller/callee" relationship, they have "we're all equal partners" relationship, where any coroutine is free to resume any other one where it left off. It's no coincidence that making coroutines easy to use was pioneered by simulation languages! Just try simulating a marriage where one partner is the master and the other a slave . i-may-be-a-bachelor-but-i-have-eyes-ly y'rs - tim From tim_one@email.msn.com Sat May 22 05:22:55 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 22 May 1999 00:22:55 -0400 Subject: [Python-Dev] Re: Coroutines In-Reply-To: Message-ID: <000501bea40a$c3d1fe20$659e2299@tim> Thoughts o' the day: + Generators ("semi-coroutines") are wonderful tools and easy to implement without major changes to the PVM. Icon calls 'em generators, Sather calls 'em iterators, and they're exactly what you need to implement "for thing in object:" when object represents a collection that's tricky to materialize. Python needs something like that. OTOH, generators are pretty much limited to that. + Coroutines are more general but much harder to implement, because each coroutine needs its own stack (a generator only has one stack *frame*-- its own --to worry about), and C-calling-Python can get into the act. As Sam said, they're probably no easier to implement than call/cc (but trivial to implement given call/cc). + What may be most *natural* is to forget all that and think about a variation of Python threads implemented directly via the interpreter, without using OS threads. The PVM already knows how to handle thread-state swapping. Given Christian's stackless interpreter, and barring C->Python cases, I suspect Python can fake threads all by itself, in the sense of interleaving their executions within a single "real" (OS) thread. Given the global interpreter lock, Python effectively does only-one-at-a-time anyway. Threads are harder than generators or coroutines to learn, but A) Many more people know how to use them already. B) Generators and coroutines can be implemented using (real or fake) threads. C) Python has offered threads since the beginning. D) Threads offer a powerful mode of control transfer coroutines don't, namely "*anyone* else who can make progress now, feel encouraged to do so at my expense". E) For whatever reasons, in my experience people find threads much easier to learn than call/cc -- perhaps because threads are *obviously* useful upon first sight, while it takes a real Zen Experience before call/cc begins to make sense. F) Simulated threads could presumably produce much more informative error msgs (about deadlocks and such) than OS threads, so even people using real threads could find excellent debugging use for them. Sam doesn't want to use "real threads" because they're pigs; fake threads don't have to be. Perhaps x = y.SOME_ASYNC_CALL(r, s, t) could map to e.g. import config if config.USE_REAL_THREADS: import threading else: from simulated_threading import threading from config.shared import msg_queue class Y: def __init__(self, ...): self.ready = threading.Event() ... def SOME_ASYNC_CALL(self, r, s, t): result = [None] # mutable container to hold the result msg_queue.put((server_of_the_day, r, s, t, self.ready, result)) self.ready.wait() self.ready.clear() return result[0] where some other simulated thread polls the msg_queue and does ready.set() when it's done processing the msg enqueued by SOME_ASYNC_CALL. For this to scale nicely, it's probably necessary for the PVM to cooperate with the simulated_threading implementation (e.g., a simulated thread that blocks (like on self.ready.wait()) should be taken out of the collection of simulated threads the PVM may attempt to resume -- else in Sam's case the PVM would repeatedly attempt to wake up thousands of blocked threads, and things would slow to a crawl). Of course, simulated_threading could be built on top of call/cc or coroutines too. The point to making threads the core concept is keeping Guido's brain from exploding. Plus, as above, you can switch to "real threads" by changing an import statement. making-sure-the-global-lock-support-hair-stays-around-even-if-greg- renders-it-moot-for-real-threads-ly y'rs - tim From tismer@appliedbiometrics.com Sat May 22 17:20:30 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sat, 22 May 1999 18:20:30 +0200 Subject: [Python-Dev] Coroutines References: <000401bea40a$c1d2d2c0$659e2299@tim> Message-ID: <3746D94E.239D0B8E@appliedbiometrics.com> Tim Peters wrote: > > [Tim] > > OK. So how do you feel about coroutines? Would sure be nice > > to have *some* way to get pseudo-parallel semantics regardless of OS. > > [David Ascher] > > I read about coroutines years ago on c.l.py, but I admit I forgot it all. > > Can you explain them briefly in pseudo-python? > > How about real Python? http://www.python.org/tim_one/000169.html contains a > complete coroutine implementation using threads under the covers (& exactly > 5 years old tomorrow ). If I were to do it over again, I'd use a > different object interface (making coroutines objects in their own right > instead of funneling everything through a "coroutine controller" object), > but the ideas are the same in every coroutine language. The post contains > several executable examples, from simple to "literature standard". What an interesting thread! Unfortunately, all the examples are messed up since some HTML formatter didn't take care of the python code, rendering it unreadable. Is there a different version available? Also, I'd like to read the rest of the threads in http://www.python.org/tim_one/ but it seems that only your messages are archived? Anyway, the citations in http://www.python.org/tim_one/000146.html show me that you have been through all of this five years ago, with a five years younger Guido which sounds a bit different than today. I had understood him better if I had known that this is a re-iteration of a somehow dropped or entombed idea. (If someone has the original archives from that epoche, I'd be happy to get a copy. Actually, I'm missing all upto end of 1996.) A sort snapshot: Stackless Python is meanwhile nearly alive, with recursion avoided in ceval. Of course, some modules are left which still need work, but enough for a prototype. Frames contain now all necessry state and are now prepared for execution and thrown back to the evaluator (elevator?). The key idea was to change the deeply nested functions in a way, that their last eval_code call happens to be tail recursive. In ceval.c (and in other not yet changed places), functions to a lot of preparation, build some parameter, call eval_code and release the parameter. This was the crux, which I solved by a new filed in the frame object, where such references can be stored. The routine can now return with the ready packaged frame, instead of calling it. As a minimum facility for future co-anythings, I provided a hook function for resuming frames, which causes no overhead in the usual case but allows to override what a frame does when someone returns control to it. To implement this is due to some extension module, wether this may be coroutines or your nice nano-threads, it's possible. threadedly yours - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Sat May 22 20:04:43 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sat, 22 May 1999 21:04:43 +0200 Subject: [Python-Dev] How stackless can Python be? Message-ID: <3746FFCB.CD506BE4@appliedbiometrics.com> Hi, to make the core interpreter stackless is one thing. Turning functions which call the interpreter from some deep nesting level into versions, which return a frame object instead which is to be called, is possible in many cases. Internals like apply are rather uncomplicated to convert. CallObjectWithKeywords is done. What I have *no* good solution for is map. Map does an iteration over evaluations and keeps state while it is running. The same applies to reduce, but it seems to be not used so much. Map is. I don't see at the moment if map could be a killer for Tim's nice mini-thread idea. How must map work, if, for instance, a map is done with a function which then begins to switch between threads, before map is done? Can one imagine a problem? Maybe it is no issue, but I'd really like to know wether we need a stateless map. (without replacing it by a for loop :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one@email.msn.com Sat May 22 20:35:58 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 22 May 1999 15:35:58 -0400 Subject: [Python-Dev] Coroutines In-Reply-To: <3746D94E.239D0B8E@appliedbiometrics.com> Message-ID: <000501bea48a$51563980$119e2299@tim> >> http://www.python.org/tim_one/000169.html [Christian] > What an interesting thread! Unfortunately, all the examples are messed > up since some HTML formatter didn't take care of the python code, > rendering it unreadable. Is there a different version available? > > Also, I'd like to read the rest of the threads in > http://www.python.org/tim_one/ but it seems that only your messages > are archived? Yes, that link is the old Pythonic Award Shrine erected in my memory -- it's all me, all the time, no mercy, no escape . It predates the DejaNews archive, but the context can still be found in http://www.python.org/search/hypermail/python-1994q2/index.html There's a lot in that quarter about continuations & coroutines, most from Steven Majewski, who took a serious shot at implementing all this. Don't have the code in a more usable form; when my then-employer died, most of my files went with it. You can save the file as text, though! The structure of the code is intact, it's simply that your browswer squashes out the spaces when displaying it. Nuke the

at the start of each code line and what remains is very close to what was originally posted. > Anyway, the citations in http://www.python.org/tim_one/000146.html > show me that you have been through all of this five years > ago, with a five years younger Guido which sounds a bit > different than today. > I had understood him better if I had known that this > is a re-iteration of a somehow dropped or entombed idea. You *used* to know that ! Thought you even got StevenM's old code from him a year or so ago. He went most of the way, up until hitting the C<->Python stack intertwingling barrier, and then dropped it. Plus Guido wrote generator.py to shut me up, which works, but is about 3x clumsier to use and runs about 50x slower than a generator should . > ... > Stackless Python is meanwhile nearly alive, with recursion > avoided in ceval. Of course, some modules are left which > still need work, but enough for a prototype. Frames contain > now all necessry state and are now prepared for execution > and thrown back to the evaluator (elevator?). > ... Excellent! Running off to a movie & dinner now, but will give a more careful reading tonight. co-dependent-ly y'rs - tim From tismer at appliedbiometrics.com Sun May 23 15:07:44 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 23 May 1999 15:07:44 +0200 Subject: [Python-Dev] How stackless can Python be? References: <3746FFCB.CD506BE4@appliedbiometrics.com> Message-ID: <3747FDA0.AD3E7095@appliedbiometrics.com> After a good sleep, I can answer this one by myself. I wrote: > to make the core interpreter stackless is one thing. ... > Internals like apply are rather uncomplicated to convert. > CallObjectWithKeywords is done. > > What I have *no* good solution for is map. > Map does an iteration over evaluations and keeps > state while it is running. The same applies to reduce, > but it seems to be not used so much. Map is. ... About stackless map, and this applies to every extension module which *wants* to be stackless. We don't have to enforce everybody to be stackless, but there is a couple of modules which would benefit from it. The problem with map is, that it needs to keep state, while repeatedly calling objects which might call the interpreter. Even if we kept local variables in the caller's frame, this would still be not stateless. The info that a map is running is sitting on the hardware stack, and that's wrong. Now a solution. In my last post, I argued that I don't want to replace map by a slower Python function. But that gave me the key idea to solve this: C functions which cannot tail-recursively unwound to return an executable frame object must instead return themselves as a frame object. That's it! Frames need again to be a little extended. They have to spell their interpreter, which normally is the old eval_code loop. Anatomy of a standard frame invocation: A new frame is created, parameters are inserted, the frame is returned to the frame dispatcher, which runs the inner eval_code loop until it bails out. On return, special cases of control flow are handled, as there are exception, returning, and now also calling. This is an eval_code frame, since eval_code is its execution handler. Anatomy of a map frame invocation: Map has several phases. The first phases to argument checking and basic setup. The last phase is iteration over function calls and building the result. This phase must be split off as a second function, eval_map. A new frame is created, with all temporary variables placed there. eval_map is inserted as the execution handler. Now, I think the analogy is obvious. By building proper frames, it should be possible to turn any extension function into a stackless function. The overall protocol is: A C function which does a simple computation which cannot cause an interpreter invocation, may simply evaluate and return a value. A C function which might cause an interpreter invocation, should return a freshly created frame as return value. - This can be done either in a tail-recursive fashion, if the last action of the C function would basically be calling the frame. - If no tail-recursion is possible, the function must return a new frame for itself, with an executor for its purpose. A good stackless candidate is Fredrik's xmlop, which calls back into the interpreter. If that worked without the hardware stack, then we could build ultra-fast XML processors with co-routines! As a side note: The frame structure which I sketched so far is still made for eval_code in the first place, but it has all necessary flexibilty for pluggable interpreters. An extension module can now create its own frame, with its own execution handler, and throw it back to the frame dispatcher. In other words: People can create extensions and test their own VMs if they want. This was not my primary intent, but comes for free as a consequence of having a stackless map. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From fredrik at pythonware.com Sun May 23 15:53:19 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 23 May 1999 15:53:19 +0200 Subject: [Python-Dev] Coroutines References: <000401bea40a$c1d2d2c0$659e2299@tim> <3746D94E.239D0B8E@appliedbiometrics.com> Message-ID: <031e01bea524$8db41e70$f29b12c2@pythonware.com> Christian Tismer wrote: > (If someone has the original archives from that epoche, > I'd be happy to get a copy. Actually, I'm missing all upto > end of 1996.) http://www.egroups.com/group/python-list/info.html has it all (almost), starting in 1991. From tim_one at email.msn.com Sat May 1 10:32:30 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 1 May 1999 04:32:30 -0400 Subject: [Python-Dev] Speed (was RE: [Python-Dev] More flexible namespaces.) In-Reply-To: <14121.55659.754846.708467@amarok.cnri.reston.va.us> Message-ID: <000801be93ad$27772ea0$7a9e2299@tim> [Andrew M. Kuchling] > ... > A performance improvement project would definitely be a good idea > for 1.6, and a good sub-topic for python-dev. To the extent that optimization requires uglification, optimization got pushed beyond Guido's comfort zone back around 1.4 -- little has made it in since then. Not griping; I'm just trying to avoid enduring the same discussions for the third to twelfth times . Anywho, on the theory that a sweeping speedup patch has no chance of making it in regardless, how about focusing on one subsystem? In my experience, the speed issue Python gets beat up the most for is the relative slowness of function calls. It would be very good if eval_code2 somehow or other could manage to invoke a Python function without all the hair of a recursive C call, and I believe Guido intends to move in that direction for Python2 anyway. This would be a good time to start exploring that seriously. inspirationally y'rs - tim From da at ski.org Sun May 2 00:15:32 1999 From: da at ski.org (David Ascher) Date: Sat, 1 May 1999 15:15:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <37296856.5875AAAF@lemburg.com> Message-ID: > Since you put out to objectives, I'd like to propose a little > different approach... > > 1. Have eval/exec accept any mapping object as input > > 2. Make those two copy the content of the mapping object into real > dictionaries > > 3. Provide a hook into the dictionary implementation that can be > used to redirect KeyErrors and use that redirection to forward > the request to the original mapping objects Interesting counterproposal. I'm not sure whether any of the proposals on the table really do what's needed for e.g. case-insensitive namespace handling. I can see how all of the proposals so far allow case-insensitive reference name handling in the global namespace, but don't we also need to hook into the local-namespace creation process to allow case-insensitivity to work throughout? --david From da at ski.org Sun May 2 17:15:57 1999 From: da at ski.org (David Ascher) Date: Sun, 2 May 1999 08:15:57 -0700 (Pacific Daylight Time) Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <00bc01be942a$47d94070$0801a8c0@bobcat> Message-ID: On Sun, 2 May 1999, Mark Hammond wrote: > > I'm not sure whether any of the > > proposals on > > the table really do what's needed for e.g. case-insensitive namespace > > handling. I can see how all of the proposals so far allow > > case-insensitive reference name handling in the global namespace, but > > don't we also need to hook into the local-namespace creation > > process to > > allow case-insensitivity to work throughout? > > Why not? I pictured case insensitive namespaces working so that they > retain the case of the first assignment, but all lookups would be > case-insensitive. > > Ohh - right! Python itself would need changing to support this. I suppose > that faced with code such as: > > def func(): > if spam: > Spam=1 > > Python would generate code that refers to "spam" as a local, and "Spam" as > a global. > > Is this why you feel it wont work? I hadn't thought of that, to be truthful, but I think it's more generic. [FWIW, I never much cared for the tag-variables-at-compile-time optimization in CPython, and wouldn't miss it if were lost.] The point is that if I eval or exec code which calls a function specifying some strange mapping as the namespaces (global and current-local) I presumably want to also specify how local namespaces work for the function calls within that code snippet. That means that somehow Python has to know what kind of namespace to use for local environments, and not use the standard dictionary. Maybe we can simply have it use a '.clear()'ed .__copy__ of the specified environment. exec 'foo()' in globals(), mylocals would then call foo and within foo, the local env't would be mylocals.__copy__.clear(). Anyway, something for those-with-the-patches to keep in mind. --david From tismer at appliedbiometrics.com Sun May 2 15:00:37 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 02 May 1999 15:00:37 +0200 Subject: [Python-Dev] More flexible namespaces. References: Message-ID: <372C4C75.5B7CCAC8@appliedbiometrics.com> David Ascher wrote: [Marc:> > > Since you put out to objectives, I'd like to propose a little > > different approach... > > > > 1. Have eval/exec accept any mapping object as input > > > > 2. Make those two copy the content of the mapping object into real > > dictionaries > > > > 3. Provide a hook into the dictionary implementation that can be > > used to redirect KeyErrors and use that redirection to forward > > the request to the original mapping objects I don't think that this proposal would give so much new value. Since a mapping can also be implemented in arbitrary ways, say by functions, a mapping is not necessarily finite and might not be changeable into a dict. [David:> > Interesting counterproposal. I'm not sure whether any of the proposals on > the table really do what's needed for e.g. case-insensitive namespace > handling. I can see how all of the proposals so far allow > case-insensitive reference name handling in the global namespace, but > don't we also need to hook into the local-namespace creation process to > allow case-insensitivity to work throughout? Case-independant namespaces seem to be a minor point, nice to have for interfacing to other products, but then, in a function, I see no benefit in changing the semantics of function locals? The lookup of foreign symbols would always be through a mapping object. If you take COM for instance, your access to a COM wrapper for an arbitrary object would be through properties of this object. After assignment to a local function variable, why should we support case-insensitivity at all? I would think mapping objects would be a great simplification of lazy imports in COM, where we would like to avoid to import really huge namespaces in one big slurp. Also the wrapper code could be made quite a lot easier and faster without so much getattr/setattr trapping. Does btw. anybody really want to see case-insensitivity in Python programs? I'm quite happy with it as it is, and I would even force the use to always use the same case style after he has touched an external property once. Example for Excel: You may write "xl.workbooks" in lowercase, but then you have to stay with it. This would keep Python source clean for, say, PyLint. my 0.02 Euro - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From MHammond at skippinet.com.au Sun May 2 01:28:11 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sun, 2 May 1999 09:28:11 +1000 Subject: [Python-Dev] More flexible namespaces. In-Reply-To: Message-ID: <00bc01be942a$47d94070$0801a8c0@bobcat> > I'm not sure whether any of the > proposals on > the table really do what's needed for e.g. case-insensitive namespace > handling. I can see how all of the proposals so far allow > case-insensitive reference name handling in the global namespace, but > don't we also need to hook into the local-namespace creation > process to > allow case-insensitivity to work throughout? Why not? I pictured case insensitive namespaces working so that they retain the case of the first assignment, but all lookups would be case-insensitive. Ohh - right! Python itself would need changing to support this. I suppose that faced with code such as: def func(): if spam: Spam=1 Python would generate code that refers to "spam" as a local, and "Spam" as a global. Is this why you feel it wont work? Mark. From mal at lemburg.com Sun May 2 21:24:54 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 02 May 1999 21:24:54 +0200 Subject: [Python-Dev] More flexible namespaces. References: <372C4C75.5B7CCAC8@appliedbiometrics.com> Message-ID: <372CA686.215D71DF@lemburg.com> Christian Tismer wrote: > > David Ascher wrote: > [Marc:> > > > Since you put out the objectives, I'd like to propose a little > > > different approach... > > > > > > 1. Have eval/exec accept any mapping object as input > > > > > > 2. Make those two copy the content of the mapping object into real > > > dictionaries > > > > > > 3. Provide a hook into the dictionary implementation that can be > > > used to redirect KeyErrors and use that redirection to forward > > > the request to the original mapping objects > > I don't think that this proposal would give so much new > value. Since a mapping can also be implemented in arbitrary > ways, say by functions, a mapping is not necessarily finite > and might not be changeable into a dict. [Disclaimer: I'm not really keen on having the possibility of letting code execute in arbitrary namespace objects... it would make code optimizations even less manageable.] You can easily support infinite mappings by wrapping the function into an object which returns an empty list for .items() and then use the hook mentioned in 3 to redirect the lookup to that function. The proposal allows one to use such a proxy to simulate any kind of mapping -- it works much like the __getattr__ hook provided for instances. > [David:> > > Interesting counterproposal. I'm not sure whether any of the proposals on > > the table really do what's needed for e.g. case-insensitive namespace > > handling. I can see how all of the proposals so far allow > > case-insensitive reference name handling in the global namespace, but > > don't we also need to hook into the local-namespace creation process to > > allow case-insensitivity to work throughout? > > Case-independant namespaces seem to be a minor point, > nice to have for interfacing to other products, but then, > in a function, I see no benefit in changing the semantics > of function locals? The lookup of foreign symbols would > always be through a mapping object. If you take COM for > instance, your access to a COM wrapper for an arbitrary > object would be through properties of this object. After > assignment to a local function variable, why should we > support case-insensitivity at all? > > I would think mapping objects would be a great > simplification of lazy imports in COM, where > we would like to avoid to import really huge > namespaces in one big slurp. Also the wrapper code > could be made quite a lot easier and faster without > so much getattr/setattr trapping. What do lazy imports have to do with case [in]sensitive namespaces ? Anyway, how about a simple lazy import mechanism in the standard distribution, i.e. why not make all imports lazy ? Since modules are first class objects this should be easy to implement... > Does btw. anybody really want to see case-insensitivity > in Python programs? I'm quite happy with it as it is, > and I would even force the use to always use the same > case style after he has touched an external property > once. Example for Excel: You may write "xl.workbooks" > in lowercase, but then you have to stay with it. > This would keep Python source clean for, say, PyLint. "No" and "me too" ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 243 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From MHammond at skippinet.com.au Mon May 3 02:52:41 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Mon, 3 May 1999 10:52:41 +1000 Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <372CA686.215D71DF@lemburg.com> Message-ID: <000e01be94ff$4047ef20$0801a8c0@bobcat> [Marc] > [Disclaimer: I'm not really keen on having the possibility of > letting code execute in arbitrary namespace objects... it would > make code optimizations even less manageable.] Good point - although surely that would simply mean (certain) optimisations can't be performed for code executing in that environment? How to detect this at "optimization time" may be a little difficult :-) However, this is the primary purpose of this thread - to workout _if_ it is a good idea, as much as working out _how_ to do it :-) > The proposal allows one to use such a proxy to simulate any > kind of mapping -- it works much like the __getattr__ hook > provided for instances. My only problem with Marc's proposal is that there already _is_ an established mapping protocol, and this doesnt use it; instead it invents a new one with the benefit being potentially less code breakage. And without attempting to sound flippant, I wonder how many extension modules will be affected? Module init code certainly assumes the module __dict__ is a dictionary, but none of my code assumes anything about other namespaces. Marc's extensions may be a special case, as AFAIK they inject objects into other dictionaries (ie, new builtins?). Again, not trying to downplay this too much, but if it is only a problem for Marc's more esoteric extensions, I dont feel that should hold up an otherwise solid proposal. [Chris, I think?] > > Case-independant namespaces seem to be a minor point, > > nice to have for interfacing to other products, but then, > > in a function, I see no benefit in changing the semantics > > of function locals? The lookup of foreign symbols would I disagree here. Consider Alice, and similar projects, where a (arguably misplaced, but nonetheless) requirement is that the embedded language be case-insensitive. Period. The Alice people are somewhat special in that they had the resources to change the interpreters guts. Most people wont, and will look for a different language to embedd. Of course, I agree with you for the specific cases you are talking - COM, Active Scripting etc. Indeed, everything I would use this for would prefer to keep the local function semantics identical. > > Does btw. anybody really want to see case-insensitivity > > in Python programs? I'm quite happy with it as it is, > > and I would even force the use to always use the same > > case style after he has touched an external property > > once. Example for Excel: You may write "xl.workbooks" > > in lowercase, but then you have to stay with it. > > This would keep Python source clean for, say, PyLint. > > "No" and "me too" ;-) I think we are missing the point a little. If we focus on COM, we may come up with a different answer. Indeed, if we are to focus on COM integration with Python, there are other areas I would prefer to start with :-) IMO, we should attempt to come up with a more flexible namespace mechanism that is in the style of Python, and will not noticeably slowdown Python. Then COM etc can take advantage of it - much in the same way that Python's existing namespace model existed pre-COM, and COM had to take advantage of what it could! Of course, a key indicator of the likely success is how well COM _can_ take advantage of it, and how much Alice could have taken advantage of it - I cant think of any other yardsticks? Mark. From mal at lemburg.com Mon May 3 09:56:53 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 03 May 1999 09:56:53 +0200 Subject: [Python-Dev] More flexible namespaces. References: <000e01be94ff$4047ef20$0801a8c0@bobcat> Message-ID: <372D56C5.4738DE3D@lemburg.com> Mark Hammond wrote: > > [Marc] > > [Disclaimer: I'm not really keen on having the possibility of > > letting code execute in arbitrary namespace objects... it would > > make code optimizations even less manageable.] > > Good point - although surely that would simply mean (certain) optimisations > can't be performed for code executing in that environment? How to detect > this at "optimization time" may be a little difficult :-) > > However, this is the primary purpose of this thread - to workout _if_ it is > a good idea, as much as working out _how_ to do it :-) > > > The proposal allows one to use such a proxy to simulate any > > kind of mapping -- it works much like the __getattr__ hook > > provided for instances. > > My only problem with Marc's proposal is that there already _is_ an > established mapping protocol, and this doesnt use it; instead it invents a > new one with the benefit being potentially less code breakage. ...and that's the key point: you get the intended features and the core code will not have to be changed in significant ways. Basically, I think these kind of core extensions should be done in generic ways, e.g. by letting the eval/exec machinery accept subclasses of dictionaries, rather than trying to raise the abstraction level used and slowing things down in general just to be able to use the feature on very few occasions. > And without attempting to sound flippant, I wonder how many extension > modules will be affected? Module init code certainly assumes the module > __dict__ is a dictionary, but none of my code assumes anything about other > namespaces. Marc's extensions may be a special case, as AFAIK they inject > objects into other dictionaries (ie, new builtins?). Again, not trying to > downplay this too much, but if it is only a problem for Marc's more > esoteric extensions, I dont feel that should hold up an otherwise solid > proposal. My mxTools extension does the assignment in Python, so it wouldn't be affected. The others only do the usual modinit() stuff. Before going any further on this thread we may have to ponder a little more on the objectives that we have. If it's only case-insensitive lookups then I guess a simple compile time switch exchanging the implementations of string hash and compare functions would do the trick. If we're after doing wild things like lookups accross networks, then a more specific approach is needed. So what is it that we want in 1.6 ? > [Chris, I think?] > > > Case-independant namespaces seem to be a minor point, > > > nice to have for interfacing to other products, but then, > > > in a function, I see no benefit in changing the semantics > > > of function locals? The lookup of foreign symbols would > > I disagree here. Consider Alice, and similar projects, where a (arguably > misplaced, but nonetheless) requirement is that the embedded language be > case-insensitive. Period. The Alice people are somewhat special in that > they had the resources to change the interpreters guts. Most people wont, > and will look for a different language to embedd. > > Of course, I agree with you for the specific cases you are talking - COM, > Active Scripting etc. Indeed, everything I would use this for would prefer > to keep the local function semantics identical. As I understand the needs in COM and AS you are talking about object attributes, right ? Making these case-insensitive is a job for a proxy or a __getattr__ hack. > > > Does btw. anybody really want to see case-insensitivity > > > in Python programs? I'm quite happy with it as it is, > > > and I would even force the use to always use the same > > > case style after he has touched an external property > > > once. Example for Excel: You may write "xl.workbooks" > > > in lowercase, but then you have to stay with it. > > > This would keep Python source clean for, say, PyLint. > > > > "No" and "me too" ;-) > > I think we are missing the point a little. If we focus on COM, we may come > up with a different answer. Indeed, if we are to focus on COM integration > with Python, there are other areas I would prefer to start with :-) > > IMO, we should attempt to come up with a more flexible namespace mechanism > that is in the style of Python, and will not noticeably slowdown Python. > Then COM etc can take advantage of it - much in the same way that Python's > existing namespace model existed pre-COM, and COM had to take advantage of > what it could! > > Of course, a key indicator of the likely success is how well COM _can_ take > advantage of it, and how much Alice could have taken advantage of it - I > cant think of any other yardsticks? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 242 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From fredrik at pythonware.com Mon May 3 16:01:10 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 16:01:10 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat> Message-ID: <005b01be956d$66d48450$f29b12c2@pythonware.com> scriptics is positioning tcl as a perl killer: http://www.scriptics.com/scripting/perl.html afaict, unicode and event handling are the two main thingies missing from python 1.5. -- unicode: is on its way. -- event handling: asynclib/asynchat provides an awesome framework for event-driven socket pro- gramming. however, Python still lacks good cross- platform support for event-driven access to files and pipes. are threads good enough, or would it be cool to have something similar to Tcl's fileevent stuff in Python? -- regexps: has anyone compared the new uni- code-aware regexp package in Tcl with pcre? comments? btw, the rebol folks have reached 2.0: http://www.rebol.com/ maybe 1.6 should be renamed to Python 6.0? From akuchlin at cnri.reston.va.us Mon May 3 17:14:15 1999 From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 3 May 1999 11:14:15 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <005b01be956d$66d48450$f29b12c2@pythonware.com> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> Message-ID: <14125.47524.196878.583460@amarok.cnri.reston.va.us> Fredrik Lundh writes: >-- regexps: has anyone compared the new uni- >code-aware regexp package in Tcl with pcre? I looked at it a bit when Tcl 8.1 was in beta; it derives from Henry Spencer's 1998-vintage code, which seems to try to do a lot of optimization and analysis. It may even compile DFAs instead of NFAs when possible, though it's hard for me to be sure. This might give it a substantial speed advantage over engines that do less analysis, but I haven't benchmarked it. The code is easy to read, but difficult to understand because the theory underlying the analysis isn't explained in the comments; one feels there should be an accompanying paper to explain how everything works, and it's why I'm not sure if it really is producing DFAs for some expressions. Tcl seems to represent everything as UTF-8 internally, so there's only one regex engine; there's . The code is scattered over more files: amarok generic>ls re*.[ch] regc_color.c regc_locale.c regcustom.h regerrs.h regfree.c regc_cvec.c regc_nfa.c rege_dfa.c regex.h regfronts.c regc_lex.c regcomp.c regerror.c regexec.c regguts.h amarok generic>wc -l re*.[ch] 742 regc_color.c 170 regc_cvec.c 1010 regc_lex.c 781 regc_locale.c 1528 regc_nfa.c 2124 regcomp.c 85 regcustom.h 627 rege_dfa.c 82 regerror.c 18 regerrs.h 308 regex.h 952 regexec.c 25 regfree.c 56 regfronts.c 388 regguts.h 8896 total amarok generic> This would be an issue for using it with Python, since all these files would wind up scattered around the Modules directory. For comparison, pypcre.c is around 4700 lines of code. -- A.M. Kuchling http://starship.python.net/crew/amk/ Things need not have happened to be true. Tales and dreams are the shadow-truths that will endure when mere facts are dust and ashes, and forgot. -- Neil Gaiman, _Sandman_ #19: _A Midsummer Night's Dream_ From guido at CNRI.Reston.VA.US Mon May 3 17:32:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 03 May 1999 11:32:09 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Mon, 03 May 1999 11:14:15 EDT." <14125.47524.196878.583460@amarok.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> Message-ID: <199905031532.LAA05617@eric.cnri.reston.va.us> > I looked at it a bit when Tcl 8.1 was in beta; it derives from > Henry Spencer's 1998-vintage code, which seems to try to do a lot of > optimization and analysis. It may even compile DFAs instead of NFAs > when possible, though it's hard for me to be sure. This might give it > a substantial speed advantage over engines that do less analysis, but > I haven't benchmarked it. The code is easy to read, but difficult to > understand because the theory underlying the analysis isn't explained > in the comments; one feels there should be an accompanying paper to > explain how everything works, and it's why I'm not sure if it really > is producing DFAs for some expressions. > > Tcl seems to represent everything as UTF-8 internally, so > there's only one regex engine; there's . Hmm... I looked when Tcl 8.1 was in alpha, and I *think* that at that point the regex engine was compiled twice, once for 8-bit chars and once for 16-bit chars. But this may have changed. I've noticed that Perl is taking the same position (everything is UTF-8 internally). On the other hand, Java distinguishes 16-bit chars from 8-bit bytes. Python is currently in the Java camp. This might be a good time to make sure that we're still convinced that this is the right thing to do! > The code is scattered over > more files: > > amarok generic>ls re*.[ch] > regc_color.c regc_locale.c regcustom.h regerrs.h regfree.c > regc_cvec.c regc_nfa.c rege_dfa.c regex.h regfronts.c > regc_lex.c regcomp.c regerror.c regexec.c regguts.h > amarok generic>wc -l re*.[ch] > 742 regc_color.c > 170 regc_cvec.c > 1010 regc_lex.c > 781 regc_locale.c > 1528 regc_nfa.c > 2124 regcomp.c > 85 regcustom.h > 627 rege_dfa.c > 82 regerror.c > 18 regerrs.h > 308 regex.h > 952 regexec.c > 25 regfree.c > 56 regfronts.c > 388 regguts.h > 8896 total > amarok generic> > > This would be an issue for using it with Python, since all > these files would wind up scattered around the Modules directory. For > comparison, pypcre.c is around 4700 lines of code. I'm sure that if it's good code, we'll find a way. Perhaps a more interesting question is whether it is Perl5 compatible. I contacted Henry Spencer at the time and he was willing to let us use his code. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at cnri.reston.va.us Mon May 3 17:56:46 1999 From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 3 May 1999 11:56:46 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> Message-ID: <14125.49911.982236.754340@amarok.cnri.reston.va.us> Guido van Rossum writes: >Hmm... I looked when Tcl 8.1 was in alpha, and I *think* that at that >point the regex engine was compiled twice, once for 8-bit chars and >once for 16-bit chars. But this may have changed. It doesn't seem to currently; the code in tclRegexp.c looks like this: /* Remember the UTF-8 string so Tcl_RegExpRange() can convert the * matches from character to byte offsets. */ regexpPtr->string = string; Tcl_DStringInit(&stringBuffer); uniString = Tcl_UtfToUniCharDString(string, -1, &stringBuffer); numChars = Tcl_DStringLength(&stringBuffer) / sizeof(Tcl_UniChar); /* Perform the regexp match. */ result = TclRegExpExecUniChar(interp, re, uniString, numChars, -1, ((string > start) ? REG_NOTBOL : 0)); ISTR the Spencer engine does, however, define a small and large representation for NFAs and have two versions of the engine, one for each representation. Perhaps that's what you're thinking of. >I've noticed that Perl is taking the same position (everything is >UTF-8 internally). On the other hand, Java distinguishes 16-bit chars >from 8-bit bytes. Python is currently in the Java camp. This might >be a good time to make sure that we're still convinced that this is >the right thing to do! I don't know. There's certainly the fundamental dichotomy that strings are sometimes used to represent characters, where changing encodings on input and output is reasonably, and sometimes used to hold chunks of binary data, where any changes are incorrect. Perhaps Paul Prescod is right, and we should try to get some other data type (array.array()) for holding binary data, as distinct from strings. >I'm sure that if it's good code, we'll find a way. Perhaps a more >interesting question is whether it is Perl5 compatible. I contacted >Henry Spencer at the time and he was willing to let us use his code. Mostly Perl-compatible, though it doesn't look like the 5.005 features are there, and I haven't checked for every single 5.004 feature. Adding missing features might be problematic, because I don't really understand what the code is doing at a high level. Also, is there a user community for this code? Do any other projects use it? Philip Hazel has been quite helpful with PCRE, an important thing when making modifications to the code. Should I make a point of looking at what using the Spencer engine would entail? It might not be too difficult (an evening or two, maybe?) to write a re.py that sat on top of the Spencer code; that would at least let us do some benchmarking. -- A.M. Kuchling http://starship.python.net/crew/amk/ In Einstein's theory of relativity the observer is a man who sets out in quest of truth armed with a measuring-rod. In quantum theory he sets out with a sieve. -- Sir Arthur Eddington From guido at CNRI.Reston.VA.US Mon May 3 18:02:22 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 03 May 1999 12:02:22 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Mon, 03 May 1999 11:56:46 EDT." <14125.49911.982236.754340@amarok.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> Message-ID: <199905031602.MAA05829@eric.cnri.reston.va.us> > Should I make a point of looking at what using the Spencer > engine would entail? It might not be too difficult (an evening or > two, maybe?) to write a re.py that sat on top of the Spencer code; > that would at least let us do some benchmarking. Surely this would be more helpful than weeks of specilative emails -- go for it! --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Mon May 3 19:10:55 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 19:10:55 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat><005b01be956d$66d48450$f29b12c2@pythonware.com><14125.47524.196878.583460@amarok.cnri.reston.va.us><199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> Message-ID: <005801be9588$7ad0fcc0$f29b12c2@pythonware.com> > Also, is there a user community for this code? how about comp.lang.tcl ;-) From fredrik at pythonware.com Mon May 3 19:15:00 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 19:15:00 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> <199905031602.MAA05829@eric.cnri.reston.va.us> Message-ID: <005901be9588$7af59bc0$f29b12c2@pythonware.com> talking about regexps, here's another thing that would be quite nice to have in 1.6 (available from the Python level, that is). or is it already in there somewhere? ... http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=464362873 Tcl 8.1b3 Request: Generated by Scriptics' bug entry form at Submitted by: Frederic BONNET OperatingSystem: Windows 98 CustomShell: Applied patch to the regexp engine (the exec part) Synopsis: regexp improvements DesiredBehavior: As previously requested by Don Libes: > I see no way for Tcl_RegExpExec to indicate "could match" meaning > "could match if more characters arrive that were suitable for a > match". This is required for a class of applications involving > matching on a stream required by Expect's interact command. Henry > assured me that this facility would be in the engine (I'm not the only > one that needs it). Note that it is not sufficient to add one more > return value to Tcl_RegExpExec (i.e., 2) because one needs to know > both if something matches now and can match later. I recommend > another argument (canMatch *int) be added to Tcl_RegExpExec. /patch info follows/ ... From bwarsaw at cnri.reston.va.us Tue May 4 00:28:23 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 3 May 1999 18:28:23 -0400 (EDT) Subject: [Python-Dev] New mailing list: python-bugs-list Message-ID: <14126.8967.793734.892670@anthem.cnri.reston.va.us> I've been using Jitterbug for a couple of weeks now as my bug database for Mailman and JPython. So it was easy enough for me to set up a database for Python bug reports. Guido is in the process of tailoring the Jitterbug web interface to his liking and will announce it to the appropriate forums when he's ready. In the meantime, I've created YAML that you might be interested in. All bug reports entered into Jitterbug will be forwarded to python-bugs-list at python.org. You are invited to subscribe to the list by visiting http://www.python.org/mailman/listinfo/python-bugs-list Enjoy, -Barry From jeremy at cnri.reston.va.us Tue May 4 00:30:10 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Mon, 3 May 1999 18:30:10 -0400 (EDT) Subject: [Python-Dev] New mailing list: python-bugs-list In-Reply-To: <14126.8967.793734.892670@anthem.cnri.reston.va.us> References: <14126.8967.793734.892670@anthem.cnri.reston.va.us> Message-ID: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us> Pretty low volume list, eh? From MHammond at skippinet.com.au Tue May 4 01:28:39 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 4 May 1999 09:28:39 +1000 Subject: [Python-Dev] New mailing list: python-bugs-list In-Reply-To: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us> Message-ID: <000701be95bc$ad0b45e0$0801a8c0@bobcat> ha - we wish. More likely to be full of detailed bug reports about how 1/2 != 0.5, or that "def foo(baz=[])" is buggy, etc :-) Mark. > Pretty low volume list, eh? From tim_one at email.msn.com Tue May 4 07:16:17 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 4 May 1999 01:16:17 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us> Message-ID: <000701be95ed$3d594180$dca22299@tim> [Guido & Andrew on Tcl's new regexp code] > I'm sure that if it's good code, we'll find a way. Perhaps a more > interesting question is whether it is Perl5 compatible. I contacted > Henry Spencer at the time and he was willing to let us use his code. Haven't looked at the code, but did read the manpage just now: http://www.scriptics.com/man/tcl8.1/TclCmd/regexp.htm WRT Perl5 compatibility, it sez: Incompatibilities of note include `\b', `\B', the lack of special treatment for a trailing newline, the addition of complemented bracket expressions to the things affected by newline-sensitive matching, the restrictions on parentheses and back references in lookahead constraints, and the longest/shortest-match (rather than first-match) matching semantics. So some gratuitous differences, and maybe a killer: Guido hasn't had much kind to say about "longest" (aka POSIX) matching semantics. An example from the page: (week|wee)(night|knights) matches all ten characters of `weeknights' which means it matched 'wee' and 'knights'; Python/Perl match 'week' and 'night'. It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA is correct; indeed, it's a pain to get that behavior any other way! otoh-it's-potentially-very-much-faster-ly y'rs - tim From tim_one at email.msn.com Tue May 4 07:51:01 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 4 May 1999 01:51:01 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <000701be95ed$3d594180$dca22299@tim> Message-ID: <000901be95f2$195556c0$dca22299@tim> [Tim] > ... > It's the *natural* semantics if Andrew's suspicion that it's > compiling a DFA is correct ... More from the man page: AREs report the longest/shortest match for the RE, rather than the first found in a specified search order. This may affect some RREs which were written in the expectation that the first match would be reported. (The careful crafting of RREs to optimize the search order for fast matching is obsolete (AREs examine all possible matches in parallel, and their performance is largely insensitive to their complexity) but cases where the search order was exploited to deliberately find a match which was not the longest/shortest will need rewriting.) Nails it, yes? Now, in 10 seconds, try to remember a regexp where this really matters . Note in passing that IDLE's colorizer regexp *needs* to search for triple-quoted strings before single-quoted ones, else the P/P semantics would consider """ to be an empty single-quoted string followed by a double quote. This isn't a case where it matters in a bad way, though! The "longest" rule picks the correct alternative regardless of the order in which they're written. at-least-in-that-specific-regex<0.1-wink>-ly y'rs - tim From guido at CNRI.Reston.VA.US Tue May 4 14:26:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 04 May 1999 08:26:04 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Tue, 04 May 1999 01:16:17 EDT." <000701be95ed$3d594180$dca22299@tim> References: <000701be95ed$3d594180$dca22299@tim> Message-ID: <199905041226.IAA07627@eric.cnri.reston.va.us> [Tim] > So some gratuitous differences, and maybe a killer: Guido hasn't had much > kind to say about "longest" (aka POSIX) matching semantics. > > An example from the page: > > (week|wee)(night|knights) > matches all ten characters of `weeknights' > > which means it matched 'wee' and 'knights'; Python/Perl match 'week' and > 'night'. > > It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA > is correct; indeed, it's a pain to get that behavior any other way! Possibly contradicting what I once said about DFAs (I have no idea what I said any more :-): I think we shouldn't be hung up about the subtleties of DFA vs. NFA; for most people, the Perl-compatibility simply means that they can use the same metacharacters. My guess is that people don'y so much translate long Perl regexp's to Python but simply transport their (always incomplete -- Larry Wall *wants* it that way :-) knowledge of Perl regexps to Python. My meta-guess is that this is also Henry Spencer's and John Ousterhout's guess. As for Larry Wall, I guess he really doesn't care :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at cnri.reston.va.us Tue May 4 18:14:41 1999 From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling) Date: Tue, 4 May 1999 12:14:41 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us> References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us> Message-ID: <14127.6410.646122.342115@amarok.cnri.reston.va.us> Guido van Rossum writes: >Possibly contradicting what I once said about DFAs (I have no idea >what I said any more :-): I think we shouldn't be hung up about the >subtleties of DFA vs. NFA; for most people, the Perl-compatibility >simply means that they can use the same metacharacters. My guess is I don't like slipping in such a change to the semantics with no visible change to the module name or interface. On the other hand, if it's not NFA-based, then it can provide POSIX semantics without danger of taking exponential time to determine the longest match. BTW, there's an interesting reference, I assume to this code, in _Mastering Regular Expressions_; Spencer is quoted on page 121 as saying it's "at worst quadratic in text size.". Anyway, we can let it slide until a Python interface gets written. -- A.M. Kuchling http://starship.python.net/crew/amk/ In the black shadow of the Baba Yaga babies screamed and mothers miscarried; milk soured and men went mad. -- In SANDMAN #38: "The Hunt" From guido at CNRI.Reston.VA.US Tue May 4 18:19:06 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 04 May 1999 12:19:06 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Tue, 04 May 1999 12:14:41 EDT." <14127.6410.646122.342115@amarok.cnri.reston.va.us> References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us> <14127.6410.646122.342115@amarok.cnri.reston.va.us> Message-ID: <199905041619.MAA08408@eric.cnri.reston.va.us> > BTW, there's an interesting reference, I assume to this code, in > _Mastering Regular Expressions_; Spencer is quoted on page 121 as > saying it's "at worst quadratic in text size.". Not sure if that was the same code -- this is *new* code, not Spencer's old code. I think Friedl's book is older than the current code. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Wed May 5 07:37:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 5 May 1999 01:37:02 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us> Message-ID: <000701be96b9$4e434460$799e2299@tim> I've consistently found that the best way to kill a thread is to rename it accurately . Agree w/ Guido that few people really care about the differing semantics. Agree w/ Andrew that it's bad to pull a semantic switcheroo at this stage anyway: code will definitely break. Like \b(?: (?Pand|if|else|...) | (?P[a-zA-Z_]\w*) )\b The (special)|(general) idiom relies on left-to-right match-and-out searching of alternatives to do its job correctly. Not to mention that \b is not a word-boundary assertion in the new pkg (talk about pointlessly irritating differences! at least this one could be easily hidden via brainless preprocessing). Over the long run, moving to a DFA locks Python out of the directions Perl is *moving*, namely embedding all sorts of runtime gimmicks in regexps that exploit knowing the "state of the match so far". DFAs don't work that way. I don't mind losing those possibilities, because I think the regexp sublanguage is strained beyond its limits already. But that's a decision with Big Consequences, so deserves some thought. I'd definitely like the (sometimes dramatically) increased speed a DFA can offer (btw, this code appears to use a lazily-generated DFA, to avoid the exponential *compile*-time a straightforward DFA implementation can suffer -- the code is very complex and lacks any high-level internal docs, so we better hope Henry stays in love with it <0.5 wink>). > ... > My guess is that people don't so much translate long Perl regexp's > to Python but simply transport their (always incomplete -- Larry Wall > *wants* it that way :-) knowledge of Perl regexps to Python. This is directly proportional to the number of feeble CGI programmers Python attracts . The good news is that they wouldn't know an NFA from a DFA if Larry bit Henry on the ass ... > My meta-guess is that this is also Henry Spencer's and John > Ousterhout's guess. I think Spencer strongly favors DFA semantics regardless of fashion, and Ousterhout is a pragmatist. So I trust JO's judgment more <0.9 wink>. > As for Larry Wall, I guess he really doesn't care :-) I expect he cares a lot! Because a DFA would prevent Perl from going even more insane in its present direction. About the age of the code, postings to comp.lang.tcl have Henry saying he was working on the alpha version intensely as recently as Decemeber ('98). A few complaints about the alpha release trickled in, about regexp compile speed and regexp matching speed in specific cases. Perhaps paradoxically, the latter were about especially simple regexps with long fixed substrings (where this mountain of sophisticated machinery is likely to get beat cold by an NFA with some fixed-substring lookahead smarts -- which latter Henry intended to graft into this pkg too). [Andrew] > BTW, there's an interesting reference, I assume to this code, in > _Mastering Regular Expressions_; Spencer is quoted on page 121 as > saying it's "at worst quadratic in text size.". [Guido] > Not sure if that was the same code -- this is *new* code, not > Spencer's old code. I think Friedl's book is older than the current > code. I expect this is an invariant, though: it's not natural for a DFA to know where subexpression matches begin and end, and there's a pile of xxx_dissect functions in regexec.c that use what strongly appear to be worst-case quadratic-time algorithms for figuring that out after it's known that the overall expression has *a* match. Expect too, but don't know, that only pathological cases are actually expensive. Question: has this package been released in any other context, or is it unique to Tcl? I searched in vain for an announcement (let alone code) from Henry, or any discussion of this code outside the Tcl world. whatever-happens-i-vote-we-let-them-debug-it-ly y'rs - tim From gstein at lyra.org Wed May 5 08:22:20 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 4 May 1999 23:22:20 -0700 (PDT) Subject: [Python-Dev] Tcl 8.1's regexp code In-Reply-To: <000701be96b9$4e434460$799e2299@tim> Message-ID: On Wed, 5 May 1999, Tim Peters wrote: >... > Question: has this package been released in any other context, or is it > unique to Tcl? I searched in vain for an announcement (let alone code) from > Henry, or any discussion of this code outside the Tcl world. Apache uses it. However, the Apache guys have considered possibility updating the thing. I gather that they have a pretty old snapshot. Another guy mentioned PCRE and I pointed out that Python uses it for its regex support. In other words, if Apache *does* update the code, then it may be that Apache will drop the HS engine in favor of PCRE. Cheers, -g -- Greg Stein, http://www.lyra.org/ From Ivan.Porres at abo.fi Wed May 5 10:29:21 1999 From: Ivan.Porres at abo.fi (Ivan Porres Paltor) Date: Wed, 05 May 1999 11:29:21 +0300 Subject: [Python-Dev] Python for Small Systems patch Message-ID: <37300161.8DFD1D7F@abo.fi> Python for Small Systems is a minimal version of the python interpreter, intended to run on small embedded systems with a limited amount of memory. Since there is some interest in the newsgroup, we have decide to release an alpha version of the patch. You can download the patch from the following page: http://www.abo.fi/~iporres/python There is no documentation about the changes, but I guess that it is not so difficult to figure out what Raul has been doing. There are some simple examples in the Demo/hitachi directory. The configure scripts are broken. We plan to modify the configure scripts for cross-compilation. We are still testing, cleaning and trying to reduce the memory requirements of the patched interpreter. We also plan to write some documentation. Please send comments to Raul (rparra at abo.fi) or to me (iporres at abo.fi), Regards, Ivan -- Ivan Porres Paltor Turku Centre for Computer Science ?bo Akademi, Department of Computer Science Phone: +358-2-2154033 Lemmink?inengatan 14A FIN-20520 Turku - Finland http://www.abo.fi/~iporres From tismer at appliedbiometrics.com Wed May 5 13:52:24 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 05 May 1999 13:52:24 +0200 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> Message-ID: <373030F8.21B73451@appliedbiometrics.com> Ivan Porres Paltor wrote: > > Python for Small Systems is a minimal version of the python interpreter, > intended to run on small embedded systems with a limited amount of > memory. > > Since there is some interest in the newsgroup, we have decide to release > an alpha version of the patch. You can download the patch from the > following page: > > http://www.abo.fi/~iporres/python > > There is no documentation about the changes, but I guess that it is not > so difficult to figure out what Raul has been doing. Ivan, small Python is a very interesting thing, thanks for the preview. But, aren't 12600 lines of diff a little too much to call it "not difficult to figure out"? :-) The very last line was indeed helpful: +++ Pss/miniconfigure Tue Mar 16 16:59:42 1999 @@ -0,0 +1 @@ +./configure --prefix="/home/rparra/python/Python-1.5.1" --without-complex --without-float --without-long --without-file --without-libm --without-libc --without-fpectl --without-threads --without-dec-threads --with-libs= But I'd be interested in a brief list of which other features are out, and even more which structures were changed. Would that be possible? thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From Ivan.Porres at abo.fi Wed May 5 15:17:17 1999 From: Ivan.Porres at abo.fi (Ivan Porres Paltor) Date: Wed, 05 May 1999 16:17:17 +0300 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> Message-ID: <373044DD.FE4499E@abo.fi> Christian Tismer wrote: > Ivan, > small Python is a very interesting thing, > thanks for the preview. > > But, aren't 12600 lines of diff a little too much > to call it "not difficult to figure out"? :-) Raul Parra (rpb), the author of the patch, got the "source scissors" (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a embedded system with some RAM, no keyboard, no screen and no OS. An example application can be a printer where the print jobs are python bytecompiled scripts (instead of postscript). We plan to write some documentation about the patch. Meanwhile, here are some of the changes: WITHOUT_PARSER, WITHOUT_COMPILER Defining WITHOUT_PARSER removes the parser. This has a lot of implications (no eval() !) but saves a lot of memory. The interpreter can only execute byte-compiled scripts, that is PyCodeObjects. Most embedded processors have poor floating point capabilities. (They can not compete with DSP's): WITHOUT-COMPLEX Removes support for complex numbers WITHOUT-LONG Removes long numbers WITHOUT-FLOAT Removes floating point numbers Dependences with the OS: WITHOUT-FILE Removes file objects. No file, no print, no input, no interactive prompt. This is not to bad in a device without hard disk, keyboard or screen... WITHOUT-GETPATH Removes dependencies with os path.(Probabily this change should be integrated with WITHOUT-FILE) These changes render most of the standard modules unusable. There are no fundamental changes on the interpter, just cut and cut.... Ivan -- Ivan Porres Paltor Turku Centre for Computer Science ?bo Akademi, Department of Computer Science Phone: +358-2-2154033 Lemmink?inengatan 14A FIN-20520 Turku - Finland http://www.abo.fi/~iporres From tismer at appliedbiometrics.com Wed May 5 15:31:05 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 05 May 1999 15:31:05 +0200 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> <373044DD.FE4499E@abo.fi> Message-ID: <37304819.AD636B67@appliedbiometrics.com> Ivan Porres Paltor wrote: > > Christian Tismer wrote: > > Ivan, > > small Python is a very interesting thing, > > thanks for the preview. > > > > But, aren't 12600 lines of diff a little too much > > to call it "not difficult to figure out"? :-) > > Raul Parra (rpb), the author of the patch, got the "source scissors" > (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a > embedded system with some RAM, no keyboard, no screen and no OS. An > example application can be a printer where the print jobs are python > bytecompiled scripts (instead of postscript). > > We plan to write some documentation about the patch. Meanwhile, here are > some of the changes: Many thanks, this is really interesting > These changes render most of the standard modules unusable. > There are no fundamental changes on the interpter, just cut and cut.... I see. A last thing which I'm curious about is the executable size. If this can be compared to a Windows dll at all. Did you compile without the changes for your target as well? How is the ratio? The python15.dll file contains everything of core Python and is about 560 KB large. If your engine goes down to, say below 200 KB, this could be a great thing for embedding Python into other apps. ciao & thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Wed May 5 16:55:40 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 5 May 1999 10:55:40 -0400 (EDT) Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) References: <199905041226.IAA07627@eric.cnri.reston.va.us> <000701be96b9$4e434460$799e2299@tim> Message-ID: <14128.23532.499380.835737@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Over the long run, moving to a DFA locks Python out of the TP> directions Perl is *moving*, namely embedding all sorts of TP> runtime gimmicks in regexps that exploit knowing the "state of TP> the match so far". DFAs don't work that way. I don't mind TP> losing those possibilities, because I think the regexp TP> sublanguage is strained beyond its limits already. But that's TP> a decision with Big Consequences, so deserves some thought. I know zip about the internals of the various regexp package. But as far as the Python level interface, would it be feasible to support both as underlying regexp engines underneath re.py? The idea would be that you'd add an extra flag (re.PERL / re.TCL ? re.DFA / re.NFA ? re.POSIX / re.USEFUL ? :-) that would select the engine and compiler. Then all the rest of the magic happens behind the scenes, with appropriate exceptions thrown if there are syntax mismatches in the regexp that can't be worked around by preprocessors, etc. Or would that be more confusing than yet another different regexp module? -Barry From tim_one at email.msn.com Wed May 5 17:55:20 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 5 May 1999 11:55:20 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code In-Reply-To: Message-ID: <000601be970f$adef5740$a59e2299@tim> [Tim] > Question: has this package [Tcl's 8.1 regexp support] been released in > any other context, or is it unique to Tcl? I searched in vain for an > announcement (let alone code) from Henry, or any discussion of this code > outside the Tcl world. [Greg Stein] > Apache uses it. > > However, the Apache guys have considered possibility updating the thing. I > gather that they have a pretty old snapshot. Another guy mentioned PCRE > and I pointed out that Python uses it for its regex support. In other > words, if Apache *does* update the code, then it may be that Apache will > drop the HS engine in favor of PCRE. Hmm. I just downloaded the Apache 1.3.4 source to check on this, and it appears to be using a lightly massaged version of Spencer's old (circa '92-'94) just-POSIX regexp package. Henry has been distributing regexp pkgs for a loooong time . The Tcl 8.1 regexp pkg is much hairier. If the Apache folk want to switch in order to get the Perl regexp syntax extensions, this Tcl version is worth looking at too. If they want to switch for some other reason, it would be good to know what that is! The base pkg Apache uses is easily available all over the web; the pkg Tcl 8.1 is using I haven't found anywhere except in the Tcl download (which is why I'm wondering about it -- so far, it doesn't appear to be distributed by Spencer himself, in a non-Tcl-customized form). looks-like-an-entirely-new-pkg-to-me-ly y'rs - tim From beazley at cs.uchicago.edu Wed May 5 18:54:45 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 5 May 1999 11:54:45 -0500 (CDT) Subject: [Python-Dev] My (possibly delusional) book project Message-ID: <199905051654.LAA11410@tartarus.cs.uchicago.edu> Although this is a little off-topic for the developer list, I want to fill people in on a new Python book project. A few months ago, I was approached about doing a new Python reference book and I've since decided to proceed with the project (after all, an increased presence at the bookstore is probably a good thing :-). In any event, my "vision" for this book is to take the material in the Python tutorial, language reference, library reference, and extension guide and squeeze it into a compact book no longer than 300 pages (and hopefully without having to use a 4-point font). Actually, what I'm really trying to do is write something in a style similar to the K&R C Programming book (very terse, straight to the point, and technically accurate). The book's target audience is experienced/expert programmers. With this said, I would really like to get feedback from the developer community about this project in a few areas. First, I want to make sure the language reference is in sync with the latest version of Python, that it is as accurate as possible, and that it doesn't leave out any important topics or recent developments. Second, I would be interested in knowing how to emphasize certain topics (for instance, should I emphasize class-based exceptions over string-based exceptions even though most books only cover the former case?). The other big area is the library reference. Given the size of the library, I'm going to cut a number of modules out. However, the choice of what to cut is not entirely clear (for now, it's a judgment call on my part). All of the work in progress for this project is online at: http://rustler.cs.uchicago.edu/~beazley/essential/reference.html I would love to get constructive feedback about this from other developers. Of course, I'll keep people posted in any case. Cheers, Dave From tim_one at email.msn.com Thu May 6 07:43:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 6 May 1999 01:43:16 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) In-Reply-To: <14128.23532.499380.835737@anthem.cnri.reston.va.us> Message-ID: <000d01be9783$57543940$2ca22299@tim> [Tim notes that moving to a DFA regexp engine would rule out some future aping of Perl mistakes ] [Barry "The Great Compromiser" Warsaw] > I know zip about the internals of the various regexp package. But as > far as the Python level interface, would it be feasible to support > both as underlying regexp engines underneath re.py? The idea would be > that you'd add an extra flag (re.PERL / re.TCL ? re.DFA / re.NFA ? > re.POSIX / re.USEFUL ? :-) that would select the engine and compiler. > Then all the rest of the magic happens behind the scenes, with > appropriate exceptions thrown if there are syntax mismatches in the > regexp that can't be worked around by preprocessors, etc. > > Or would that be more confusing than yet another different regexp > module? It depends some on what percentage of the Python distribution Guido wants to devote to regexp code <0.6 wink>; the Tcl pkg would be the largest block of code in Modules/, where regexp packages already consume more than anything else. It's a lot of delicate, difficult code. Someone would need to step up and champion each alternative package. I haven't asked Andrew lately, but I'd bet half a buck the thrill of supporting pcre has waned. If there were competing packages, your suggested interface is fine. I just doubt the Python developers will support more than one (Andrew may still be young, but he can't possibly still be naive enough to sign up for two of these nightmares ). i'm-so-old-i-never-signed-up-for-one-ly y'rs - tim From rushing at nightmare.com Thu May 13 08:34:19 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Wed, 12 May 1999 23:34:19 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905070507.BAA22545@python.org> References: <199905070507.BAA22545@python.org> Message-ID: <14138.28243.553816.166686@seattle.nightmare.com> [list has been quiet, thought I'd liven things up a bit. 8^)] I'm not sure if this has been brought up before in other forums, but has there been discussion of separating the Python and C invocation stacks, (i.e., removing recursive calls to the intepreter) to facilitate coroutines or first-class continuations? One of the biggest barriers to getting others to use asyncore/medusa is the need to program in continuation-passing-style (callbacks, callbacks to callbacks, state machines, etc...). Usually there has to be an overriding requirement for speed/scalability before someone will even look into it. And even when you do 'get' it, there are limits to how inside-out your thinking can go. 8^) If Python had coroutines/continuations, it would be possible to hide asyncore-style select()/poll() machinery 'behind the scenes'. I believe that Concurrent ML does exactly this... Other advantages might be restartable exceptions, different threading models, etc... -Sam rushing at nightmare.com rushing at eGroups.net From mal at lemburg.com Thu May 13 10:23:13 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 13 May 1999 10:23:13 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> Message-ID: <373A8BF1.AE124BF@lemburg.com> rushing at nightmare.com wrote: > > [list has been quiet, thought I'd liven things up a bit. 8^)] Well, there certainly is enough on the todo list... it's probably the usual "ain't got no time" thing. > I'm not sure if this has been brought up before in other forums, but > has there been discussion of separating the Python and C invocation > stacks, (i.e., removing recursive calls to the intepreter) to > facilitate coroutines or first-class continuations? Wouldn't it be possible to move all the C variables passed to eval_code() via the execution frame ? AFAIK, the frame is generated on every call to eval_code() and thus could also be generated *before* calling it. > One of the biggest barriers to getting others to use asyncore/medusa > is the need to program in continuation-passing-style (callbacks, > callbacks to callbacks, state machines, etc...). Usually there has to > be an overriding requirement for speed/scalability before someone will > even look into it. And even when you do 'get' it, there are limits to > how inside-out your thinking can go. 8^) > > If Python had coroutines/continuations, it would be possible to hide > asyncore-style select()/poll() machinery 'behind the scenes'. I > believe that Concurrent ML does exactly this... > > Other advantages might be restartable exceptions, different threading > models, etc... Don't know if moving the C stack stuff into the frame objects will get you the desired effect: what about other things having state (e.g. connections or files), that are not even touched by this mechanism ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 232 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From rushing at nightmare.com Thu May 13 11:40:19 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Thu, 13 May 1999 02:40:19 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373A8BF1.AE124BF@lemburg.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <373A8BF1.AE124BF@lemburg.com> Message-ID: <14138.38550.89759.752058@seattle.nightmare.com> M.-A. Lemburg writes: > Wouldn't it be possible to move all the C variables passed to > eval_code() via the execution frame ? AFAIK, the frame is > generated on every call to eval_code() and thus could also > be generated *before* calling it. I think this solves half of the problem. The C stack is both a value stack and an execution stack (i.e., it holds variables and return addresses). Getting rid of arguments (and a return value!) gets rid of the need for the 'value stack' aspect. In aiming for an enter-once, exit-once VM, the thorniest part is to somehow allow python->c->python calls. The second invocation could never save a continuation because its execution context includes a C frame. This is a general problem, not specific to Python; I probably should have thought about it a bit before posting... > Don't know if moving the C stack stuff into the frame objects > will get you the desired effect: what about other things having > state (e.g. connections or files), that are not even touched > by this mechanism ? I don't think either of those cause 'real' problems (i.e., nothing should crash that assumes an open file or socket), but there may be other stateful things that might. I don't think that refcounts would be a problem - a saved continuation wouldn't be all that different from an exception traceback. -Sam p.s. Here's a tiny VM experiment I wrote a while back, to explain what I mean by 'stackless': http://www.nightmare.com/stuff/machine.h http://www.nightmare.com/stuff/machine.c Note how OP_INVOKE (the PROC_CLOSURE clause) pushes new context onto heap-allocated data structures rather than calling the VM recursively. From skip at mojam.com Thu May 13 13:38:39 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 13 May 1999 07:38:39 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14138.28243.553816.166686@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> Message-ID: <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Sam> I'm not sure if this has been brought up before in other forums, Sam> but has there been discussion of separating the Python and C Sam> invocation stacks, (i.e., removing recursive calls to the Sam> intepreter) to facilitate coroutines or first-class continuations? I thought Guido was working on that for the mobile agent stuff he was working on at CNRI. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From bwarsaw at cnri.reston.va.us Thu May 13 17:10:52 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 13 May 1999 11:10:52 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Message-ID: <14138.60284.584739.711112@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I thought Guido was working on that for the mobile agent stuff SM> he was working on at CNRI. Nope, we decided that we could accomplish everything we needed without this. We occasionally revisit this but Guido keeps insisting it's a lot of work for not enough benefit :-) -Barry From guido at CNRI.Reston.VA.US Thu May 13 17:19:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 13 May 1999 11:19:10 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Thu, 13 May 1999 07:38:39 EDT." <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Message-ID: <199905131519.LAA01097@eric.cnri.reston.va.us> Interesting topic! While I 'm on the road, a few short notes. > I thought Guido was working on that for the mobile agent stuff he was > working on at CNRI. Indeed. At least I planned on working on it. I ended up abandoning the idea because I expected it would be a lot of work and I never had the time (same old story indeed). Sam also hit it on the nail: the hardest problem is what to do about all the places where C calls back into Python. I've come up with two partial solutions: (1) allow for a way to arrange for a call to be made immediately after you return to the VM from C; this would take care of apply() at least and a few other "tail-recursive" cases; (2) invoke a new VM when C code needs a Python result, requiring it to return. The latter clearly breaks certain uses of coroutines but could probably be made to work most of the time. Typical use of the 80-20 rule. And I've just come up with a third solution: a variation on (1) where you arrange *two* calls: one to Python and then one to C, with the result of the first. (And a bit saying whether you want the C call to be made even when an exception happened.) In general, I still think it's a cool idea, but I also still think that continuations are too complicated for most programmers. (This comes from the realization that they are too complicated for me!) Corollary: even if we had continuations, I'm not sure if this would take away the resistance against asyncore/asynchat. Of course I could be wrong. Different suggestion: it would be cool to work on completely separating out the VM from the rest of Python, through some kind of C-level API specification. Two things should be possiblw with this new architecture: (1) small platform ports could cut out the interactive interpreter, the parser and compiler, and certain data types such as long, complex and files; (2) there could be alternative pluggable VMs with certain desirable properties such as platform-specific optimization (Christian, are you listening? :-). I think the most challenging part might be defining an API for passing in the set of supported object types and operations. E.g. the EXEC_STMT opcode needs to be be implemented in a way that allows "exec" to be absent from the language. Perhaps an __exec__ function (analogous to __import__) is the way to go. The set of built-in functions should also be passed in, so that e.g. one can easily leave out open(), eval() and comppile(), complex(), long(), float(), etc. I think it would be ideal if no #ifdefs were needed to remove features (at least not in the VM code proper). Fortunately, the VM doesn't really know about many object types -- frames, fuctions, methods, classes, ints, strings, dictionaries, tuples, tracebacks, that may be all it knows. (Lists?) Gotta run, --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Thu May 13 21:50:44 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 13 May 1999 21:50:44 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <199905131519.LAA01097@eric.cnri.reston.va.us> Message-ID: <01d501be9d79$e4890060$f29b12c2@pythonware.com> > In general, I still think it's a cool idea, but I also still think > that continuations are too complicated for most programmers. (This > comes from the realization that they are too complicated for me!) in an earlier life, I used non-preemtive threads (that is, explicit yields) and co-routines to do some really cool stuff with very little code. looks like a stack-less inter- preter would make it trivial to implement that. might just be nostalgia, but I think I would give an arm or two to get that (not necessarily my own, though ;-) From rushing at nightmare.com Fri May 14 04:00:09 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Thu, 13 May 1999 19:00:09 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> Message-ID: <14139.30970.644343.612721@seattle.nightmare.com> Guido van Rossum writes: > I've come up with two partial solutions: (1) allow for a way to > arrange for a call to be made immediately after you return to the > VM from C; this would take care of apply() at least and a few > other "tail-recursive" cases; (2) invoke a new VM when C code > needs a Python result, requiring it to return. The latter clearly > breaks certain uses of coroutines but could probably be made to > work most of the time. Typical use of the 80-20 rule. I know this is disgusting, but could setjmp/longjmp 'automagically' force a 'recursive call' to jump back into the top-level loop? This would put some serious restraint on what C called from Python could do... I think just about any Scheme implementation has to solve this same problem... I'll dig through my collection of them for ideas. > In general, I still think it's a cool idea, but I also still think > that continuations are too complicated for most programmers. (This > comes from the realization that they are too complicated for me!) > Corollary: even if we had continuations, I'm not sure if this would > take away the resistance against asyncore/asynchat. Of course I could > be wrong. Theoretically, you could have a bit of code that looked just like 'normal' imperative code, that would actually be entering and exiting the context for non-blocking i/o. If it were done right, the same exact code might even run under 'normal' threads. Recently I've written an async server that needed to talk to several other RPC servers, and a mysql server. Pseudo-example, with possibly-async calls in UPPERCASE: auth, archive = db.FETCH_USER_INFO (user) if verify_login(user,auth): rpc_server = self.archive_servers[archive] group_info = rpc_server.FETCH_GROUP_INFO (group) if valid (group_info): return rpc_server.FETCH_MESSAGE (message_number) else: ... else: ... This code in CPS is a horrible, complicated mess, it takes something like 8 callback methods, variables and exceptions have to be passed around in 'continuation' objects. It's hairy because there are three levels of callback state. Ugh. If Python had closures, then it would be a *little* easier, but would still make the average Pythoneer swoon. Closures would let you put the above logic all in one method, but the code would still be 'inside-out'. > Different suggestion: it would be cool to work on completely > separating out the VM from the rest of Python, through some kind of > C-level API specification. I think this is a great idea. I've been staring at python bytecodes a bit lately thinking about how to do something like this, for some subset of Python. [...] Ok, we've all seen the 'stick'. I guess I should give an example of the 'carrot': I think that a web server built on such a Python could have the performance/scalability of thttpd, with the ease-of-programming of Roxen. As far as I know, there's nothing like it out there. Medusa would be put out to pasture. 8^) -Sam From guido at CNRI.Reston.VA.US Fri May 14 14:03:31 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 14 May 1999 08:03:31 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Thu, 13 May 1999 19:00:09 PDT." <14139.30970.644343.612721@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> Message-ID: <199905141203.IAA01808@eric.cnri.reston.va.us> > I know this is disgusting, but could setjmp/longjmp 'automagically' > force a 'recursive call' to jump back into the top-level loop? This > would put some serious restraint on what C called from Python could > do... Forget about it. setjmp/longjmp are invitations to problems. I also assume that they would interfere badly with C++. > I think just about any Scheme implementation has to solve this same > problem... I'll dig through my collection of them for ideas. Anything that assumes knowledge about how the C compiler and/or the CPU and OS lay out the stack is a no-no, because it means that the first thing one has to do for a port to a new architecture is figure out how the stack is laid out. Another thread in this list is porting Python to microplatforms like PalmOS. Typically the scheme Hackers are not afraid to delve deep into the machine, but I refuse to do that -- I think it's too risky. > > In general, I still think it's a cool idea, but I also still think > > that continuations are too complicated for most programmers. (This > > comes from the realization that they are too complicated for me!) > > Corollary: even if we had continuations, I'm not sure if this would > > take away the resistance against asyncore/asynchat. Of course I could > > be wrong. > > Theoretically, you could have a bit of code that looked just like > 'normal' imperative code, that would actually be entering and exiting > the context for non-blocking i/o. If it were done right, the same > exact code might even run under 'normal' threads. Yes -- I remember in 92 or 93 I worked out a way to emulat coroutines with regular threads. (I think in cooperation with Steve Majewski.) > Recently I've written an async server that needed to talk to several > other RPC servers, and a mysql server. Pseudo-example, with > possibly-async calls in UPPERCASE: > > auth, archive = db.FETCH_USER_INFO (user) > if verify_login(user,auth): > rpc_server = self.archive_servers[archive] > group_info = rpc_server.FETCH_GROUP_INFO (group) > if valid (group_info): > return rpc_server.FETCH_MESSAGE (message_number) > else: > ... > else: > ... > > This code in CPS is a horrible, complicated mess, it takes something > like 8 callback methods, variables and exceptions have to be passed > around in 'continuation' objects. It's hairy because there are three > levels of callback state. Ugh. Agreed. > If Python had closures, then it would be a *little* easier, but would > still make the average Pythoneer swoon. Closures would let you put > the above logic all in one method, but the code would still be > 'inside-out'. I forget how this worked :-( > > Different suggestion: it would be cool to work on completely > > separating out the VM from the rest of Python, through some kind of > > C-level API specification. > > I think this is a great idea. I've been staring at python bytecodes a > bit lately thinking about how to do something like this, for some > subset of Python. > > [...] > > Ok, we've all seen the 'stick'. I guess I should give an example of > the 'carrot': I think that a web server built on such a Python could > have the performance/scalability of thttpd, with the > ease-of-programming of Roxen. As far as I know, there's nothing like > it out there. Medusa would be put out to pasture. 8^) I'm afraid I haven't kept up -- what are Roxen and thttpd? What do they do that Apache doesn't? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri May 14 15:16:13 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 14 May 1999 15:16:13 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> > I'm afraid I haven't kept up -- what are Roxen and thttpd? What do > they do that Apache doesn't? http://www.roxen.com/ a lean and mean secure web server written in Pike (http://pike.idonex.se/), from a company here in Link?ping. From tismer at appliedbiometrics.com Fri May 14 17:15:20 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 14 May 1999 17:15:20 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <373C3E08.FCCB141B@appliedbiometrics.com> Guido van Rossum wrote: [setjmp/longjmp -no-no] > Forget about it. setjmp/longjmp are invitations to problems. I also > assume that they would interfere badly with C++. > > > I think just about any Scheme implementation has to solve this same > > problem... I'll dig through my collection of them for ideas. > > Anything that assumes knowledge about how the C compiler and/or the > CPU and OS lay out the stack is a no-no, because it means that the > first thing one has to do for a port to a new architecture is figure > out how the stack is laid out. Another thread in this list is porting > Python to microplatforms like PalmOS. Typically the scheme Hackers > are not afraid to delve deep into the machine, but I refuse to do that > -- I think it's too risky. ... I agree that this is generally bad. While it's a cakewalk to do a stack swap for the few (X86 based:) platforms where I work with. This is much less than a thread change. But on the general issues: Can the Python-calls-C and C-calls-Python problem just be solved by turning the whole VM state into a data structure, including a Python call stack which is independent? Maybe this has been mentioned already. This might give a little slowdown, but opens possibilities like continuation-passing style, and context switches between different interpreter states would be under direct control. Just a little dreaming: Not using threads, but just tiny interpreter incarnations with local state, and a special C call or better a new opcode which activates the next state in some list (of course a Python list). This would automagically produce ICON iterators (duck) and coroutines (cover). If I guess right, continuation passing could be done by just shifting tiny tuples around. Well, Tim, help me :-) [closures] > > I think this is a great idea. I've been staring at python bytecodes a > > bit lately thinking about how to do something like this, for some > > subset of Python. Lumberjack? How is it going? [to Sam] ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Fri May 14 17:32:51 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 14 May 1999 11:32:51 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> Message-ID: <14140.16931.987089.887772@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> a lean and mean secure web server written in Pike FL> (http://pike.idonex.se/), from a company here in FL> Link?ping. Interesting off-topic Pike connection. My co-maintainer for CC-Mode original came on board to add Pike support, which has a syntax similar enough to C to be easily integrated. I think I've had as much success convincing him to use Python as he's had convincing me to use Pike :-) -Barry From gstein at lyra.org Fri May 14 23:54:02 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 14 May 1999 14:54:02 -0700 Subject: [Python-Dev] Roxen (was Re: [Python-Dev] 'stackless' python?) References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> <14140.16931.987089.887772@anthem.cnri.reston.va.us> Message-ID: <373C9B7A.3676A910@lyra.org> Barry A. Warsaw wrote: > > >>>>> "FL" == Fredrik Lundh writes: > > FL> a lean and mean secure web server written in Pike > FL> (http://pike.idonex.se/), from a company here in > FL> Link?ping. > > Interesting off-topic Pike connection. My co-maintainer for CC-Mode > original came on board to add Pike support, which has a syntax similar > enough to C to be easily integrated. I think I've had as much success > convincing him to use Python as he's had convincing me to use Pike :-) Heh. Pike is an outgrowth of the MUD world's LPC programming language. A guy named "Profezzorn" started a project (in '94?) to redevelop an LPC compiler/interpreter ("driver") from scratch to avoid some licensing constraints. The project grew into a generalized network handler, since MUDs' typical designs are excellent for these tasks. From there, you get the Roxen web server. Cheers, -g -- Greg Stein, http://www.lyra.org/ From rushing at nightmare.com Sat May 15 01:36:11 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Fri, 14 May 1999 16:36:11 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <14140.44469.848840.740112@seattle.nightmare.com> Guido van Rossum writes: > > If Python had closures, then it would be a *little* easier, but would > > still make the average Pythoneer swoon. Closures would let you put > > the above logic all in one method, but the code would still be > > 'inside-out'. > > I forget how this worked :-( [with a faked-up lambda-ish syntax] def thing (a): return do_async_job_1 (a, lambda (b): if (a>1): do_async_job_2a (b, lambda (c): [...] ) else: do_async_job_2b (a,b, lambda (d,e,f): [...] ) ) The call to do_async_job_1 passes 'a', and a callback, which is specified 'in-line'. You can follow the logic of something like this more easily than if each lambda is spun off into a different function/method. > > I think that a web server built on such a Python could have the > > performance/scalability of thttpd, with the ease-of-programming > > of Roxen. As far as I know, there's nothing like it out there. > > Medusa would be put out to pasture. 8^) > > I'm afraid I haven't kept up -- what are Roxen and thttpd? What do > they do that Apache doesn't? thttpd (& Zeus, Squid, Xitami) use select()/poll() to gain performance and scalability, but suffer from the same programmability problem as Medusa (only worse, 'cause they're in C). Roxen is written in Pike, a c-like language with gc, threads, etc... Roxen is I think now the official 'GNU Web Server'. Here's an interesting web-server comparison chart: http://www.acme.com/software/thttpd/benchmarks.html -Sam From guido at CNRI.Reston.VA.US Sat May 15 04:23:24 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 14 May 1999 22:23:24 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Fri, 14 May 1999 16:36:11 PDT." <14140.44469.848840.740112@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <14140.44469.848840.740112@seattle.nightmare.com> Message-ID: <199905150223.WAA02457@eric.cnri.reston.va.us> > def thing (a): > return do_async_job_1 (a, > lambda (b): > if (a>1): > do_async_job_2a (b, > lambda (c): > [...] > ) > else: > do_async_job_2b (a,b, > lambda (d,e,f): > [...] > ) > ) > > The call to do_async_job_1 passes 'a', and a callback, which is > specified 'in-line'. You can follow the logic of something like this > more easily than if each lambda is spun off into a different > function/method. I agree that it is still ugly. > http://www.acme.com/software/thttpd/benchmarks.html I see. Any pointers to a graph of thttp market share? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Sat May 15 09:51:00 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 15 May 1999 03:51:00 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <000701be9ea7$acab7f40$159e2299@tim> [GvR] > ... > Anything that assumes knowledge about how the C compiler and/or the > CPU and OS lay out the stack is a no-no, because it means that the > first thing one has to do for a port to a new architecture is figure > out how the stack is laid out. Another thread in this list is porting > Python to microplatforms like PalmOS. Typically the scheme Hackers > are not afraid to delve deep into the machine, but I refuse to do that > -- I think it's too risky. The Icon language needs a bit of platform-specific context-switching assembly code to support its full coroutine features, although its bread-and-butter generators ("semi coroutines") don't need anything special. The result is that Icon ports sometimes limp for a year before they support full coroutines, waiting for someone wizardly enough to write the necessary code. This can, in fact, be quite difficult; e.g., on machines with HW register windows (where "the stack" can be a complicated beast half buried in hidden machine state, sometimes needing kernel privilege to uncover). Not attractive. Generators are, though . threads-too-ly y'rs - tim From tim_one at email.msn.com Sat May 15 09:51:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 15 May 1999 03:51:03 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com> Message-ID: <000801be9ea7$ae45f560$159e2299@tim> [Christian Tismer] > ... > But on the general issues: > Can the Python-calls-C and C-calls-Python problem just be solved > by turning the whole VM state into a data structure, including > a Python call stack which is independent? Maybe this has been > mentioned already. The problem is that when C calls Python, any notion of continuation has to include C's state too, else resuming the continuation won't return into C correctly. The C code that *implements* Python could be reworked to support this, but in the general case you've got some external C extension module calling into Python, and then Python hasn't a clue about its caller's state. I'm not a fan of continuations myself; coroutines can be implemented faithfully via threads (I posted a rather complete set of Python classes for that in the pre-DejaNews days, a bit more flexible than Icon's coroutines); and: > This would automagically produce ICON iterators (duck) > and coroutines (cover). Icon iterators/generators could be implemented today if anyone bothered (Majewski essentially implemented them back around '93 already, but seemed to lose interest when he realized it couldn't be extended to full continuations, because of C/Python stack intertwingling). > If I guess right, continuation passing could be done > by just shifting tiny tuples around. Well, Tim, help me :-) Python-calling-Python continuations should be easily doable in a "stackless" Python; the key ideas were already covered in this thread, I think. The thing that makes generators so much easier is that they always return directly to their caller, at the point of call; so no C frame can get stuck in the middle even under today's implementation; it just requires not deleting the generator's frame object, and adding an opcode to *resume* the frame's execution the next time the generator is called. Unlike as in Icon, it wouldn't even need to be tied to a funky notion of goal-directed evaluation. don't-try-to-traverse-a-tree-without-it-ly y'rs - tim From gstein at lyra.org Sat May 15 10:17:15 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 15 May 1999 01:17:15 -0700 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <14140.44469.848840.740112@seattle.nightmare.com> <199905150223.WAA02457@eric.cnri.reston.va.us> Message-ID: <373D2D8B.390C523C@lyra.org> Guido van Rossum wrote: > ... > > http://www.acme.com/software/thttpd/benchmarks.html > > I see. Any pointers to a graph of thttp market share? thttpd currently has about 70k sites (of 5.4mil found by Netcraft). That puts it at #6. However, it is interesting to note that 60k of those sites are in the .uk domain. I can't figure out who is running it, but I would guess that a large UK-based ISP is hosting a bunch of domains on thttpd. It is somewhat difficult to navigate the various reports (and it never fails that the one you want is not present), but the data is from Netcraft's survey at: http://www.netcraft.com/survey/ Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Sat May 15 18:43:20 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 15 May 1999 12:43:20 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com> Message-ID: <000701be9ef2$0a9713e0$659e2299@tim> [Christian Tismer] > ... > But on the general issues: > Can the Python-calls-C and C-calls-Python problem just be solved > by turning the whole VM state into a data structure, including > a Python call stack which is independent? Maybe this has been > mentioned already. The problem is that when C calls Python, any notion of continuation has to include C's state too, else resuming the continuation won't return into C correctly. The C code that *implements* Python could be reworked to support this, but in the general case you've got some external C extension module calling into Python, and then Python hasn't a clue about its caller's state. I'm not a fan of continuations myself; coroutines can be implemented faithfully via threads (I posted a rather complete set of Python classes for that in the pre-DejaNews days, a bit more flexible than Icon's coroutines); and: > This would automagically produce ICON iterators (duck) > and coroutines (cover). Icon iterators/generators could be implemented today if anyone bothered (Majewski essentially implemented them back around '93 already, but seemed to lose interest when he realized it couldn't be extended to full continuations, because of C/Python stack intertwingling). > If I guess right, continuation passing could be done > by just shifting tiny tuples around. Well, Tim, help me :-) Python-calling-Python continuations should be easily doable in a "stackless" Python; the key ideas were already covered in this thread, I think. The thing that makes generators so much easier is that they always return directly to their caller, at the point of call; so no C frame can get stuck in the middle even under today's implementation; it just requires not deleting the generator's frame object, and adding an opcode to *resume* the frame's execution the next time the generator is called. Unlike as in Icon, it wouldn't even need to be tied to a funky notion of goal-directed evaluation. don't-try-to-traverse-a-tree-without-it-ly y'rs - tim From rushing at nightmare.com Sun May 16 13:10:18 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Sun, 16 May 1999 04:10:18 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <81365478@toto.iv> Message-ID: <14142.40867.103424.764346@seattle.nightmare.com> Tim Peters writes: > I'm not a fan of continuations myself; coroutines can be > implemented faithfully via threads (I posted a rather complete set > of Python classes for that in the pre-DejaNews days, a bit more > flexible than Icon's coroutines); and: Continuations are more powerful than coroutines, though I admit they're a bit esoteric. I programmed in Scheme for years without seeing the need for them. But when you need 'em, you *really* need 'em. No way around it. For my purposes (massively scalable single-process servers and clients) threads don't cut it... for example I have a mailing-list exploder that juggles up to 2048 simultaneous SMTP connections. I think it can go higher - I've tested select() on FreeBSD with 16,000 file descriptors. [...] BTW, I have actually made progress borrowing a bit of code from SCM. It uses the stack-copying technique, along with setjmp/longjmp. It's too ugly and unportable to be a real candidate for inclusion in Official Python. [i.e., if it could be made to work it should be considered a stopgap measure for the desperate]. I haven't tested it thoroughly, but I have successfully saved and invoked (and reinvoked) a continuation. Caveat: I have to turn off Py_DECREF in order to keep it from crashing. | >>> import callcc | >>> saved = None | >>> def thing(n): | ... if n == 2: | ... global saved | ... saved = callcc.new() | ... print 'n==',n | ... if n == 0: | ... print 'Done!' | ... else: | ... thing (n-1) | ... | >>> thing (5) | n== 5 | n== 4 | n== 3 | n== 2 | n== 1 | n== 0 | Done! | >>> saved | | >>> saved.throw (0) | n== 2 | n== 1 | n== 0 | Done! | >>> saved.throw (0) | n== 2 | n== 1 | n== 0 | Done! | >>> I will probably not be able to work on this for a while (baby due any day now), so anyone is welcome to dive right in. I don't have much experience wading through gdb tracking down reference bugs, I'm hoping a brave soul will pick up where I left off. 8^) http://www.nightmare.com/stuff/python-callcc.tar.gz ftp://www.nightmare.com/stuff/python-callcc.tar.gz -Sam From tismer at appliedbiometrics.com Sun May 16 17:31:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 16 May 1999 17:31:01 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> Message-ID: <373EE4B5.6EE6A678@appliedbiometrics.com> rushing at nightmare.com wrote: [...] > BTW, I have actually made progress borrowing a bit of code from SCM. > It uses the stack-copying technique, along with setjmp/longjmp. It's > too ugly and unportable to be a real candidate for inclusion in > Official Python. [i.e., if it could be made to work it should be > considered a stopgap measure for the desperate]. I tried it and built it as a Win32 .pyd file, and it seems to work, but... > I haven't tested it thoroughly, but I have successfully saved and > invoked (and reinvoked) a continuation. Caveat: I have to turn off > Py_DECREF in order to keep it from crashing. Indeed, and this seems to be a problem too hard to solve without lots of work. Since you keep a snapshot of the current machine stack, it contains a number of object references which have been valid when the snapshot was taken, but many are most probably invalid when you restart the continuation. I guess, incref-ing all current alive objects on the interpreter stack would be the minimum, maybe more. A tuple of necessary references could be used as an attribute of a Continuation object. I will look how difficult this is. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Sun May 16 20:31:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 16 May 1999 20:31:01 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> <373EE4B5.6EE6A678@appliedbiometrics.com> Message-ID: <373F0EE5.A8DE00C5@appliedbiometrics.com> Christian Tismer wrote: > > rushing at nightmare.com wrote: [...] > > I haven't tested it thoroughly, but I have successfully saved and > > invoked (and reinvoked) a continuation. Caveat: I have to turn off > > Py_DECREF in order to keep it from crashing. It is possible, but a little hard. To take a working snapshot of the current thread's stack, one needs not only the stack snapshot which continue.c provides, but also a restorable copy of all frame objects involved so far. A copy of the current frame chain must be built, with proper reference counting of all involved elements. And this is the crux: The current stack pointer of the VM is not present in the frame objects, but hangs around somewhere on the machine stack. Two solutions: 1) modify PyFrameObject by adding a field which holds the stack pointer, when a function is called. I don't like to change the VM in any way for this. 2) use the lasti field which holds the last VM instruction offset. Then scan the opcodes of the code object and calculate the current stack level. This is possible since Guido's code generator creates code with the stack level lexically bound to the code offset. Now we can incref all the referenced objects in the frame. This must be done for the whole chain, which is copied and relinked during that. This chain is then held as a property of the continuation object. To throw the continuation, the current frame chain must be cleared, and the saved one is inserted, together with the machine stack operation which Sam has already. A little hefty, isn't it? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Mon May 17 07:42:59 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 17 May 1999 01:42:59 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14142.40867.103424.764346@seattle.nightmare.com> Message-ID: <000f01bea028$1f75c360$fb9e2299@tim> [Sam] > Continuations are more powerful than coroutines, though I admit > they're a bit esoteric. "More powerful" is a tedious argument you should always avoid . > I programmed in Scheme for years without seeing the need for them. > But when you need 'em, you *really* need 'em. No way around it. > > For my purposes (massively scalable single-process servers and > clients) threads don't cut it... for example I have a mailing-list > exploder that juggles up to 2048 simultaneous SMTP connections. I > think it can go higher - I've tested select() on FreeBSD with 16,000 > file descriptors. The other point being that you want to avoid "inside out" logic, though, right? Earlier you posted a kind of ideal: Recently I've written an async server that needed to talk to several other RPC servers, and a mysql server. Pseudo-example, with possibly-async calls in UPPERCASE: auth, archive = db.FETCH_USER_INFO (user) if verify_login(user,auth): rpc_server = self.archive_servers[archive] group_info = rpc_server.FETCH_GROUP_INFO (group) if valid (group_info): return rpc_server.FETCH_MESSAGE (message_number) else: ... else: ... I assume you want to capture a continuation object in the UPPERCASE methods, store it away somewhere, run off to your select/poll/whatever loop, and have it invoke the stored continuation objects as the data they're waiting for arrives. If so, that's got to be the nicest use for continuations I've seen! All invisible to the end user. I don't know how to fake it pleasantly without threads, either, and understand that threads aren't appropriate for resource reasons. So I don't have a nice alternative. > ... > | >>> import callcc > | >>> saved = None > | >>> def thing(n): > | ... if n == 2: > | ... global saved > | ... saved = callcc.new() > | ... print 'n==',n > | ... if n == 0: > | ... print 'Done!' > | ... else: > | ... thing (n-1) > | ... > | >>> thing (5) > | n== 5 > | n== 4 > | n== 3 > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> saved > | > | >>> saved.throw (0) > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> saved.throw (0) > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> Suppose the driver were in a script instead: thing(5) # line 1 print repr(saved) # line 2 saved.throw(0) # line 3 saved.throw(0) # line 4 Then the continuation would (eventually) "return to" the "print repr(saved)" and we'd get an infinite output tail of: Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! ... and never reach line 4. Right? That's the part that Guido hates . takes-one-to-know-one-ly y'rs - tim From tismer at appliedbiometrics.com Mon May 17 09:07:22 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 17 May 1999 09:07:22 +0200 Subject: [Python-Dev] 'stackless' python? References: <000f01bea028$1f75c360$fb9e2299@tim> Message-ID: <373FC02A.69F2D912@appliedbiometrics.com> Tim Peters wrote: [to Sam] > The other point being that you want to avoid "inside out" logic, though, > right? Earlier you posted a kind of ideal: > > Recently I've written an async server that needed to talk to several > other RPC servers, and a mysql server. Pseudo-example, with > possibly-async calls in UPPERCASE: > > auth, archive = db.FETCH_USER_INFO (user) > if verify_login(user,auth): > rpc_server = self.archive_servers[archive] > group_info = rpc_server.FETCH_GROUP_INFO (group) > if valid (group_info): > return rpc_server.FETCH_MESSAGE (message_number) > else: > ... > else: > ... > > I assume you want to capture a continuation object in the UPPERCASE methods, > store it away somewhere, run off to your select/poll/whatever loop, and have > it invoke the stored continuation objects as the data they're waiting for > arrives. > > If so, that's got to be the nicest use for continuations I've seen! All > invisible to the end user. I don't know how to fake it pleasantly without > threads, either, and understand that threads aren't appropriate for resource > reasons. So I don't have a nice alternative. It can always be done with threads, but also without. Tried it last night, with proper refcounting, and it wasn't too easy since I had to duplicate the Python frame chain. ... > Suppose the driver were in a script instead: > > thing(5) # line 1 > print repr(saved) # line 2 > saved.throw(0) # line 3 > saved.throw(0) # line 4 > > Then the continuation would (eventually) "return to" the "print repr(saved)" > and we'd get an infinite output tail of: > > Continuation object at 80d30d0> > n== 2 > n== 1 > n== 0 > Done! > Continuation object at 80d30d0> > n== 2 > n== 1 > n== 0 > Done! This is at the moment exactly what happens, with the difference that after some repetitions we GPF due to dangling references to too often decref'ed objects. My incref'ing prepares for just one re-incarnation and should prevend a second call. But this will be solved, soon. > and never reach line 4. Right? That's the part that Guido hates . Yup. With a little counting, it was easy to survive: def main(): global a a=2 thing (5) a=a-1 if a: saved.throw (0) Weird enough and needs a much better interface. But finally I'm quite happy that it worked so smoothly after just a couple of hours (well, about six :) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing at nightmare.com Mon May 17 11:46:29 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 02:46:29 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <000f01bea028$1f75c360$fb9e2299@tim> References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> Message-ID: <14143.56604.21827.891993@seattle.nightmare.com> Tim Peters writes: > [Sam] > > Continuations are more powerful than coroutines, though I admit > > they're a bit esoteric. > > "More powerful" is a tedious argument you should always avoid . More powerful in the sense that you can use continuations to build lots of different control structures (coroutines, backtracking, exceptions), but not vice versa. Kinda like a better tool for blowing one's own foot off. 8^) > Suppose the driver were in a script instead: > > thing(5) # line 1 > print repr(saved) # line 2 > saved.throw(0) # line 3 > saved.throw(0) # line 4 > > Then the continuation would (eventually) "return to" the "print repr(saved)" > and we'd get an infinite output tail [...] > > and never reach line 4. Right? That's the part that Guido hates . Yes... the continuation object so far isn't very usable. It needs a driver of some kind around it. In the Scheme world, there are two common ways of using continuations - let/cc and call/cc. [call/cc is what is in the standard, it's official name is call-with-current-continuation] let/cc stores the continuation in a variable binding, while introducing a new scope. It requires a change to the underlying language: (+ 1 (let/cc escape (...) (escape 34))) => 35 'escape' is a function that when called will 'resume' with whatever follows the let/cc clause. In this case it would continue with the addition... call/cc is a little trickier, but doesn't require any change to the language... instead of making a new binding directly, you pass in a function that will receive the binding: (+ 1 (call/cc (lambda (escape) (...) (escape 34)))) => 35 In words, it's much more frightening: "call/cc is a function, that when called with a function as an argument, will pass that function an argument that is a new function, which when called with a value will resume the computation with that value as the result of the entire expression" Phew. In Python, an example might look like this: SAVED = None def save_continuation (k): global SAVED SAVED = k def thing(): [...] value = callcc (lambda k: save_continuation(k)) # or more succinctly: def thing(): [...] value = callcc (save_continuation) In order to do useful work like passing values back and forth between coroutines, we have to have some way of returning a value from the continuation when it is reinvoked. I should emphasize that most folks will never see call/cc 'in the raw', it will usually have some nice wrapper around to implement whatever construct is needed. -Sam From arw at ifu.net Mon May 17 20:06:18 1999 From: arw at ifu.net (Aaron Watters) Date: Mon, 17 May 1999 14:06:18 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads Message-ID: <37405A99.1DBAF399@ifu.net> The illustrious Sam Rushing avers: >Continuations are more powerful than coroutines, though I admit >they're a bit esoteric. I programmed in Scheme for years without >seeing the need for them. But when you need 'em, you *really* need >'em. No way around it. Frankly, I think I thought I understood this once but now I know I don't. How're continuations more powerful than coroutines? And why can't they be implemented using threads (and semaphores etc)? ...I'm not promising I'll understand the answer... -- Aaron Watters === I taught I taw a putty-cat! From gmcm at hypernet.com Mon May 17 21:18:43 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 17 May 1999 14:18:43 -0500 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37405A99.1DBAF399@ifu.net> Message-ID: <1285153546-166193857@hypernet.com> The estimable Aaron Watters queries: > The illustrious Sam Rushing avers: > >Continuations are more powerful than coroutines, though I admit > >they're a bit esoteric. I programmed in Scheme for years without > >seeing the need for them. But when you need 'em, you *really* need > >'em. No way around it. > > Frankly, I think I thought I understood this once but now I know I > don't. How're continuations more powerful than coroutines? And why > can't they be implemented using threads (and semaphores etc)? I think Sam's (immediate ) problem is that he can't afford threads - he may have hundreds to thousands of these suckers. As a fuddy-duddy old imperative programmer, I'm inclined to think "state machine". But I'd guess that functional-ophiles probably see that as inelegant. (Safe guess - they see _anything_ that isn't functional as inelegant!). crude-but-not-rude-ly y'rs - Gordon From jeremy at cnri.reston.va.us Mon May 17 20:43:34 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Mon, 17 May 1999 14:43:34 -0400 (EDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37405A99.1DBAF399@ifu.net> References: <37405A99.1DBAF399@ifu.net> Message-ID: <14144.24242.128959.726878@bitdiddle.cnri.reston.va.us> >>>>> "AW" == Aaron Watters writes: AW> The illustrious Sam Rushing avers: >> Continuations are more powerful than coroutines, though I admit >> they're a bit esoteric. I programmed in Scheme for years without >> seeing the need for them. But when you need 'em, you *really* >> need 'em. No way around it. AW> Frankly, I think I thought I understood this once but now I know AW> I don't. How're continuations more powerful than coroutines? AW> And why can't they be implemented using threads (and semaphores AW> etc)? I think I understood, too. I'm hoping that someone will debug my answer and enlighten us both. A continuation is a mechanism for making control flow explicit. A continuation is a means of naming and manipulating "the rest of the program." In Scheme terms, the continuation is the function that the value of the current expression should be passed to. The call/cc mechanisms lets you capture the current continuation and explicitly call on it. The most typical use of call/cc is non-local exits, but it gives you incredible flexibility for implementing your control flow. I'm fuzzy on coroutines, as I've only seen them in "Structure Programming" (which is as old as I am :-) and never actually used them. The basic idea is that when a coroutine calls another coroutine, control is transfered to the second coroutine at the point at which it last left off (by itself calling another coroutine or by detaching, which returns control to the lexically enclosing scope). It seems to me that coroutines are an example of the kind of control structure that you could build with continuations. It's not clear that the reverse is true. I have to admit that I'm a bit unclear on the motivation for all this. As Gordon said, the state machine approach seems like it would be a good approach. Jeremy From klm at digicool.com Mon May 17 21:08:57 1999 From: klm at digicool.com (Ken Manheimer) Date: Mon, 17 May 1999 15:08:57 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads Message-ID: <613145F79272D211914B0020AFF640190BEEDE@gandalf.digicool.com> Jeremy Hylton: > I have to admit that I'm a bit unclear on the motivation for all > this. As Gordon said, the state machine approach seems like it would > be a good approach. If i understand what you mean by state machine programming, it's pretty inherently uncompartmented, all the combinations of state variables need to be accounted for, so the number of states grows factorially on the number of state vars, in general it's awkward. The advantage of going with what functional folks come up with, like continuations, is that it tends to be well compartmented - functional. (Come to think of it, i suppose that compartmentalization as opposed to state is their mania.) As abstract as i can be (because i hardly know what i'm talking about) (but i have done some specifically finite state machine programming, and did not enjoy it), Ken klm at digicool.com From arw at ifu.net Mon May 17 21:20:13 1999 From: arw at ifu.net (Aaron Watters) Date: Mon, 17 May 1999 15:20:13 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <1285153546-166193857@hypernet.com> Message-ID: <37406BED.95AEB896@ifu.net> The ineffible Gordon McMillan retorts: > As a fuddy-duddy old imperative programmer, I'm inclined to think > "state machine". But I'd guess that functional-ophiles probably see > that as inelegant. (Safe guess - they see _anything_ that isn't > functional as inelegant!). As a fellow fuddy-duddy I'd agree except that if you write properlylayered software you have to unrole and rerole all those layers for every transition of the multi-level state machine, and even though with proper discipline it can be implemented without becoming hideous, it still adds significant overhead compared to "stop right here and come back later" which could be implemented using threads/coroutines(?)/continuations. I think this is particularly true in Python with the relatively high function call overhead. Or maybe I'm out in left field doing cartwheels... I guess the question of interest is why are threads insufficient? I guess they have system limitations on the number of threads or other limitations that wouldn't be a problem with continuations? If there aren't a *lot* of situations where coroutines are vital, I'd be hesitant to do major surgery. But I'm a fuddy-duddy. -- Aaron Watters === I did! I did! From tismer at appliedbiometrics.com Mon May 17 22:03:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 17 May 1999 22:03:01 +0200 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <1285153546-166193857@hypernet.com> <37406BED.95AEB896@ifu.net> Message-ID: <374075F5.F29B4EAB@appliedbiometrics.com> Aaron Watters wrote: > > The ineffible Gordon McMillan retorts: > > > As a fuddy-duddy old imperative programmer, I'm inclined to think > > "state machine". But I'd guess that functional-ophiles probably see > > that as inelegant. (Safe guess - they see _anything_ that isn't > > functional as inelegant!). > > As a fellow fuddy-duddy I'd agree except that if you write properlylayered > software you have to unrole and rerole all those layers for every > transition of the multi-level state machine, and even though with proper > discipline it can be implemented without becoming hideous, it still adds > significant overhead compared to "stop right here and come back later" > which could be implemented using threads/coroutines(?)/continuations. Coroutines are most elegant here, since (fir a simple example) they are a symmetric pair of functions which call each other. There is neither the one-pulls, the other pushes asymmetry, nor the need to maintain state and be controlled by a supervisor function. > I think this is particularly true in Python with the relatively high > function > call overhead. Or maybe I'm out in left field doing cartwheels... > I guess the question of interest is why are threads insufficient? I guess > they have system limitations on the number of threads or other limitations > that wouldn't be a problem with continuations? If there aren't a *lot* of > situations where coroutines are vital, I'd be hesitant to do major > surgery. For me (as always) most interesting is the possible speed of coroutines. They involve no threads overhead, no locking, no nothing. Python supports it better than expected. If the stack level of two code objects is the same at a switching point, the whole switch is nothing more than swapping two frame objects, and we're done. This might be even cheaper than general call/cc, like a function call. Sam's prototype works already, with no change to the interpreter (but knowledge of Python frames, and a .dll of course). I think we'll continue a while. continuously - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From gmcm at hypernet.com Tue May 18 00:17:25 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 17 May 1999 17:17:25 -0500 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <374075F5.F29B4EAB@appliedbiometrics.com> Message-ID: <1285142823-166838954@hypernet.com> Co-Christian-routines Tismer continues: > Aaron Watters wrote: > > > > The ineffible Gordon McMillan retorts: > > > > > As a fuddy-duddy old imperative programmer, I'm inclined to think > > > "state machine". But I'd guess that functional-ophiles probably see > > > that as inelegant. (Safe guess - they see _anything_ that isn't > > > functional as inelegant!). > > > > As a fellow fuddy-duddy I'd agree except that if you write properlylayered > > software you have to unrole and rerole all those layers for every > > transition of the multi-level state machine, and even though with proper > > discipline it can be implemented without becoming hideous, it still adds > > significant overhead compared to "stop right here and come back later" > > which could be implemented using threads/coroutines(?)/continuations. > > Coroutines are most elegant here, since (fir a simple example) > they are a symmetric pair of functions which call each other. > There is neither the one-pulls, the other pushes asymmetry, nor the > need to maintain state and be controlled by a supervisor function. Well, the state maintains you, instead of the other way 'round. (Any other ex-Big-Blue-ers out there that used to play these games with checkpoint and SyncSort?). I won't argue elegance. Just a couple points: - there's an art to writing state machines which is largely unrecognized (most of them are unnecessarily horrid). - a multiplexed solution (vs a threaded solution) requires that something be inside out. In one case it's your code, in the other, your understanding of the problem. Neither is trivial. Not to be discouraging - as long as your solution doesn't involve using regexps on bytecode , I say go for it! - Gordon From guido at CNRI.Reston.VA.US Tue May 18 06:03:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 18 May 1999 00:03:34 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Mon, 17 May 1999 02:46:29 PDT." <14143.56604.21827.891993@seattle.nightmare.com> References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> <14143.56604.21827.891993@seattle.nightmare.com> Message-ID: <199905180403.AAA04772@eric.cnri.reston.va.us> Sam (& others), I thought I understood what continuations were, but the examples of what you can do with them so far don't clarify the matter at all. Perhaps it would help to explain what a continuation actually does with the run-time environment, instead of giving examples of how to use them and what the result it? Here's a start of my own understanding (brief because I'm on a 28.8k connection which makes my ordinary typing habits in Emacs very painful). 1. All program state is somehow contained in a single execution stack. This includes globals (which are simply name bindings in the botton stack frame). It also includes a code pointer for each stack frame indicating where the function corresponding to that stack frame is executing (this is the return address if there is a newer stack frame, or the current instruction for the newest frame). 2. A continuation does something equivalent to making a copy of the entire execution stack. This can probably be done lazily. There are probably lots of details. I also expect that Scheme's semantic model is different than Python here -- e.g. does it matter whether deep or shallow copies are made? I.e. are there mutable *objects* in Scheme? (I know there are mutable and immutable *name bindings* -- I think.) 3. Calling a continuation probably makes the saved copy of the execution stack the current execution state; I presume there's also a way to pass an extra argument. 4. Coroutines (which I *do* understand) are probably done by swapping between two (or more) continuations. 5. Other control constructs can be done by various manipulations of continuations. I presume that in many situations the saved continuation becomes the main control locus permanently, and the (previously) current stack is simply garbage-collected. Of course the lazy copy makes this efficient. If this all is close enough to the truth, I think that continuations involving C stack frames are definitely out -- as Tim Peters mentioned, you don't know what the stuff on the C stack of extensions refers to. (My guess would be that Scheme implementations assume that any pointers on the C stack point to Scheme objects, so that C stack frames can be copied and conservative GC can be used -- this will never happen in Python.) Continuations involving only Python stack frames might be supported, if we can agree on the the sharing / copying semantics. This is where I don't know enough see questions at #2 above). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Tue May 18 06:46:12 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:46:12 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37406BED.95AEB896@ifu.net> Message-ID: <000901bea0e9$5aa2dec0$829e2299@tim> [Aaron Watters] > ... > I guess the question of interest is why are threads insufficient? I > guess they have system limitations on the number of threads or other > limitations that wouldn't be a problem with continuations? Sam is mucking with thousands of simultaneous I/O-bound socket connections, and makes a good case that threads simply don't fly here (each one consumes a stack, kernel resources, etc). It's unclear (to me) that thousands of continuations would be *much* better, though, by the time Christian gets done making thousands of copies of the Python stack chain. > If there aren't a *lot* of situations where coroutines are vital, I'd > be hesitant to do major surgery. But I'm a fuddy-duddy. Go to Sam's site (http://www.nightmare.com/), download Medusa, and read the docs. They're very well written and describe the problem space exquisitely. I don't have any problems like that I need to solve, but it's interesting to ponder! alas-no-time-for-it-now-ly y'rs - tim From tim_one at email.msn.com Tue May 18 06:45:52 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:45:52 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373FC02A.69F2D912@appliedbiometrics.com> Message-ID: <000301bea0e9$4fd473a0$829e2299@tim> [Christian Tismer] > ... > Yup. With a little counting, it was easy to survive: > > def main(): > global a > a=2 > thing (5) > a=a-1 > if a: > saved.throw (0) Did "a" really need to be global here? I hope you see the same behavior without the "global a"; e.g., this Scheme: (define -cont- #f) (define thing (lambda (n) (if (= n 2) (call/cc (lambda (k) (set! -cont- k)))) (display "n == ") (display n) (newline) (if (= n 0) (begin (display "Done!") (newline)) (thing (- n 1))))) (define main (lambda () (let ((a 2)) (thing 5) (display "a is ") (display a) (newline) (set! a (- a 1)) (if (> a 0) (-cont- #f))))) (main) prints: n == 5 n == 4 n == 3 n == 2 n == 1 n == 0 Done! a is 2 n == 2 n == 1 n == 0 Done! a is 1 Or does brute-force frame-copying cause the continuation to set "a" back to 2 each time? > Weird enough Par for the continuation course! They're nasty when eaten raw. > and needs a much better interface. Ya, like screw 'em and use threads . > But finally I'm quite happy that it worked so smoothly > after just a couple of hours (well, about six :) Yup! Playing with Python internals is a treat. to-be-continued-ly y'rs - tim From tim_one at email.msn.com Tue May 18 06:45:57 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:45:57 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14143.56604.21827.891993@seattle.nightmare.com> Message-ID: <000401bea0e9$51e467e0$829e2299@tim> [Sam] >>> Continuations are more powerful than coroutines, though I admit >>> they're a bit esoteric. [Tim] >> "More powerful" is a tedious argument you should always avoid . [Sam] > More powerful in the sense that you can use continuations to build > lots of different control structures (coroutines, backtracking, > exceptions), but not vice versa. "More powerful" is a tedious argument you should always avoid >. >> Then the continuation would (eventually) "return to" the >> "print repr(saved)" and we'd get an infinite output tail [...] >> and never reach line 4. Right? > Yes... the continuation object so far isn't very usable. But it's proper behavior for a continuation all the same! So this aspect shouldn't be "fixed". > ... > let/cc stores the continuation in a variable binding, while > introducing a new scope. It requires a change to the underlying > language: Isn't this often implemented via a macro, though, so that (let/cc name code) "acts like" (call/cc (lambda (name) code)) ? I haven't used a Scheme with native let/cc, but poking around it appears that the real intent is to support exception-style function exits with a mechanism cheaper than 1st-class continuations: twice saw the let/cc object (the thingie bound to "name") defined as being invalid the instant after "code" returns, so it's an "up the call stack" gimmick. That doesn't sound powerful enough for what you're after. > [nice let/cc call/cc tutorialette] > ... > In order to do useful work like passing values back and forth between > coroutines, we have to have some way of returning a value from the > continuation when it is reinvoked. Somehow, I suspect that's the least of our problems <0.5 wink>. If continuations are in Python's future, though, I agree with the need as stated. > I should emphasize that most folks will never see call/cc 'in the > raw', it will usually have some nice wrapper around to implement > whatever construct is needed. Python already has well-developed exception and thread facilities, so it's hard to make a case for continuations as a catch-all implementation mechanism. That may be the rub here: while any number of things *can* be implementated via continuations, I think very few *need* to be implemented that way, and full-blown continuations aren't easy to implement efficiently & portably. The Icon language was particularly concerned with backtracking searches, and came up with generators as another clearer/cheaper implementation technique. When it went on to full-blown coroutines, it's hard to say whether continuations would have been a better approach. But the coroutine implementation it has is sluggish and buggy and hard to port, so I doubt they could have done noticeably worse. Would full-blown coroutines be powerful enough for your needs? assuming-the-practical-defn-of-"powerful-enough"-ly y'rs - tim From rushing at nightmare.com Tue May 18 07:18:06 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 22:18:06 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <000401bea0e9$51e467e0$829e2299@tim> References: <14143.56604.21827.891993@seattle.nightmare.com> <000401bea0e9$51e467e0$829e2299@tim> Message-ID: <14144.61765.308962.101884@seattle.nightmare.com> Tim Peters writes: > Isn't this often implemented via a macro, though, so that > > (let/cc name code) > > "acts like" > > (call/cc (lambda (name) code)) Yup, they're equivalent, in the sense that given one you can make a macro to do the other. call/cc is preferred because it doesn't require a new binding construct. > ? I haven't used a Scheme with native let/cc, but poking around it > appears that the real intent is to support exception-style function > exits with a mechanism cheaper than 1st-class continuations: twice > saw the let/cc object (the thingie bound to "name") defined as > being invalid the instant after "code" returns, so it's an "up the > call stack" gimmick. That doesn't sound powerful enough for what > you're after. Except that since the escape procedure is 'first-class' it can be stored away and invoked (and reinvoked) later. [that's all that 'first-class' means: a thing that can be stored in a variable, returned from a function, used as an argument, etc..] I've never seen a let/cc that wasn't full-blown, but it wouldn't surprise me. > The Icon language was particularly concerned with backtracking > searches, and came up with generators as another clearer/cheaper > implementation technique. When it went on to full-blown > coroutines, it's hard to say whether continuations would have been > a better approach. But the coroutine implementation it has is > sluggish and buggy and hard to port, so I doubt they could have > done noticeably worse. Many Scheme implementors either skip it, or only support non-escaping call/cc (i.e., exceptions in Python). > Would full-blown coroutines be powerful enough for your needs? Yes, I think they would be. But I think with Python it's going to be just about as hard, either way. -Sam From rushing at nightmare.com Tue May 18 07:48:29 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 22:48:29 -0700 (PDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <51325225@toto.iv> Message-ID: <14144.63787.502454.111804@seattle.nightmare.com> Aaron Watters writes: > Frankly, I think I thought I understood this once but now I know I > don't. 8^) That's what I said when I backed into the idea via medusa a couple of years ago. > How're continuations more powerful than coroutines? And why can't > they be implemented using threads (and semaphores etc)? My understanding of the original 'coroutine' (from Pascal?) was that it allows two procedures to 'resume' each other. The classic coroutine example is the 'samefringe' problem: given two trees of differing structure, are they equal in the sense that a traversal of the leaves results in the same list? Coroutines let you do this efficiently, comparing leaf-by-leaf without storing the whole tree. continuations can do coroutines, but can also be used to implement backtracking, exceptions, threads... probably other stuff I've never heard of or needed. The reason that Scheme and ML are such big fans of continuations is because they can be used to implement all these other features. Look at how much try/except and threads complicate other language implementations. It's like a super-tool-widget - if you make sure it's in your toolbox, you can use it to build your circular saw and lathe from scratch. Unfortunately there aren't many good sites on the web with good explanatory material. The best reference I have is "Essentials of Programming Languages". For those that want to play with some of these ideas using little VM's written in Python: http://www.nightmare.com/software.html#EOPL -Sam From rushing at nightmare.com Tue May 18 07:56:37 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 22:56:37 -0700 (PDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <13631823@toto.iv> Message-ID: <14144.65355.400281.123856@seattle.nightmare.com> Jeremy Hylton writes: > I have to admit that I'm a bit unclear on the motivation for all > this. As Gordon said, the state machine approach seems like it would > be a good approach. For simple problems, state machines are ideal. Medusa uses state machines that are built out of Python methods. But past a certain level of complexity, they get too hairy to understand. A really good example can be found in /usr/src/linux/net/ipv4. 8^) -Sam From rushing at nightmare.com Tue May 18 09:05:20 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Tue, 18 May 1999 00:05:20 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <60057226@toto.iv> Message-ID: <14145.927.588572.113256@seattle.nightmare.com> Guido van Rossum writes: > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? This helped me a lot, and is the angle used in "Essentials of Programming Languages": Usually when folks refer to a 'stack', they're refering to an *implementation* of the stack data type: really an optimization that assumes an upper bound on stack size, and that things will only be pushed and popped in order. If you were to implement a language's variable and execution stacks with actual data structures (linked lists), then it's easy to see what's needed: the head of the list represents the current state. As functions exit, they pop things off the list. The reason I brought this up (during a lull!) was that Python is already paying all of the cost of heap-allocated frames, and it didn't seem to me too much of a leap from there. > 1. All program state is somehow contained in a single execution stack. Yup. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. Yup. > I.e. are there mutable *objects* in Scheme? > (I know there are mutable and immutable *name bindings* -- I think.) Yes, Scheme is pro-functional... but it has arrays, i/o, and set-cdr!, all the things that make it 'impure'. I think shallow copies are what's expected. In the examples I have, the continuation is kept in a 'register', and call/cc merely packages it up with a little function wrapper. You are allowed to stomp all over lexical variables with "set!". > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. Yup. > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. Yup. Here's an example in Scheme: http://www.nightmare.com/stuff/samefringe.scm Somewhere I have an example of coroutines being used for parsing, very elegant. Something like one coroutine does lexing, and passes tokens one-by-one to the next level, which passes parsed expressions to a compiler, or whatever. Kinda like pipes. > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course > the lazy copy makes this efficient. Yes... I think backtracking would be an example of this. You're doing a search on a large space (say a chess game). After a certain point you want to try a previous fork, to see if it's promising, but you don't want to throw away your current work. Save it, then unwind back to the previous fork, try that option out... if it turns out to be better then toss the original. > If this all is close enough to the truth, I think that > continuations involving C stack frames are definitely out -- as Tim > Peters mentioned, you don't know what the stuff on the C stack of > extensions refers to. (My guess would be that Scheme > implementations assume that any pointers on the C stack point to > Scheme objects, so that C stack frames can be copied and > conservative GC can be used -- this will never happen in Python.) I think you're probably right here - usually there are heavy restrictions on what kind of data can pass through the C interface. But I know of at least one Scheme (mzscheme/PLT) that uses conservative gc and has c/c++ interfaces. [... dig dig ...] From rushing at nightmare.com Tue May 18 09:17:11 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Tue, 18 May 1999 00:17:11 -0700 (PDT) Subject: [Python-Dev] another good motivation Message-ID: <14145.4917.164756.300678@seattle.nightmare.com> "Escaping the event loop: an alternative control structure for multi-threaded GUIs" http://cs.nyu.edu/phd_students/fuchs/ http://cs.nyu.edu/phd_students/fuchs/gui.ps -Sam From tismer at appliedbiometrics.com Tue May 18 15:46:53 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 15:46:53 +0200 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <000901bea0e9$5aa2dec0$829e2299@tim> Message-ID: <37416F4D.8E95D71A@appliedbiometrics.com> Tim Peters wrote: > > [Aaron Watters] > > ... > > I guess the question of interest is why are threads insufficient? I > > guess they have system limitations on the number of threads or other > > limitations that wouldn't be a problem with continuations? > > Sam is mucking with thousands of simultaneous I/O-bound socket connections, > and makes a good case that threads simply don't fly here (each one consumes > a stack, kernel resources, etc). It's unclear (to me) that thousands of > continuations would be *much* better, though, by the time Christian gets > done making thousands of copies of the Python stack chain. Well, what he needs here are coroutines and just a single frame object for every minithread (I think this is a "fiber"?). If these fibers later do deep function calls before they switch, there will of course be more frames then. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Tue May 18 16:35:30 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 16:35:30 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> <14143.56604.21827.891993@seattle.nightmare.com> <199905180403.AAA04772@eric.cnri.reston.va.us> Message-ID: <37417AB2.80920595@appliedbiometrics.com> Guido van Rossum wrote: > > Sam (& others), > > I thought I understood what continuations were, but the examples of > what you can do with them so far don't clarify the matter at all. > > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? > > Here's a start of my own understanding (brief because I'm on a 28.8k > connection which makes my ordinary typing habits in Emacs very > painful). > > 1. All program state is somehow contained in a single execution stack. > This includes globals (which are simply name bindings in the botton > stack frame). It also includes a code pointer for each stack frame > indicating where the function corresponding to that stack frame is > executing (this is the return address if there is a newer stack frame, > or the current instruction for the newest frame). Right. For now, this information is on the C stack for each called function, although almost completely available in the frame chain. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. This can probably be done lazily. There are > probably lots of details. I also expect that Scheme's semantic model > is different than Python here -- e.g. does it matter whether deep or > shallow copies are made? I.e. are there mutable *objects* in Scheme? > (I know there are mutable and immutable *name bindings* -- I think.) To make it lazy, a gatekeeper must be put on top of the two splitted frames, which catches the event that one of them returns. It appears to me that this it the same callcc.new() object which catches this, splitting frames when hit by a return. > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. > > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. Right, which is just two or three assignments. > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course the > lazy copy makes this efficient. Yes, great. It looks like that switching continuations is not more expensive than a single Python function call. > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). This would mean to avoid creating incompatible continuations. A continutation may not switch to a frame chain which was created by a different VM incarnation since this would later on corrupt the machine stack. One way to assure that would be a thread-safe function in sys, similar to sys.exc_info() which gives an id for the current interpreter. continuations living somewhere in globals would be marked by the interpreter which created them, and reject to be thrown if they don't match. The necessary interpreter support appears to be small: Extend the PyFrame structure by two fields: - interpreter ID (addr of some local variable would do) - stack pointer at current instruction. Change the CALL_FUNCTION opcode to avoid calling eval recursively in the case of a Python function/method, but the current frame, build the new one and start over. RETURN will pop a frame and reload its local variables instead of returning, as long as there is a frame to pop. I'm unclear how exceptions should be handled. Are they currently propagated up across different C calls other than ceval2 recursions? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Tue May 18 17:05:39 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Tue, 18 May 1999 11:05:39 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14145.927.588572.113256@seattle.nightmare.com> References: <60057226@toto.iv> <14145.927.588572.113256@seattle.nightmare.com> Message-ID: <14145.33150.767551.472591@bitdiddle.cnri.reston.va.us> >>>>> "SR" == rushing writes: SR> Somewhere I have an example of coroutines being used for SR> parsing, very elegant. Something like one coroutine does SR> lexing, and passes tokens one-by-one to the next level, which SR> passes parsed expressions to a compiler, or whatever. Kinda SR> like pipes. This is the first example that's used in Structured Programming (Dahl, Djikstra, and Hoare). I'd be happy to loan a copy to any of the Python-dev people who sit nearby. Jeremy From tismer at appliedbiometrics.com Tue May 18 17:31:11 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 17:31:11 +0200 Subject: [Python-Dev] 'stackless' python? References: <000301bea0e9$4fd473a0$829e2299@tim> Message-ID: <374187BF.36CC65E7@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > Yup. With a little counting, it was easy to survive: > > > > def main(): > > global a > > a=2 > > thing (5) > > a=a-1 > > if a: > > saved.throw (0) > > Did "a" really need to be global here? I hope you see the same behavior > without the "global a"; e.g., this Scheme: (H?stel) Actually, I inserted the "global" later. It worked as well with a local variable, but I didn't understand it. Still don't :-) > Or does brute-force frame-copying cause the continuation to set "a" back to > 2 each time? No, it doesn't. Behavior is exactly the same with or without global. I'm not sure wether this is a bug or a feature. I *think* 'a' as a local has a slot in the frame, so it's actually a different 'a' living in both copies. But this would not have worked. Can it be that before a function call, the interpreter turns its locals into a dict, using fast_to_locals? That would explain it. This is not what I think it should be! Locals need to be copied. > > and needs a much better interface. > > Ya, like screw 'em and use threads . Never liked threads. These fibers are so neat since they don't need threads, no locking, and they are available on systems without threads. > > But finally I'm quite happy that it worked so smoothly > > after just a couple of hours (well, about six :) > > Yup! Playing with Python internals is a treat. > > to-be-continued-ly y'rs - tim throw(42) - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip at mojam.com Tue May 18 17:49:42 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 18 May 1999 11:49:42 -0400 Subject: [Python-Dev] Is there another way to solve the continuation problem? Message-ID: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com> Okay, from my feeble understanding of the problem it appears that coroutines/continuations and threads are going to be problematic at best for Sam's needs. Are there other "solutions"? We know about state machines. They have the problem that the number of states grows exponentially (?) as the number of state variables increases. Can exceptions be coerced into providing the necessary structure without botching up the application too badly? Seems that at some point where you need to do some I/O, you could raise an exception whose second expression contains the necessary state to get back to where you need to be once the I/O is ready to go. The controller that catches the exceptions would use select or poll to prepare for the I/O then dispatch back to the handlers using the information from exceptions. class IOSetup: pass class WaveHands: """maintains exception raise info and selects one to go to next""" def choose_one(r,w,e): pass def remember(info): pass def controller(...): waiters = WaveHands() while 1: r, w, e = select([...], [...], [...]) # using r,w,e, select a waiter to call func, place = waiters.choose_one(r,w,e) try: func(place) except IOSetup, info: waiters.remember(info) def spam_func(place): if place == "spam": # whatever I/O we needed to do is ready to go bytes = read(some_fd) process(bytes) # need to read some more from some_fd. args are: # function, target, fd category (r, w), selectable object, raise IOSetup, (spam_func, "eggs" , "r", some_fd) elif place == "eggs": # that next chunk is ready - get it and proceed... elif yadda, yadda, yadda... One thread, some craftiness needed to construct things. Seems like it might isolate some of the statefulness to smaller functional units than a pure state machine. Clearly not as clean as continuations would be. Totally bogus? Totally inadequate? Maybe Sam already does things this way? Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer at appliedbiometrics.com Tue May 18 19:23:08 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 19:23:08 +0200 Subject: [Python-Dev] 'stackless' python? References: <000301bea0e9$4fd473a0$829e2299@tim> Message-ID: <3741A1FC.E84DC926@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > Yup. With a little counting, it was easy to survive: > > > > def main(): > > global a > > a=2 > > thing (5) > > a=a-1 > > if a: > > saved.throw (0) > > Did "a" really need to be global here? I hope you see the same behavior > without the "global a"; e.g., this Scheme: Actually, the frame-copying was not enough to make this all behave correctly. Since I didn't change the interpreter, the ceval.c incarnations still had copies to the old frames. The only effect which I achieved with frame copying was that the refcounts were increased correctly. I have to remove the hardware stack copying now. Will try to create a non-recursive version of the interpreter. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From MHammond at skippinet.com.au Wed May 19 01:16:54 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 19 May 1999 09:16:54 +1000 Subject: [Python-Dev] Is there another way to solve the continuation problem? In-Reply-To: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com> Message-ID: <006d01bea184$869f1480$0801a8c0@bobcat> > Sam's needs. Are there other "solutions"? We know about > state machines. > They have the problem that the number of states grows > exponentially (?) as > the number of state variables increases. Well, I can give you my feeble understanding of "IO Completion Ports", the technique Win32 provides to "solve" this problem. My experience is limited to how we used these in a server product designed to maintain thousands of long-term client connections each spooling large chunks of data (MSOffice docs - yes, that large :-). We too could obviously not afford a thread per connection. Searching through NT's documentation, completion ports are the technique they recommend for high-performance IO, and it appears to deliver. NT has the concept of a completion port, which in many ways is like an "inverted semaphore". You create a completion port with a "max number of threads" value. Then, for every IO object you need to use (files, sockets, pipes etc) you "attach" it to the completion port, along with an integer key. This key is (presumably) unique to the file, and usually a pointer to some structure maintaing the state of the file (ie, connection) The general programming model is that you have a small number of threads (possibly 1), and a large number of io objects (eg files). Each of these threads is executing a state machine. When IO is "ready" for a particular file, one of the available threads is woken, and passed the "key" associated with the file. This key identifies the file, and more importantly the state of that file. The thread uses the state to perform the next IO operation, then immediately go back to sleep. When that IO operation completes, some other thread is woken to handle that state change. What makes this work of course is that _all_ IO is asynch - not a single IO call in this whole model can afford to block. NT provides asynch IO natively. This sounds very similar to what Medusa does internally, although the NT model provides a "thread pooling" scheme built-in. Although our server performed very well with a single thread and hundreds of high-volume connections, we chose to run with a default of 5 threads here. For those still interested, our project has the multi-threaded state machine I described above implemented in C. Most of the work is responsible for spooling the client request data (possibly 100s of kbs) before handing that data off to the real server. When the C code transitions the client through the state of "send/get from the real server", we actually set a different completion port. This other completion port wakes a thread written in Python. So our architecture consists of a C implemented thread-pool managing client connections, and a different Python implemented thread pool that does the real work for each of these client connections. (The Python side of the world is bound by the server we are talking to, so Python performance doesnt matter as much - C wouldnt buy enough) This means that our state machines are not that complex. Each "thread pool" is managing its own, fairly simple state. NT automatically allows you to associate state with the IO object, and as we have multiple thread pools, each one is simple - the one spooling client data is simple, the one doing the actual server work is simple. If we had to have a single, monolithic state machine managing all aspects of the client spooling, _and_ the server work, it would be horrid. This is all in a shrink-wrapped relatively cheap "Document Management" product being targetted (successfully, it appears) at huge NT/Exchange based sites. Australia's largest Telco are implementing it, and indeed the company has VC from Intel! Lots of support from MS, as it helps compete with Domino. Not bad for a little startup - now they are wondering what to do with this Python-thingy they now have in their product that noone else has ever heard off; but they are planning on keeping it for now :-) [Funnily, when they started, they didnt think they even _needed_ a server, so I said "Ill just knock up a little one in Python", and we havent looked back :-] Mark. From tim_one at email.msn.com Wed May 19 02:48:00 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 20:48:00 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905180403.AAA04772@eric.cnri.reston.va.us> Message-ID: <000701bea191$3f4d1a20$2e9e2299@tim> [GvR] > ... > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? Paul Wilson (the GC guy) has a very nice-- but incomplete --intro to Scheme and its implementation: ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html You can pick up a lot from that fast. Is Steven (Majewski) on this list? He doped most of this out years ago. > Here's a start of my own understanding (brief because I'm on a 28.8k > connection which makes my ordinary typing habits in Emacs very > painful). > > 1. All program state is somehow contained in a single execution stack. > This includes globals (which are simply name bindings in the botton > stack frame). Better to think of name resolution following lexical links. Lexical closures with indefinite extent are common in Scheme, so much so that name resolution is (at least conceptually) best viewed as distinct from execution stacks. Here's a key: continuations are entirely about capturing control flow state, and nothing about capturing binding or data state. Indeed, mutating bindings and/or non-local data are the ways distinct invocations of a continuation communicate with each other, and for this reason true functional languages generally don't support continuations of the call/cc flavor. > It also includes a code pointer for each stack frame indicating where > the function corresponding to that stack frame is executing (this is > the return address if there is a newer stack frame, or the current > instruction for the newest frame). Yes, although the return address is one piece of information in the current frame's continuation object -- continuations are used internally for "regular calls" too. When a function returns, it passes control thru its continuation object. That process restores-- from the continuation object --what the caller needs to know (in concept: a pointer to *its* continuation object, its PC, its name-resolution chain pointer, and its local eval stack). Another key point: a continuation object is immutable. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. This can probably be done lazily. There are > probably lots of details. The point of the above is to get across that for Scheme-calling-Scheme, creating a continuation object copies just a small, fixed number of pointers (the current continuation pointer, the current name-resolution chain pointer, the PC), plus the local eval stack. This is for a "stackless" interpreter that heap-allocates name-mapping and execution-frame and continuation objects. Half the literature is devoted to optimizing one or more of those away in special cases (e.g., for continuations provably "up-level", using a stack + setjmp/longjmp instead). > I also expect that Scheme's semantic model is different than Python > here -- e.g. does it matter whether deep or shallow copies are made? > I.e. are there mutable *objects* in Scheme? (I know there are mutable > and immutable *name bindings* -- I think.) Same as Python here; Scheme isn't a functional language; has mutable bindings and mutable objects; any copies needed should be shallow, since it's "a feature" that invoking a continuation doesn't restore bindings or object values (see above re communication). > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. Right, except "stack" is the wrong mental model in the presence of continuations; it's a general rooted graph (A calls B, B saves a continuation pointing back to A, B goes on to call A, A saves a continuation pointing back to B, etc). If the explicitly saved continuations are never *invoked*, control will eventually pop back to the root of the graph, so in that sense there's *a* stack implicit at any given moment. > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. > > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course the > lazy copy makes this efficient. There's much less copying going on in Scheme-to-Scheme than you might think; other than that, right on. > If this all is close enough to the truth, I think that continuations > involving C stack frames are definitely out -- as Tim Peters > mentioned, you don't know what the stuff on the C stack of extensions > refers to. (My guess would be that Scheme implementations assume that > any pointers on the C stack point to Scheme objects, so that C stack > frames can be copied and conservative GC can be used -- this will > never happen in Python.) "Scheme" has become a generic term covering dozens of implementations with varying semantics, and a quick tour of the web suggests that cross-language Schemes generally put severe restrictions on continuations across language boundaries. Most popular seems to be to outlaw them by decree. > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). I'd like to go back to examples of what they'd be used for -- but fully fleshed out. In the absence of Scheme's ubiquitous lexical closures and "lambdaness" and syntax-extension facilities, I'm unsure they're going to work out reasonably in Python practice; it's not enough that they can be very useful in Scheme, and Sam is highly motivated to go to extremes here. give-me-a-womb-and-i-still-won't-give-birth-ly y'rs - tim From tismer at appliedbiometrics.com Wed May 19 03:10:15 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 03:10:15 +0200 Subject: [Python-Dev] 'stackless' python? References: <000701bea191$3f4d1a20$2e9e2299@tim> Message-ID: <37420F77.48E9940F@appliedbiometrics.com> Tim Peters wrote: ... > > Continuations involving only Python stack frames might be supported, > > if we can agree on the the sharing / copying semantics. This is where > > I don't know enough see questions at #2 above). > > I'd like to go back to examples of what they'd be used for -- but > fully fleshed out. In the absence of Scheme's ubiquitous lexical closures > and "lambdaness" and syntax-extension facilities, I'm unsure they're going > to work out reasonably in Python practice; it's not enough that they can be > very useful in Scheme, and Sam is highly motivated to go to extremes here. > > give-me-a-womb-and-i-still-won't-give-birth-ly y'rs - tim I've put quite many hours into a non-recursive ceval.c already. Should I continue? At least this would be a little improvement, also if the continuation thing will not be born. ? - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing at nightmare.com Wed May 19 04:52:04 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Tue, 18 May 1999 19:52:04 -0700 (PDT) Subject: [Python-Dev] Is there another way to solve the continuation problem? In-Reply-To: <101382377@toto.iv> Message-ID: <14146.8395.754509.591141@seattle.nightmare.com> Skip Montanaro writes: > Can exceptions be coerced into providing the necessary structure > without botching up the application too badly? Seems that at some > point where you need to do some I/O, you could raise an exception > whose second expression contains the necessary state to get back to > where you need to be once the I/O is ready to go. The controller > that catches the exceptions would use select or poll to prepare for > the I/O then dispatch back to the handlers using the information > from exceptions. > [... code ...] Well, you just re-invented the 'Reactor' pattern! 8^) http://www.cs.wustl.edu/~schmidt/patterns-ace.html > One thread, some craftiness needed to construct things. Seems like > it might isolate some of the statefulness to smaller functional > units than a pure state machine. Clearly not as clean as > continuations would be. Totally bogus? Totally inadequate? Maybe > Sam already does things this way? What you just described is what Medusa does (well, actually, 'Python' does it now, because the two core libraries that implement this are now in the library - asyncore.py and asynchat.py). asyncore doesn't really use exceptions exactly that way, and asynchat allows you to add another layer of processing (basically, dividing the input into logical 'lines' or 'records' depending on a 'line terminator'). The same technique is at the heart of many well-known network servers, including INND, BIND, X11, Squid, etc.. It's really just a state machine underneath (with python functions or methods implementing the 'states'). As long as things don't get too complex. Python simplifies things enough to allow one to 'push the difficulty envelope' a bit further than one could reasonably tolerate in C. For example, Squid implements async HTTP (server and client, because it's a proxy) - but stops short of trying to implement async FTP. Medusa implements async FTP, but it's the largest file in the Medusa distribution, weighing in at a hefty 32KB. The hard part comes when you want to plug different pieces and protocols together. For example, building a simple HTTP or FTP server is relatively easy, but building an HTTP server *that proxied to an FTP server* is much more difficult. I've done these kinds of things, viewing each as a challenge; but past a certain point it boggles. The paper I posted about earlier by Matthew Fuchs has a really good explanation of this, but in the context of GUI event loops... I think it ties in neatly with this discussion because at the heart of any X11 app is a little guy manipulating a file descriptor. -Sam From tim_one at email.msn.com Wed May 19 07:41:39 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:39 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14144.61765.308962.101884@seattle.nightmare.com> Message-ID: <000b01bea1ba$443a1a00$2e9e2299@tim> [Sam] > ... > Except that since the escape procedure is 'first-class' it can be > stored away and invoked (and reinvoked) later. [that's all that > 'first-class' means: a thing that can be stored in a variable, > returned from a function, used as an argument, etc..] > > I've never seen a let/cc that wasn't full-blown, but it wouldn't > surprise me. The let/cc's in question were specifically defined to create continuations valid only during let/cc's dynamic extent, so that, sure, you could store them away, but trying to invoke one later could be an error. It's in that sense I meant they weren't "first class". Other flavors of Scheme appear to call this concept "weak continuation", and use a different verb to invoke it (like call-with-escaping-continuation, or call/ec). Suspect the let/cc oddballs I found were simply confused implementations (there are a lot of amateur Scheme implementations out there!). >> Would full-blown coroutines be powerful enough for your needs? > Yes, I think they would be. But I think with Python it's going to > be just about as hard, either way. Most people on this list are comfortable with coroutines already because they already understand them -- Jeremy can even reach across the hall and hand Guido a helpful book . So pondering coroutines increase the number of brain cells willing to think about the implementation. continuation-examples-leave-people-still-going-"huh?"-after-an- hour-of-explanation-ly y'rs - tim From tim_one at email.msn.com Wed May 19 07:41:45 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:45 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3741A1FC.E84DC926@appliedbiometrics.com> Message-ID: <000e01bea1ba$47fe7500$2e9e2299@tim> [Christian Tismer] >>> ... >>> Yup. With a little counting, it was easy to survive: >>> >>> def main(): >>> global a >>> a=2 >>> thing (5) >>> a=a-1 >>> if a: >>> saved.throw (0) [Tim] >> Did "a" really need to be global here? I hope you see the same behavior >> without the "global a"; [which he does, but for mysterious reasons] [Christian] > Actually, the frame-copying was not enough to make this > all behave correctly. Since I didn't change the interpreter, > the ceval.c incarnations still had copies to the old frames. > The only effect which I achieved with frame copying was > that the refcounts were increased correctly. All right! Now you're closer to the real solution ; i.e., copying wasn't really needed here, but keeping stuff alive was. In Scheme terms, when we entered main originally a set of bindings was created for its locals, and it is that very same set of bindings to which the continuation returns. So the continuation *should* reuse them -- making a copy of the locals is semantically hosed. This is clearer in Scheme because its "stack" holds *only* control-flow info (bindings follow a chain of static links, independent of the current "call stack"), so there's no temptation to run off copying bindings too. elegant-and-baffling-for-the-price-of-one-ly y'rs - tim From tim_one at email.msn.com Wed May 19 07:41:56 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:56 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37420F77.48E9940F@appliedbiometrics.com> Message-ID: <001301bea1ba$4eb498c0$2e9e2299@tim> [Christian Tismer] > I've put quite many hours into a non-recursive ceval.c > already. Does that mean 6 or 600 ? > Should I continue? At least this would be a little improvement, also > if the continuation thing will not be born. ? Guido wanted to move in the "flat interpreter" direction for Python2 anyway, so my belief is it's worth pursuing. but-then-i-flipped-a-coin-with-two-heads-ly y'rs - tim From arw at ifu.net Wed May 19 15:04:53 1999 From: arw at ifu.net (Aaron Watters) Date: Wed, 19 May 1999 09:04:53 -0400 Subject: [Python-Dev] continuations and C extensions? Message-ID: <3742B6F5.C6CB7313@ifu.net> the immutable GvR intones: > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). What if there are native C calls mixed in (eg, list.sort calls back to myclass.__cmp__ which decides to do a call/cc). One of the really big advantages of Python in my book is the relative simplicity of embedding and extensions, and this is generally one of the failings of lisp implementations. I understand lots of scheme implementations purport to be extendible and embeddable, but in practice you can't do it with *existing* code -- there is always a show stopper involving having to change the way some Oracle library which you don't have the source for does memory management or something... I've known several grad students who have been bitten by this... I think having to unroll the C stack safely might be one problem area. With, eg, a netscape nsapi embedding you can actually get into netscape code calls my code calls netscape code calls my code... suspends in a continuation? How would that work? [my ignorance is torment!] Threading and extensions are probably also problematic, but at least it's better understood, I think. Just kvetching. Sorry. -- Aaron Watters ps: Of course there are valid reasons and excellent advantages to having continuations, but it's also interesting to consider the possible cost. There ain't no free lunch. From tismer at appliedbiometrics.com Wed May 19 21:30:18 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 21:30:18 +0200 Subject: [Python-Dev] 'stackless' python? References: <000e01bea1ba$47fe7500$2e9e2299@tim> Message-ID: <3743114A.220FFA0B@appliedbiometrics.com> Tim Peters wrote: ... > [Christian] > > Actually, the frame-copying was not enough to make this > > all behave correctly. Since I didn't change the interpreter, > > the ceval.c incarnations still had copies to the old frames. > > The only effect which I achieved with frame copying was > > that the refcounts were increased correctly. > > All right! Now you're closer to the real solution ; i.e., copying > wasn't really needed here, but keeping stuff alive was. In Scheme terms, > when we entered main originally a set of bindings was created for its > locals, and it is that very same set of bindings to which the continuation > returns. So the continuation *should* reuse them -- making a copy of the > locals is semantically hosed. I tried the most simple thing, and this seemed to be duplicating the current state of the machine. The frame holds the stack, and references to all objects. By chance, the locals are not in a dict, but unpacked into the frame. (Sometimes I agree with Guido, that optimization is considered harmful :-) > This is clearer in Scheme because its "stack" holds *only* control-flow info > (bindings follow a chain of static links, independent of the current "call > stack"), so there's no temptation to run off copying bindings too. The Python stack, besides its intermingledness with the machine stack, is basically its chain of frames. The value stack pointer still hides in the machine stack, but that's easy to change. So the real Scheme-like part is this chain, methinks, with the current bytecode offset and value stack info. Making a copy of this in a restartable way means to increase the refcount of all objects in a frame. Would it be correct to undo the effect of fast locals before splitting, and redoing it on activation? Or do I need to rethink the whole structure? What should be natural for Python, it at all? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Wed May 19 21:46:49 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 19 May 1999 15:46:49 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> References: <000e01bea1ba$47fe7500$2e9e2299@tim> <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: [Tim Peters] >> This is clearer in Scheme because its "stack" holds *only* >> control-flow info (bindings follow a chain of static links, >> independent of the current "call stack"), so there's no >> temptation to run off copying bindings too. CT> The Python stack, besides its intermingledness with the machine CT> stack, is basically its chain of frames. The value stack pointer CT> still hides in the machine stack, but that's easy to change. So CT> the real Scheme-like part is this chain, methinks, with the CT> current bytecode offset and value stack info. CT> Making a copy of this in a restartable way means to increase the CT> refcount of all objects in a frame. Would it be correct to undo CT> the effect of fast locals before splitting, and redoing it on CT> activation? Wouldn't it be easier to increase the refcount on the frame object? Then you wouldn't need to worry about the recounts on all the objects in the frame, because they would only be decrefed when the frame is deallocated. It seems like the two other things you would need are some way to get a copy of the current frame and a means to invoke eval_code2 with an already existing stack frame instead of a new one. (This sounds too simple, so it's obviously wrong. I'm just not sure where. Is the problem that you really need a seperate stack/graph to hold the frames? If we leave them on the Python stack, it could be hard to dis-entangle value objects from control objects.) Jeremy From tismer at appliedbiometrics.com Wed May 19 22:10:16 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 22:10:16 +0200 Subject: [Python-Dev] 'stackless' python? References: <000e01bea1ba$47fe7500$2e9e2299@tim> <3743114A.220FFA0B@appliedbiometrics.com> <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us> Message-ID: <37431AA8.BC77C615@appliedbiometrics.com> Jeremy Hylton wrote: [TP+CT about frame copies et al] > Wouldn't it be easier to increase the refcount on the frame object? > Then you wouldn't need to worry about the recounts on all the objects > in the frame, because they would only be decrefed when the frame is > deallocated. Well, the frame is supposed to be run twice, since there are two incarnations of interpreters working on it: The original one, and later, when it is thown, another one (or the same, but, in principle). The frame could have been in any state, with a couple of objects on the stack. My splitting function can be invoked in some nested context, so I have a current opcode position, and a current stack position. Running this once leaves the stack empty, since all the objects are decrefed. Running this a second time gives a GPF, since the stack is empty. Therefore, I made a copy which means to create a duplicate frame with an extra refcound for all the objects. This makes sure that both can be restarted at any time. > It seems like the two other things you would need are some way to get > a copy of the current frame and a means to invoke eval_code2 with an > already existing stack frame instead of a new one. Well, that's exactly where I'm working on. > (This sounds too simple, so it's obviously wrong. I'm just not sure > where. Is the problem that you really need a seperate stack/graph to > hold the frames? If we leave them on the Python stack, it could be > hard to dis-entangle value objects from control objects.) Oh, perhaps I should explain it a bit clearer? What did you mean by the Python stack? The hardware machine stack? What do we have at the moment: The stack is the linked list of frames. Every frame has a local Python evaluation stack. Calls of Python functions produce a new frame, and the old one is put beneath. This is the control stack. The additional info on the hardware stack happens to be a parallel friend of this chain, and currently holds extra info, but this is an artifact. Adding the current Python stack level to the frame makes the hardware stack totally unnecessary. There is a possible speed loss, anyway. Today, the recursive call of ceval2 is optimized and quite fast. The non-recursive Version will have to copy variables in and out from the frames, instead, so there is of course a little speed penalty to pay. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Wed May 19 23:38:07 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 23:38:07 +0200 Subject: [Python-Dev] 'stackless' python? References: <001301bea1ba$4eb498c0$2e9e2299@tim> Message-ID: <37432F3F.2694DA0E@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > I've put quite many hours into a non-recursive ceval.c > > already. > > Does that mean 6 or 600 ? 6, or 10, or 20, if I count the time from the first start with Sam's code, maybe. > > > Should I continue? At least this would be a little improvement, also > > if the continuation thing will not be born. ? > > Guido wanted to move in the "flat interpreter" direction for Python2 anyway, > so my belief is it's worth pursuing. > > but-then-i-flipped-a-coin-with-two-heads-ly y'rs - tim Right. Who'se faces? :-) On the stackless thing, what should I do. I started to insert minimum patches, but it turns out that I have to change frames a little (extending). I can make quite small changes to the interpreter to replace the recursive calls, but this involves extra flags in some cases, where the interpreter is called the first time and so on. What has more probability to be included into a future Python: Tweaking the current thing only minimally, to make it as similar as possible as the former? Or do as much redesign as I think is needed to do it in a clean way. This would mean to split eval_code2 into two functions, where one is the interpreter kernel, and one is the frame manager. There are also other places which do quite deep function calls and finally call eval_code2. I think these should return a frame object now. I could convince them to call or return frame, depending on a flag, but it would be clean to rename the functions, let them always deal with frames, and put the original function on top of it. Short, I can do larger changes which clean this all a bit up, or I can make small changes which are more tricky to grasp, but give just small diffs. How to touch untouchable code the best? :-) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Wed May 19 23:49:38 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 19 May 1999 17:49:38 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37432F3F.2694DA0E@appliedbiometrics.com> References: <001301bea1ba$4eb498c0$2e9e2299@tim> <37432F3F.2694DA0E@appliedbiometrics.com> Message-ID: <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us> I think it makes sense to avoid being obscure or unclear in order to minimize the size of the patch or the diff. Realistically, it's unlikely that anything like your original patch is going to make it into the CVS tree. It's primary value is as proof of concept and as code that the rest of us can try out. If you make large changes, but they are clearer, you'll help us out a lot. We can worry about minimizing the impact of the changes on the codebase after, after everyone has figured out what's going on and agree that its worth doing. feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's, Jeremy From tismer at appliedbiometrics.com Thu May 20 00:25:20 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 00:25:20 +0200 Subject: [Python-Dev] 'stackless' python? References: <001301bea1ba$4eb498c0$2e9e2299@tim> <37432F3F.2694DA0E@appliedbiometrics.com> <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us> Message-ID: <37433A50.31E66CB1@appliedbiometrics.com> Jeremy Hylton wrote: > > I think it makes sense to avoid being obscure or unclear in order to > minimize the size of the patch or the diff. Realistically, it's > unlikely that anything like your original patch is going to make it > into the CVS tree. It's primary value is as proof of concept and as > code that the rest of us can try out. If you make large changes, but > they are clearer, you'll help us out a lot. Many many thanks. This is good advice. I will make absolutely clear what's going on, keep parts untouched as possible, cut out parts which must change, and I will not look into speed too much. Better have a function call more and a bit less optimization, but a clear and rock-solid introduction of a concept. > We can worry about minimizing the impact of the changes on the > codebase after, after everyone has figured out what's going on and > agree that its worth doing. > > feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's, > Jeremy Hihi - the new little slot with local variables of the interpreter happens to have the name "continuation". Maybe I'd better rename it to "activation record"?. Now, there is no longer a recoursive call. Instead, a frame object is returned, which is waiting to be activated by a dispatcher. Some more ideas are popping up. Right now, only the recursive calls can vanish. Callbacks from C code which is called by the interpreter whcih is called by... is still a problem. But it might perhaps vanish completely. We have to see how much the cost is. But if I can manage to let the interpreter duck and cover also on every call to a builtin? The interpreter again returns to the dispatcher which then calls the builtin. Well, if that builtin happens to call to the interpreter again, it will be a dispatcher again. The machine stack grows a little, but since everything is saved in the frames, these stacks are no longer related. This means, the principle works with existing extension modules, since interpreter-world and C-stack world are decoupled. To avoid stack growth, of course a number of builtins would be better changed, but it is no must in the first place. execfile for instance is a candidate which needn't call the interpreter. It could equally parse the file, generate the code object, build a frame and just return it. This is what the dispatcher likes: returned frames are put on the chain and fired. waah, my bus - running - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Thu May 20 01:56:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 19:56:33 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <000701bea253$3a182a00$179e2299@tim> I'm home sick today, so tortured myself <0.9 wink>. Sam mentioned using coroutines to compare the fringes of two trees, and I picked a simpler problem: given a nested list structure, generate the leaf elements one at a time, in left-to-right order. A solution to Sam's problem can be built on that, by getting a generator for each tree and comparing the leaves a pair at a time until there's a difference. Attached are solutions in Icon, Python and Scheme. I have the least experience with Scheme, but browsing around didn't find a better Scheme approach than this. The Python solution is the least satisfactory, using an explicit stack to simulate recursion by hand; if you didn't know the routine's purpose in advance, you'd have a hard time guessing it. The Icon solution is very short and simple, and I'd guess obvious to an average Icon programmer. It uses the subset of Icon ("generators") that doesn't require any C-stack trickery. However, alone of the three, it doesn't create a function that could be explicitly called from several locations to produce "the next" result; Icon's generators are tied into Icon's unique control structures to work their magic, and breaking that connection requires moving to full-blown Icon coroutines. It doesn't need to be that way, though. The Scheme solution was the hardest to write, but is a largely mechanical transformation of a recursive fringe-lister that constructs the entire fringe in one shot. Continuations are used twice: to enable the recursive routine to resume itself where it left off, and to get each leaf value back to the caller. Getting that to work required rebinding non-local identifiers in delicate ways. I doubt the intent would be clear to an average Scheme programmer. So what would this look like in Continuation Python? Note that each place the Scheme says "lambda" or "letrec", it's creating a new lexical scope, and up-level references are very common. Two functions are defined at top level, but seven more at various levels of nesting; the latter can't be pulled up to the top because they refer to vrbls local to the top-level functions. Another (at least initially) discouraging thing to note is that Scheme schemes for hiding the pain of raw call/cc often use Scheme's macro facilities. may-not-be-as-fun-as-it-sounds-ly y'rs - tim Here's the Icon: procedure main() x := [[1, [[2, 3]]], [4], [], [[[5]], 6]] every writes(fringe(x), " ") write() end procedure fringe(node) if type(node) == "list" then suspend fringe(!node) else suspend node end Here's the Python: from types import ListType class Fringe: def __init__(self, value): self.stack = [(value, 0)] def __getitem__(self, ignored): while 1: # find topmost pending list with something to do while 1: if not self.stack: raise IndexError v, i = self.stack[-1] if i < len(v): break self.stack.pop() this = v[i] self.stack[-1] = (v, i+1) if type(this) is ListType: self.stack.append((this, 0)) else: break return this testcase = [[1, [[2, 3]]], [4], [], [[[5]], 6]] for x in Fringe(testcase): print x, print Here's the Scheme: (define list->generator ; Takes a list as argument. ; Returns a generator g such that each call to g returns ; the next element in the list's symmetric-order fringe. (lambda (x) (letrec {(produce-value #f) ; set to return-to continuation (looper (lambda (x) (cond ((null? x) 'nada) ; ignore null ((list? x) (looper (car x)) (looper (cdr x))) (else ; want to produce this non-list fringe elt, ; and also resume here (call/cc (lambda (here) (set! getnext (lambda () (here 'keep-going))) (produce-value x))))))) (getnext (lambda () (looper x) ; have to signal end of sequence somehow; ; assume false isn't a legitimate fringe elt (produce-value #f)))} ; return niladic function that returns next value (lambda () (call/cc (lambda (k) (set! produce-value k) (getnext))))))) (define display-fringe (lambda (x) (letrec ((g (list->generator x)) (thiselt #f) (looper (lambda () (set! thiselt (g)) (if thiselt (begin (display thiselt) (display " ") (looper)))))) (looper)))) (define test-case '((1 ((2 3))) (4) () (((5)) 6))) (display-fringe test-case) From MHammond at skippinet.com.au Thu May 20 02:14:24 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Thu, 20 May 1999 10:14:24 +1000 Subject: [Python-Dev] Interactive Debugging of Python Message-ID: <008b01bea255$b80cf790$0801a8c0@bobcat> All this talk about stack frames and manipulating them at runtime has reminded me of one of my biggest gripes about Python. When I say "biggest gripe", I really mean "biggest surprise" or "biggest shame". That is, Python is very interactive and dynamic. However, when I am debugging Python, it seems to lose this. There is no way for me to effectively change a running program. Now with VC6, I can do this with C. Although it is slow and a little dumb, I can change the C side of my Python world while my program is running, but not the Python side of the world. Im wondering how feasable it would be to change Python code _while_ running under the debugger. Presumably this would require a way of recompiling the current block of code, patching this code back into the object, and somehow tricking the stack frame to use this new block of code; even if a first-cut had to restart the block or somesuch... Any thoughts on this? Mark. From tim_one at email.msn.com Thu May 20 04:41:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 22:41:03 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <000901bea26a$34526240$179e2299@tim> [Christian Tismer] > I tried the most simple thing, and this seemed to be duplicating > the current state of the machine. The frame holds the stack, > and references to all objects. > By chance, the locals are not in a dict, but unpacked into > the frame. (Sometimes I agree with Guido, that optimization > is considered harmful :-) I don't see that the locals are a problem here -- provided you simply leave them alone . > The Python stack, besides its intermingledness with the machine > stack, is basically its chain of frames. Right. > The value stack pointer still hides in the machine stack, but > that's easy to change. I'm not sure what "value stack" means here, or "machine stack". The latter means the C stack? Then I don't know which values you have in mind that are hiding in it (the locals are, as you say, unpacked in the frame, and the evaluation stack too). By "evaluation stack" I mean specifically f->f_valuestack; the current *top* of stack pointer (specifically stack_pointer) lives in the C stack -- is that what we're talking about? Whichever, when we're talking about the code, let's use the names the code uses . > So the real Scheme-like part is this chain, methinks, with > the current bytecode offset and value stack info. Curiously, f->f_lasti is already materialized every time we make a call, in order to support tracing. So if capturing a continuation is done via a function call (hard to see any other way it could be done ), a bytecode offset is already getting saved in the frame object. > Making a copy of this in a restartable way means to increase > the refcount of all objects in a frame. You later had a vision of splitting the frame into two objects -- I think. Whichever part the locals live in should not be copied at all, but merely have its (single) refcount increased. The other part hinges on details of your approach I don't know. The nastiest part seems to be f->f_valuestack, which conceptually needs to be (shallow) copied in the current frame and in all other frames reachable from the current frame's continuation (the chain rooted at f->f_back today); that's the sum total (along with the same frames' bytecode offsets) of capturing the control flow state. > Would it be correct to undo the effect of fast locals before > splitting, and redoing it on activation? Unsure what splitting means, but in any case I can't conceive of a reason for doing anything to the locals. Their values aren't *supposed* to get restored upon continuation invocation, so there's no reason to do anything with their values upon continuation creation either. Right? Or are we talking about different things? almost-as-good-as-pantomimem-ly y'rs - tim From rushing at nightmare.com Thu May 20 06:04:20 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Wed, 19 May 1999 21:04:20 -0700 (PDT) Subject: [Python-Dev] A "real" continuation example In-Reply-To: <50692631@toto.iv> Message-ID: <14147.34175.950743.79464@seattle.nightmare.com> Tim Peters writes: > The Scheme solution was the hardest to write, but is a largely > mechanical transformation of a recursive fringe-lister that > constructs the entire fringe in one shot. Continuations are used > twice: to enable the recursive routine to resume itself where it > left off, and to get each leaf value back to the caller. Getting > that to work required rebinding non-local identifiers in delicate > ways. I doubt the intent would be clear to an average Scheme > programmer. It's the only way to do it - every example I've seen of using call/cc looks just like it. I reworked your Scheme a bit. IMHO letrec is for compilers, not for people. The following should be equivalent: (define (list->generator x) (let ((produce-value #f)) (define (looper x) (cond ((null? x) 'nada) ((list? x) (looper (car x)) (looper (cdr x))) (else (call/cc (lambda (here) (set! getnext (lambda () (here 'keep-going))) (produce-value x)))))) (define (getnext) (looper x) (produce-value #f)) (lambda () (call/cc (lambda (k) (set! produce-value k) (getnext)))))) (define (display-fringe x) (let ((g (list->generator x))) (let loop ((elt (g))) (if elt (begin (display elt) (display " ") (loop (g))))))) (define test-case '((1 ((2 3))) (4) () (((5)) 6))) (display-fringe test-case) > So what would this look like in Continuation Python? Here's my first hack at it. Most likely wrong. It is REALLY HARD to do this without having the feature to play with. This presumes a function "call_cc" that behaves like Scheme's. I believe the extra level of indirection is necessary. (i.e., call_cc takes a function as an argument that takes a continuation function) class list_generator: def __init__ (x): self.x = x self.k_suspend = None self.k_produce = None def walk (self, x): if type(x) == type([]): for item in x: self.walk (item) else: self.item = x # call self.suspend() with a continuation # that will continue walking the tree call_cc (self.suspend) def __call__ (self): # call self.resume() with a continuation # that will return the next fringe element return call_cc (self.resume) def resume (self, k_produce): self.k_produce = k_produce if self.k_suspend: # resume the suspended walk self.k_suspend (None) else: self.walk (self.x) def suspend (self, k_suspend): self.k_suspend = k_suspend # return a value for __call__ self.k_produce (self.item) Variables hold continuations have a 'k_' prefix. In real life it might be possible to put the suspend/call/resume machinery in a base class (Generator?), and override 'walk' as you please. -Sam From tim_one at email.msn.com Thu May 20 09:21:45 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 20 May 1999 03:21:45 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com> Message-ID: <001d01bea291$6b3efbc0$179e2299@tim> [Sam, takes up the Continuation Python Challenge] Thanks, Sam! I think this is very helpful. > ... > It's the only way to do it - every example I've seen of using call/cc > looks just like it. Same here -- alas <0.5 wink>. > I reworked your Scheme a bit. IMHO letrec is for compilers, not for > people. The following should be equivalent: I confess I stopped paying attention to Scheme after R4RS, and largely because the std decreed that *so* many forms were optional. Your rework is certainly nicer, but internal defines and named let are two that R4RS refused to require, so I always avoided them. BTW, I *am* a compiler, so that never bothered me . >> So what would this look like in Continuation Python? > Here's my first hack at it. Most likely wrong. It is REALLY HARD to > do this without having the feature to play with. Fully understood. It's also really hard to implement the feature without knowing how someone who wants it would like it to behave. But I don't think anyone is getting graded on this, so let's have fun . Ack! I have to sleep. Will study the code in detail later, but first impression was it looked good! Especially nice that it appears possible to package up most of the funky call_cc magic in a base class, so that non-wizards could reuse it by following a simple protocol. great-fun-to-come-up-with-one-of-these-but-i'd-hate-to-have-to-redo- from-scratch-every-time-ly y'rs - tim From skip at mojam.com Thu May 20 15:27:59 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 20 May 1999 09:27:59 -0400 (EDT) Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com> References: <50692631@toto.iv> <14147.34175.950743.79464@seattle.nightmare.com> Message-ID: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com> Sam> I reworked your Scheme a bit. IMHO letrec is for compilers, not for Sam> people. Sam, you are aware of course that the timbot *is* a compiler, right? ;-) >> So what would this look like in Continuation Python? Sam> Here's my first hack at it. Most likely wrong. It is REALLY HARD to Sam> do this without having the feature to play with. The thought that it's unlikely one could arrive at a reasonable approximation of a correct solution for such a small problem without the ability to "play with" it is sort of scary. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer at appliedbiometrics.com Thu May 20 16:10:32 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 16:10:32 +0200 Subject: [Python-Dev] Interactive Debugging of Python References: <008b01bea255$b80cf790$0801a8c0@bobcat> Message-ID: <374417D8.8DBCB617@appliedbiometrics.com> Mark Hammond wrote: > > All this talk about stack frames and manipulating them at runtime has > reminded me of one of my biggest gripes about Python. When I say "biggest > gripe", I really mean "biggest surprise" or "biggest shame". > > That is, Python is very interactive and dynamic. However, when I am > debugging Python, it seems to lose this. There is no way for me to > effectively change a running program. Now with VC6, I can do this with C. > Although it is slow and a little dumb, I can change the C side of my Python > world while my program is running, but not the Python side of the world. > > Im wondering how feasable it would be to change Python code _while_ running > under the debugger. Presumably this would require a way of recompiling the > current block of code, patching this code back into the object, and somehow > tricking the stack frame to use this new block of code; even if a first-cut > had to restart the block or somesuch... > > Any thoughts on this? I'm writing a prototype of a stackless Python, which means that you will be able to access the current state of the interpreter completely. The inner interpreter loop will be isolated from the frame dispatcher. It will break whenever the ticker goes zero. If you set the ticker to one, you will be able to single step on every opcode, have the value stack, the frame chain, everything. I think, with this you can do very much. But tell me if you want a callback hook somewhere. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Thu May 20 18:52:21 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 18:52:21 +0200 Subject: [Python-Dev] 'stackless' python? References: <000901bea26a$34526240$179e2299@tim> Message-ID: <37443DC5.1330EAC6@appliedbiometrics.com> Cleaning up, clarifying, trying to understand... Tim Peters wrote: > > [Christian Tismer] > > I tried the most simple thing, and this seemed to be duplicating > > the current state of the machine. The frame holds the stack, > > and references to all objects. > > By chance, the locals are not in a dict, but unpacked into > > the frame. (Sometimes I agree with Guido, that optimization > > is considered harmful :-) > > I don't see that the locals are a problem here -- provided you simply leave > them alone . This depends on wether I have to duplicate frames or not. Below... > > The Python stack, besides its intermingledness with the machine > > stack, is basically its chain of frames. > > Right. > > > The value stack pointer still hides in the machine stack, but > > that's easy to change. > > I'm not sure what "value stack" means here, or "machine stack". The latter > means the C stack? Then I don't know which values you have in mind that are > hiding in it (the locals are, as you say, unpacked in the frame, and the > evaluation stack too). By "evaluation stack" I mean specifically > f->f_valuestack; the current *top* of stack pointer (specifically > stack_pointer) lives in the C stack -- is that what we're talking about? Exactly! > Whichever, when we're talking about the code, let's use the names the code > uses . The evaluation stack pointer is a local variable in the C stack and must be written to the frame to become independant from the C stack. Sounds better now? > > > So the real Scheme-like part is this chain, methinks, with > > the current bytecode offset and value stack info. > > Curiously, f->f_lasti is already materialized every time we make a call, in > order to support tracing. So if capturing a continuation is done via a > function call (hard to see any other way it could be done ), a > bytecode offset is already getting saved in the frame object. You got me. I'm just completing what is partially there. > > Making a copy of this in a restartable way means to increase > > the refcount of all objects in a frame. > > You later had a vision of splitting the frame into two objects -- I think. My wrong wording. Not splitting, but duplicting. If a frame is the current state, I make it two frames to have two current states. One will be saved, the other will be run. This is what I call "splitting". Actually, splitting must occour whenever a frame can be reached twice, in order to keep elements alive. > Whichever part the locals live in should not be copied at all, but merely > have its (single) refcount increased. The other part hinges on details of > your approach I don't know. The nastiest part seems to be f->f_valuestack, > which conceptually needs to be (shallow) copied in the current frame and in > all other frames reachable from the current frame's continuation (the chain > rooted at f->f_back today); that's the sum total (along with the same > frames' bytecode offsets) of capturing the control flow state. Well, I see. You want one locals and one globals, shared by two incarnations. Gets me into trouble. > > Would it be correct to undo the effect of fast locals before > > splitting, and redoing it on activation? > > Unsure what splitting means, but in any case I can't conceive of a reason > for doing anything to the locals. Their values aren't *supposed* to get > restored upon continuation invocation, so there's no reason to do anything > with their values upon continuation creation either. Right? Or are we > talking about different things? Let me explain. What Python does right now is: When a function is invoked, all local variables are copied into fast_locals, well of course just references are copied and counts increased. These fast locals give a lot of speed today, we must have them. You are saying I have to share locals between frames. Besides that will be a reasonable slowdown, since an extra structure must be built and accessed indirectly (right now, i's all fast, living in the one frame buffer), I cannot say that I'm convinced that this is what we need. Suppose you have a function def f(x): # do something ... # in some context, wanna have a snapshot global snapshot # initialized to None if not snapshot: snapshot = callcc.new() # continue computation x = x+1 ... What I want to achieve is that I can run this again, from my snapshot. But with shared locals, my parameter x of the snapshot would have changed to x+1, which I don't find useful. I want to fix a state of the current frame and still think it should "own" its locals. Globals are borrowed, anyway. Class instances will anyway do what you want, since the local "self" is a mutable object. How do you want to keep computations independent when locals are shared? For me it's just easier to implement and also to think with the shallow copy. Otherwise, where is my private place? Open for becoming convinced, of course :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Thu May 20 21:26:30 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Thu, 20 May 1999 15:26:30 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com> References: <000901bea26a$34526240$179e2299@tim> <37443DC5.1330EAC6@appliedbiometrics.com> Message-ID: <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: CT> What I want to achieve is that I can run this again, from my CT> snapshot. But with shared locals, my parameter x of the snapshot CT> would have changed to x+1, which I don't find useful. I want to CT> fix a state of the current frame and still think it should "own" CT> its locals. Globals are borrowed, anyway. Class instances will CT> anyway do what you want, since the local "self" is a mutable CT> object. CT> How do you want to keep computations independent when locals are CT> shared? For me it's just easier to implement and also to think CT> with the shallow copy. Otherwise, where is my private place? CT> Open for becoming convinced, of course :-) I think you're making things a lot more complicated by trying to instantiate new variable bindings for locals every time you create a continuation. Can you give an example of why that would be helpful? (Ok. I'm not sure I can offer a good example of why it would be helpful to share them, but it makes intuitive sense to me.) The call_cc mechanism is going to let you capture the current continuation, save it somewhere, and call on it again as often as you like. Would you get a fresh locals each time you used it? or just the first time? If only the first time, it doesn't seem that you've gained a whole lot. Also, all the locals that are references to mutable objects are already effectively shared. So it's only a few oddballs like ints that are an issue. Jeremy From tim_one at email.msn.com Fri May 21 00:04:04 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 20 May 1999 18:04:04 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com> Message-ID: <000601bea30c$ad51b220$9d9e2299@tim> [Tim] > So what would this look like in Continuation Python? [Sam] > Here's my first hack at it. Most likely wrong. It is > REALLY HARD to do this without having the feature to play with. [Skip] > The thought that it's unlikely one could arrive at a reasonable > approximation of a correct solution for such a small problem without the > ability to "play with" it is sort of scary. Yes it is. But while the problem is small, it's not easy, and only the Icon solution wrote itself (not a surprise -- Icon was designed for expressing this kind of algorithm, and the entire language is actually warped towards it). My first stab at the Python stack-fiddling solution had bugs too, but I conveniently didn't post that . After studying Sam's code, I expect it *would* work as written, so it's a decent bet that it's a reasonable approximation to a correct solution as-is. A different Python approach using threads can be built using Demo/threads/Generator.py from the source distribution. To make that a fair comparison, I would have to post the supporting machinery from Generator.py too -- and we can ask Guido whether Generator.py worked right the first time he tried it . The continuation solution is subtle, requiring real expertise; but the threads solution doesn't fare any better on that count (building the support machinery with threads is also a baffler if you don't have thread expertise). If we threw Python metaclasses into the pot too, they'd be a third kind of nightmare for the non-expert. So, if you're faced with this kind of task, there's simply no easy way to get it done. Thread- and (it appears) continuation- based machinery can be crafted once by an expert, then packaged into an easy-to-use protocol for non-experts. All in all, I view continuations as a feature most people should actively avoid! I think it has that status in Scheme too (e.g., the famed Schemer's SICP textbook doesn't even mention call/cc). Its real value (if any ) is as a Big Invisible Hammer for certified wizards. Where call_cc leaks into the user's view of the world I'd try to hide it; e.g., where Sam has def walk (self, x): if type(x) == type([]): for item in x: self.walk (item) else: self.item = x # call self.suspend() with a continuation # that will continue walking the tree call_cc (self.suspend) I'd do def walk(self, x): if type(x) == type([]): for item in x: self.walk(item) else: self.put(x) where "put" is inherited from the base class (part of the protocol) and hides the call_cc business. Do enough of this, and we'll rediscover why Scheme demands that tail calls not push a new stack frame <0.9 wink>. the-tradeoffs-are-murky-ly y'rs - tim From tim_one at email.msn.com Fri May 21 00:04:09 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 20 May 1999 18:04:09 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com> Message-ID: <000701bea30c$af7a1060$9d9e2299@tim> [Christian] [... clarified stuff ... thanks! ... much clearer ...] > ... > If a frame is the current state, I make it two frames to have two > current states. One will be saved, the other will be run. This is > what I call "splitting". Actually, splitting must occour whenever > a frame can be reached twice, in order to keep elements alive. That part doesn't compute: if a frame can be reached by more than one path, its refcount must be at least equal to the number of its immediate predecessors, and its refcount won't fall to 0 before it becomes unreachable. So while you may need to split stuff for *some* reasons, I can't see how keeping elements alive could be one of those reasons (unless you're zapping frame contents *before* the frame itself is garbage?). > ... > Well, I see. You want one locals and one globals, shared by two > incarnations. Gets me into trouble. Just clarifying what Scheme does. Since they've been doing this forever, I don't want to toss their semantics on a whim . It's at least a conceptual thing: why *should* locals follow different rules than globals? If Python2 grows lexical closures, the only thing special about today's "locals" is that they happen to be the first guys found on the search path. Conceptually, that's really all they are today too. Here's the clearest Scheme example I can dream up: (define k #f) (define (printi i) (display "i is ") (display i) (newline)) (define (test n) (let ((i n)) (printi i) (set! i (- i 1)) (printi i) (display "saving continuation") (newline) (call/cc (lambda (here) (set! k here))) (set! i (- i 1)) (printi i) (set! i (- i 1)) (printi i))) No loops, no recursive calls, just a straight chain of fiddle-a-local ops. Here's some output: > (test 5) i is 5 i is 4 saving continuation i is 3 i is 2 > (k #f) i is 1 i is 0 > (k #f) i is -1 i is -2 > (k #f) i is -3 i is -4 > So there's no question about what Scheme thinks is proper behavior here. > ... > Let me explain. What Python does right now is: > When a function is invoked, all local variables are copied > into fast_locals, well of course just references are copied > and counts increased. These fast locals give a lot of speed > today, we must have them. Scheme (most of 'em, anyway) also resolves locals via straight base + offset indexing. > You are saying I have to share locals between frames. Besides > that will be a reasonable slowdown, since an extra structure > must be built and accessed indirectly (right now, i's all fast, > living in the one frame buffer), GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't care where that points *to* I cannot say that I'm convinced that this is what we need. > > Suppose you have a function > > def f(x): > # do something > ... > # in some context, wanna have a snapshot > global snapshot # initialized to None > if not snapshot: > snapshot = callcc.new() > # continue computation > x = x+1 > ... > > What I want to achieve is that I can run this again, from my > snapshot. But with shared locals, my parameter x of the > snapshot would have changed to x+1, which I don't find useful. You need a completely fleshed-out example to score points here: the use of call/cc is subtle, hinging on details, and fragments ignore too much. If you do want the same x, commonx = x if not snapshot: # get the continuation # continue computation x = commonx x = x+1 ... That is, it's easy to get it. But if you *do* want to see changes to the locals (which is one way for those distinct continuation invocations to *cooperate* in solving a task -- see below), but the implementation doesn't allow for it, I don't know what you can do to worm around it short of making x global too. But then different *top* level invocations of f will stomp on that shared global, so that's not a solution either. Maybe forget functions entirely and make everything a class method. > I want to fix a state of the current frame and still think > it should "own" its locals. Globals are borrowed, anyway. > Class instances will anyway do what you want, since > the local "self" is a mutable object. > > How do you want to keep computations independent > when locals are shared? For me it's just easier to > implement and also to think with the shallow copy. > Otherwise, where is my private place? > Open for becoming convinced, of course :-) I imagine it comes up less often in Scheme because it has no loops: communication among "iterations" is via function arguments or up-level lexical vrbls. So recall your uses of Icon generators instead: like Python, Icon does have loops, and two-level scoping, and I routinely build loopy Icon generators that keep state in locals. Here's a dirt-simple example I emailed to Sam earlier this week: procedure main() every result := fib(0, 1) \ 10 do write(result) end procedure fib(i, j) local temp repeat { suspend i temp := i + j i := j j := temp } end which prints 0 1 1 2 3 5 8 13 21 34 If Icon restored the locals (i, j, temp) upon each fib resumption, it would generate a zero followed by an infinite sequence of ones(!). Think of a continuation as a *paused* computation (which it is) rather than an *independent* one (which it isn't ), and I think it gets darned hard to argue. theory-and-practice-agree-here-in-my-experience-ly y'rs - tim From MHammond at skippinet.com.au Fri May 21 01:01:22 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 21 May 1999 09:01:22 +1000 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <374417D8.8DBCB617@appliedbiometrics.com> Message-ID: <00c001bea314$aefc5b40$0801a8c0@bobcat> > I'm writing a prototype of a stackless Python, which means that > you will be able to access the current state of the interpreter > completely. > The inner interpreter loop will be isolated from the frame > dispatcher. It will break whenever the ticker goes zero. > If you set the ticker to one, you will be able to single > step on every opcode, have the value stack, the frame chain, > everything. I think the main point is how to change code when a Python frame already references it. I dont think the structure of the frames is as important as the general concept. But while we were talking frame-fiddling it seemed a good point to try and hijack it a little :-) Would it be possible to recompile just a block of code (eg, just the current function or method) and patch it back in such a way that the current frame continues execution of the new code? I feel this is somewhat related to the inability to change class implementation for an existing instance. I know there have been hacks around this before but they arent completly reliable and IMO it would be nice if the core Python made it easier to change already running code - whether that code is in an existing stack frame, or just in an already created instance, it is very difficult to do. This has come to try and deflect some conversation away from changing Python as such towards an attempt at enhancing its _environment_. To paraphrase many people before me, even if we completely froze the language now there would still plenty of work ahead of us :-) Mark. From guido at CNRI.Reston.VA.US Fri May 21 02:06:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 20 May 1999 20:06:51 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: Your message of "Fri, 21 May 1999 09:01:22 +1000." <00c001bea314$aefc5b40$0801a8c0@bobcat> References: <00c001bea314$aefc5b40$0801a8c0@bobcat> Message-ID: <199905210006.UAA07900@eric.cnri.reston.va.us> > I think the main point is how to change code when a Python frame already > references it. I dont think the structure of the frames is as important as > the general concept. But while we were talking frame-fiddling it seemed a > good point to try and hijack it a little :-) > > Would it be possible to recompile just a block of code (eg, just the > current function or method) and patch it back in such a way that the > current frame continues execution of the new code? This topic sounds mostly unrelated to the stackless discussion -- in either case you need to be able to fiddle the contents of the frame and the bytecode pointer to reflect the changed function. Some issues: - The slots containing local variables may be renumbered after recompilation; fortunately we know the name--number mapping so we can move them to their new location. But it is still tricky. - Should you be able to edit functions that are present on the call stack below the top? Suppose we have two functions: def f(): return 1 + g() def g(): return 0 Suppose set a break in g(), and then edit the source of f(). We can do all sorts of evil to f(): e.g. we could change it to return g() + 2 which affects the contents of the value stack when g() returns (originally, the value stack contained the value 1, now it is empty). Or we could even change f() to return 3 thereby eliminating the call to g() altogether! What kind of limitations do other systems that support modifying a "live" program being debugged impose? Only allowing modification of the function at the top of the stack might eliminate some problems, although there are still ways to mess up. The value stack is not always empty even when we only stop at statement boundaries -- e.g. it contains 'for' loop indices, and there's also the 'block' stack, which contains try-except information. E.g. what should happen if we change def f(): for i in range(10): print 1 stopped at the 'print 1' into def f(): print 1 ??? (Ditto for removing or adding a try/except block.) > I feel this is somewhat related to the inability to change class > implementation for an existing instance. I know there have been hacks > around this before but they arent completly reliable and IMO it would be > nice if the core Python made it easier to change already running code - > whether that code is in an existing stack frame, or just in an already > created instance, it is very difficult to do. I've been thinking a bit about this. Function objects now have mutable func_code attributes (and also func_defaults), I think we can use this. The hard part is to do the analysis needed to decide which functions to recompile! Ideally, we would simply edit a file and tell the programming environment "recompile this". The programming environment would compare the changed file with the old version that it had saved for this purpose, and notice (for example) that we changed two methods of class C. It would then recompile those methods only and stuff the new code objects in the corresponding function objects. But what would it do when we changed a global variable? Say a module originally contains a statement "x = 0". Now we change the source code to say "x = 100". Should we change the variable x? Suppose that x is modified by some of the computations in the module, and the that, after some computations, the actual value of x was 50. Should the "recompile" reset x to 100 or leave it alone? One option would be to actually change the semantics of the class and def statements so that they modify an existing class or function rather than using assignment. Effectively, this proposal would change the semantics of class A: ...some code... class A: ...some more code... to be the same as class A: ...more code... ...some more code... This is somewhat similar to the way the module or package commands in some other dynamic languages work, I think; and I don't think this would break too much existing code. The proposal would also change def f(): ...some code... def f(): ...other code... but here the equivalence is not so easy to express, since I want different semantics (I don't want the second f's code to be tacked onto the end of the first f's code). If we understand that def f(): ... really does the following: f = NewFunctionObject() f.func_code = ...code object... then the construct above (def f():... def f(): ...) would do this: f = NewFunctionObject() f.func_code = ...some code... f.func_code = ...other code... i.e. there is no assignment of a new function object for the second def. Of course if there is a variable f but it is not a function, it would have to be assigned a new function object first. But in the case of def, this *does* break existing code. E.g. # module A from B import f . . . if ...some test...: def f(): ...some code... This idiom conditionally redefines a function that was also imported from some other module. The proposed new semantics would change B.f in place! So perhaps these new semantics should only be invoked when a special "reload-compile" is asked for... Or perhaps the programming environment could do this through source parsing as I proposed before... > This has come to try and deflect some conversation away from changing > Python as such towards an attempt at enhancing its _environment_. To > paraphrase many people before me, even if we completely froze the language > now there would still plenty of work ahead of us :-) Please, no more posts about Scheme. Each new post mentioning call/cc makes it *less* likely that something like that will ever be part of Python. "What if Guido's brain exploded?" :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Fri May 21 03:13:28 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 20 May 1999 21:13:28 -0400 (EDT) Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> References: <00c001bea314$aefc5b40$0801a8c0@bobcat> <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <14148.45321.204380.19130@cm-29-94-2.nycap.rr.com> Guido> What kind of limitations do other systems that support modifying Guido> a "live" program being debugged impose? Only allowing Guido> modification of the function at the top of the stack might Guido> eliminate some problems, although there are still ways to mess Guido> up. Frame objects maintain pointers to the active code objects, locals and globals, so modifying a function object's code or globals shouldn't have any effect on currently executing frames, right? I assume frame objects do the usual INCREF/DECREF dance, so the old code object won't get deleted before the frame object is tossed. Guido> But what would it do when we changed a global variable? Say a Guido> module originally contains a statement "x = 0". Now we change Guido> the source code to say "x = 100". Should we change the variable Guido> x? Suppose that x is modified by some of the computations in the Guido> module, and the that, after some computations, the actual value Guido> of x was 50. Should the "recompile" reset x to 100 or leave it Guido> alone? I think you should note the change for users and give them some way to easily pick between old initial value, new initial value or current value. Guido> Please, no more posts about Scheme. Each new post mentioning Guido> call/cc makes it *less* likely that something like that will ever Guido> be part of Python. "What if Guido's brain exploded?" :-) I agree. I see call/cc or set! and my eyes just glaze over... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From MHammond at skippinet.com.au Fri May 21 03:42:14 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 21 May 1999 11:42:14 +1000 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <00c501bea32b$277ce3d0$0801a8c0@bobcat> [Guido writes...] > This topic sounds mostly unrelated to the stackless discussion -- in Sure is - I just saw that as an excuse to try and hijack it > Some issues: > > - The slots containing local variables may be renumbered after Generally, I think we could make something very useful even with a number of limitations. For example, I would find a first cut completely acceptable and a great improvement on today if: * Only the function at the top of the stack can be recompiled and have the code reflected while executing. This function also must be restarted after such an edit. If the function uses global variables or makes calls that restarting will screw-up, then either a) make the code changes _before_ doing this stuff, or b) live with it for now, and help us remove the limitation :-) That may make the locals being renumbered easier to deal with, and also remove some of the problems you discussed about editing functions below the top. > What kind of limitations do other systems that support modifying a > "live" program being debugged impose? Only allowing modification of I can only speak for VC, and from experience at that - I havent attempted to find documentation on it. It accepts most changes while running. The current line is fine. If you create or change the definition of globals (and possibly even the type of locals?), the "incremental compilation" fails, and you are given the option of continuing with the old code, or stopping the process and doing a full build. When the debug session terminates, some link process (and maybe even compilation?) is done to bring the .exe on disk up to date with the changes. If you do wierd stuff like delete the line being executed, it usually gives you some warning message before either restarting the function or trying to pick a line somewhere near the line you deleted. Either way, it can screw up, moving the "current" line somewhere else - it doesnt crash the debugger, but may not do exactly what you expected. It is still a _huge_ win, and a great feature! Ironically, I turn this feature _off_ for Python extensions. Although changing the C code is great, in 99% of the cases I also need to change some .py code, and as existing instances are affected I need to restart the app anyway - so I may as well do a normal build at that time. ie, C now lets me debug incrementally, but a far more dynamic language prevents this feature being useful ;-) > the function at the top of the stack might eliminate some problems, > although there are still ways to mess up. The value stack is not > always empty even when we only stop at statement boundaries If we forced a restart would this be better? Can we reliably reset the stack to the start of the current function? > I've been thinking a bit about this. Function objects now have > mutable func_code attributes (and also func_defaults), I think we can > use this. > > The hard part is to do the analysis needed to decide which functions > to recompile! Ideally, we would simply edit a file and tell the > programming environment "recompile this". The programming environment > would compare the changed file with the old version that it had saved > for this purpose, and notice (for example) that we changed two methods > of class C. It would then recompile those methods only and stuff the > new code objects in the corresponding function objects. If this would work for the few changed functions/methods, what would the impact be of doing it for _every_ function (changed or not)? Then the analysis can drop to the module level which is much easier. I dont think a slight performace hit is a problem at all when doing this stuff. > One option would be to actually change the semantics of the class and > def statements so that they modify an existing class or function > rather than using assignment. Effectively, this proposal would change > the semantics of > > class A: > ...some code... > > class A: > ...some more code... > > to be the same as > > class A: > ...more code... > ...some more code... Or extending this (didnt this come up at the latest IPC?) # .\package\__init__.py class BigMutha: pass # .\package\something.py class package.BigMutha: def some_category_of_methods(): ... # .\package\other.py class package.BigMutha: def other_category_of_methods(): ... [Of course, this wont fly as it stands; just a conceptual possibility] > So perhaps these new semantics should only be invoked when a special > "reload-compile" is asked for... Or perhaps the programming > environment could do this through source parsing as I proposed > before... From guido at CNRI.Reston.VA.US Fri May 21 05:02:49 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 20 May 1999 23:02:49 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: Your message of "Fri, 21 May 1999 11:42:14 +1000." <00c501bea32b$277ce3d0$0801a8c0@bobcat> References: <00c501bea32b$277ce3d0$0801a8c0@bobcat> Message-ID: <199905210302.XAA08129@eric.cnri.reston.va.us> > Generally, I think we could make something very useful even with a number > of limitations. For example, I would find a first cut completely > acceptable and a great improvement on today if: > > * Only the function at the top of the stack can be recompiled and have the > code reflected while executing. This function also must be restarted after > such an edit. If the function uses global variables or makes calls that > restarting will screw-up, then either a) make the code changes _before_ > doing this stuff, or b) live with it for now, and help us remove the > limitation :-) OK, restarting the function seems a reasonable compromise and would seem relatively easy to implement. Not *real* easy though: it turns out that eval_code2() is called with a code object as argument, and it's not entirely trivial to figure out the corresponding function object from which to grab the new code object. But it could be done -- give it a try. (Don't wait for me, I'm ducking for cover until at least mid June.) > Ironically, I turn this feature _off_ for Python extensions. Although > changing the C code is great, in 99% of the cases I also need to change > some .py code, and as existing instances are affected I need to restart the > app anyway - so I may as well do a normal build at that time. ie, C now > lets me debug incrementally, but a far more dynamic language prevents this > feature being useful ;-) I hear you. > If we forced a restart would this be better? Can we reliably reset the > stack to the start of the current function? Yes, no problem. > If this would work for the few changed functions/methods, what would the > impact be of doing it for _every_ function (changed or not)? Then the > analysis can drop to the module level which is much easier. I dont think a > slight performace hit is a problem at all when doing this stuff. Yes, this would be fine too. > >"What if Guido's brain exploded?" :-) > > At least on that particular topic I didnt even consider I was the only one > in fear of that! But it is good to know that you specifically are too :-) Have no fear. I've learned to say no. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Fri May 21 07:36:44 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 21 May 1999 01:36:44 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <000401bea34b$e93fcda0$d89e2299@tim> [GvR] > ... > What kind of limitations do other systems that support modifying a > "live" program being debugged impose? As an ex-compiler guy, I should have something wise to say about that. Alas, I've never used a system that allowed more than poking new values into vrbls, and the thought of any more than that makes me vaguely ill! Oh, that's right -- I'm vaguely ill anyway today. Still-- oooooh -- the problems. This later got reduced to restarting the topmost function from scratch. That has some attraction, especially on the bang-for-buck-o-meter. > ... > Please, no more posts about Scheme. Each new post mentioning call/cc > makes it *less* likely that something like that will ever be part of > Python. "What if Guido's brain exploded?" :-) What a pussy . Really, overall continuations are much less trouble to understand than threads -- there's only one function in the entire interface! OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS. changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs - tim From tim_one at email.msn.com Sat May 1 10:32:30 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 1 May 1999 04:32:30 -0400 Subject: [Python-Dev] Speed (was RE: [Python-Dev] More flexible namespaces.) In-Reply-To: <14121.55659.754846.708467@amarok.cnri.reston.va.us> Message-ID: <000801be93ad$27772ea0$7a9e2299@tim> [Andrew M. Kuchling] > ... > A performance improvement project would definitely be a good idea > for 1.6, and a good sub-topic for python-dev. To the extent that optimization requires uglification, optimization got pushed beyond Guido's comfort zone back around 1.4 -- little has made it in since then. Not griping; I'm just trying to avoid enduring the same discussions for the third to twelfth times . Anywho, on the theory that a sweeping speedup patch has no chance of making it in regardless, how about focusing on one subsystem? In my experience, the speed issue Python gets beat up the most for is the relative slowness of function calls. It would be very good if eval_code2 somehow or other could manage to invoke a Python function without all the hair of a recursive C call, and I believe Guido intends to move in that direction for Python2 anyway. This would be a good time to start exploring that seriously. inspirationally y'rs - tim From da at ski.org Sun May 2 00:15:32 1999 From: da at ski.org (David Ascher) Date: Sat, 1 May 1999 15:15:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <37296856.5875AAAF@lemburg.com> Message-ID: > Since you put out to objectives, I'd like to propose a little > different approach... > > 1. Have eval/exec accept any mapping object as input > > 2. Make those two copy the content of the mapping object into real > dictionaries > > 3. Provide a hook into the dictionary implementation that can be > used to redirect KeyErrors and use that redirection to forward > the request to the original mapping objects Interesting counterproposal. I'm not sure whether any of the proposals on the table really do what's needed for e.g. case-insensitive namespace handling. I can see how all of the proposals so far allow case-insensitive reference name handling in the global namespace, but don't we also need to hook into the local-namespace creation process to allow case-insensitivity to work throughout? --david From da at ski.org Sun May 2 17:15:57 1999 From: da at ski.org (David Ascher) Date: Sun, 2 May 1999 08:15:57 -0700 (Pacific Daylight Time) Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <00bc01be942a$47d94070$0801a8c0@bobcat> Message-ID: On Sun, 2 May 1999, Mark Hammond wrote: > > I'm not sure whether any of the > > proposals on > > the table really do what's needed for e.g. case-insensitive namespace > > handling. I can see how all of the proposals so far allow > > case-insensitive reference name handling in the global namespace, but > > don't we also need to hook into the local-namespace creation > > process to > > allow case-insensitivity to work throughout? > > Why not? I pictured case insensitive namespaces working so that they > retain the case of the first assignment, but all lookups would be > case-insensitive. > > Ohh - right! Python itself would need changing to support this. I suppose > that faced with code such as: > > def func(): > if spam: > Spam=1 > > Python would generate code that refers to "spam" as a local, and "Spam" as > a global. > > Is this why you feel it wont work? I hadn't thought of that, to be truthful, but I think it's more generic. [FWIW, I never much cared for the tag-variables-at-compile-time optimization in CPython, and wouldn't miss it if were lost.] The point is that if I eval or exec code which calls a function specifying some strange mapping as the namespaces (global and current-local) I presumably want to also specify how local namespaces work for the function calls within that code snippet. That means that somehow Python has to know what kind of namespace to use for local environments, and not use the standard dictionary. Maybe we can simply have it use a '.clear()'ed .__copy__ of the specified environment. exec 'foo()' in globals(), mylocals would then call foo and within foo, the local env't would be mylocals.__copy__.clear(). Anyway, something for those-with-the-patches to keep in mind. --david From tismer at appliedbiometrics.com Sun May 2 15:00:37 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 02 May 1999 15:00:37 +0200 Subject: [Python-Dev] More flexible namespaces. References: Message-ID: <372C4C75.5B7CCAC8@appliedbiometrics.com> David Ascher wrote: [Marc:> > > Since you put out to objectives, I'd like to propose a little > > different approach... > > > > 1. Have eval/exec accept any mapping object as input > > > > 2. Make those two copy the content of the mapping object into real > > dictionaries > > > > 3. Provide a hook into the dictionary implementation that can be > > used to redirect KeyErrors and use that redirection to forward > > the request to the original mapping objects I don't think that this proposal would give so much new value. Since a mapping can also be implemented in arbitrary ways, say by functions, a mapping is not necessarily finite and might not be changeable into a dict. [David:> > Interesting counterproposal. I'm not sure whether any of the proposals on > the table really do what's needed for e.g. case-insensitive namespace > handling. I can see how all of the proposals so far allow > case-insensitive reference name handling in the global namespace, but > don't we also need to hook into the local-namespace creation process to > allow case-insensitivity to work throughout? Case-independant namespaces seem to be a minor point, nice to have for interfacing to other products, but then, in a function, I see no benefit in changing the semantics of function locals? The lookup of foreign symbols would always be through a mapping object. If you take COM for instance, your access to a COM wrapper for an arbitrary object would be through properties of this object. After assignment to a local function variable, why should we support case-insensitivity at all? I would think mapping objects would be a great simplification of lazy imports in COM, where we would like to avoid to import really huge namespaces in one big slurp. Also the wrapper code could be made quite a lot easier and faster without so much getattr/setattr trapping. Does btw. anybody really want to see case-insensitivity in Python programs? I'm quite happy with it as it is, and I would even force the use to always use the same case style after he has touched an external property once. Example for Excel: You may write "xl.workbooks" in lowercase, but then you have to stay with it. This would keep Python source clean for, say, PyLint. my 0.02 Euro - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From MHammond at skippinet.com.au Sun May 2 01:28:11 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sun, 2 May 1999 09:28:11 +1000 Subject: [Python-Dev] More flexible namespaces. In-Reply-To: Message-ID: <00bc01be942a$47d94070$0801a8c0@bobcat> > I'm not sure whether any of the > proposals on > the table really do what's needed for e.g. case-insensitive namespace > handling. I can see how all of the proposals so far allow > case-insensitive reference name handling in the global namespace, but > don't we also need to hook into the local-namespace creation > process to > allow case-insensitivity to work throughout? Why not? I pictured case insensitive namespaces working so that they retain the case of the first assignment, but all lookups would be case-insensitive. Ohh - right! Python itself would need changing to support this. I suppose that faced with code such as: def func(): if spam: Spam=1 Python would generate code that refers to "spam" as a local, and "Spam" as a global. Is this why you feel it wont work? Mark. From mal at lemburg.com Sun May 2 21:24:54 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 02 May 1999 21:24:54 +0200 Subject: [Python-Dev] More flexible namespaces. References: <372C4C75.5B7CCAC8@appliedbiometrics.com> Message-ID: <372CA686.215D71DF@lemburg.com> Christian Tismer wrote: > > David Ascher wrote: > [Marc:> > > > Since you put out the objectives, I'd like to propose a little > > > different approach... > > > > > > 1. Have eval/exec accept any mapping object as input > > > > > > 2. Make those two copy the content of the mapping object into real > > > dictionaries > > > > > > 3. Provide a hook into the dictionary implementation that can be > > > used to redirect KeyErrors and use that redirection to forward > > > the request to the original mapping objects > > I don't think that this proposal would give so much new > value. Since a mapping can also be implemented in arbitrary > ways, say by functions, a mapping is not necessarily finite > and might not be changeable into a dict. [Disclaimer: I'm not really keen on having the possibility of letting code execute in arbitrary namespace objects... it would make code optimizations even less manageable.] You can easily support infinite mappings by wrapping the function into an object which returns an empty list for .items() and then use the hook mentioned in 3 to redirect the lookup to that function. The proposal allows one to use such a proxy to simulate any kind of mapping -- it works much like the __getattr__ hook provided for instances. > [David:> > > Interesting counterproposal. I'm not sure whether any of the proposals on > > the table really do what's needed for e.g. case-insensitive namespace > > handling. I can see how all of the proposals so far allow > > case-insensitive reference name handling in the global namespace, but > > don't we also need to hook into the local-namespace creation process to > > allow case-insensitivity to work throughout? > > Case-independant namespaces seem to be a minor point, > nice to have for interfacing to other products, but then, > in a function, I see no benefit in changing the semantics > of function locals? The lookup of foreign symbols would > always be through a mapping object. If you take COM for > instance, your access to a COM wrapper for an arbitrary > object would be through properties of this object. After > assignment to a local function variable, why should we > support case-insensitivity at all? > > I would think mapping objects would be a great > simplification of lazy imports in COM, where > we would like to avoid to import really huge > namespaces in one big slurp. Also the wrapper code > could be made quite a lot easier and faster without > so much getattr/setattr trapping. What do lazy imports have to do with case [in]sensitive namespaces ? Anyway, how about a simple lazy import mechanism in the standard distribution, i.e. why not make all imports lazy ? Since modules are first class objects this should be easy to implement... > Does btw. anybody really want to see case-insensitivity > in Python programs? I'm quite happy with it as it is, > and I would even force the use to always use the same > case style after he has touched an external property > once. Example for Excel: You may write "xl.workbooks" > in lowercase, but then you have to stay with it. > This would keep Python source clean for, say, PyLint. "No" and "me too" ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 243 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From MHammond at skippinet.com.au Mon May 3 02:52:41 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Mon, 3 May 1999 10:52:41 +1000 Subject: [Python-Dev] More flexible namespaces. In-Reply-To: <372CA686.215D71DF@lemburg.com> Message-ID: <000e01be94ff$4047ef20$0801a8c0@bobcat> [Marc] > [Disclaimer: I'm not really keen on having the possibility of > letting code execute in arbitrary namespace objects... it would > make code optimizations even less manageable.] Good point - although surely that would simply mean (certain) optimisations can't be performed for code executing in that environment? How to detect this at "optimization time" may be a little difficult :-) However, this is the primary purpose of this thread - to workout _if_ it is a good idea, as much as working out _how_ to do it :-) > The proposal allows one to use such a proxy to simulate any > kind of mapping -- it works much like the __getattr__ hook > provided for instances. My only problem with Marc's proposal is that there already _is_ an established mapping protocol, and this doesnt use it; instead it invents a new one with the benefit being potentially less code breakage. And without attempting to sound flippant, I wonder how many extension modules will be affected? Module init code certainly assumes the module __dict__ is a dictionary, but none of my code assumes anything about other namespaces. Marc's extensions may be a special case, as AFAIK they inject objects into other dictionaries (ie, new builtins?). Again, not trying to downplay this too much, but if it is only a problem for Marc's more esoteric extensions, I dont feel that should hold up an otherwise solid proposal. [Chris, I think?] > > Case-independant namespaces seem to be a minor point, > > nice to have for interfacing to other products, but then, > > in a function, I see no benefit in changing the semantics > > of function locals? The lookup of foreign symbols would I disagree here. Consider Alice, and similar projects, where a (arguably misplaced, but nonetheless) requirement is that the embedded language be case-insensitive. Period. The Alice people are somewhat special in that they had the resources to change the interpreters guts. Most people wont, and will look for a different language to embedd. Of course, I agree with you for the specific cases you are talking - COM, Active Scripting etc. Indeed, everything I would use this for would prefer to keep the local function semantics identical. > > Does btw. anybody really want to see case-insensitivity > > in Python programs? I'm quite happy with it as it is, > > and I would even force the use to always use the same > > case style after he has touched an external property > > once. Example for Excel: You may write "xl.workbooks" > > in lowercase, but then you have to stay with it. > > This would keep Python source clean for, say, PyLint. > > "No" and "me too" ;-) I think we are missing the point a little. If we focus on COM, we may come up with a different answer. Indeed, if we are to focus on COM integration with Python, there are other areas I would prefer to start with :-) IMO, we should attempt to come up with a more flexible namespace mechanism that is in the style of Python, and will not noticeably slowdown Python. Then COM etc can take advantage of it - much in the same way that Python's existing namespace model existed pre-COM, and COM had to take advantage of what it could! Of course, a key indicator of the likely success is how well COM _can_ take advantage of it, and how much Alice could have taken advantage of it - I cant think of any other yardsticks? Mark. From mal at lemburg.com Mon May 3 09:56:53 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 03 May 1999 09:56:53 +0200 Subject: [Python-Dev] More flexible namespaces. References: <000e01be94ff$4047ef20$0801a8c0@bobcat> Message-ID: <372D56C5.4738DE3D@lemburg.com> Mark Hammond wrote: > > [Marc] > > [Disclaimer: I'm not really keen on having the possibility of > > letting code execute in arbitrary namespace objects... it would > > make code optimizations even less manageable.] > > Good point - although surely that would simply mean (certain) optimisations > can't be performed for code executing in that environment? How to detect > this at "optimization time" may be a little difficult :-) > > However, this is the primary purpose of this thread - to workout _if_ it is > a good idea, as much as working out _how_ to do it :-) > > > The proposal allows one to use such a proxy to simulate any > > kind of mapping -- it works much like the __getattr__ hook > > provided for instances. > > My only problem with Marc's proposal is that there already _is_ an > established mapping protocol, and this doesnt use it; instead it invents a > new one with the benefit being potentially less code breakage. ...and that's the key point: you get the intended features and the core code will not have to be changed in significant ways. Basically, I think these kind of core extensions should be done in generic ways, e.g. by letting the eval/exec machinery accept subclasses of dictionaries, rather than trying to raise the abstraction level used and slowing things down in general just to be able to use the feature on very few occasions. > And without attempting to sound flippant, I wonder how many extension > modules will be affected? Module init code certainly assumes the module > __dict__ is a dictionary, but none of my code assumes anything about other > namespaces. Marc's extensions may be a special case, as AFAIK they inject > objects into other dictionaries (ie, new builtins?). Again, not trying to > downplay this too much, but if it is only a problem for Marc's more > esoteric extensions, I dont feel that should hold up an otherwise solid > proposal. My mxTools extension does the assignment in Python, so it wouldn't be affected. The others only do the usual modinit() stuff. Before going any further on this thread we may have to ponder a little more on the objectives that we have. If it's only case-insensitive lookups then I guess a simple compile time switch exchanging the implementations of string hash and compare functions would do the trick. If we're after doing wild things like lookups accross networks, then a more specific approach is needed. So what is it that we want in 1.6 ? > [Chris, I think?] > > > Case-independant namespaces seem to be a minor point, > > > nice to have for interfacing to other products, but then, > > > in a function, I see no benefit in changing the semantics > > > of function locals? The lookup of foreign symbols would > > I disagree here. Consider Alice, and similar projects, where a (arguably > misplaced, but nonetheless) requirement is that the embedded language be > case-insensitive. Period. The Alice people are somewhat special in that > they had the resources to change the interpreters guts. Most people wont, > and will look for a different language to embedd. > > Of course, I agree with you for the specific cases you are talking - COM, > Active Scripting etc. Indeed, everything I would use this for would prefer > to keep the local function semantics identical. As I understand the needs in COM and AS you are talking about object attributes, right ? Making these case-insensitive is a job for a proxy or a __getattr__ hack. > > > Does btw. anybody really want to see case-insensitivity > > > in Python programs? I'm quite happy with it as it is, > > > and I would even force the use to always use the same > > > case style after he has touched an external property > > > once. Example for Excel: You may write "xl.workbooks" > > > in lowercase, but then you have to stay with it. > > > This would keep Python source clean for, say, PyLint. > > > > "No" and "me too" ;-) > > I think we are missing the point a little. If we focus on COM, we may come > up with a different answer. Indeed, if we are to focus on COM integration > with Python, there are other areas I would prefer to start with :-) > > IMO, we should attempt to come up with a more flexible namespace mechanism > that is in the style of Python, and will not noticeably slowdown Python. > Then COM etc can take advantage of it - much in the same way that Python's > existing namespace model existed pre-COM, and COM had to take advantage of > what it could! > > Of course, a key indicator of the likely success is how well COM _can_ take > advantage of it, and how much Alice could have taken advantage of it - I > cant think of any other yardsticks? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 242 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From fredrik at pythonware.com Mon May 3 16:01:10 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 16:01:10 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat> Message-ID: <005b01be956d$66d48450$f29b12c2@pythonware.com> scriptics is positioning tcl as a perl killer: http://www.scriptics.com/scripting/perl.html afaict, unicode and event handling are the two main thingies missing from python 1.5. -- unicode: is on its way. -- event handling: asynclib/asynchat provides an awesome framework for event-driven socket pro- gramming. however, Python still lacks good cross- platform support for event-driven access to files and pipes. are threads good enough, or would it be cool to have something similar to Tcl's fileevent stuff in Python? -- regexps: has anyone compared the new uni- code-aware regexp package in Tcl with pcre? comments? btw, the rebol folks have reached 2.0: http://www.rebol.com/ maybe 1.6 should be renamed to Python 6.0? From akuchlin at cnri.reston.va.us Mon May 3 17:14:15 1999 From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 3 May 1999 11:14:15 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <005b01be956d$66d48450$f29b12c2@pythonware.com> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> Message-ID: <14125.47524.196878.583460@amarok.cnri.reston.va.us> Fredrik Lundh writes: >-- regexps: has anyone compared the new uni- >code-aware regexp package in Tcl with pcre? I looked at it a bit when Tcl 8.1 was in beta; it derives from Henry Spencer's 1998-vintage code, which seems to try to do a lot of optimization and analysis. It may even compile DFAs instead of NFAs when possible, though it's hard for me to be sure. This might give it a substantial speed advantage over engines that do less analysis, but I haven't benchmarked it. The code is easy to read, but difficult to understand because the theory underlying the analysis isn't explained in the comments; one feels there should be an accompanying paper to explain how everything works, and it's why I'm not sure if it really is producing DFAs for some expressions. Tcl seems to represent everything as UTF-8 internally, so there's only one regex engine; there's . The code is scattered over more files: amarok generic>ls re*.[ch] regc_color.c regc_locale.c regcustom.h regerrs.h regfree.c regc_cvec.c regc_nfa.c rege_dfa.c regex.h regfronts.c regc_lex.c regcomp.c regerror.c regexec.c regguts.h amarok generic>wc -l re*.[ch] 742 regc_color.c 170 regc_cvec.c 1010 regc_lex.c 781 regc_locale.c 1528 regc_nfa.c 2124 regcomp.c 85 regcustom.h 627 rege_dfa.c 82 regerror.c 18 regerrs.h 308 regex.h 952 regexec.c 25 regfree.c 56 regfronts.c 388 regguts.h 8896 total amarok generic> This would be an issue for using it with Python, since all these files would wind up scattered around the Modules directory. For comparison, pypcre.c is around 4700 lines of code. -- A.M. Kuchling http://starship.python.net/crew/amk/ Things need not have happened to be true. Tales and dreams are the shadow-truths that will endure when mere facts are dust and ashes, and forgot. -- Neil Gaiman, _Sandman_ #19: _A Midsummer Night's Dream_ From guido at CNRI.Reston.VA.US Mon May 3 17:32:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 03 May 1999 11:32:09 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Mon, 03 May 1999 11:14:15 EDT." <14125.47524.196878.583460@amarok.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> Message-ID: <199905031532.LAA05617@eric.cnri.reston.va.us> > I looked at it a bit when Tcl 8.1 was in beta; it derives from > Henry Spencer's 1998-vintage code, which seems to try to do a lot of > optimization and analysis. It may even compile DFAs instead of NFAs > when possible, though it's hard for me to be sure. This might give it > a substantial speed advantage over engines that do less analysis, but > I haven't benchmarked it. The code is easy to read, but difficult to > understand because the theory underlying the analysis isn't explained > in the comments; one feels there should be an accompanying paper to > explain how everything works, and it's why I'm not sure if it really > is producing DFAs for some expressions. > > Tcl seems to represent everything as UTF-8 internally, so > there's only one regex engine; there's . Hmm... I looked when Tcl 8.1 was in alpha, and I *think* that at that point the regex engine was compiled twice, once for 8-bit chars and once for 16-bit chars. But this may have changed. I've noticed that Perl is taking the same position (everything is UTF-8 internally). On the other hand, Java distinguishes 16-bit chars from 8-bit bytes. Python is currently in the Java camp. This might be a good time to make sure that we're still convinced that this is the right thing to do! > The code is scattered over > more files: > > amarok generic>ls re*.[ch] > regc_color.c regc_locale.c regcustom.h regerrs.h regfree.c > regc_cvec.c regc_nfa.c rege_dfa.c regex.h regfronts.c > regc_lex.c regcomp.c regerror.c regexec.c regguts.h > amarok generic>wc -l re*.[ch] > 742 regc_color.c > 170 regc_cvec.c > 1010 regc_lex.c > 781 regc_locale.c > 1528 regc_nfa.c > 2124 regcomp.c > 85 regcustom.h > 627 rege_dfa.c > 82 regerror.c > 18 regerrs.h > 308 regex.h > 952 regexec.c > 25 regfree.c > 56 regfronts.c > 388 regguts.h > 8896 total > amarok generic> > > This would be an issue for using it with Python, since all > these files would wind up scattered around the Modules directory. For > comparison, pypcre.c is around 4700 lines of code. I'm sure that if it's good code, we'll find a way. Perhaps a more interesting question is whether it is Perl5 compatible. I contacted Henry Spencer at the time and he was willing to let us use his code. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at cnri.reston.va.us Mon May 3 17:56:46 1999 From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 3 May 1999 11:56:46 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> Message-ID: <14125.49911.982236.754340@amarok.cnri.reston.va.us> Guido van Rossum writes: >Hmm... I looked when Tcl 8.1 was in alpha, and I *think* that at that >point the regex engine was compiled twice, once for 8-bit chars and >once for 16-bit chars. But this may have changed. It doesn't seem to currently; the code in tclRegexp.c looks like this: /* Remember the UTF-8 string so Tcl_RegExpRange() can convert the * matches from character to byte offsets. */ regexpPtr->string = string; Tcl_DStringInit(&stringBuffer); uniString = Tcl_UtfToUniCharDString(string, -1, &stringBuffer); numChars = Tcl_DStringLength(&stringBuffer) / sizeof(Tcl_UniChar); /* Perform the regexp match. */ result = TclRegExpExecUniChar(interp, re, uniString, numChars, -1, ((string > start) ? REG_NOTBOL : 0)); ISTR the Spencer engine does, however, define a small and large representation for NFAs and have two versions of the engine, one for each representation. Perhaps that's what you're thinking of. >I've noticed that Perl is taking the same position (everything is >UTF-8 internally). On the other hand, Java distinguishes 16-bit chars >from 8-bit bytes. Python is currently in the Java camp. This might >be a good time to make sure that we're still convinced that this is >the right thing to do! I don't know. There's certainly the fundamental dichotomy that strings are sometimes used to represent characters, where changing encodings on input and output is reasonably, and sometimes used to hold chunks of binary data, where any changes are incorrect. Perhaps Paul Prescod is right, and we should try to get some other data type (array.array()) for holding binary data, as distinct from strings. >I'm sure that if it's good code, we'll find a way. Perhaps a more >interesting question is whether it is Perl5 compatible. I contacted >Henry Spencer at the time and he was willing to let us use his code. Mostly Perl-compatible, though it doesn't look like the 5.005 features are there, and I haven't checked for every single 5.004 feature. Adding missing features might be problematic, because I don't really understand what the code is doing at a high level. Also, is there a user community for this code? Do any other projects use it? Philip Hazel has been quite helpful with PCRE, an important thing when making modifications to the code. Should I make a point of looking at what using the Spencer engine would entail? It might not be too difficult (an evening or two, maybe?) to write a re.py that sat on top of the Spencer code; that would at least let us do some benchmarking. -- A.M. Kuchling http://starship.python.net/crew/amk/ In Einstein's theory of relativity the observer is a man who sets out in quest of truth armed with a measuring-rod. In quantum theory he sets out with a sieve. -- Sir Arthur Eddington From guido at CNRI.Reston.VA.US Mon May 3 18:02:22 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 03 May 1999 12:02:22 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Mon, 03 May 1999 11:56:46 EDT." <14125.49911.982236.754340@amarok.cnri.reston.va.us> References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> Message-ID: <199905031602.MAA05829@eric.cnri.reston.va.us> > Should I make a point of looking at what using the Spencer > engine would entail? It might not be too difficult (an evening or > two, maybe?) to write a re.py that sat on top of the Spencer code; > that would at least let us do some benchmarking. Surely this would be more helpful than weeks of specilative emails -- go for it! --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Mon May 3 19:10:55 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 19:10:55 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat><005b01be956d$66d48450$f29b12c2@pythonware.com><14125.47524.196878.583460@amarok.cnri.reston.va.us><199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> Message-ID: <005801be9588$7ad0fcc0$f29b12c2@pythonware.com> > Also, is there a user community for this code? how about comp.lang.tcl ;-) From fredrik at pythonware.com Mon May 3 19:15:00 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 May 1999 19:15:00 +0200 Subject: [Python-Dev] Why Foo is better than Baz References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us> <199905031602.MAA05829@eric.cnri.reston.va.us> Message-ID: <005901be9588$7af59bc0$f29b12c2@pythonware.com> talking about regexps, here's another thing that would be quite nice to have in 1.6 (available from the Python level, that is). or is it already in there somewhere? ... http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=464362873 Tcl 8.1b3 Request: Generated by Scriptics' bug entry form at Submitted by: Frederic BONNET OperatingSystem: Windows 98 CustomShell: Applied patch to the regexp engine (the exec part) Synopsis: regexp improvements DesiredBehavior: As previously requested by Don Libes: > I see no way for Tcl_RegExpExec to indicate "could match" meaning > "could match if more characters arrive that were suitable for a > match". This is required for a class of applications involving > matching on a stream required by Expect's interact command. Henry > assured me that this facility would be in the engine (I'm not the only > one that needs it). Note that it is not sufficient to add one more > return value to Tcl_RegExpExec (i.e., 2) because one needs to know > both if something matches now and can match later. I recommend > another argument (canMatch *int) be added to Tcl_RegExpExec. /patch info follows/ ... From bwarsaw at cnri.reston.va.us Tue May 4 00:28:23 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 3 May 1999 18:28:23 -0400 (EDT) Subject: [Python-Dev] New mailing list: python-bugs-list Message-ID: <14126.8967.793734.892670@anthem.cnri.reston.va.us> I've been using Jitterbug for a couple of weeks now as my bug database for Mailman and JPython. So it was easy enough for me to set up a database for Python bug reports. Guido is in the process of tailoring the Jitterbug web interface to his liking and will announce it to the appropriate forums when he's ready. In the meantime, I've created YAML that you might be interested in. All bug reports entered into Jitterbug will be forwarded to python-bugs-list at python.org. You are invited to subscribe to the list by visiting http://www.python.org/mailman/listinfo/python-bugs-list Enjoy, -Barry From jeremy at cnri.reston.va.us Tue May 4 00:30:10 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Mon, 3 May 1999 18:30:10 -0400 (EDT) Subject: [Python-Dev] New mailing list: python-bugs-list In-Reply-To: <14126.8967.793734.892670@anthem.cnri.reston.va.us> References: <14126.8967.793734.892670@anthem.cnri.reston.va.us> Message-ID: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us> Pretty low volume list, eh? From MHammond at skippinet.com.au Tue May 4 01:28:39 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 4 May 1999 09:28:39 +1000 Subject: [Python-Dev] New mailing list: python-bugs-list In-Reply-To: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us> Message-ID: <000701be95bc$ad0b45e0$0801a8c0@bobcat> ha - we wish. More likely to be full of detailed bug reports about how 1/2 != 0.5, or that "def foo(baz=[])" is buggy, etc :-) Mark. > Pretty low volume list, eh? From tim_one at email.msn.com Tue May 4 07:16:17 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 4 May 1999 01:16:17 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us> Message-ID: <000701be95ed$3d594180$dca22299@tim> [Guido & Andrew on Tcl's new regexp code] > I'm sure that if it's good code, we'll find a way. Perhaps a more > interesting question is whether it is Perl5 compatible. I contacted > Henry Spencer at the time and he was willing to let us use his code. Haven't looked at the code, but did read the manpage just now: http://www.scriptics.com/man/tcl8.1/TclCmd/regexp.htm WRT Perl5 compatibility, it sez: Incompatibilities of note include `\b', `\B', the lack of special treatment for a trailing newline, the addition of complemented bracket expressions to the things affected by newline-sensitive matching, the restrictions on parentheses and back references in lookahead constraints, and the longest/shortest-match (rather than first-match) matching semantics. So some gratuitous differences, and maybe a killer: Guido hasn't had much kind to say about "longest" (aka POSIX) matching semantics. An example from the page: (week|wee)(night|knights) matches all ten characters of `weeknights' which means it matched 'wee' and 'knights'; Python/Perl match 'week' and 'night'. It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA is correct; indeed, it's a pain to get that behavior any other way! otoh-it's-potentially-very-much-faster-ly y'rs - tim From tim_one at email.msn.com Tue May 4 07:51:01 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 4 May 1999 01:51:01 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <000701be95ed$3d594180$dca22299@tim> Message-ID: <000901be95f2$195556c0$dca22299@tim> [Tim] > ... > It's the *natural* semantics if Andrew's suspicion that it's > compiling a DFA is correct ... More from the man page: AREs report the longest/shortest match for the RE, rather than the first found in a specified search order. This may affect some RREs which were written in the expectation that the first match would be reported. (The careful crafting of RREs to optimize the search order for fast matching is obsolete (AREs examine all possible matches in parallel, and their performance is largely insensitive to their complexity) but cases where the search order was exploited to deliberately find a match which was not the longest/shortest will need rewriting.) Nails it, yes? Now, in 10 seconds, try to remember a regexp where this really matters . Note in passing that IDLE's colorizer regexp *needs* to search for triple-quoted strings before single-quoted ones, else the P/P semantics would consider """ to be an empty single-quoted string followed by a double quote. This isn't a case where it matters in a bad way, though! The "longest" rule picks the correct alternative regardless of the order in which they're written. at-least-in-that-specific-regex<0.1-wink>-ly y'rs - tim From guido at CNRI.Reston.VA.US Tue May 4 14:26:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 04 May 1999 08:26:04 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Tue, 04 May 1999 01:16:17 EDT." <000701be95ed$3d594180$dca22299@tim> References: <000701be95ed$3d594180$dca22299@tim> Message-ID: <199905041226.IAA07627@eric.cnri.reston.va.us> [Tim] > So some gratuitous differences, and maybe a killer: Guido hasn't had much > kind to say about "longest" (aka POSIX) matching semantics. > > An example from the page: > > (week|wee)(night|knights) > matches all ten characters of `weeknights' > > which means it matched 'wee' and 'knights'; Python/Perl match 'week' and > 'night'. > > It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA > is correct; indeed, it's a pain to get that behavior any other way! Possibly contradicting what I once said about DFAs (I have no idea what I said any more :-): I think we shouldn't be hung up about the subtleties of DFA vs. NFA; for most people, the Perl-compatibility simply means that they can use the same metacharacters. My guess is that people don'y so much translate long Perl regexp's to Python but simply transport their (always incomplete -- Larry Wall *wants* it that way :-) knowledge of Perl regexps to Python. My meta-guess is that this is also Henry Spencer's and John Ousterhout's guess. As for Larry Wall, I guess he really doesn't care :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at cnri.reston.va.us Tue May 4 18:14:41 1999 From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling) Date: Tue, 4 May 1999 12:14:41 -0400 (EDT) Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us> References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us> Message-ID: <14127.6410.646122.342115@amarok.cnri.reston.va.us> Guido van Rossum writes: >Possibly contradicting what I once said about DFAs (I have no idea >what I said any more :-): I think we shouldn't be hung up about the >subtleties of DFA vs. NFA; for most people, the Perl-compatibility >simply means that they can use the same metacharacters. My guess is I don't like slipping in such a change to the semantics with no visible change to the module name or interface. On the other hand, if it's not NFA-based, then it can provide POSIX semantics without danger of taking exponential time to determine the longest match. BTW, there's an interesting reference, I assume to this code, in _Mastering Regular Expressions_; Spencer is quoted on page 121 as saying it's "at worst quadratic in text size.". Anyway, we can let it slide until a Python interface gets written. -- A.M. Kuchling http://starship.python.net/crew/amk/ In the black shadow of the Baba Yaga babies screamed and mothers miscarried; milk soured and men went mad. -- In SANDMAN #38: "The Hunt" From guido at CNRI.Reston.VA.US Tue May 4 18:19:06 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 04 May 1999 12:19:06 -0400 Subject: [Python-Dev] Why Foo is better than Baz In-Reply-To: Your message of "Tue, 04 May 1999 12:14:41 EDT." <14127.6410.646122.342115@amarok.cnri.reston.va.us> References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us> <14127.6410.646122.342115@amarok.cnri.reston.va.us> Message-ID: <199905041619.MAA08408@eric.cnri.reston.va.us> > BTW, there's an interesting reference, I assume to this code, in > _Mastering Regular Expressions_; Spencer is quoted on page 121 as > saying it's "at worst quadratic in text size.". Not sure if that was the same code -- this is *new* code, not Spencer's old code. I think Friedl's book is older than the current code. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Wed May 5 07:37:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 5 May 1999 01:37:02 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us> Message-ID: <000701be96b9$4e434460$799e2299@tim> I've consistently found that the best way to kill a thread is to rename it accurately . Agree w/ Guido that few people really care about the differing semantics. Agree w/ Andrew that it's bad to pull a semantic switcheroo at this stage anyway: code will definitely break. Like \b(?: (?Pand|if|else|...) | (?P[a-zA-Z_]\w*) )\b The (special)|(general) idiom relies on left-to-right match-and-out searching of alternatives to do its job correctly. Not to mention that \b is not a word-boundary assertion in the new pkg (talk about pointlessly irritating differences! at least this one could be easily hidden via brainless preprocessing). Over the long run, moving to a DFA locks Python out of the directions Perl is *moving*, namely embedding all sorts of runtime gimmicks in regexps that exploit knowing the "state of the match so far". DFAs don't work that way. I don't mind losing those possibilities, because I think the regexp sublanguage is strained beyond its limits already. But that's a decision with Big Consequences, so deserves some thought. I'd definitely like the (sometimes dramatically) increased speed a DFA can offer (btw, this code appears to use a lazily-generated DFA, to avoid the exponential *compile*-time a straightforward DFA implementation can suffer -- the code is very complex and lacks any high-level internal docs, so we better hope Henry stays in love with it <0.5 wink>). > ... > My guess is that people don't so much translate long Perl regexp's > to Python but simply transport their (always incomplete -- Larry Wall > *wants* it that way :-) knowledge of Perl regexps to Python. This is directly proportional to the number of feeble CGI programmers Python attracts . The good news is that they wouldn't know an NFA from a DFA if Larry bit Henry on the ass ... > My meta-guess is that this is also Henry Spencer's and John > Ousterhout's guess. I think Spencer strongly favors DFA semantics regardless of fashion, and Ousterhout is a pragmatist. So I trust JO's judgment more <0.9 wink>. > As for Larry Wall, I guess he really doesn't care :-) I expect he cares a lot! Because a DFA would prevent Perl from going even more insane in its present direction. About the age of the code, postings to comp.lang.tcl have Henry saying he was working on the alpha version intensely as recently as Decemeber ('98). A few complaints about the alpha release trickled in, about regexp compile speed and regexp matching speed in specific cases. Perhaps paradoxically, the latter were about especially simple regexps with long fixed substrings (where this mountain of sophisticated machinery is likely to get beat cold by an NFA with some fixed-substring lookahead smarts -- which latter Henry intended to graft into this pkg too). [Andrew] > BTW, there's an interesting reference, I assume to this code, in > _Mastering Regular Expressions_; Spencer is quoted on page 121 as > saying it's "at worst quadratic in text size.". [Guido] > Not sure if that was the same code -- this is *new* code, not > Spencer's old code. I think Friedl's book is older than the current > code. I expect this is an invariant, though: it's not natural for a DFA to know where subexpression matches begin and end, and there's a pile of xxx_dissect functions in regexec.c that use what strongly appear to be worst-case quadratic-time algorithms for figuring that out after it's known that the overall expression has *a* match. Expect too, but don't know, that only pathological cases are actually expensive. Question: has this package been released in any other context, or is it unique to Tcl? I searched in vain for an announcement (let alone code) from Henry, or any discussion of this code outside the Tcl world. whatever-happens-i-vote-we-let-them-debug-it-ly y'rs - tim From gstein at lyra.org Wed May 5 08:22:20 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 4 May 1999 23:22:20 -0700 (PDT) Subject: [Python-Dev] Tcl 8.1's regexp code In-Reply-To: <000701be96b9$4e434460$799e2299@tim> Message-ID: On Wed, 5 May 1999, Tim Peters wrote: >... > Question: has this package been released in any other context, or is it > unique to Tcl? I searched in vain for an announcement (let alone code) from > Henry, or any discussion of this code outside the Tcl world. Apache uses it. However, the Apache guys have considered possibility updating the thing. I gather that they have a pretty old snapshot. Another guy mentioned PCRE and I pointed out that Python uses it for its regex support. In other words, if Apache *does* update the code, then it may be that Apache will drop the HS engine in favor of PCRE. Cheers, -g -- Greg Stein, http://www.lyra.org/ From Ivan.Porres at abo.fi Wed May 5 10:29:21 1999 From: Ivan.Porres at abo.fi (Ivan Porres Paltor) Date: Wed, 05 May 1999 11:29:21 +0300 Subject: [Python-Dev] Python for Small Systems patch Message-ID: <37300161.8DFD1D7F@abo.fi> Python for Small Systems is a minimal version of the python interpreter, intended to run on small embedded systems with a limited amount of memory. Since there is some interest in the newsgroup, we have decide to release an alpha version of the patch. You can download the patch from the following page: http://www.abo.fi/~iporres/python There is no documentation about the changes, but I guess that it is not so difficult to figure out what Raul has been doing. There are some simple examples in the Demo/hitachi directory. The configure scripts are broken. We plan to modify the configure scripts for cross-compilation. We are still testing, cleaning and trying to reduce the memory requirements of the patched interpreter. We also plan to write some documentation. Please send comments to Raul (rparra at abo.fi) or to me (iporres at abo.fi), Regards, Ivan -- Ivan Porres Paltor Turku Centre for Computer Science ?bo Akademi, Department of Computer Science Phone: +358-2-2154033 Lemmink?inengatan 14A FIN-20520 Turku - Finland http://www.abo.fi/~iporres From tismer at appliedbiometrics.com Wed May 5 13:52:24 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 05 May 1999 13:52:24 +0200 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> Message-ID: <373030F8.21B73451@appliedbiometrics.com> Ivan Porres Paltor wrote: > > Python for Small Systems is a minimal version of the python interpreter, > intended to run on small embedded systems with a limited amount of > memory. > > Since there is some interest in the newsgroup, we have decide to release > an alpha version of the patch. You can download the patch from the > following page: > > http://www.abo.fi/~iporres/python > > There is no documentation about the changes, but I guess that it is not > so difficult to figure out what Raul has been doing. Ivan, small Python is a very interesting thing, thanks for the preview. But, aren't 12600 lines of diff a little too much to call it "not difficult to figure out"? :-) The very last line was indeed helpful: +++ Pss/miniconfigure Tue Mar 16 16:59:42 1999 @@ -0,0 +1 @@ +./configure --prefix="/home/rparra/python/Python-1.5.1" --without-complex --without-float --without-long --without-file --without-libm --without-libc --without-fpectl --without-threads --without-dec-threads --with-libs= But I'd be interested in a brief list of which other features are out, and even more which structures were changed. Would that be possible? thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From Ivan.Porres at abo.fi Wed May 5 15:17:17 1999 From: Ivan.Porres at abo.fi (Ivan Porres Paltor) Date: Wed, 05 May 1999 16:17:17 +0300 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> Message-ID: <373044DD.FE4499E@abo.fi> Christian Tismer wrote: > Ivan, > small Python is a very interesting thing, > thanks for the preview. > > But, aren't 12600 lines of diff a little too much > to call it "not difficult to figure out"? :-) Raul Parra (rpb), the author of the patch, got the "source scissors" (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a embedded system with some RAM, no keyboard, no screen and no OS. An example application can be a printer where the print jobs are python bytecompiled scripts (instead of postscript). We plan to write some documentation about the patch. Meanwhile, here are some of the changes: WITHOUT_PARSER, WITHOUT_COMPILER Defining WITHOUT_PARSER removes the parser. This has a lot of implications (no eval() !) but saves a lot of memory. The interpreter can only execute byte-compiled scripts, that is PyCodeObjects. Most embedded processors have poor floating point capabilities. (They can not compete with DSP's): WITHOUT-COMPLEX Removes support for complex numbers WITHOUT-LONG Removes long numbers WITHOUT-FLOAT Removes floating point numbers Dependences with the OS: WITHOUT-FILE Removes file objects. No file, no print, no input, no interactive prompt. This is not to bad in a device without hard disk, keyboard or screen... WITHOUT-GETPATH Removes dependencies with os path.(Probabily this change should be integrated with WITHOUT-FILE) These changes render most of the standard modules unusable. There are no fundamental changes on the interpter, just cut and cut.... Ivan -- Ivan Porres Paltor Turku Centre for Computer Science ?bo Akademi, Department of Computer Science Phone: +358-2-2154033 Lemmink?inengatan 14A FIN-20520 Turku - Finland http://www.abo.fi/~iporres From tismer at appliedbiometrics.com Wed May 5 15:31:05 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 05 May 1999 15:31:05 +0200 Subject: [Python-Dev] Python for Small Systems patch References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> <373044DD.FE4499E@abo.fi> Message-ID: <37304819.AD636B67@appliedbiometrics.com> Ivan Porres Paltor wrote: > > Christian Tismer wrote: > > Ivan, > > small Python is a very interesting thing, > > thanks for the preview. > > > > But, aren't 12600 lines of diff a little too much > > to call it "not difficult to figure out"? :-) > > Raul Parra (rpb), the author of the patch, got the "source scissors" > (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a > embedded system with some RAM, no keyboard, no screen and no OS. An > example application can be a printer where the print jobs are python > bytecompiled scripts (instead of postscript). > > We plan to write some documentation about the patch. Meanwhile, here are > some of the changes: Many thanks, this is really interesting > These changes render most of the standard modules unusable. > There are no fundamental changes on the interpter, just cut and cut.... I see. A last thing which I'm curious about is the executable size. If this can be compared to a Windows dll at all. Did you compile without the changes for your target as well? How is the ratio? The python15.dll file contains everything of core Python and is about 560 KB large. If your engine goes down to, say below 200 KB, this could be a great thing for embedding Python into other apps. ciao & thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Wed May 5 16:55:40 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 5 May 1999 10:55:40 -0400 (EDT) Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) References: <199905041226.IAA07627@eric.cnri.reston.va.us> <000701be96b9$4e434460$799e2299@tim> Message-ID: <14128.23532.499380.835737@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Over the long run, moving to a DFA locks Python out of the TP> directions Perl is *moving*, namely embedding all sorts of TP> runtime gimmicks in regexps that exploit knowing the "state of TP> the match so far". DFAs don't work that way. I don't mind TP> losing those possibilities, because I think the regexp TP> sublanguage is strained beyond its limits already. But that's TP> a decision with Big Consequences, so deserves some thought. I know zip about the internals of the various regexp package. But as far as the Python level interface, would it be feasible to support both as underlying regexp engines underneath re.py? The idea would be that you'd add an extra flag (re.PERL / re.TCL ? re.DFA / re.NFA ? re.POSIX / re.USEFUL ? :-) that would select the engine and compiler. Then all the rest of the magic happens behind the scenes, with appropriate exceptions thrown if there are syntax mismatches in the regexp that can't be worked around by preprocessors, etc. Or would that be more confusing than yet another different regexp module? -Barry From tim_one at email.msn.com Wed May 5 17:55:20 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 5 May 1999 11:55:20 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code In-Reply-To: Message-ID: <000601be970f$adef5740$a59e2299@tim> [Tim] > Question: has this package [Tcl's 8.1 regexp support] been released in > any other context, or is it unique to Tcl? I searched in vain for an > announcement (let alone code) from Henry, or any discussion of this code > outside the Tcl world. [Greg Stein] > Apache uses it. > > However, the Apache guys have considered possibility updating the thing. I > gather that they have a pretty old snapshot. Another guy mentioned PCRE > and I pointed out that Python uses it for its regex support. In other > words, if Apache *does* update the code, then it may be that Apache will > drop the HS engine in favor of PCRE. Hmm. I just downloaded the Apache 1.3.4 source to check on this, and it appears to be using a lightly massaged version of Spencer's old (circa '92-'94) just-POSIX regexp package. Henry has been distributing regexp pkgs for a loooong time . The Tcl 8.1 regexp pkg is much hairier. If the Apache folk want to switch in order to get the Perl regexp syntax extensions, this Tcl version is worth looking at too. If they want to switch for some other reason, it would be good to know what that is! The base pkg Apache uses is easily available all over the web; the pkg Tcl 8.1 is using I haven't found anywhere except in the Tcl download (which is why I'm wondering about it -- so far, it doesn't appear to be distributed by Spencer himself, in a non-Tcl-customized form). looks-like-an-entirely-new-pkg-to-me-ly y'rs - tim From beazley at cs.uchicago.edu Wed May 5 18:54:45 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 5 May 1999 11:54:45 -0500 (CDT) Subject: [Python-Dev] My (possibly delusional) book project Message-ID: <199905051654.LAA11410@tartarus.cs.uchicago.edu> Although this is a little off-topic for the developer list, I want to fill people in on a new Python book project. A few months ago, I was approached about doing a new Python reference book and I've since decided to proceed with the project (after all, an increased presence at the bookstore is probably a good thing :-). In any event, my "vision" for this book is to take the material in the Python tutorial, language reference, library reference, and extension guide and squeeze it into a compact book no longer than 300 pages (and hopefully without having to use a 4-point font). Actually, what I'm really trying to do is write something in a style similar to the K&R C Programming book (very terse, straight to the point, and technically accurate). The book's target audience is experienced/expert programmers. With this said, I would really like to get feedback from the developer community about this project in a few areas. First, I want to make sure the language reference is in sync with the latest version of Python, that it is as accurate as possible, and that it doesn't leave out any important topics or recent developments. Second, I would be interested in knowing how to emphasize certain topics (for instance, should I emphasize class-based exceptions over string-based exceptions even though most books only cover the former case?). The other big area is the library reference. Given the size of the library, I'm going to cut a number of modules out. However, the choice of what to cut is not entirely clear (for now, it's a judgment call on my part). All of the work in progress for this project is online at: http://rustler.cs.uchicago.edu/~beazley/essential/reference.html I would love to get constructive feedback about this from other developers. Of course, I'll keep people posted in any case. Cheers, Dave From tim_one at email.msn.com Thu May 6 07:43:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 6 May 1999 01:43:16 -0400 Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz) In-Reply-To: <14128.23532.499380.835737@anthem.cnri.reston.va.us> Message-ID: <000d01be9783$57543940$2ca22299@tim> [Tim notes that moving to a DFA regexp engine would rule out some future aping of Perl mistakes ] [Barry "The Great Compromiser" Warsaw] > I know zip about the internals of the various regexp package. But as > far as the Python level interface, would it be feasible to support > both as underlying regexp engines underneath re.py? The idea would be > that you'd add an extra flag (re.PERL / re.TCL ? re.DFA / re.NFA ? > re.POSIX / re.USEFUL ? :-) that would select the engine and compiler. > Then all the rest of the magic happens behind the scenes, with > appropriate exceptions thrown if there are syntax mismatches in the > regexp that can't be worked around by preprocessors, etc. > > Or would that be more confusing than yet another different regexp > module? It depends some on what percentage of the Python distribution Guido wants to devote to regexp code <0.6 wink>; the Tcl pkg would be the largest block of code in Modules/, where regexp packages already consume more than anything else. It's a lot of delicate, difficult code. Someone would need to step up and champion each alternative package. I haven't asked Andrew lately, but I'd bet half a buck the thrill of supporting pcre has waned. If there were competing packages, your suggested interface is fine. I just doubt the Python developers will support more than one (Andrew may still be young, but he can't possibly still be naive enough to sign up for two of these nightmares ). i'm-so-old-i-never-signed-up-for-one-ly y'rs - tim From rushing at nightmare.com Thu May 13 08:34:19 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Wed, 12 May 1999 23:34:19 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905070507.BAA22545@python.org> References: <199905070507.BAA22545@python.org> Message-ID: <14138.28243.553816.166686@seattle.nightmare.com> [list has been quiet, thought I'd liven things up a bit. 8^)] I'm not sure if this has been brought up before in other forums, but has there been discussion of separating the Python and C invocation stacks, (i.e., removing recursive calls to the intepreter) to facilitate coroutines or first-class continuations? One of the biggest barriers to getting others to use asyncore/medusa is the need to program in continuation-passing-style (callbacks, callbacks to callbacks, state machines, etc...). Usually there has to be an overriding requirement for speed/scalability before someone will even look into it. And even when you do 'get' it, there are limits to how inside-out your thinking can go. 8^) If Python had coroutines/continuations, it would be possible to hide asyncore-style select()/poll() machinery 'behind the scenes'. I believe that Concurrent ML does exactly this... Other advantages might be restartable exceptions, different threading models, etc... -Sam rushing at nightmare.com rushing at eGroups.net From mal at lemburg.com Thu May 13 10:23:13 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 13 May 1999 10:23:13 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> Message-ID: <373A8BF1.AE124BF@lemburg.com> rushing at nightmare.com wrote: > > [list has been quiet, thought I'd liven things up a bit. 8^)] Well, there certainly is enough on the todo list... it's probably the usual "ain't got no time" thing. > I'm not sure if this has been brought up before in other forums, but > has there been discussion of separating the Python and C invocation > stacks, (i.e., removing recursive calls to the intepreter) to > facilitate coroutines or first-class continuations? Wouldn't it be possible to move all the C variables passed to eval_code() via the execution frame ? AFAIK, the frame is generated on every call to eval_code() and thus could also be generated *before* calling it. > One of the biggest barriers to getting others to use asyncore/medusa > is the need to program in continuation-passing-style (callbacks, > callbacks to callbacks, state machines, etc...). Usually there has to > be an overriding requirement for speed/scalability before someone will > even look into it. And even when you do 'get' it, there are limits to > how inside-out your thinking can go. 8^) > > If Python had coroutines/continuations, it would be possible to hide > asyncore-style select()/poll() machinery 'behind the scenes'. I > believe that Concurrent ML does exactly this... > > Other advantages might be restartable exceptions, different threading > models, etc... Don't know if moving the C stack stuff into the frame objects will get you the desired effect: what about other things having state (e.g. connections or files), that are not even touched by this mechanism ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Y2000: 232 days left Business: http://www.lemburg.com/ Python Pages: http://starship.python.net/crew/lemburg/ From rushing at nightmare.com Thu May 13 11:40:19 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Thu, 13 May 1999 02:40:19 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373A8BF1.AE124BF@lemburg.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <373A8BF1.AE124BF@lemburg.com> Message-ID: <14138.38550.89759.752058@seattle.nightmare.com> M.-A. Lemburg writes: > Wouldn't it be possible to move all the C variables passed to > eval_code() via the execution frame ? AFAIK, the frame is > generated on every call to eval_code() and thus could also > be generated *before* calling it. I think this solves half of the problem. The C stack is both a value stack and an execution stack (i.e., it holds variables and return addresses). Getting rid of arguments (and a return value!) gets rid of the need for the 'value stack' aspect. In aiming for an enter-once, exit-once VM, the thorniest part is to somehow allow python->c->python calls. The second invocation could never save a continuation because its execution context includes a C frame. This is a general problem, not specific to Python; I probably should have thought about it a bit before posting... > Don't know if moving the C stack stuff into the frame objects > will get you the desired effect: what about other things having > state (e.g. connections or files), that are not even touched > by this mechanism ? I don't think either of those cause 'real' problems (i.e., nothing should crash that assumes an open file or socket), but there may be other stateful things that might. I don't think that refcounts would be a problem - a saved continuation wouldn't be all that different from an exception traceback. -Sam p.s. Here's a tiny VM experiment I wrote a while back, to explain what I mean by 'stackless': http://www.nightmare.com/stuff/machine.h http://www.nightmare.com/stuff/machine.c Note how OP_INVOKE (the PROC_CLOSURE clause) pushes new context onto heap-allocated data structures rather than calling the VM recursively. From skip at mojam.com Thu May 13 13:38:39 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 13 May 1999 07:38:39 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14138.28243.553816.166686@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> Message-ID: <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Sam> I'm not sure if this has been brought up before in other forums, Sam> but has there been discussion of separating the Python and C Sam> invocation stacks, (i.e., removing recursive calls to the Sam> intepreter) to facilitate coroutines or first-class continuations? I thought Guido was working on that for the mobile agent stuff he was working on at CNRI. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From bwarsaw at cnri.reston.va.us Thu May 13 17:10:52 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 13 May 1999 11:10:52 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Message-ID: <14138.60284.584739.711112@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I thought Guido was working on that for the mobile agent stuff SM> he was working on at CNRI. Nope, we decided that we could accomplish everything we needed without this. We occasionally revisit this but Guido keeps insisting it's a lot of work for not enough benefit :-) -Barry From guido at CNRI.Reston.VA.US Thu May 13 17:19:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 13 May 1999 11:19:10 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Thu, 13 May 1999 07:38:39 EDT." <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> Message-ID: <199905131519.LAA01097@eric.cnri.reston.va.us> Interesting topic! While I 'm on the road, a few short notes. > I thought Guido was working on that for the mobile agent stuff he was > working on at CNRI. Indeed. At least I planned on working on it. I ended up abandoning the idea because I expected it would be a lot of work and I never had the time (same old story indeed). Sam also hit it on the nail: the hardest problem is what to do about all the places where C calls back into Python. I've come up with two partial solutions: (1) allow for a way to arrange for a call to be made immediately after you return to the VM from C; this would take care of apply() at least and a few other "tail-recursive" cases; (2) invoke a new VM when C code needs a Python result, requiring it to return. The latter clearly breaks certain uses of coroutines but could probably be made to work most of the time. Typical use of the 80-20 rule. And I've just come up with a third solution: a variation on (1) where you arrange *two* calls: one to Python and then one to C, with the result of the first. (And a bit saying whether you want the C call to be made even when an exception happened.) In general, I still think it's a cool idea, but I also still think that continuations are too complicated for most programmers. (This comes from the realization that they are too complicated for me!) Corollary: even if we had continuations, I'm not sure if this would take away the resistance against asyncore/asynchat. Of course I could be wrong. Different suggestion: it would be cool to work on completely separating out the VM from the rest of Python, through some kind of C-level API specification. Two things should be possiblw with this new architecture: (1) small platform ports could cut out the interactive interpreter, the parser and compiler, and certain data types such as long, complex and files; (2) there could be alternative pluggable VMs with certain desirable properties such as platform-specific optimization (Christian, are you listening? :-). I think the most challenging part might be defining an API for passing in the set of supported object types and operations. E.g. the EXEC_STMT opcode needs to be be implemented in a way that allows "exec" to be absent from the language. Perhaps an __exec__ function (analogous to __import__) is the way to go. The set of built-in functions should also be passed in, so that e.g. one can easily leave out open(), eval() and comppile(), complex(), long(), float(), etc. I think it would be ideal if no #ifdefs were needed to remove features (at least not in the VM code proper). Fortunately, the VM doesn't really know about many object types -- frames, fuctions, methods, classes, ints, strings, dictionaries, tuples, tracebacks, that may be all it knows. (Lists?) Gotta run, --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Thu May 13 21:50:44 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 13 May 1999 21:50:44 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <199905131519.LAA01097@eric.cnri.reston.va.us> Message-ID: <01d501be9d79$e4890060$f29b12c2@pythonware.com> > In general, I still think it's a cool idea, but I also still think > that continuations are too complicated for most programmers. (This > comes from the realization that they are too complicated for me!) in an earlier life, I used non-preemtive threads (that is, explicit yields) and co-routines to do some really cool stuff with very little code. looks like a stack-less inter- preter would make it trivial to implement that. might just be nostalgia, but I think I would give an arm or two to get that (not necessarily my own, though ;-) From rushing at nightmare.com Fri May 14 04:00:09 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Thu, 13 May 1999 19:00:09 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> Message-ID: <14139.30970.644343.612721@seattle.nightmare.com> Guido van Rossum writes: > I've come up with two partial solutions: (1) allow for a way to > arrange for a call to be made immediately after you return to the > VM from C; this would take care of apply() at least and a few > other "tail-recursive" cases; (2) invoke a new VM when C code > needs a Python result, requiring it to return. The latter clearly > breaks certain uses of coroutines but could probably be made to > work most of the time. Typical use of the 80-20 rule. I know this is disgusting, but could setjmp/longjmp 'automagically' force a 'recursive call' to jump back into the top-level loop? This would put some serious restraint on what C called from Python could do... I think just about any Scheme implementation has to solve this same problem... I'll dig through my collection of them for ideas. > In general, I still think it's a cool idea, but I also still think > that continuations are too complicated for most programmers. (This > comes from the realization that they are too complicated for me!) > Corollary: even if we had continuations, I'm not sure if this would > take away the resistance against asyncore/asynchat. Of course I could > be wrong. Theoretically, you could have a bit of code that looked just like 'normal' imperative code, that would actually be entering and exiting the context for non-blocking i/o. If it were done right, the same exact code might even run under 'normal' threads. Recently I've written an async server that needed to talk to several other RPC servers, and a mysql server. Pseudo-example, with possibly-async calls in UPPERCASE: auth, archive = db.FETCH_USER_INFO (user) if verify_login(user,auth): rpc_server = self.archive_servers[archive] group_info = rpc_server.FETCH_GROUP_INFO (group) if valid (group_info): return rpc_server.FETCH_MESSAGE (message_number) else: ... else: ... This code in CPS is a horrible, complicated mess, it takes something like 8 callback methods, variables and exceptions have to be passed around in 'continuation' objects. It's hairy because there are three levels of callback state. Ugh. If Python had closures, then it would be a *little* easier, but would still make the average Pythoneer swoon. Closures would let you put the above logic all in one method, but the code would still be 'inside-out'. > Different suggestion: it would be cool to work on completely > separating out the VM from the rest of Python, through some kind of > C-level API specification. I think this is a great idea. I've been staring at python bytecodes a bit lately thinking about how to do something like this, for some subset of Python. [...] Ok, we've all seen the 'stick'. I guess I should give an example of the 'carrot': I think that a web server built on such a Python could have the performance/scalability of thttpd, with the ease-of-programming of Roxen. As far as I know, there's nothing like it out there. Medusa would be put out to pasture. 8^) -Sam From guido at CNRI.Reston.VA.US Fri May 14 14:03:31 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 14 May 1999 08:03:31 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Thu, 13 May 1999 19:00:09 PDT." <14139.30970.644343.612721@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> Message-ID: <199905141203.IAA01808@eric.cnri.reston.va.us> > I know this is disgusting, but could setjmp/longjmp 'automagically' > force a 'recursive call' to jump back into the top-level loop? This > would put some serious restraint on what C called from Python could > do... Forget about it. setjmp/longjmp are invitations to problems. I also assume that they would interfere badly with C++. > I think just about any Scheme implementation has to solve this same > problem... I'll dig through my collection of them for ideas. Anything that assumes knowledge about how the C compiler and/or the CPU and OS lay out the stack is a no-no, because it means that the first thing one has to do for a port to a new architecture is figure out how the stack is laid out. Another thread in this list is porting Python to microplatforms like PalmOS. Typically the scheme Hackers are not afraid to delve deep into the machine, but I refuse to do that -- I think it's too risky. > > In general, I still think it's a cool idea, but I also still think > > that continuations are too complicated for most programmers. (This > > comes from the realization that they are too complicated for me!) > > Corollary: even if we had continuations, I'm not sure if this would > > take away the resistance against asyncore/asynchat. Of course I could > > be wrong. > > Theoretically, you could have a bit of code that looked just like > 'normal' imperative code, that would actually be entering and exiting > the context for non-blocking i/o. If it were done right, the same > exact code might even run under 'normal' threads. Yes -- I remember in 92 or 93 I worked out a way to emulat coroutines with regular threads. (I think in cooperation with Steve Majewski.) > Recently I've written an async server that needed to talk to several > other RPC servers, and a mysql server. Pseudo-example, with > possibly-async calls in UPPERCASE: > > auth, archive = db.FETCH_USER_INFO (user) > if verify_login(user,auth): > rpc_server = self.archive_servers[archive] > group_info = rpc_server.FETCH_GROUP_INFO (group) > if valid (group_info): > return rpc_server.FETCH_MESSAGE (message_number) > else: > ... > else: > ... > > This code in CPS is a horrible, complicated mess, it takes something > like 8 callback methods, variables and exceptions have to be passed > around in 'continuation' objects. It's hairy because there are three > levels of callback state. Ugh. Agreed. > If Python had closures, then it would be a *little* easier, but would > still make the average Pythoneer swoon. Closures would let you put > the above logic all in one method, but the code would still be > 'inside-out'. I forget how this worked :-( > > Different suggestion: it would be cool to work on completely > > separating out the VM from the rest of Python, through some kind of > > C-level API specification. > > I think this is a great idea. I've been staring at python bytecodes a > bit lately thinking about how to do something like this, for some > subset of Python. > > [...] > > Ok, we've all seen the 'stick'. I guess I should give an example of > the 'carrot': I think that a web server built on such a Python could > have the performance/scalability of thttpd, with the > ease-of-programming of Roxen. As far as I know, there's nothing like > it out there. Medusa would be put out to pasture. 8^) I'm afraid I haven't kept up -- what are Roxen and thttpd? What do they do that Apache doesn't? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri May 14 15:16:13 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 14 May 1999 15:16:13 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> > I'm afraid I haven't kept up -- what are Roxen and thttpd? What do > they do that Apache doesn't? http://www.roxen.com/ a lean and mean secure web server written in Pike (http://pike.idonex.se/), from a company here in Link?ping. From tismer at appliedbiometrics.com Fri May 14 17:15:20 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 14 May 1999 17:15:20 +0200 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <373C3E08.FCCB141B@appliedbiometrics.com> Guido van Rossum wrote: [setjmp/longjmp -no-no] > Forget about it. setjmp/longjmp are invitations to problems. I also > assume that they would interfere badly with C++. > > > I think just about any Scheme implementation has to solve this same > > problem... I'll dig through my collection of them for ideas. > > Anything that assumes knowledge about how the C compiler and/or the > CPU and OS lay out the stack is a no-no, because it means that the > first thing one has to do for a port to a new architecture is figure > out how the stack is laid out. Another thread in this list is porting > Python to microplatforms like PalmOS. Typically the scheme Hackers > are not afraid to delve deep into the machine, but I refuse to do that > -- I think it's too risky. ... I agree that this is generally bad. While it's a cakewalk to do a stack swap for the few (X86 based:) platforms where I work with. This is much less than a thread change. But on the general issues: Can the Python-calls-C and C-calls-Python problem just be solved by turning the whole VM state into a data structure, including a Python call stack which is independent? Maybe this has been mentioned already. This might give a little slowdown, but opens possibilities like continuation-passing style, and context switches between different interpreter states would be under direct control. Just a little dreaming: Not using threads, but just tiny interpreter incarnations with local state, and a special C call or better a new opcode which activates the next state in some list (of course a Python list). This would automagically produce ICON iterators (duck) and coroutines (cover). If I guess right, continuation passing could be done by just shifting tiny tuples around. Well, Tim, help me :-) [closures] > > I think this is a great idea. I've been staring at python bytecodes a > > bit lately thinking about how to do something like this, for some > > subset of Python. Lumberjack? How is it going? [to Sam] ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Fri May 14 17:32:51 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 14 May 1999 11:32:51 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> Message-ID: <14140.16931.987089.887772@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> a lean and mean secure web server written in Pike FL> (http://pike.idonex.se/), from a company here in FL> Link?ping. Interesting off-topic Pike connection. My co-maintainer for CC-Mode original came on board to add Pike support, which has a syntax similar enough to C to be easily integrated. I think I've had as much success convincing him to use Python as he's had convincing me to use Pike :-) -Barry From gstein at lyra.org Fri May 14 23:54:02 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 14 May 1999 14:54:02 -0700 Subject: [Python-Dev] Roxen (was Re: [Python-Dev] 'stackless' python?) References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> <14140.16931.987089.887772@anthem.cnri.reston.va.us> Message-ID: <373C9B7A.3676A910@lyra.org> Barry A. Warsaw wrote: > > >>>>> "FL" == Fredrik Lundh writes: > > FL> a lean and mean secure web server written in Pike > FL> (http://pike.idonex.se/), from a company here in > FL> Link?ping. > > Interesting off-topic Pike connection. My co-maintainer for CC-Mode > original came on board to add Pike support, which has a syntax similar > enough to C to be easily integrated. I think I've had as much success > convincing him to use Python as he's had convincing me to use Pike :-) Heh. Pike is an outgrowth of the MUD world's LPC programming language. A guy named "Profezzorn" started a project (in '94?) to redevelop an LPC compiler/interpreter ("driver") from scratch to avoid some licensing constraints. The project grew into a generalized network handler, since MUDs' typical designs are excellent for these tasks. From there, you get the Roxen web server. Cheers, -g -- Greg Stein, http://www.lyra.org/ From rushing at nightmare.com Sat May 15 01:36:11 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Fri, 14 May 1999 16:36:11 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <14140.44469.848840.740112@seattle.nightmare.com> Guido van Rossum writes: > > If Python had closures, then it would be a *little* easier, but would > > still make the average Pythoneer swoon. Closures would let you put > > the above logic all in one method, but the code would still be > > 'inside-out'. > > I forget how this worked :-( [with a faked-up lambda-ish syntax] def thing (a): return do_async_job_1 (a, lambda (b): if (a>1): do_async_job_2a (b, lambda (c): [...] ) else: do_async_job_2b (a,b, lambda (d,e,f): [...] ) ) The call to do_async_job_1 passes 'a', and a callback, which is specified 'in-line'. You can follow the logic of something like this more easily than if each lambda is spun off into a different function/method. > > I think that a web server built on such a Python could have the > > performance/scalability of thttpd, with the ease-of-programming > > of Roxen. As far as I know, there's nothing like it out there. > > Medusa would be put out to pasture. 8^) > > I'm afraid I haven't kept up -- what are Roxen and thttpd? What do > they do that Apache doesn't? thttpd (& Zeus, Squid, Xitami) use select()/poll() to gain performance and scalability, but suffer from the same programmability problem as Medusa (only worse, 'cause they're in C). Roxen is written in Pike, a c-like language with gc, threads, etc... Roxen is I think now the official 'GNU Web Server'. Here's an interesting web-server comparison chart: http://www.acme.com/software/thttpd/benchmarks.html -Sam From guido at CNRI.Reston.VA.US Sat May 15 04:23:24 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 14 May 1999 22:23:24 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Fri, 14 May 1999 16:36:11 PDT." <14140.44469.848840.740112@seattle.nightmare.com> References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <14140.44469.848840.740112@seattle.nightmare.com> Message-ID: <199905150223.WAA02457@eric.cnri.reston.va.us> > def thing (a): > return do_async_job_1 (a, > lambda (b): > if (a>1): > do_async_job_2a (b, > lambda (c): > [...] > ) > else: > do_async_job_2b (a,b, > lambda (d,e,f): > [...] > ) > ) > > The call to do_async_job_1 passes 'a', and a callback, which is > specified 'in-line'. You can follow the logic of something like this > more easily than if each lambda is spun off into a different > function/method. I agree that it is still ugly. > http://www.acme.com/software/thttpd/benchmarks.html I see. Any pointers to a graph of thttp market share? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Sat May 15 09:51:00 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 15 May 1999 03:51:00 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us> Message-ID: <000701be9ea7$acab7f40$159e2299@tim> [GvR] > ... > Anything that assumes knowledge about how the C compiler and/or the > CPU and OS lay out the stack is a no-no, because it means that the > first thing one has to do for a port to a new architecture is figure > out how the stack is laid out. Another thread in this list is porting > Python to microplatforms like PalmOS. Typically the scheme Hackers > are not afraid to delve deep into the machine, but I refuse to do that > -- I think it's too risky. The Icon language needs a bit of platform-specific context-switching assembly code to support its full coroutine features, although its bread-and-butter generators ("semi coroutines") don't need anything special. The result is that Icon ports sometimes limp for a year before they support full coroutines, waiting for someone wizardly enough to write the necessary code. This can, in fact, be quite difficult; e.g., on machines with HW register windows (where "the stack" can be a complicated beast half buried in hidden machine state, sometimes needing kernel privilege to uncover). Not attractive. Generators are, though . threads-too-ly y'rs - tim From tim_one at email.msn.com Sat May 15 09:51:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 15 May 1999 03:51:03 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com> Message-ID: <000801be9ea7$ae45f560$159e2299@tim> [Christian Tismer] > ... > But on the general issues: > Can the Python-calls-C and C-calls-Python problem just be solved > by turning the whole VM state into a data structure, including > a Python call stack which is independent? Maybe this has been > mentioned already. The problem is that when C calls Python, any notion of continuation has to include C's state too, else resuming the continuation won't return into C correctly. The C code that *implements* Python could be reworked to support this, but in the general case you've got some external C extension module calling into Python, and then Python hasn't a clue about its caller's state. I'm not a fan of continuations myself; coroutines can be implemented faithfully via threads (I posted a rather complete set of Python classes for that in the pre-DejaNews days, a bit more flexible than Icon's coroutines); and: > This would automagically produce ICON iterators (duck) > and coroutines (cover). Icon iterators/generators could be implemented today if anyone bothered (Majewski essentially implemented them back around '93 already, but seemed to lose interest when he realized it couldn't be extended to full continuations, because of C/Python stack intertwingling). > If I guess right, continuation passing could be done > by just shifting tiny tuples around. Well, Tim, help me :-) Python-calling-Python continuations should be easily doable in a "stackless" Python; the key ideas were already covered in this thread, I think. The thing that makes generators so much easier is that they always return directly to their caller, at the point of call; so no C frame can get stuck in the middle even under today's implementation; it just requires not deleting the generator's frame object, and adding an opcode to *resume* the frame's execution the next time the generator is called. Unlike as in Icon, it wouldn't even need to be tied to a funky notion of goal-directed evaluation. don't-try-to-traverse-a-tree-without-it-ly y'rs - tim From gstein at lyra.org Sat May 15 10:17:15 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 15 May 1999 01:17:15 -0700 Subject: [Python-Dev] 'stackless' python? References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us> <14140.44469.848840.740112@seattle.nightmare.com> <199905150223.WAA02457@eric.cnri.reston.va.us> Message-ID: <373D2D8B.390C523C@lyra.org> Guido van Rossum wrote: > ... > > http://www.acme.com/software/thttpd/benchmarks.html > > I see. Any pointers to a graph of thttp market share? thttpd currently has about 70k sites (of 5.4mil found by Netcraft). That puts it at #6. However, it is interesting to note that 60k of those sites are in the .uk domain. I can't figure out who is running it, but I would guess that a large UK-based ISP is hosting a bunch of domains on thttpd. It is somewhat difficult to navigate the various reports (and it never fails that the one you want is not present), but the data is from Netcraft's survey at: http://www.netcraft.com/survey/ Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Sat May 15 18:43:20 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 15 May 1999 12:43:20 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com> Message-ID: <000701be9ef2$0a9713e0$659e2299@tim> [Christian Tismer] > ... > But on the general issues: > Can the Python-calls-C and C-calls-Python problem just be solved > by turning the whole VM state into a data structure, including > a Python call stack which is independent? Maybe this has been > mentioned already. The problem is that when C calls Python, any notion of continuation has to include C's state too, else resuming the continuation won't return into C correctly. The C code that *implements* Python could be reworked to support this, but in the general case you've got some external C extension module calling into Python, and then Python hasn't a clue about its caller's state. I'm not a fan of continuations myself; coroutines can be implemented faithfully via threads (I posted a rather complete set of Python classes for that in the pre-DejaNews days, a bit more flexible than Icon's coroutines); and: > This would automagically produce ICON iterators (duck) > and coroutines (cover). Icon iterators/generators could be implemented today if anyone bothered (Majewski essentially implemented them back around '93 already, but seemed to lose interest when he realized it couldn't be extended to full continuations, because of C/Python stack intertwingling). > If I guess right, continuation passing could be done > by just shifting tiny tuples around. Well, Tim, help me :-) Python-calling-Python continuations should be easily doable in a "stackless" Python; the key ideas were already covered in this thread, I think. The thing that makes generators so much easier is that they always return directly to their caller, at the point of call; so no C frame can get stuck in the middle even under today's implementation; it just requires not deleting the generator's frame object, and adding an opcode to *resume* the frame's execution the next time the generator is called. Unlike as in Icon, it wouldn't even need to be tied to a funky notion of goal-directed evaluation. don't-try-to-traverse-a-tree-without-it-ly y'rs - tim From rushing at nightmare.com Sun May 16 13:10:18 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Sun, 16 May 1999 04:10:18 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <81365478@toto.iv> Message-ID: <14142.40867.103424.764346@seattle.nightmare.com> Tim Peters writes: > I'm not a fan of continuations myself; coroutines can be > implemented faithfully via threads (I posted a rather complete set > of Python classes for that in the pre-DejaNews days, a bit more > flexible than Icon's coroutines); and: Continuations are more powerful than coroutines, though I admit they're a bit esoteric. I programmed in Scheme for years without seeing the need for them. But when you need 'em, you *really* need 'em. No way around it. For my purposes (massively scalable single-process servers and clients) threads don't cut it... for example I have a mailing-list exploder that juggles up to 2048 simultaneous SMTP connections. I think it can go higher - I've tested select() on FreeBSD with 16,000 file descriptors. [...] BTW, I have actually made progress borrowing a bit of code from SCM. It uses the stack-copying technique, along with setjmp/longjmp. It's too ugly and unportable to be a real candidate for inclusion in Official Python. [i.e., if it could be made to work it should be considered a stopgap measure for the desperate]. I haven't tested it thoroughly, but I have successfully saved and invoked (and reinvoked) a continuation. Caveat: I have to turn off Py_DECREF in order to keep it from crashing. | >>> import callcc | >>> saved = None | >>> def thing(n): | ... if n == 2: | ... global saved | ... saved = callcc.new() | ... print 'n==',n | ... if n == 0: | ... print 'Done!' | ... else: | ... thing (n-1) | ... | >>> thing (5) | n== 5 | n== 4 | n== 3 | n== 2 | n== 1 | n== 0 | Done! | >>> saved | | >>> saved.throw (0) | n== 2 | n== 1 | n== 0 | Done! | >>> saved.throw (0) | n== 2 | n== 1 | n== 0 | Done! | >>> I will probably not be able to work on this for a while (baby due any day now), so anyone is welcome to dive right in. I don't have much experience wading through gdb tracking down reference bugs, I'm hoping a brave soul will pick up where I left off. 8^) http://www.nightmare.com/stuff/python-callcc.tar.gz ftp://www.nightmare.com/stuff/python-callcc.tar.gz -Sam From tismer at appliedbiometrics.com Sun May 16 17:31:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 16 May 1999 17:31:01 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> Message-ID: <373EE4B5.6EE6A678@appliedbiometrics.com> rushing at nightmare.com wrote: [...] > BTW, I have actually made progress borrowing a bit of code from SCM. > It uses the stack-copying technique, along with setjmp/longjmp. It's > too ugly and unportable to be a real candidate for inclusion in > Official Python. [i.e., if it could be made to work it should be > considered a stopgap measure for the desperate]. I tried it and built it as a Win32 .pyd file, and it seems to work, but... > I haven't tested it thoroughly, but I have successfully saved and > invoked (and reinvoked) a continuation. Caveat: I have to turn off > Py_DECREF in order to keep it from crashing. Indeed, and this seems to be a problem too hard to solve without lots of work. Since you keep a snapshot of the current machine stack, it contains a number of object references which have been valid when the snapshot was taken, but many are most probably invalid when you restart the continuation. I guess, incref-ing all current alive objects on the interpreter stack would be the minimum, maybe more. A tuple of necessary references could be used as an attribute of a Continuation object. I will look how difficult this is. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Sun May 16 20:31:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 16 May 1999 20:31:01 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> <373EE4B5.6EE6A678@appliedbiometrics.com> Message-ID: <373F0EE5.A8DE00C5@appliedbiometrics.com> Christian Tismer wrote: > > rushing at nightmare.com wrote: [...] > > I haven't tested it thoroughly, but I have successfully saved and > > invoked (and reinvoked) a continuation. Caveat: I have to turn off > > Py_DECREF in order to keep it from crashing. It is possible, but a little hard. To take a working snapshot of the current thread's stack, one needs not only the stack snapshot which continue.c provides, but also a restorable copy of all frame objects involved so far. A copy of the current frame chain must be built, with proper reference counting of all involved elements. And this is the crux: The current stack pointer of the VM is not present in the frame objects, but hangs around somewhere on the machine stack. Two solutions: 1) modify PyFrameObject by adding a field which holds the stack pointer, when a function is called. I don't like to change the VM in any way for this. 2) use the lasti field which holds the last VM instruction offset. Then scan the opcodes of the code object and calculate the current stack level. This is possible since Guido's code generator creates code with the stack level lexically bound to the code offset. Now we can incref all the referenced objects in the frame. This must be done for the whole chain, which is copied and relinked during that. This chain is then held as a property of the continuation object. To throw the continuation, the current frame chain must be cleared, and the saved one is inserted, together with the machine stack operation which Sam has already. A little hefty, isn't it? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Mon May 17 07:42:59 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 17 May 1999 01:42:59 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14142.40867.103424.764346@seattle.nightmare.com> Message-ID: <000f01bea028$1f75c360$fb9e2299@tim> [Sam] > Continuations are more powerful than coroutines, though I admit > they're a bit esoteric. "More powerful" is a tedious argument you should always avoid . > I programmed in Scheme for years without seeing the need for them. > But when you need 'em, you *really* need 'em. No way around it. > > For my purposes (massively scalable single-process servers and > clients) threads don't cut it... for example I have a mailing-list > exploder that juggles up to 2048 simultaneous SMTP connections. I > think it can go higher - I've tested select() on FreeBSD with 16,000 > file descriptors. The other point being that you want to avoid "inside out" logic, though, right? Earlier you posted a kind of ideal: Recently I've written an async server that needed to talk to several other RPC servers, and a mysql server. Pseudo-example, with possibly-async calls in UPPERCASE: auth, archive = db.FETCH_USER_INFO (user) if verify_login(user,auth): rpc_server = self.archive_servers[archive] group_info = rpc_server.FETCH_GROUP_INFO (group) if valid (group_info): return rpc_server.FETCH_MESSAGE (message_number) else: ... else: ... I assume you want to capture a continuation object in the UPPERCASE methods, store it away somewhere, run off to your select/poll/whatever loop, and have it invoke the stored continuation objects as the data they're waiting for arrives. If so, that's got to be the nicest use for continuations I've seen! All invisible to the end user. I don't know how to fake it pleasantly without threads, either, and understand that threads aren't appropriate for resource reasons. So I don't have a nice alternative. > ... > | >>> import callcc > | >>> saved = None > | >>> def thing(n): > | ... if n == 2: > | ... global saved > | ... saved = callcc.new() > | ... print 'n==',n > | ... if n == 0: > | ... print 'Done!' > | ... else: > | ... thing (n-1) > | ... > | >>> thing (5) > | n== 5 > | n== 4 > | n== 3 > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> saved > | > | >>> saved.throw (0) > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> saved.throw (0) > | n== 2 > | n== 1 > | n== 0 > | Done! > | >>> Suppose the driver were in a script instead: thing(5) # line 1 print repr(saved) # line 2 saved.throw(0) # line 3 saved.throw(0) # line 4 Then the continuation would (eventually) "return to" the "print repr(saved)" and we'd get an infinite output tail of: Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! Continuation object at 80d30d0> n== 2 n== 1 n== 0 Done! ... and never reach line 4. Right? That's the part that Guido hates . takes-one-to-know-one-ly y'rs - tim From tismer at appliedbiometrics.com Mon May 17 09:07:22 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 17 May 1999 09:07:22 +0200 Subject: [Python-Dev] 'stackless' python? References: <000f01bea028$1f75c360$fb9e2299@tim> Message-ID: <373FC02A.69F2D912@appliedbiometrics.com> Tim Peters wrote: [to Sam] > The other point being that you want to avoid "inside out" logic, though, > right? Earlier you posted a kind of ideal: > > Recently I've written an async server that needed to talk to several > other RPC servers, and a mysql server. Pseudo-example, with > possibly-async calls in UPPERCASE: > > auth, archive = db.FETCH_USER_INFO (user) > if verify_login(user,auth): > rpc_server = self.archive_servers[archive] > group_info = rpc_server.FETCH_GROUP_INFO (group) > if valid (group_info): > return rpc_server.FETCH_MESSAGE (message_number) > else: > ... > else: > ... > > I assume you want to capture a continuation object in the UPPERCASE methods, > store it away somewhere, run off to your select/poll/whatever loop, and have > it invoke the stored continuation objects as the data they're waiting for > arrives. > > If so, that's got to be the nicest use for continuations I've seen! All > invisible to the end user. I don't know how to fake it pleasantly without > threads, either, and understand that threads aren't appropriate for resource > reasons. So I don't have a nice alternative. It can always be done with threads, but also without. Tried it last night, with proper refcounting, and it wasn't too easy since I had to duplicate the Python frame chain. ... > Suppose the driver were in a script instead: > > thing(5) # line 1 > print repr(saved) # line 2 > saved.throw(0) # line 3 > saved.throw(0) # line 4 > > Then the continuation would (eventually) "return to" the "print repr(saved)" > and we'd get an infinite output tail of: > > Continuation object at 80d30d0> > n== 2 > n== 1 > n== 0 > Done! > Continuation object at 80d30d0> > n== 2 > n== 1 > n== 0 > Done! This is at the moment exactly what happens, with the difference that after some repetitions we GPF due to dangling references to too often decref'ed objects. My incref'ing prepares for just one re-incarnation and should prevend a second call. But this will be solved, soon. > and never reach line 4. Right? That's the part that Guido hates . Yup. With a little counting, it was easy to survive: def main(): global a a=2 thing (5) a=a-1 if a: saved.throw (0) Weird enough and needs a much better interface. But finally I'm quite happy that it worked so smoothly after just a couple of hours (well, about six :) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing at nightmare.com Mon May 17 11:46:29 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 02:46:29 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <000f01bea028$1f75c360$fb9e2299@tim> References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> Message-ID: <14143.56604.21827.891993@seattle.nightmare.com> Tim Peters writes: > [Sam] > > Continuations are more powerful than coroutines, though I admit > > they're a bit esoteric. > > "More powerful" is a tedious argument you should always avoid . More powerful in the sense that you can use continuations to build lots of different control structures (coroutines, backtracking, exceptions), but not vice versa. Kinda like a better tool for blowing one's own foot off. 8^) > Suppose the driver were in a script instead: > > thing(5) # line 1 > print repr(saved) # line 2 > saved.throw(0) # line 3 > saved.throw(0) # line 4 > > Then the continuation would (eventually) "return to" the "print repr(saved)" > and we'd get an infinite output tail [...] > > and never reach line 4. Right? That's the part that Guido hates . Yes... the continuation object so far isn't very usable. It needs a driver of some kind around it. In the Scheme world, there are two common ways of using continuations - let/cc and call/cc. [call/cc is what is in the standard, it's official name is call-with-current-continuation] let/cc stores the continuation in a variable binding, while introducing a new scope. It requires a change to the underlying language: (+ 1 (let/cc escape (...) (escape 34))) => 35 'escape' is a function that when called will 'resume' with whatever follows the let/cc clause. In this case it would continue with the addition... call/cc is a little trickier, but doesn't require any change to the language... instead of making a new binding directly, you pass in a function that will receive the binding: (+ 1 (call/cc (lambda (escape) (...) (escape 34)))) => 35 In words, it's much more frightening: "call/cc is a function, that when called with a function as an argument, will pass that function an argument that is a new function, which when called with a value will resume the computation with that value as the result of the entire expression" Phew. In Python, an example might look like this: SAVED = None def save_continuation (k): global SAVED SAVED = k def thing(): [...] value = callcc (lambda k: save_continuation(k)) # or more succinctly: def thing(): [...] value = callcc (save_continuation) In order to do useful work like passing values back and forth between coroutines, we have to have some way of returning a value from the continuation when it is reinvoked. I should emphasize that most folks will never see call/cc 'in the raw', it will usually have some nice wrapper around to implement whatever construct is needed. -Sam From arw at ifu.net Mon May 17 20:06:18 1999 From: arw at ifu.net (Aaron Watters) Date: Mon, 17 May 1999 14:06:18 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads Message-ID: <37405A99.1DBAF399@ifu.net> The illustrious Sam Rushing avers: >Continuations are more powerful than coroutines, though I admit >they're a bit esoteric. I programmed in Scheme for years without >seeing the need for them. But when you need 'em, you *really* need >'em. No way around it. Frankly, I think I thought I understood this once but now I know I don't. How're continuations more powerful than coroutines? And why can't they be implemented using threads (and semaphores etc)? ...I'm not promising I'll understand the answer... -- Aaron Watters === I taught I taw a putty-cat! From gmcm at hypernet.com Mon May 17 21:18:43 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 17 May 1999 14:18:43 -0500 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37405A99.1DBAF399@ifu.net> Message-ID: <1285153546-166193857@hypernet.com> The estimable Aaron Watters queries: > The illustrious Sam Rushing avers: > >Continuations are more powerful than coroutines, though I admit > >they're a bit esoteric. I programmed in Scheme for years without > >seeing the need for them. But when you need 'em, you *really* need > >'em. No way around it. > > Frankly, I think I thought I understood this once but now I know I > don't. How're continuations more powerful than coroutines? And why > can't they be implemented using threads (and semaphores etc)? I think Sam's (immediate ) problem is that he can't afford threads - he may have hundreds to thousands of these suckers. As a fuddy-duddy old imperative programmer, I'm inclined to think "state machine". But I'd guess that functional-ophiles probably see that as inelegant. (Safe guess - they see _anything_ that isn't functional as inelegant!). crude-but-not-rude-ly y'rs - Gordon From jeremy at cnri.reston.va.us Mon May 17 20:43:34 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Mon, 17 May 1999 14:43:34 -0400 (EDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37405A99.1DBAF399@ifu.net> References: <37405A99.1DBAF399@ifu.net> Message-ID: <14144.24242.128959.726878@bitdiddle.cnri.reston.va.us> >>>>> "AW" == Aaron Watters writes: AW> The illustrious Sam Rushing avers: >> Continuations are more powerful than coroutines, though I admit >> they're a bit esoteric. I programmed in Scheme for years without >> seeing the need for them. But when you need 'em, you *really* >> need 'em. No way around it. AW> Frankly, I think I thought I understood this once but now I know AW> I don't. How're continuations more powerful than coroutines? AW> And why can't they be implemented using threads (and semaphores AW> etc)? I think I understood, too. I'm hoping that someone will debug my answer and enlighten us both. A continuation is a mechanism for making control flow explicit. A continuation is a means of naming and manipulating "the rest of the program." In Scheme terms, the continuation is the function that the value of the current expression should be passed to. The call/cc mechanisms lets you capture the current continuation and explicitly call on it. The most typical use of call/cc is non-local exits, but it gives you incredible flexibility for implementing your control flow. I'm fuzzy on coroutines, as I've only seen them in "Structure Programming" (which is as old as I am :-) and never actually used them. The basic idea is that when a coroutine calls another coroutine, control is transfered to the second coroutine at the point at which it last left off (by itself calling another coroutine or by detaching, which returns control to the lexically enclosing scope). It seems to me that coroutines are an example of the kind of control structure that you could build with continuations. It's not clear that the reverse is true. I have to admit that I'm a bit unclear on the motivation for all this. As Gordon said, the state machine approach seems like it would be a good approach. Jeremy From klm at digicool.com Mon May 17 21:08:57 1999 From: klm at digicool.com (Ken Manheimer) Date: Mon, 17 May 1999 15:08:57 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads Message-ID: <613145F79272D211914B0020AFF640190BEEDE@gandalf.digicool.com> Jeremy Hylton: > I have to admit that I'm a bit unclear on the motivation for all > this. As Gordon said, the state machine approach seems like it would > be a good approach. If i understand what you mean by state machine programming, it's pretty inherently uncompartmented, all the combinations of state variables need to be accounted for, so the number of states grows factorially on the number of state vars, in general it's awkward. The advantage of going with what functional folks come up with, like continuations, is that it tends to be well compartmented - functional. (Come to think of it, i suppose that compartmentalization as opposed to state is their mania.) As abstract as i can be (because i hardly know what i'm talking about) (but i have done some specifically finite state machine programming, and did not enjoy it), Ken klm at digicool.com From arw at ifu.net Mon May 17 21:20:13 1999 From: arw at ifu.net (Aaron Watters) Date: Mon, 17 May 1999 15:20:13 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <1285153546-166193857@hypernet.com> Message-ID: <37406BED.95AEB896@ifu.net> The ineffible Gordon McMillan retorts: > As a fuddy-duddy old imperative programmer, I'm inclined to think > "state machine". But I'd guess that functional-ophiles probably see > that as inelegant. (Safe guess - they see _anything_ that isn't > functional as inelegant!). As a fellow fuddy-duddy I'd agree except that if you write properlylayered software you have to unrole and rerole all those layers for every transition of the multi-level state machine, and even though with proper discipline it can be implemented without becoming hideous, it still adds significant overhead compared to "stop right here and come back later" which could be implemented using threads/coroutines(?)/continuations. I think this is particularly true in Python with the relatively high function call overhead. Or maybe I'm out in left field doing cartwheels... I guess the question of interest is why are threads insufficient? I guess they have system limitations on the number of threads or other limitations that wouldn't be a problem with continuations? If there aren't a *lot* of situations where coroutines are vital, I'd be hesitant to do major surgery. But I'm a fuddy-duddy. -- Aaron Watters === I did! I did! From tismer at appliedbiometrics.com Mon May 17 22:03:01 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 17 May 1999 22:03:01 +0200 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <1285153546-166193857@hypernet.com> <37406BED.95AEB896@ifu.net> Message-ID: <374075F5.F29B4EAB@appliedbiometrics.com> Aaron Watters wrote: > > The ineffible Gordon McMillan retorts: > > > As a fuddy-duddy old imperative programmer, I'm inclined to think > > "state machine". But I'd guess that functional-ophiles probably see > > that as inelegant. (Safe guess - they see _anything_ that isn't > > functional as inelegant!). > > As a fellow fuddy-duddy I'd agree except that if you write properlylayered > software you have to unrole and rerole all those layers for every > transition of the multi-level state machine, and even though with proper > discipline it can be implemented without becoming hideous, it still adds > significant overhead compared to "stop right here and come back later" > which could be implemented using threads/coroutines(?)/continuations. Coroutines are most elegant here, since (fir a simple example) they are a symmetric pair of functions which call each other. There is neither the one-pulls, the other pushes asymmetry, nor the need to maintain state and be controlled by a supervisor function. > I think this is particularly true in Python with the relatively high > function > call overhead. Or maybe I'm out in left field doing cartwheels... > I guess the question of interest is why are threads insufficient? I guess > they have system limitations on the number of threads or other limitations > that wouldn't be a problem with continuations? If there aren't a *lot* of > situations where coroutines are vital, I'd be hesitant to do major > surgery. For me (as always) most interesting is the possible speed of coroutines. They involve no threads overhead, no locking, no nothing. Python supports it better than expected. If the stack level of two code objects is the same at a switching point, the whole switch is nothing more than swapping two frame objects, and we're done. This might be even cheaper than general call/cc, like a function call. Sam's prototype works already, with no change to the interpreter (but knowledge of Python frames, and a .dll of course). I think we'll continue a while. continuously - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From gmcm at hypernet.com Tue May 18 00:17:25 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 17 May 1999 17:17:25 -0500 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <374075F5.F29B4EAB@appliedbiometrics.com> Message-ID: <1285142823-166838954@hypernet.com> Co-Christian-routines Tismer continues: > Aaron Watters wrote: > > > > The ineffible Gordon McMillan retorts: > > > > > As a fuddy-duddy old imperative programmer, I'm inclined to think > > > "state machine". But I'd guess that functional-ophiles probably see > > > that as inelegant. (Safe guess - they see _anything_ that isn't > > > functional as inelegant!). > > > > As a fellow fuddy-duddy I'd agree except that if you write properlylayered > > software you have to unrole and rerole all those layers for every > > transition of the multi-level state machine, and even though with proper > > discipline it can be implemented without becoming hideous, it still adds > > significant overhead compared to "stop right here and come back later" > > which could be implemented using threads/coroutines(?)/continuations. > > Coroutines are most elegant here, since (fir a simple example) > they are a symmetric pair of functions which call each other. > There is neither the one-pulls, the other pushes asymmetry, nor the > need to maintain state and be controlled by a supervisor function. Well, the state maintains you, instead of the other way 'round. (Any other ex-Big-Blue-ers out there that used to play these games with checkpoint and SyncSort?). I won't argue elegance. Just a couple points: - there's an art to writing state machines which is largely unrecognized (most of them are unnecessarily horrid). - a multiplexed solution (vs a threaded solution) requires that something be inside out. In one case it's your code, in the other, your understanding of the problem. Neither is trivial. Not to be discouraging - as long as your solution doesn't involve using regexps on bytecode , I say go for it! - Gordon From guido at CNRI.Reston.VA.US Tue May 18 06:03:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 18 May 1999 00:03:34 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: Your message of "Mon, 17 May 1999 02:46:29 PDT." <14143.56604.21827.891993@seattle.nightmare.com> References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> <14143.56604.21827.891993@seattle.nightmare.com> Message-ID: <199905180403.AAA04772@eric.cnri.reston.va.us> Sam (& others), I thought I understood what continuations were, but the examples of what you can do with them so far don't clarify the matter at all. Perhaps it would help to explain what a continuation actually does with the run-time environment, instead of giving examples of how to use them and what the result it? Here's a start of my own understanding (brief because I'm on a 28.8k connection which makes my ordinary typing habits in Emacs very painful). 1. All program state is somehow contained in a single execution stack. This includes globals (which are simply name bindings in the botton stack frame). It also includes a code pointer for each stack frame indicating where the function corresponding to that stack frame is executing (this is the return address if there is a newer stack frame, or the current instruction for the newest frame). 2. A continuation does something equivalent to making a copy of the entire execution stack. This can probably be done lazily. There are probably lots of details. I also expect that Scheme's semantic model is different than Python here -- e.g. does it matter whether deep or shallow copies are made? I.e. are there mutable *objects* in Scheme? (I know there are mutable and immutable *name bindings* -- I think.) 3. Calling a continuation probably makes the saved copy of the execution stack the current execution state; I presume there's also a way to pass an extra argument. 4. Coroutines (which I *do* understand) are probably done by swapping between two (or more) continuations. 5. Other control constructs can be done by various manipulations of continuations. I presume that in many situations the saved continuation becomes the main control locus permanently, and the (previously) current stack is simply garbage-collected. Of course the lazy copy makes this efficient. If this all is close enough to the truth, I think that continuations involving C stack frames are definitely out -- as Tim Peters mentioned, you don't know what the stuff on the C stack of extensions refers to. (My guess would be that Scheme implementations assume that any pointers on the C stack point to Scheme objects, so that C stack frames can be copied and conservative GC can be used -- this will never happen in Python.) Continuations involving only Python stack frames might be supported, if we can agree on the the sharing / copying semantics. This is where I don't know enough see questions at #2 above). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Tue May 18 06:46:12 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:46:12 -0400 Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <37406BED.95AEB896@ifu.net> Message-ID: <000901bea0e9$5aa2dec0$829e2299@tim> [Aaron Watters] > ... > I guess the question of interest is why are threads insufficient? I > guess they have system limitations on the number of threads or other > limitations that wouldn't be a problem with continuations? Sam is mucking with thousands of simultaneous I/O-bound socket connections, and makes a good case that threads simply don't fly here (each one consumes a stack, kernel resources, etc). It's unclear (to me) that thousands of continuations would be *much* better, though, by the time Christian gets done making thousands of copies of the Python stack chain. > If there aren't a *lot* of situations where coroutines are vital, I'd > be hesitant to do major surgery. But I'm a fuddy-duddy. Go to Sam's site (http://www.nightmare.com/), download Medusa, and read the docs. They're very well written and describe the problem space exquisitely. I don't have any problems like that I need to solve, but it's interesting to ponder! alas-no-time-for-it-now-ly y'rs - tim From tim_one at email.msn.com Tue May 18 06:45:52 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:45:52 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <373FC02A.69F2D912@appliedbiometrics.com> Message-ID: <000301bea0e9$4fd473a0$829e2299@tim> [Christian Tismer] > ... > Yup. With a little counting, it was easy to survive: > > def main(): > global a > a=2 > thing (5) > a=a-1 > if a: > saved.throw (0) Did "a" really need to be global here? I hope you see the same behavior without the "global a"; e.g., this Scheme: (define -cont- #f) (define thing (lambda (n) (if (= n 2) (call/cc (lambda (k) (set! -cont- k)))) (display "n == ") (display n) (newline) (if (= n 0) (begin (display "Done!") (newline)) (thing (- n 1))))) (define main (lambda () (let ((a 2)) (thing 5) (display "a is ") (display a) (newline) (set! a (- a 1)) (if (> a 0) (-cont- #f))))) (main) prints: n == 5 n == 4 n == 3 n == 2 n == 1 n == 0 Done! a is 2 n == 2 n == 1 n == 0 Done! a is 1 Or does brute-force frame-copying cause the continuation to set "a" back to 2 each time? > Weird enough Par for the continuation course! They're nasty when eaten raw. > and needs a much better interface. Ya, like screw 'em and use threads . > But finally I'm quite happy that it worked so smoothly > after just a couple of hours (well, about six :) Yup! Playing with Python internals is a treat. to-be-continued-ly y'rs - tim From tim_one at email.msn.com Tue May 18 06:45:57 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 00:45:57 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14143.56604.21827.891993@seattle.nightmare.com> Message-ID: <000401bea0e9$51e467e0$829e2299@tim> [Sam] >>> Continuations are more powerful than coroutines, though I admit >>> they're a bit esoteric. [Tim] >> "More powerful" is a tedious argument you should always avoid . [Sam] > More powerful in the sense that you can use continuations to build > lots of different control structures (coroutines, backtracking, > exceptions), but not vice versa. "More powerful" is a tedious argument you should always avoid >. >> Then the continuation would (eventually) "return to" the >> "print repr(saved)" and we'd get an infinite output tail [...] >> and never reach line 4. Right? > Yes... the continuation object so far isn't very usable. But it's proper behavior for a continuation all the same! So this aspect shouldn't be "fixed". > ... > let/cc stores the continuation in a variable binding, while > introducing a new scope. It requires a change to the underlying > language: Isn't this often implemented via a macro, though, so that (let/cc name code) "acts like" (call/cc (lambda (name) code)) ? I haven't used a Scheme with native let/cc, but poking around it appears that the real intent is to support exception-style function exits with a mechanism cheaper than 1st-class continuations: twice saw the let/cc object (the thingie bound to "name") defined as being invalid the instant after "code" returns, so it's an "up the call stack" gimmick. That doesn't sound powerful enough for what you're after. > [nice let/cc call/cc tutorialette] > ... > In order to do useful work like passing values back and forth between > coroutines, we have to have some way of returning a value from the > continuation when it is reinvoked. Somehow, I suspect that's the least of our problems <0.5 wink>. If continuations are in Python's future, though, I agree with the need as stated. > I should emphasize that most folks will never see call/cc 'in the > raw', it will usually have some nice wrapper around to implement > whatever construct is needed. Python already has well-developed exception and thread facilities, so it's hard to make a case for continuations as a catch-all implementation mechanism. That may be the rub here: while any number of things *can* be implementated via continuations, I think very few *need* to be implemented that way, and full-blown continuations aren't easy to implement efficiently & portably. The Icon language was particularly concerned with backtracking searches, and came up with generators as another clearer/cheaper implementation technique. When it went on to full-blown coroutines, it's hard to say whether continuations would have been a better approach. But the coroutine implementation it has is sluggish and buggy and hard to port, so I doubt they could have done noticeably worse. Would full-blown coroutines be powerful enough for your needs? assuming-the-practical-defn-of-"powerful-enough"-ly y'rs - tim From rushing at nightmare.com Tue May 18 07:18:06 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 22:18:06 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <000401bea0e9$51e467e0$829e2299@tim> References: <14143.56604.21827.891993@seattle.nightmare.com> <000401bea0e9$51e467e0$829e2299@tim> Message-ID: <14144.61765.308962.101884@seattle.nightmare.com> Tim Peters writes: > Isn't this often implemented via a macro, though, so that > > (let/cc name code) > > "acts like" > > (call/cc (lambda (name) code)) Yup, they're equivalent, in the sense that given one you can make a macro to do the other. call/cc is preferred because it doesn't require a new binding construct. > ? I haven't used a Scheme with native let/cc, but poking around it > appears that the real intent is to support exception-style function > exits with a mechanism cheaper than 1st-class continuations: twice > saw the let/cc object (the thingie bound to "name") defined as > being invalid the instant after "code" returns, so it's an "up the > call stack" gimmick. That doesn't sound powerful enough for what > you're after. Except that since the escape procedure is 'first-class' it can be stored away and invoked (and reinvoked) later. [that's all that 'first-class' means: a thing that can be stored in a variable, returned from a function, used as an argument, etc..] I've never seen a let/cc that wasn't full-blown, but it wouldn't surprise me. > The Icon language was particularly concerned with backtracking > searches, and came up with generators as another clearer/cheaper > implementation technique. When it went on to full-blown > coroutines, it's hard to say whether continuations would have been > a better approach. But the coroutine implementation it has is > sluggish and buggy and hard to port, so I doubt they could have > done noticeably worse. Many Scheme implementors either skip it, or only support non-escaping call/cc (i.e., exceptions in Python). > Would full-blown coroutines be powerful enough for your needs? Yes, I think they would be. But I think with Python it's going to be just about as hard, either way. -Sam From rushing at nightmare.com Tue May 18 07:48:29 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 22:48:29 -0700 (PDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <51325225@toto.iv> Message-ID: <14144.63787.502454.111804@seattle.nightmare.com> Aaron Watters writes: > Frankly, I think I thought I understood this once but now I know I > don't. 8^) That's what I said when I backed into the idea via medusa a couple of years ago. > How're continuations more powerful than coroutines? And why can't > they be implemented using threads (and semaphores etc)? My understanding of the original 'coroutine' (from Pascal?) was that it allows two procedures to 'resume' each other. The classic coroutine example is the 'samefringe' problem: given two trees of differing structure, are they equal in the sense that a traversal of the leaves results in the same list? Coroutines let you do this efficiently, comparing leaf-by-leaf without storing the whole tree. continuations can do coroutines, but can also be used to implement backtracking, exceptions, threads... probably other stuff I've never heard of or needed. The reason that Scheme and ML are such big fans of continuations is because they can be used to implement all these other features. Look at how much try/except and threads complicate other language implementations. It's like a super-tool-widget - if you make sure it's in your toolbox, you can use it to build your circular saw and lathe from scratch. Unfortunately there aren't many good sites on the web with good explanatory material. The best reference I have is "Essentials of Programming Languages". For those that want to play with some of these ideas using little VM's written in Python: http://www.nightmare.com/software.html#EOPL -Sam From rushing at nightmare.com Tue May 18 07:56:37 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Mon, 17 May 1999 22:56:37 -0700 (PDT) Subject: [Python-Dev] coroutines vs. continuations vs. threads In-Reply-To: <13631823@toto.iv> Message-ID: <14144.65355.400281.123856@seattle.nightmare.com> Jeremy Hylton writes: > I have to admit that I'm a bit unclear on the motivation for all > this. As Gordon said, the state machine approach seems like it would > be a good approach. For simple problems, state machines are ideal. Medusa uses state machines that are built out of Python methods. But past a certain level of complexity, they get too hairy to understand. A really good example can be found in /usr/src/linux/net/ipv4. 8^) -Sam From rushing at nightmare.com Tue May 18 09:05:20 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Tue, 18 May 1999 00:05:20 -0700 (PDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <60057226@toto.iv> Message-ID: <14145.927.588572.113256@seattle.nightmare.com> Guido van Rossum writes: > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? This helped me a lot, and is the angle used in "Essentials of Programming Languages": Usually when folks refer to a 'stack', they're refering to an *implementation* of the stack data type: really an optimization that assumes an upper bound on stack size, and that things will only be pushed and popped in order. If you were to implement a language's variable and execution stacks with actual data structures (linked lists), then it's easy to see what's needed: the head of the list represents the current state. As functions exit, they pop things off the list. The reason I brought this up (during a lull!) was that Python is already paying all of the cost of heap-allocated frames, and it didn't seem to me too much of a leap from there. > 1. All program state is somehow contained in a single execution stack. Yup. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. Yup. > I.e. are there mutable *objects* in Scheme? > (I know there are mutable and immutable *name bindings* -- I think.) Yes, Scheme is pro-functional... but it has arrays, i/o, and set-cdr!, all the things that make it 'impure'. I think shallow copies are what's expected. In the examples I have, the continuation is kept in a 'register', and call/cc merely packages it up with a little function wrapper. You are allowed to stomp all over lexical variables with "set!". > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. Yup. > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. Yup. Here's an example in Scheme: http://www.nightmare.com/stuff/samefringe.scm Somewhere I have an example of coroutines being used for parsing, very elegant. Something like one coroutine does lexing, and passes tokens one-by-one to the next level, which passes parsed expressions to a compiler, or whatever. Kinda like pipes. > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course > the lazy copy makes this efficient. Yes... I think backtracking would be an example of this. You're doing a search on a large space (say a chess game). After a certain point you want to try a previous fork, to see if it's promising, but you don't want to throw away your current work. Save it, then unwind back to the previous fork, try that option out... if it turns out to be better then toss the original. > If this all is close enough to the truth, I think that > continuations involving C stack frames are definitely out -- as Tim > Peters mentioned, you don't know what the stuff on the C stack of > extensions refers to. (My guess would be that Scheme > implementations assume that any pointers on the C stack point to > Scheme objects, so that C stack frames can be copied and > conservative GC can be used -- this will never happen in Python.) I think you're probably right here - usually there are heavy restrictions on what kind of data can pass through the C interface. But I know of at least one Scheme (mzscheme/PLT) that uses conservative gc and has c/c++ interfaces. [... dig dig ...] From rushing at nightmare.com Tue May 18 09:17:11 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Tue, 18 May 1999 00:17:11 -0700 (PDT) Subject: [Python-Dev] another good motivation Message-ID: <14145.4917.164756.300678@seattle.nightmare.com> "Escaping the event loop: an alternative control structure for multi-threaded GUIs" http://cs.nyu.edu/phd_students/fuchs/ http://cs.nyu.edu/phd_students/fuchs/gui.ps -Sam From tismer at appliedbiometrics.com Tue May 18 15:46:53 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 15:46:53 +0200 Subject: [Python-Dev] coroutines vs. continuations vs. threads References: <000901bea0e9$5aa2dec0$829e2299@tim> Message-ID: <37416F4D.8E95D71A@appliedbiometrics.com> Tim Peters wrote: > > [Aaron Watters] > > ... > > I guess the question of interest is why are threads insufficient? I > > guess they have system limitations on the number of threads or other > > limitations that wouldn't be a problem with continuations? > > Sam is mucking with thousands of simultaneous I/O-bound socket connections, > and makes a good case that threads simply don't fly here (each one consumes > a stack, kernel resources, etc). It's unclear (to me) that thousands of > continuations would be *much* better, though, by the time Christian gets > done making thousands of copies of the Python stack chain. Well, what he needs here are coroutines and just a single frame object for every minithread (I think this is a "fiber"?). If these fibers later do deep function calls before they switch, there will of course be more frames then. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Tue May 18 16:35:30 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 16:35:30 +0200 Subject: [Python-Dev] 'stackless' python? References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim> <14143.56604.21827.891993@seattle.nightmare.com> <199905180403.AAA04772@eric.cnri.reston.va.us> Message-ID: <37417AB2.80920595@appliedbiometrics.com> Guido van Rossum wrote: > > Sam (& others), > > I thought I understood what continuations were, but the examples of > what you can do with them so far don't clarify the matter at all. > > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? > > Here's a start of my own understanding (brief because I'm on a 28.8k > connection which makes my ordinary typing habits in Emacs very > painful). > > 1. All program state is somehow contained in a single execution stack. > This includes globals (which are simply name bindings in the botton > stack frame). It also includes a code pointer for each stack frame > indicating where the function corresponding to that stack frame is > executing (this is the return address if there is a newer stack frame, > or the current instruction for the newest frame). Right. For now, this information is on the C stack for each called function, although almost completely available in the frame chain. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. This can probably be done lazily. There are > probably lots of details. I also expect that Scheme's semantic model > is different than Python here -- e.g. does it matter whether deep or > shallow copies are made? I.e. are there mutable *objects* in Scheme? > (I know there are mutable and immutable *name bindings* -- I think.) To make it lazy, a gatekeeper must be put on top of the two splitted frames, which catches the event that one of them returns. It appears to me that this it the same callcc.new() object which catches this, splitting frames when hit by a return. > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. > > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. Right, which is just two or three assignments. > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course the > lazy copy makes this efficient. Yes, great. It looks like that switching continuations is not more expensive than a single Python function call. > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). This would mean to avoid creating incompatible continuations. A continutation may not switch to a frame chain which was created by a different VM incarnation since this would later on corrupt the machine stack. One way to assure that would be a thread-safe function in sys, similar to sys.exc_info() which gives an id for the current interpreter. continuations living somewhere in globals would be marked by the interpreter which created them, and reject to be thrown if they don't match. The necessary interpreter support appears to be small: Extend the PyFrame structure by two fields: - interpreter ID (addr of some local variable would do) - stack pointer at current instruction. Change the CALL_FUNCTION opcode to avoid calling eval recursively in the case of a Python function/method, but the current frame, build the new one and start over. RETURN will pop a frame and reload its local variables instead of returning, as long as there is a frame to pop. I'm unclear how exceptions should be handled. Are they currently propagated up across different C calls other than ceval2 recursions? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Tue May 18 17:05:39 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Tue, 18 May 1999 11:05:39 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14145.927.588572.113256@seattle.nightmare.com> References: <60057226@toto.iv> <14145.927.588572.113256@seattle.nightmare.com> Message-ID: <14145.33150.767551.472591@bitdiddle.cnri.reston.va.us> >>>>> "SR" == rushing writes: SR> Somewhere I have an example of coroutines being used for SR> parsing, very elegant. Something like one coroutine does SR> lexing, and passes tokens one-by-one to the next level, which SR> passes parsed expressions to a compiler, or whatever. Kinda SR> like pipes. This is the first example that's used in Structured Programming (Dahl, Djikstra, and Hoare). I'd be happy to loan a copy to any of the Python-dev people who sit nearby. Jeremy From tismer at appliedbiometrics.com Tue May 18 17:31:11 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 17:31:11 +0200 Subject: [Python-Dev] 'stackless' python? References: <000301bea0e9$4fd473a0$829e2299@tim> Message-ID: <374187BF.36CC65E7@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > Yup. With a little counting, it was easy to survive: > > > > def main(): > > global a > > a=2 > > thing (5) > > a=a-1 > > if a: > > saved.throw (0) > > Did "a" really need to be global here? I hope you see the same behavior > without the "global a"; e.g., this Scheme: (H?stel) Actually, I inserted the "global" later. It worked as well with a local variable, but I didn't understand it. Still don't :-) > Or does brute-force frame-copying cause the continuation to set "a" back to > 2 each time? No, it doesn't. Behavior is exactly the same with or without global. I'm not sure wether this is a bug or a feature. I *think* 'a' as a local has a slot in the frame, so it's actually a different 'a' living in both copies. But this would not have worked. Can it be that before a function call, the interpreter turns its locals into a dict, using fast_to_locals? That would explain it. This is not what I think it should be! Locals need to be copied. > > and needs a much better interface. > > Ya, like screw 'em and use threads . Never liked threads. These fibers are so neat since they don't need threads, no locking, and they are available on systems without threads. > > But finally I'm quite happy that it worked so smoothly > > after just a couple of hours (well, about six :) > > Yup! Playing with Python internals is a treat. > > to-be-continued-ly y'rs - tim throw(42) - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip at mojam.com Tue May 18 17:49:42 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 18 May 1999 11:49:42 -0400 Subject: [Python-Dev] Is there another way to solve the continuation problem? Message-ID: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com> Okay, from my feeble understanding of the problem it appears that coroutines/continuations and threads are going to be problematic at best for Sam's needs. Are there other "solutions"? We know about state machines. They have the problem that the number of states grows exponentially (?) as the number of state variables increases. Can exceptions be coerced into providing the necessary structure without botching up the application too badly? Seems that at some point where you need to do some I/O, you could raise an exception whose second expression contains the necessary state to get back to where you need to be once the I/O is ready to go. The controller that catches the exceptions would use select or poll to prepare for the I/O then dispatch back to the handlers using the information from exceptions. class IOSetup: pass class WaveHands: """maintains exception raise info and selects one to go to next""" def choose_one(r,w,e): pass def remember(info): pass def controller(...): waiters = WaveHands() while 1: r, w, e = select([...], [...], [...]) # using r,w,e, select a waiter to call func, place = waiters.choose_one(r,w,e) try: func(place) except IOSetup, info: waiters.remember(info) def spam_func(place): if place == "spam": # whatever I/O we needed to do is ready to go bytes = read(some_fd) process(bytes) # need to read some more from some_fd. args are: # function, target, fd category (r, w), selectable object, raise IOSetup, (spam_func, "eggs" , "r", some_fd) elif place == "eggs": # that next chunk is ready - get it and proceed... elif yadda, yadda, yadda... One thread, some craftiness needed to construct things. Seems like it might isolate some of the statefulness to smaller functional units than a pure state machine. Clearly not as clean as continuations would be. Totally bogus? Totally inadequate? Maybe Sam already does things this way? Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer at appliedbiometrics.com Tue May 18 19:23:08 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 18 May 1999 19:23:08 +0200 Subject: [Python-Dev] 'stackless' python? References: <000301bea0e9$4fd473a0$829e2299@tim> Message-ID: <3741A1FC.E84DC926@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > Yup. With a little counting, it was easy to survive: > > > > def main(): > > global a > > a=2 > > thing (5) > > a=a-1 > > if a: > > saved.throw (0) > > Did "a" really need to be global here? I hope you see the same behavior > without the "global a"; e.g., this Scheme: Actually, the frame-copying was not enough to make this all behave correctly. Since I didn't change the interpreter, the ceval.c incarnations still had copies to the old frames. The only effect which I achieved with frame copying was that the refcounts were increased correctly. I have to remove the hardware stack copying now. Will try to create a non-recursive version of the interpreter. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From MHammond at skippinet.com.au Wed May 19 01:16:54 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 19 May 1999 09:16:54 +1000 Subject: [Python-Dev] Is there another way to solve the continuation problem? In-Reply-To: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com> Message-ID: <006d01bea184$869f1480$0801a8c0@bobcat> > Sam's needs. Are there other "solutions"? We know about > state machines. > They have the problem that the number of states grows > exponentially (?) as > the number of state variables increases. Well, I can give you my feeble understanding of "IO Completion Ports", the technique Win32 provides to "solve" this problem. My experience is limited to how we used these in a server product designed to maintain thousands of long-term client connections each spooling large chunks of data (MSOffice docs - yes, that large :-). We too could obviously not afford a thread per connection. Searching through NT's documentation, completion ports are the technique they recommend for high-performance IO, and it appears to deliver. NT has the concept of a completion port, which in many ways is like an "inverted semaphore". You create a completion port with a "max number of threads" value. Then, for every IO object you need to use (files, sockets, pipes etc) you "attach" it to the completion port, along with an integer key. This key is (presumably) unique to the file, and usually a pointer to some structure maintaing the state of the file (ie, connection) The general programming model is that you have a small number of threads (possibly 1), and a large number of io objects (eg files). Each of these threads is executing a state machine. When IO is "ready" for a particular file, one of the available threads is woken, and passed the "key" associated with the file. This key identifies the file, and more importantly the state of that file. The thread uses the state to perform the next IO operation, then immediately go back to sleep. When that IO operation completes, some other thread is woken to handle that state change. What makes this work of course is that _all_ IO is asynch - not a single IO call in this whole model can afford to block. NT provides asynch IO natively. This sounds very similar to what Medusa does internally, although the NT model provides a "thread pooling" scheme built-in. Although our server performed very well with a single thread and hundreds of high-volume connections, we chose to run with a default of 5 threads here. For those still interested, our project has the multi-threaded state machine I described above implemented in C. Most of the work is responsible for spooling the client request data (possibly 100s of kbs) before handing that data off to the real server. When the C code transitions the client through the state of "send/get from the real server", we actually set a different completion port. This other completion port wakes a thread written in Python. So our architecture consists of a C implemented thread-pool managing client connections, and a different Python implemented thread pool that does the real work for each of these client connections. (The Python side of the world is bound by the server we are talking to, so Python performance doesnt matter as much - C wouldnt buy enough) This means that our state machines are not that complex. Each "thread pool" is managing its own, fairly simple state. NT automatically allows you to associate state with the IO object, and as we have multiple thread pools, each one is simple - the one spooling client data is simple, the one doing the actual server work is simple. If we had to have a single, monolithic state machine managing all aspects of the client spooling, _and_ the server work, it would be horrid. This is all in a shrink-wrapped relatively cheap "Document Management" product being targetted (successfully, it appears) at huge NT/Exchange based sites. Australia's largest Telco are implementing it, and indeed the company has VC from Intel! Lots of support from MS, as it helps compete with Domino. Not bad for a little startup - now they are wondering what to do with this Python-thingy they now have in their product that noone else has ever heard off; but they are planning on keeping it for now :-) [Funnily, when they started, they didnt think they even _needed_ a server, so I said "Ill just knock up a little one in Python", and we havent looked back :-] Mark. From tim_one at email.msn.com Wed May 19 02:48:00 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 18 May 1999 20:48:00 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <199905180403.AAA04772@eric.cnri.reston.va.us> Message-ID: <000701bea191$3f4d1a20$2e9e2299@tim> [GvR] > ... > Perhaps it would help to explain what a continuation actually does > with the run-time environment, instead of giving examples of how to > use them and what the result it? Paul Wilson (the GC guy) has a very nice-- but incomplete --intro to Scheme and its implementation: ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html You can pick up a lot from that fast. Is Steven (Majewski) on this list? He doped most of this out years ago. > Here's a start of my own understanding (brief because I'm on a 28.8k > connection which makes my ordinary typing habits in Emacs very > painful). > > 1. All program state is somehow contained in a single execution stack. > This includes globals (which are simply name bindings in the botton > stack frame). Better to think of name resolution following lexical links. Lexical closures with indefinite extent are common in Scheme, so much so that name resolution is (at least conceptually) best viewed as distinct from execution stacks. Here's a key: continuations are entirely about capturing control flow state, and nothing about capturing binding or data state. Indeed, mutating bindings and/or non-local data are the ways distinct invocations of a continuation communicate with each other, and for this reason true functional languages generally don't support continuations of the call/cc flavor. > It also includes a code pointer for each stack frame indicating where > the function corresponding to that stack frame is executing (this is > the return address if there is a newer stack frame, or the current > instruction for the newest frame). Yes, although the return address is one piece of information in the current frame's continuation object -- continuations are used internally for "regular calls" too. When a function returns, it passes control thru its continuation object. That process restores-- from the continuation object --what the caller needs to know (in concept: a pointer to *its* continuation object, its PC, its name-resolution chain pointer, and its local eval stack). Another key point: a continuation object is immutable. > 2. A continuation does something equivalent to making a copy of the > entire execution stack. This can probably be done lazily. There are > probably lots of details. The point of the above is to get across that for Scheme-calling-Scheme, creating a continuation object copies just a small, fixed number of pointers (the current continuation pointer, the current name-resolution chain pointer, the PC), plus the local eval stack. This is for a "stackless" interpreter that heap-allocates name-mapping and execution-frame and continuation objects. Half the literature is devoted to optimizing one or more of those away in special cases (e.g., for continuations provably "up-level", using a stack + setjmp/longjmp instead). > I also expect that Scheme's semantic model is different than Python > here -- e.g. does it matter whether deep or shallow copies are made? > I.e. are there mutable *objects* in Scheme? (I know there are mutable > and immutable *name bindings* -- I think.) Same as Python here; Scheme isn't a functional language; has mutable bindings and mutable objects; any copies needed should be shallow, since it's "a feature" that invoking a continuation doesn't restore bindings or object values (see above re communication). > 3. Calling a continuation probably makes the saved copy of the > execution stack the current execution state; I presume there's also a > way to pass an extra argument. Right, except "stack" is the wrong mental model in the presence of continuations; it's a general rooted graph (A calls B, B saves a continuation pointing back to A, B goes on to call A, A saves a continuation pointing back to B, etc). If the explicitly saved continuations are never *invoked*, control will eventually pop back to the root of the graph, so in that sense there's *a* stack implicit at any given moment. > 4. Coroutines (which I *do* understand) are probably done by swapping > between two (or more) continuations. > > 5. Other control constructs can be done by various manipulations of > continuations. I presume that in many situations the saved > continuation becomes the main control locus permanently, and the > (previously) current stack is simply garbage-collected. Of course the > lazy copy makes this efficient. There's much less copying going on in Scheme-to-Scheme than you might think; other than that, right on. > If this all is close enough to the truth, I think that continuations > involving C stack frames are definitely out -- as Tim Peters > mentioned, you don't know what the stuff on the C stack of extensions > refers to. (My guess would be that Scheme implementations assume that > any pointers on the C stack point to Scheme objects, so that C stack > frames can be copied and conservative GC can be used -- this will > never happen in Python.) "Scheme" has become a generic term covering dozens of implementations with varying semantics, and a quick tour of the web suggests that cross-language Schemes generally put severe restrictions on continuations across language boundaries. Most popular seems to be to outlaw them by decree. > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). I'd like to go back to examples of what they'd be used for -- but fully fleshed out. In the absence of Scheme's ubiquitous lexical closures and "lambdaness" and syntax-extension facilities, I'm unsure they're going to work out reasonably in Python practice; it's not enough that they can be very useful in Scheme, and Sam is highly motivated to go to extremes here. give-me-a-womb-and-i-still-won't-give-birth-ly y'rs - tim From tismer at appliedbiometrics.com Wed May 19 03:10:15 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 03:10:15 +0200 Subject: [Python-Dev] 'stackless' python? References: <000701bea191$3f4d1a20$2e9e2299@tim> Message-ID: <37420F77.48E9940F@appliedbiometrics.com> Tim Peters wrote: ... > > Continuations involving only Python stack frames might be supported, > > if we can agree on the the sharing / copying semantics. This is where > > I don't know enough see questions at #2 above). > > I'd like to go back to examples of what they'd be used for -- but > fully fleshed out. In the absence of Scheme's ubiquitous lexical closures > and "lambdaness" and syntax-extension facilities, I'm unsure they're going > to work out reasonably in Python practice; it's not enough that they can be > very useful in Scheme, and Sam is highly motivated to go to extremes here. > > give-me-a-womb-and-i-still-won't-give-birth-ly y'rs - tim I've put quite many hours into a non-recursive ceval.c already. Should I continue? At least this would be a little improvement, also if the continuation thing will not be born. ? - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From rushing at nightmare.com Wed May 19 04:52:04 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Tue, 18 May 1999 19:52:04 -0700 (PDT) Subject: [Python-Dev] Is there another way to solve the continuation problem? In-Reply-To: <101382377@toto.iv> Message-ID: <14146.8395.754509.591141@seattle.nightmare.com> Skip Montanaro writes: > Can exceptions be coerced into providing the necessary structure > without botching up the application too badly? Seems that at some > point where you need to do some I/O, you could raise an exception > whose second expression contains the necessary state to get back to > where you need to be once the I/O is ready to go. The controller > that catches the exceptions would use select or poll to prepare for > the I/O then dispatch back to the handlers using the information > from exceptions. > [... code ...] Well, you just re-invented the 'Reactor' pattern! 8^) http://www.cs.wustl.edu/~schmidt/patterns-ace.html > One thread, some craftiness needed to construct things. Seems like > it might isolate some of the statefulness to smaller functional > units than a pure state machine. Clearly not as clean as > continuations would be. Totally bogus? Totally inadequate? Maybe > Sam already does things this way? What you just described is what Medusa does (well, actually, 'Python' does it now, because the two core libraries that implement this are now in the library - asyncore.py and asynchat.py). asyncore doesn't really use exceptions exactly that way, and asynchat allows you to add another layer of processing (basically, dividing the input into logical 'lines' or 'records' depending on a 'line terminator'). The same technique is at the heart of many well-known network servers, including INND, BIND, X11, Squid, etc.. It's really just a state machine underneath (with python functions or methods implementing the 'states'). As long as things don't get too complex. Python simplifies things enough to allow one to 'push the difficulty envelope' a bit further than one could reasonably tolerate in C. For example, Squid implements async HTTP (server and client, because it's a proxy) - but stops short of trying to implement async FTP. Medusa implements async FTP, but it's the largest file in the Medusa distribution, weighing in at a hefty 32KB. The hard part comes when you want to plug different pieces and protocols together. For example, building a simple HTTP or FTP server is relatively easy, but building an HTTP server *that proxied to an FTP server* is much more difficult. I've done these kinds of things, viewing each as a challenge; but past a certain point it boggles. The paper I posted about earlier by Matthew Fuchs has a really good explanation of this, but in the context of GUI event loops... I think it ties in neatly with this discussion because at the heart of any X11 app is a little guy manipulating a file descriptor. -Sam From tim_one at email.msn.com Wed May 19 07:41:39 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:39 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <14144.61765.308962.101884@seattle.nightmare.com> Message-ID: <000b01bea1ba$443a1a00$2e9e2299@tim> [Sam] > ... > Except that since the escape procedure is 'first-class' it can be > stored away and invoked (and reinvoked) later. [that's all that > 'first-class' means: a thing that can be stored in a variable, > returned from a function, used as an argument, etc..] > > I've never seen a let/cc that wasn't full-blown, but it wouldn't > surprise me. The let/cc's in question were specifically defined to create continuations valid only during let/cc's dynamic extent, so that, sure, you could store them away, but trying to invoke one later could be an error. It's in that sense I meant they weren't "first class". Other flavors of Scheme appear to call this concept "weak continuation", and use a different verb to invoke it (like call-with-escaping-continuation, or call/ec). Suspect the let/cc oddballs I found were simply confused implementations (there are a lot of amateur Scheme implementations out there!). >> Would full-blown coroutines be powerful enough for your needs? > Yes, I think they would be. But I think with Python it's going to > be just about as hard, either way. Most people on this list are comfortable with coroutines already because they already understand them -- Jeremy can even reach across the hall and hand Guido a helpful book . So pondering coroutines increase the number of brain cells willing to think about the implementation. continuation-examples-leave-people-still-going-"huh?"-after-an- hour-of-explanation-ly y'rs - tim From tim_one at email.msn.com Wed May 19 07:41:45 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:45 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3741A1FC.E84DC926@appliedbiometrics.com> Message-ID: <000e01bea1ba$47fe7500$2e9e2299@tim> [Christian Tismer] >>> ... >>> Yup. With a little counting, it was easy to survive: >>> >>> def main(): >>> global a >>> a=2 >>> thing (5) >>> a=a-1 >>> if a: >>> saved.throw (0) [Tim] >> Did "a" really need to be global here? I hope you see the same behavior >> without the "global a"; [which he does, but for mysterious reasons] [Christian] > Actually, the frame-copying was not enough to make this > all behave correctly. Since I didn't change the interpreter, > the ceval.c incarnations still had copies to the old frames. > The only effect which I achieved with frame copying was > that the refcounts were increased correctly. All right! Now you're closer to the real solution ; i.e., copying wasn't really needed here, but keeping stuff alive was. In Scheme terms, when we entered main originally a set of bindings was created for its locals, and it is that very same set of bindings to which the continuation returns. So the continuation *should* reuse them -- making a copy of the locals is semantically hosed. This is clearer in Scheme because its "stack" holds *only* control-flow info (bindings follow a chain of static links, independent of the current "call stack"), so there's no temptation to run off copying bindings too. elegant-and-baffling-for-the-price-of-one-ly y'rs - tim From tim_one at email.msn.com Wed May 19 07:41:56 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 01:41:56 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37420F77.48E9940F@appliedbiometrics.com> Message-ID: <001301bea1ba$4eb498c0$2e9e2299@tim> [Christian Tismer] > I've put quite many hours into a non-recursive ceval.c > already. Does that mean 6 or 600 ? > Should I continue? At least this would be a little improvement, also > if the continuation thing will not be born. ? Guido wanted to move in the "flat interpreter" direction for Python2 anyway, so my belief is it's worth pursuing. but-then-i-flipped-a-coin-with-two-heads-ly y'rs - tim From arw at ifu.net Wed May 19 15:04:53 1999 From: arw at ifu.net (Aaron Watters) Date: Wed, 19 May 1999 09:04:53 -0400 Subject: [Python-Dev] continuations and C extensions? Message-ID: <3742B6F5.C6CB7313@ifu.net> the immutable GvR intones: > Continuations involving only Python stack frames might be supported, > if we can agree on the the sharing / copying semantics. This is where > I don't know enough see questions at #2 above). What if there are native C calls mixed in (eg, list.sort calls back to myclass.__cmp__ which decides to do a call/cc). One of the really big advantages of Python in my book is the relative simplicity of embedding and extensions, and this is generally one of the failings of lisp implementations. I understand lots of scheme implementations purport to be extendible and embeddable, but in practice you can't do it with *existing* code -- there is always a show stopper involving having to change the way some Oracle library which you don't have the source for does memory management or something... I've known several grad students who have been bitten by this... I think having to unroll the C stack safely might be one problem area. With, eg, a netscape nsapi embedding you can actually get into netscape code calls my code calls netscape code calls my code... suspends in a continuation? How would that work? [my ignorance is torment!] Threading and extensions are probably also problematic, but at least it's better understood, I think. Just kvetching. Sorry. -- Aaron Watters ps: Of course there are valid reasons and excellent advantages to having continuations, but it's also interesting to consider the possible cost. There ain't no free lunch. From tismer at appliedbiometrics.com Wed May 19 21:30:18 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 21:30:18 +0200 Subject: [Python-Dev] 'stackless' python? References: <000e01bea1ba$47fe7500$2e9e2299@tim> Message-ID: <3743114A.220FFA0B@appliedbiometrics.com> Tim Peters wrote: ... > [Christian] > > Actually, the frame-copying was not enough to make this > > all behave correctly. Since I didn't change the interpreter, > > the ceval.c incarnations still had copies to the old frames. > > The only effect which I achieved with frame copying was > > that the refcounts were increased correctly. > > All right! Now you're closer to the real solution ; i.e., copying > wasn't really needed here, but keeping stuff alive was. In Scheme terms, > when we entered main originally a set of bindings was created for its > locals, and it is that very same set of bindings to which the continuation > returns. So the continuation *should* reuse them -- making a copy of the > locals is semantically hosed. I tried the most simple thing, and this seemed to be duplicating the current state of the machine. The frame holds the stack, and references to all objects. By chance, the locals are not in a dict, but unpacked into the frame. (Sometimes I agree with Guido, that optimization is considered harmful :-) > This is clearer in Scheme because its "stack" holds *only* control-flow info > (bindings follow a chain of static links, independent of the current "call > stack"), so there's no temptation to run off copying bindings too. The Python stack, besides its intermingledness with the machine stack, is basically its chain of frames. The value stack pointer still hides in the machine stack, but that's easy to change. So the real Scheme-like part is this chain, methinks, with the current bytecode offset and value stack info. Making a copy of this in a restartable way means to increase the refcount of all objects in a frame. Would it be correct to undo the effect of fast locals before splitting, and redoing it on activation? Or do I need to rethink the whole structure? What should be natural for Python, it at all? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Wed May 19 21:46:49 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 19 May 1999 15:46:49 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> References: <000e01bea1ba$47fe7500$2e9e2299@tim> <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: [Tim Peters] >> This is clearer in Scheme because its "stack" holds *only* >> control-flow info (bindings follow a chain of static links, >> independent of the current "call stack"), so there's no >> temptation to run off copying bindings too. CT> The Python stack, besides its intermingledness with the machine CT> stack, is basically its chain of frames. The value stack pointer CT> still hides in the machine stack, but that's easy to change. So CT> the real Scheme-like part is this chain, methinks, with the CT> current bytecode offset and value stack info. CT> Making a copy of this in a restartable way means to increase the CT> refcount of all objects in a frame. Would it be correct to undo CT> the effect of fast locals before splitting, and redoing it on CT> activation? Wouldn't it be easier to increase the refcount on the frame object? Then you wouldn't need to worry about the recounts on all the objects in the frame, because they would only be decrefed when the frame is deallocated. It seems like the two other things you would need are some way to get a copy of the current frame and a means to invoke eval_code2 with an already existing stack frame instead of a new one. (This sounds too simple, so it's obviously wrong. I'm just not sure where. Is the problem that you really need a seperate stack/graph to hold the frames? If we leave them on the Python stack, it could be hard to dis-entangle value objects from control objects.) Jeremy From tismer at appliedbiometrics.com Wed May 19 22:10:16 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 22:10:16 +0200 Subject: [Python-Dev] 'stackless' python? References: <000e01bea1ba$47fe7500$2e9e2299@tim> <3743114A.220FFA0B@appliedbiometrics.com> <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us> Message-ID: <37431AA8.BC77C615@appliedbiometrics.com> Jeremy Hylton wrote: [TP+CT about frame copies et al] > Wouldn't it be easier to increase the refcount on the frame object? > Then you wouldn't need to worry about the recounts on all the objects > in the frame, because they would only be decrefed when the frame is > deallocated. Well, the frame is supposed to be run twice, since there are two incarnations of interpreters working on it: The original one, and later, when it is thown, another one (or the same, but, in principle). The frame could have been in any state, with a couple of objects on the stack. My splitting function can be invoked in some nested context, so I have a current opcode position, and a current stack position. Running this once leaves the stack empty, since all the objects are decrefed. Running this a second time gives a GPF, since the stack is empty. Therefore, I made a copy which means to create a duplicate frame with an extra refcound for all the objects. This makes sure that both can be restarted at any time. > It seems like the two other things you would need are some way to get > a copy of the current frame and a means to invoke eval_code2 with an > already existing stack frame instead of a new one. Well, that's exactly where I'm working on. > (This sounds too simple, so it's obviously wrong. I'm just not sure > where. Is the problem that you really need a seperate stack/graph to > hold the frames? If we leave them on the Python stack, it could be > hard to dis-entangle value objects from control objects.) Oh, perhaps I should explain it a bit clearer? What did you mean by the Python stack? The hardware machine stack? What do we have at the moment: The stack is the linked list of frames. Every frame has a local Python evaluation stack. Calls of Python functions produce a new frame, and the old one is put beneath. This is the control stack. The additional info on the hardware stack happens to be a parallel friend of this chain, and currently holds extra info, but this is an artifact. Adding the current Python stack level to the frame makes the hardware stack totally unnecessary. There is a possible speed loss, anyway. Today, the recursive call of ceval2 is optimized and quite fast. The non-recursive Version will have to copy variables in and out from the frames, instead, so there is of course a little speed penalty to pay. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Wed May 19 23:38:07 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 19 May 1999 23:38:07 +0200 Subject: [Python-Dev] 'stackless' python? References: <001301bea1ba$4eb498c0$2e9e2299@tim> Message-ID: <37432F3F.2694DA0E@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > I've put quite many hours into a non-recursive ceval.c > > already. > > Does that mean 6 or 600 ? 6, or 10, or 20, if I count the time from the first start with Sam's code, maybe. > > > Should I continue? At least this would be a little improvement, also > > if the continuation thing will not be born. ? > > Guido wanted to move in the "flat interpreter" direction for Python2 anyway, > so my belief is it's worth pursuing. > > but-then-i-flipped-a-coin-with-two-heads-ly y'rs - tim Right. Who'se faces? :-) On the stackless thing, what should I do. I started to insert minimum patches, but it turns out that I have to change frames a little (extending). I can make quite small changes to the interpreter to replace the recursive calls, but this involves extra flags in some cases, where the interpreter is called the first time and so on. What has more probability to be included into a future Python: Tweaking the current thing only minimally, to make it as similar as possible as the former? Or do as much redesign as I think is needed to do it in a clean way. This would mean to split eval_code2 into two functions, where one is the interpreter kernel, and one is the frame manager. There are also other places which do quite deep function calls and finally call eval_code2. I think these should return a frame object now. I could convince them to call or return frame, depending on a flag, but it would be clean to rename the functions, let them always deal with frames, and put the original function on top of it. Short, I can do larger changes which clean this all a bit up, or I can make small changes which are more tricky to grasp, but give just small diffs. How to touch untouchable code the best? :-) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Wed May 19 23:49:38 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 19 May 1999 17:49:38 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37432F3F.2694DA0E@appliedbiometrics.com> References: <001301bea1ba$4eb498c0$2e9e2299@tim> <37432F3F.2694DA0E@appliedbiometrics.com> Message-ID: <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us> I think it makes sense to avoid being obscure or unclear in order to minimize the size of the patch or the diff. Realistically, it's unlikely that anything like your original patch is going to make it into the CVS tree. It's primary value is as proof of concept and as code that the rest of us can try out. If you make large changes, but they are clearer, you'll help us out a lot. We can worry about minimizing the impact of the changes on the codebase after, after everyone has figured out what's going on and agree that its worth doing. feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's, Jeremy From tismer at appliedbiometrics.com Thu May 20 00:25:20 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 00:25:20 +0200 Subject: [Python-Dev] 'stackless' python? References: <001301bea1ba$4eb498c0$2e9e2299@tim> <37432F3F.2694DA0E@appliedbiometrics.com> <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us> Message-ID: <37433A50.31E66CB1@appliedbiometrics.com> Jeremy Hylton wrote: > > I think it makes sense to avoid being obscure or unclear in order to > minimize the size of the patch or the diff. Realistically, it's > unlikely that anything like your original patch is going to make it > into the CVS tree. It's primary value is as proof of concept and as > code that the rest of us can try out. If you make large changes, but > they are clearer, you'll help us out a lot. Many many thanks. This is good advice. I will make absolutely clear what's going on, keep parts untouched as possible, cut out parts which must change, and I will not look into speed too much. Better have a function call more and a bit less optimization, but a clear and rock-solid introduction of a concept. > We can worry about minimizing the impact of the changes on the > codebase after, after everyone has figured out what's going on and > agree that its worth doing. > > feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's, > Jeremy Hihi - the new little slot with local variables of the interpreter happens to have the name "continuation". Maybe I'd better rename it to "activation record"?. Now, there is no longer a recoursive call. Instead, a frame object is returned, which is waiting to be activated by a dispatcher. Some more ideas are popping up. Right now, only the recursive calls can vanish. Callbacks from C code which is called by the interpreter whcih is called by... is still a problem. But it might perhaps vanish completely. We have to see how much the cost is. But if I can manage to let the interpreter duck and cover also on every call to a builtin? The interpreter again returns to the dispatcher which then calls the builtin. Well, if that builtin happens to call to the interpreter again, it will be a dispatcher again. The machine stack grows a little, but since everything is saved in the frames, these stacks are no longer related. This means, the principle works with existing extension modules, since interpreter-world and C-stack world are decoupled. To avoid stack growth, of course a number of builtins would be better changed, but it is no must in the first place. execfile for instance is a candidate which needn't call the interpreter. It could equally parse the file, generate the code object, build a frame and just return it. This is what the dispatcher likes: returned frames are put on the chain and fired. waah, my bus - running - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Thu May 20 01:56:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 19:56:33 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <000701bea253$3a182a00$179e2299@tim> I'm home sick today, so tortured myself <0.9 wink>. Sam mentioned using coroutines to compare the fringes of two trees, and I picked a simpler problem: given a nested list structure, generate the leaf elements one at a time, in left-to-right order. A solution to Sam's problem can be built on that, by getting a generator for each tree and comparing the leaves a pair at a time until there's a difference. Attached are solutions in Icon, Python and Scheme. I have the least experience with Scheme, but browsing around didn't find a better Scheme approach than this. The Python solution is the least satisfactory, using an explicit stack to simulate recursion by hand; if you didn't know the routine's purpose in advance, you'd have a hard time guessing it. The Icon solution is very short and simple, and I'd guess obvious to an average Icon programmer. It uses the subset of Icon ("generators") that doesn't require any C-stack trickery. However, alone of the three, it doesn't create a function that could be explicitly called from several locations to produce "the next" result; Icon's generators are tied into Icon's unique control structures to work their magic, and breaking that connection requires moving to full-blown Icon coroutines. It doesn't need to be that way, though. The Scheme solution was the hardest to write, but is a largely mechanical transformation of a recursive fringe-lister that constructs the entire fringe in one shot. Continuations are used twice: to enable the recursive routine to resume itself where it left off, and to get each leaf value back to the caller. Getting that to work required rebinding non-local identifiers in delicate ways. I doubt the intent would be clear to an average Scheme programmer. So what would this look like in Continuation Python? Note that each place the Scheme says "lambda" or "letrec", it's creating a new lexical scope, and up-level references are very common. Two functions are defined at top level, but seven more at various levels of nesting; the latter can't be pulled up to the top because they refer to vrbls local to the top-level functions. Another (at least initially) discouraging thing to note is that Scheme schemes for hiding the pain of raw call/cc often use Scheme's macro facilities. may-not-be-as-fun-as-it-sounds-ly y'rs - tim Here's the Icon: procedure main() x := [[1, [[2, 3]]], [4], [], [[[5]], 6]] every writes(fringe(x), " ") write() end procedure fringe(node) if type(node) == "list" then suspend fringe(!node) else suspend node end Here's the Python: from types import ListType class Fringe: def __init__(self, value): self.stack = [(value, 0)] def __getitem__(self, ignored): while 1: # find topmost pending list with something to do while 1: if not self.stack: raise IndexError v, i = self.stack[-1] if i < len(v): break self.stack.pop() this = v[i] self.stack[-1] = (v, i+1) if type(this) is ListType: self.stack.append((this, 0)) else: break return this testcase = [[1, [[2, 3]]], [4], [], [[[5]], 6]] for x in Fringe(testcase): print x, print Here's the Scheme: (define list->generator ; Takes a list as argument. ; Returns a generator g such that each call to g returns ; the next element in the list's symmetric-order fringe. (lambda (x) (letrec {(produce-value #f) ; set to return-to continuation (looper (lambda (x) (cond ((null? x) 'nada) ; ignore null ((list? x) (looper (car x)) (looper (cdr x))) (else ; want to produce this non-list fringe elt, ; and also resume here (call/cc (lambda (here) (set! getnext (lambda () (here 'keep-going))) (produce-value x))))))) (getnext (lambda () (looper x) ; have to signal end of sequence somehow; ; assume false isn't a legitimate fringe elt (produce-value #f)))} ; return niladic function that returns next value (lambda () (call/cc (lambda (k) (set! produce-value k) (getnext))))))) (define display-fringe (lambda (x) (letrec ((g (list->generator x)) (thiselt #f) (looper (lambda () (set! thiselt (g)) (if thiselt (begin (display thiselt) (display " ") (looper)))))) (looper)))) (define test-case '((1 ((2 3))) (4) () (((5)) 6))) (display-fringe test-case) From MHammond at skippinet.com.au Thu May 20 02:14:24 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Thu, 20 May 1999 10:14:24 +1000 Subject: [Python-Dev] Interactive Debugging of Python Message-ID: <008b01bea255$b80cf790$0801a8c0@bobcat> All this talk about stack frames and manipulating them at runtime has reminded me of one of my biggest gripes about Python. When I say "biggest gripe", I really mean "biggest surprise" or "biggest shame". That is, Python is very interactive and dynamic. However, when I am debugging Python, it seems to lose this. There is no way for me to effectively change a running program. Now with VC6, I can do this with C. Although it is slow and a little dumb, I can change the C side of my Python world while my program is running, but not the Python side of the world. Im wondering how feasable it would be to change Python code _while_ running under the debugger. Presumably this would require a way of recompiling the current block of code, patching this code back into the object, and somehow tricking the stack frame to use this new block of code; even if a first-cut had to restart the block or somesuch... Any thoughts on this? Mark. From tim_one at email.msn.com Thu May 20 04:41:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 19 May 1999 22:41:03 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com> Message-ID: <000901bea26a$34526240$179e2299@tim> [Christian Tismer] > I tried the most simple thing, and this seemed to be duplicating > the current state of the machine. The frame holds the stack, > and references to all objects. > By chance, the locals are not in a dict, but unpacked into > the frame. (Sometimes I agree with Guido, that optimization > is considered harmful :-) I don't see that the locals are a problem here -- provided you simply leave them alone . > The Python stack, besides its intermingledness with the machine > stack, is basically its chain of frames. Right. > The value stack pointer still hides in the machine stack, but > that's easy to change. I'm not sure what "value stack" means here, or "machine stack". The latter means the C stack? Then I don't know which values you have in mind that are hiding in it (the locals are, as you say, unpacked in the frame, and the evaluation stack too). By "evaluation stack" I mean specifically f->f_valuestack; the current *top* of stack pointer (specifically stack_pointer) lives in the C stack -- is that what we're talking about? Whichever, when we're talking about the code, let's use the names the code uses . > So the real Scheme-like part is this chain, methinks, with > the current bytecode offset and value stack info. Curiously, f->f_lasti is already materialized every time we make a call, in order to support tracing. So if capturing a continuation is done via a function call (hard to see any other way it could be done ), a bytecode offset is already getting saved in the frame object. > Making a copy of this in a restartable way means to increase > the refcount of all objects in a frame. You later had a vision of splitting the frame into two objects -- I think. Whichever part the locals live in should not be copied at all, but merely have its (single) refcount increased. The other part hinges on details of your approach I don't know. The nastiest part seems to be f->f_valuestack, which conceptually needs to be (shallow) copied in the current frame and in all other frames reachable from the current frame's continuation (the chain rooted at f->f_back today); that's the sum total (along with the same frames' bytecode offsets) of capturing the control flow state. > Would it be correct to undo the effect of fast locals before > splitting, and redoing it on activation? Unsure what splitting means, but in any case I can't conceive of a reason for doing anything to the locals. Their values aren't *supposed* to get restored upon continuation invocation, so there's no reason to do anything with their values upon continuation creation either. Right? Or are we talking about different things? almost-as-good-as-pantomimem-ly y'rs - tim From rushing at nightmare.com Thu May 20 06:04:20 1999 From: rushing at nightmare.com (rushing at nightmare.com) Date: Wed, 19 May 1999 21:04:20 -0700 (PDT) Subject: [Python-Dev] A "real" continuation example In-Reply-To: <50692631@toto.iv> Message-ID: <14147.34175.950743.79464@seattle.nightmare.com> Tim Peters writes: > The Scheme solution was the hardest to write, but is a largely > mechanical transformation of a recursive fringe-lister that > constructs the entire fringe in one shot. Continuations are used > twice: to enable the recursive routine to resume itself where it > left off, and to get each leaf value back to the caller. Getting > that to work required rebinding non-local identifiers in delicate > ways. I doubt the intent would be clear to an average Scheme > programmer. It's the only way to do it - every example I've seen of using call/cc looks just like it. I reworked your Scheme a bit. IMHO letrec is for compilers, not for people. The following should be equivalent: (define (list->generator x) (let ((produce-value #f)) (define (looper x) (cond ((null? x) 'nada) ((list? x) (looper (car x)) (looper (cdr x))) (else (call/cc (lambda (here) (set! getnext (lambda () (here 'keep-going))) (produce-value x)))))) (define (getnext) (looper x) (produce-value #f)) (lambda () (call/cc (lambda (k) (set! produce-value k) (getnext)))))) (define (display-fringe x) (let ((g (list->generator x))) (let loop ((elt (g))) (if elt (begin (display elt) (display " ") (loop (g))))))) (define test-case '((1 ((2 3))) (4) () (((5)) 6))) (display-fringe test-case) > So what would this look like in Continuation Python? Here's my first hack at it. Most likely wrong. It is REALLY HARD to do this without having the feature to play with. This presumes a function "call_cc" that behaves like Scheme's. I believe the extra level of indirection is necessary. (i.e., call_cc takes a function as an argument that takes a continuation function) class list_generator: def __init__ (x): self.x = x self.k_suspend = None self.k_produce = None def walk (self, x): if type(x) == type([]): for item in x: self.walk (item) else: self.item = x # call self.suspend() with a continuation # that will continue walking the tree call_cc (self.suspend) def __call__ (self): # call self.resume() with a continuation # that will return the next fringe element return call_cc (self.resume) def resume (self, k_produce): self.k_produce = k_produce if self.k_suspend: # resume the suspended walk self.k_suspend (None) else: self.walk (self.x) def suspend (self, k_suspend): self.k_suspend = k_suspend # return a value for __call__ self.k_produce (self.item) Variables hold continuations have a 'k_' prefix. In real life it might be possible to put the suspend/call/resume machinery in a base class (Generator?), and override 'walk' as you please. -Sam From tim_one at email.msn.com Thu May 20 09:21:45 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 20 May 1999 03:21:45 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com> Message-ID: <001d01bea291$6b3efbc0$179e2299@tim> [Sam, takes up the Continuation Python Challenge] Thanks, Sam! I think this is very helpful. > ... > It's the only way to do it - every example I've seen of using call/cc > looks just like it. Same here -- alas <0.5 wink>. > I reworked your Scheme a bit. IMHO letrec is for compilers, not for > people. The following should be equivalent: I confess I stopped paying attention to Scheme after R4RS, and largely because the std decreed that *so* many forms were optional. Your rework is certainly nicer, but internal defines and named let are two that R4RS refused to require, so I always avoided them. BTW, I *am* a compiler, so that never bothered me . >> So what would this look like in Continuation Python? > Here's my first hack at it. Most likely wrong. It is REALLY HARD to > do this without having the feature to play with. Fully understood. It's also really hard to implement the feature without knowing how someone who wants it would like it to behave. But I don't think anyone is getting graded on this, so let's have fun . Ack! I have to sleep. Will study the code in detail later, but first impression was it looked good! Especially nice that it appears possible to package up most of the funky call_cc magic in a base class, so that non-wizards could reuse it by following a simple protocol. great-fun-to-come-up-with-one-of-these-but-i'd-hate-to-have-to-redo- from-scratch-every-time-ly y'rs - tim From skip at mojam.com Thu May 20 15:27:59 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 20 May 1999 09:27:59 -0400 (EDT) Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com> References: <50692631@toto.iv> <14147.34175.950743.79464@seattle.nightmare.com> Message-ID: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com> Sam> I reworked your Scheme a bit. IMHO letrec is for compilers, not for Sam> people. Sam, you are aware of course that the timbot *is* a compiler, right? ;-) >> So what would this look like in Continuation Python? Sam> Here's my first hack at it. Most likely wrong. It is REALLY HARD to Sam> do this without having the feature to play with. The thought that it's unlikely one could arrive at a reasonable approximation of a correct solution for such a small problem without the ability to "play with" it is sort of scary. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer at appliedbiometrics.com Thu May 20 16:10:32 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 16:10:32 +0200 Subject: [Python-Dev] Interactive Debugging of Python References: <008b01bea255$b80cf790$0801a8c0@bobcat> Message-ID: <374417D8.8DBCB617@appliedbiometrics.com> Mark Hammond wrote: > > All this talk about stack frames and manipulating them at runtime has > reminded me of one of my biggest gripes about Python. When I say "biggest > gripe", I really mean "biggest surprise" or "biggest shame". > > That is, Python is very interactive and dynamic. However, when I am > debugging Python, it seems to lose this. There is no way for me to > effectively change a running program. Now with VC6, I can do this with C. > Although it is slow and a little dumb, I can change the C side of my Python > world while my program is running, but not the Python side of the world. > > Im wondering how feasable it would be to change Python code _while_ running > under the debugger. Presumably this would require a way of recompiling the > current block of code, patching this code back into the object, and somehow > tricking the stack frame to use this new block of code; even if a first-cut > had to restart the block or somesuch... > > Any thoughts on this? I'm writing a prototype of a stackless Python, which means that you will be able to access the current state of the interpreter completely. The inner interpreter loop will be isolated from the frame dispatcher. It will break whenever the ticker goes zero. If you set the ticker to one, you will be able to single step on every opcode, have the value stack, the frame chain, everything. I think, with this you can do very much. But tell me if you want a callback hook somewhere. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Thu May 20 18:52:21 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 20 May 1999 18:52:21 +0200 Subject: [Python-Dev] 'stackless' python? References: <000901bea26a$34526240$179e2299@tim> Message-ID: <37443DC5.1330EAC6@appliedbiometrics.com> Cleaning up, clarifying, trying to understand... Tim Peters wrote: > > [Christian Tismer] > > I tried the most simple thing, and this seemed to be duplicating > > the current state of the machine. The frame holds the stack, > > and references to all objects. > > By chance, the locals are not in a dict, but unpacked into > > the frame. (Sometimes I agree with Guido, that optimization > > is considered harmful :-) > > I don't see that the locals are a problem here -- provided you simply leave > them alone . This depends on wether I have to duplicate frames or not. Below... > > The Python stack, besides its intermingledness with the machine > > stack, is basically its chain of frames. > > Right. > > > The value stack pointer still hides in the machine stack, but > > that's easy to change. > > I'm not sure what "value stack" means here, or "machine stack". The latter > means the C stack? Then I don't know which values you have in mind that are > hiding in it (the locals are, as you say, unpacked in the frame, and the > evaluation stack too). By "evaluation stack" I mean specifically > f->f_valuestack; the current *top* of stack pointer (specifically > stack_pointer) lives in the C stack -- is that what we're talking about? Exactly! > Whichever, when we're talking about the code, let's use the names the code > uses . The evaluation stack pointer is a local variable in the C stack and must be written to the frame to become independant from the C stack. Sounds better now? > > > So the real Scheme-like part is this chain, methinks, with > > the current bytecode offset and value stack info. > > Curiously, f->f_lasti is already materialized every time we make a call, in > order to support tracing. So if capturing a continuation is done via a > function call (hard to see any other way it could be done ), a > bytecode offset is already getting saved in the frame object. You got me. I'm just completing what is partially there. > > Making a copy of this in a restartable way means to increase > > the refcount of all objects in a frame. > > You later had a vision of splitting the frame into two objects -- I think. My wrong wording. Not splitting, but duplicting. If a frame is the current state, I make it two frames to have two current states. One will be saved, the other will be run. This is what I call "splitting". Actually, splitting must occour whenever a frame can be reached twice, in order to keep elements alive. > Whichever part the locals live in should not be copied at all, but merely > have its (single) refcount increased. The other part hinges on details of > your approach I don't know. The nastiest part seems to be f->f_valuestack, > which conceptually needs to be (shallow) copied in the current frame and in > all other frames reachable from the current frame's continuation (the chain > rooted at f->f_back today); that's the sum total (along with the same > frames' bytecode offsets) of capturing the control flow state. Well, I see. You want one locals and one globals, shared by two incarnations. Gets me into trouble. > > Would it be correct to undo the effect of fast locals before > > splitting, and redoing it on activation? > > Unsure what splitting means, but in any case I can't conceive of a reason > for doing anything to the locals. Their values aren't *supposed* to get > restored upon continuation invocation, so there's no reason to do anything > with their values upon continuation creation either. Right? Or are we > talking about different things? Let me explain. What Python does right now is: When a function is invoked, all local variables are copied into fast_locals, well of course just references are copied and counts increased. These fast locals give a lot of speed today, we must have them. You are saying I have to share locals between frames. Besides that will be a reasonable slowdown, since an extra structure must be built and accessed indirectly (right now, i's all fast, living in the one frame buffer), I cannot say that I'm convinced that this is what we need. Suppose you have a function def f(x): # do something ... # in some context, wanna have a snapshot global snapshot # initialized to None if not snapshot: snapshot = callcc.new() # continue computation x = x+1 ... What I want to achieve is that I can run this again, from my snapshot. But with shared locals, my parameter x of the snapshot would have changed to x+1, which I don't find useful. I want to fix a state of the current frame and still think it should "own" its locals. Globals are borrowed, anyway. Class instances will anyway do what you want, since the local "self" is a mutable object. How do you want to keep computations independent when locals are shared? For me it's just easier to implement and also to think with the shallow copy. Otherwise, where is my private place? Open for becoming convinced, of course :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Thu May 20 21:26:30 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Thu, 20 May 1999 15:26:30 -0400 (EDT) Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com> References: <000901bea26a$34526240$179e2299@tim> <37443DC5.1330EAC6@appliedbiometrics.com> Message-ID: <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: CT> What I want to achieve is that I can run this again, from my CT> snapshot. But with shared locals, my parameter x of the snapshot CT> would have changed to x+1, which I don't find useful. I want to CT> fix a state of the current frame and still think it should "own" CT> its locals. Globals are borrowed, anyway. Class instances will CT> anyway do what you want, since the local "self" is a mutable CT> object. CT> How do you want to keep computations independent when locals are CT> shared? For me it's just easier to implement and also to think CT> with the shallow copy. Otherwise, where is my private place? CT> Open for becoming convinced, of course :-) I think you're making things a lot more complicated by trying to instantiate new variable bindings for locals every time you create a continuation. Can you give an example of why that would be helpful? (Ok. I'm not sure I can offer a good example of why it would be helpful to share them, but it makes intuitive sense to me.) The call_cc mechanism is going to let you capture the current continuation, save it somewhere, and call on it again as often as you like. Would you get a fresh locals each time you used it? or just the first time? If only the first time, it doesn't seem that you've gained a whole lot. Also, all the locals that are references to mutable objects are already effectively shared. So it's only a few oddballs like ints that are an issue. Jeremy From tim_one at email.msn.com Fri May 21 00:04:04 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 20 May 1999 18:04:04 -0400 Subject: [Python-Dev] A "real" continuation example In-Reply-To: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com> Message-ID: <000601bea30c$ad51b220$9d9e2299@tim> [Tim] > So what would this look like in Continuation Python? [Sam] > Here's my first hack at it. Most likely wrong. It is > REALLY HARD to do this without having the feature to play with. [Skip] > The thought that it's unlikely one could arrive at a reasonable > approximation of a correct solution for such a small problem without the > ability to "play with" it is sort of scary. Yes it is. But while the problem is small, it's not easy, and only the Icon solution wrote itself (not a surprise -- Icon was designed for expressing this kind of algorithm, and the entire language is actually warped towards it). My first stab at the Python stack-fiddling solution had bugs too, but I conveniently didn't post that . After studying Sam's code, I expect it *would* work as written, so it's a decent bet that it's a reasonable approximation to a correct solution as-is. A different Python approach using threads can be built using Demo/threads/Generator.py from the source distribution. To make that a fair comparison, I would have to post the supporting machinery from Generator.py too -- and we can ask Guido whether Generator.py worked right the first time he tried it . The continuation solution is subtle, requiring real expertise; but the threads solution doesn't fare any better on that count (building the support machinery with threads is also a baffler if you don't have thread expertise). If we threw Python metaclasses into the pot too, they'd be a third kind of nightmare for the non-expert. So, if you're faced with this kind of task, there's simply no easy way to get it done. Thread- and (it appears) continuation- based machinery can be crafted once by an expert, then packaged into an easy-to-use protocol for non-experts. All in all, I view continuations as a feature most people should actively avoid! I think it has that status in Scheme too (e.g., the famed Schemer's SICP textbook doesn't even mention call/cc). Its real value (if any ) is as a Big Invisible Hammer for certified wizards. Where call_cc leaks into the user's view of the world I'd try to hide it; e.g., where Sam has def walk (self, x): if type(x) == type([]): for item in x: self.walk (item) else: self.item = x # call self.suspend() with a continuation # that will continue walking the tree call_cc (self.suspend) I'd do def walk(self, x): if type(x) == type([]): for item in x: self.walk(item) else: self.put(x) where "put" is inherited from the base class (part of the protocol) and hides the call_cc business. Do enough of this, and we'll rediscover why Scheme demands that tail calls not push a new stack frame <0.9 wink>. the-tradeoffs-are-murky-ly y'rs - tim From tim_one at email.msn.com Fri May 21 00:04:09 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 20 May 1999 18:04:09 -0400 Subject: [Python-Dev] 'stackless' python? In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com> Message-ID: <000701bea30c$af7a1060$9d9e2299@tim> [Christian] [... clarified stuff ... thanks! ... much clearer ...] > ... > If a frame is the current state, I make it two frames to have two > current states. One will be saved, the other will be run. This is > what I call "splitting". Actually, splitting must occour whenever > a frame can be reached twice, in order to keep elements alive. That part doesn't compute: if a frame can be reached by more than one path, its refcount must be at least equal to the number of its immediate predecessors, and its refcount won't fall to 0 before it becomes unreachable. So while you may need to split stuff for *some* reasons, I can't see how keeping elements alive could be one of those reasons (unless you're zapping frame contents *before* the frame itself is garbage?). > ... > Well, I see. You want one locals and one globals, shared by two > incarnations. Gets me into trouble. Just clarifying what Scheme does. Since they've been doing this forever, I don't want to toss their semantics on a whim . It's at least a conceptual thing: why *should* locals follow different rules than globals? If Python2 grows lexical closures, the only thing special about today's "locals" is that they happen to be the first guys found on the search path. Conceptually, that's really all they are today too. Here's the clearest Scheme example I can dream up: (define k #f) (define (printi i) (display "i is ") (display i) (newline)) (define (test n) (let ((i n)) (printi i) (set! i (- i 1)) (printi i) (display "saving continuation") (newline) (call/cc (lambda (here) (set! k here))) (set! i (- i 1)) (printi i) (set! i (- i 1)) (printi i))) No loops, no recursive calls, just a straight chain of fiddle-a-local ops. Here's some output: > (test 5) i is 5 i is 4 saving continuation i is 3 i is 2 > (k #f) i is 1 i is 0 > (k #f) i is -1 i is -2 > (k #f) i is -3 i is -4 > So there's no question about what Scheme thinks is proper behavior here. > ... > Let me explain. What Python does right now is: > When a function is invoked, all local variables are copied > into fast_locals, well of course just references are copied > and counts increased. These fast locals give a lot of speed > today, we must have them. Scheme (most of 'em, anyway) also resolves locals via straight base + offset indexing. > You are saying I have to share locals between frames. Besides > that will be a reasonable slowdown, since an extra structure > must be built and accessed indirectly (right now, i's all fast, > living in the one frame buffer), GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't care where that points *to* I cannot say that I'm convinced that this is what we need. > > Suppose you have a function > > def f(x): > # do something > ... > # in some context, wanna have a snapshot > global snapshot # initialized to None > if not snapshot: > snapshot = callcc.new() > # continue computation > x = x+1 > ... > > What I want to achieve is that I can run this again, from my > snapshot. But with shared locals, my parameter x of the > snapshot would have changed to x+1, which I don't find useful. You need a completely fleshed-out example to score points here: the use of call/cc is subtle, hinging on details, and fragments ignore too much. If you do want the same x, commonx = x if not snapshot: # get the continuation # continue computation x = commonx x = x+1 ... That is, it's easy to get it. But if you *do* want to see changes to the locals (which is one way for those distinct continuation invocations to *cooperate* in solving a task -- see below), but the implementation doesn't allow for it, I don't know what you can do to worm around it short of making x global too. But then different *top* level invocations of f will stomp on that shared global, so that's not a solution either. Maybe forget functions entirely and make everything a class method. > I want to fix a state of the current frame and still think > it should "own" its locals. Globals are borrowed, anyway. > Class instances will anyway do what you want, since > the local "self" is a mutable object. > > How do you want to keep computations independent > when locals are shared? For me it's just easier to > implement and also to think with the shallow copy. > Otherwise, where is my private place? > Open for becoming convinced, of course :-) I imagine it comes up less often in Scheme because it has no loops: communication among "iterations" is via function arguments or up-level lexical vrbls. So recall your uses of Icon generators instead: like Python, Icon does have loops, and two-level scoping, and I routinely build loopy Icon generators that keep state in locals. Here's a dirt-simple example I emailed to Sam earlier this week: procedure main() every result := fib(0, 1) \ 10 do write(result) end procedure fib(i, j) local temp repeat { suspend i temp := i + j i := j j := temp } end which prints 0 1 1 2 3 5 8 13 21 34 If Icon restored the locals (i, j, temp) upon each fib resumption, it would generate a zero followed by an infinite sequence of ones(!). Think of a continuation as a *paused* computation (which it is) rather than an *independent* one (which it isn't ), and I think it gets darned hard to argue. theory-and-practice-agree-here-in-my-experience-ly y'rs - tim From MHammond at skippinet.com.au Fri May 21 01:01:22 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 21 May 1999 09:01:22 +1000 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <374417D8.8DBCB617@appliedbiometrics.com> Message-ID: <00c001bea314$aefc5b40$0801a8c0@bobcat> > I'm writing a prototype of a stackless Python, which means that > you will be able to access the current state of the interpreter > completely. > The inner interpreter loop will be isolated from the frame > dispatcher. It will break whenever the ticker goes zero. > If you set the ticker to one, you will be able to single > step on every opcode, have the value stack, the frame chain, > everything. I think the main point is how to change code when a Python frame already references it. I dont think the structure of the frames is as important as the general concept. But while we were talking frame-fiddling it seemed a good point to try and hijack it a little :-) Would it be possible to recompile just a block of code (eg, just the current function or method) and patch it back in such a way that the current frame continues execution of the new code? I feel this is somewhat related to the inability to change class implementation for an existing instance. I know there have been hacks around this before but they arent completly reliable and IMO it would be nice if the core Python made it easier to change already running code - whether that code is in an existing stack frame, or just in an already created instance, it is very difficult to do. This has come to try and deflect some conversation away from changing Python as such towards an attempt at enhancing its _environment_. To paraphrase many people before me, even if we completely froze the language now there would still plenty of work ahead of us :-) Mark. From guido at CNRI.Reston.VA.US Fri May 21 02:06:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 20 May 1999 20:06:51 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: Your message of "Fri, 21 May 1999 09:01:22 +1000." <00c001bea314$aefc5b40$0801a8c0@bobcat> References: <00c001bea314$aefc5b40$0801a8c0@bobcat> Message-ID: <199905210006.UAA07900@eric.cnri.reston.va.us> > I think the main point is how to change code when a Python frame already > references it. I dont think the structure of the frames is as important as > the general concept. But while we were talking frame-fiddling it seemed a > good point to try and hijack it a little :-) > > Would it be possible to recompile just a block of code (eg, just the > current function or method) and patch it back in such a way that the > current frame continues execution of the new code? This topic sounds mostly unrelated to the stackless discussion -- in either case you need to be able to fiddle the contents of the frame and the bytecode pointer to reflect the changed function. Some issues: - The slots containing local variables may be renumbered after recompilation; fortunately we know the name--number mapping so we can move them to their new location. But it is still tricky. - Should you be able to edit functions that are present on the call stack below the top? Suppose we have two functions: def f(): return 1 + g() def g(): return 0 Suppose set a break in g(), and then edit the source of f(). We can do all sorts of evil to f(): e.g. we could change it to return g() + 2 which affects the contents of the value stack when g() returns (originally, the value stack contained the value 1, now it is empty). Or we could even change f() to return 3 thereby eliminating the call to g() altogether! What kind of limitations do other systems that support modifying a "live" program being debugged impose? Only allowing modification of the function at the top of the stack might eliminate some problems, although there are still ways to mess up. The value stack is not always empty even when we only stop at statement boundaries -- e.g. it contains 'for' loop indices, and there's also the 'block' stack, which contains try-except information. E.g. what should happen if we change def f(): for i in range(10): print 1 stopped at the 'print 1' into def f(): print 1 ??? (Ditto for removing or adding a try/except block.) > I feel this is somewhat related to the inability to change class > implementation for an existing instance. I know there have been hacks > around this before but they arent completly reliable and IMO it would be > nice if the core Python made it easier to change already running code - > whether that code is in an existing stack frame, or just in an already > created instance, it is very difficult to do. I've been thinking a bit about this. Function objects now have mutable func_code attributes (and also func_defaults), I think we can use this. The hard part is to do the analysis needed to decide which functions to recompile! Ideally, we would simply edit a file and tell the programming environment "recompile this". The programming environment would compare the changed file with the old version that it had saved for this purpose, and notice (for example) that we changed two methods of class C. It would then recompile those methods only and stuff the new code objects in the corresponding function objects. But what would it do when we changed a global variable? Say a module originally contains a statement "x = 0". Now we change the source code to say "x = 100". Should we change the variable x? Suppose that x is modified by some of the computations in the module, and the that, after some computations, the actual value of x was 50. Should the "recompile" reset x to 100 or leave it alone? One option would be to actually change the semantics of the class and def statements so that they modify an existing class or function rather than using assignment. Effectively, this proposal would change the semantics of class A: ...some code... class A: ...some more code... to be the same as class A: ...more code... ...some more code... This is somewhat similar to the way the module or package commands in some other dynamic languages work, I think; and I don't think this would break too much existing code. The proposal would also change def f(): ...some code... def f(): ...other code... but here the equivalence is not so easy to express, since I want different semantics (I don't want the second f's code to be tacked onto the end of the first f's code). If we understand that def f(): ... really does the following: f = NewFunctionObject() f.func_code = ...code object... then the construct above (def f():... def f(): ...) would do this: f = NewFunctionObject() f.func_code = ...some code... f.func_code = ...other code... i.e. there is no assignment of a new function object for the second def. Of course if there is a variable f but it is not a function, it would have to be assigned a new function object first. But in the case of def, this *does* break existing code. E.g. # module A from B import f . . . if ...some test...: def f(): ...some code... This idiom conditionally redefines a function that was also imported from some other module. The proposed new semantics would change B.f in place! So perhaps these new semantics should only be invoked when a special "reload-compile" is asked for... Or perhaps the programming environment could do this through source parsing as I proposed before... > This has come to try and deflect some conversation away from changing > Python as such towards an attempt at enhancing its _environment_. To > paraphrase many people before me, even if we completely froze the language > now there would still plenty of work ahead of us :-) Please, no more posts about Scheme. Each new post mentioning call/cc makes it *less* likely that something like that will ever be part of Python. "What if Guido's brain exploded?" :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Fri May 21 03:13:28 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 20 May 1999 21:13:28 -0400 (EDT) Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> References: <00c001bea314$aefc5b40$0801a8c0@bobcat> <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <14148.45321.204380.19130@cm-29-94-2.nycap.rr.com> Guido> What kind of limitations do other systems that support modifying Guido> a "live" program being debugged impose? Only allowing Guido> modification of the function at the top of the stack might Guido> eliminate some problems, although there are still ways to mess Guido> up. Frame objects maintain pointers to the active code objects, locals and globals, so modifying a function object's code or globals shouldn't have any effect on currently executing frames, right? I assume frame objects do the usual INCREF/DECREF dance, so the old code object won't get deleted before the frame object is tossed. Guido> But what would it do when we changed a global variable? Say a Guido> module originally contains a statement "x = 0". Now we change Guido> the source code to say "x = 100". Should we change the variable Guido> x? Suppose that x is modified by some of the computations in the Guido> module, and the that, after some computations, the actual value Guido> of x was 50. Should the "recompile" reset x to 100 or leave it Guido> alone? I think you should note the change for users and give them some way to easily pick between old initial value, new initial value or current value. Guido> Please, no more posts about Scheme. Each new post mentioning Guido> call/cc makes it *less* likely that something like that will ever Guido> be part of Python. "What if Guido's brain exploded?" :-) I agree. I see call/cc or set! and my eyes just glaze over... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From MHammond at skippinet.com.au Fri May 21 03:42:14 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 21 May 1999 11:42:14 +1000 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <00c501bea32b$277ce3d0$0801a8c0@bobcat> [Guido writes...] > This topic sounds mostly unrelated to the stackless discussion -- in Sure is - I just saw that as an excuse to try and hijack it > Some issues: > > - The slots containing local variables may be renumbered after Generally, I think we could make something very useful even with a number of limitations. For example, I would find a first cut completely acceptable and a great improvement on today if: * Only the function at the top of the stack can be recompiled and have the code reflected while executing. This function also must be restarted after such an edit. If the function uses global variables or makes calls that restarting will screw-up, then either a) make the code changes _before_ doing this stuff, or b) live with it for now, and help us remove the limitation :-) That may make the locals being renumbered easier to deal with, and also remove some of the problems you discussed about editing functions below the top. > What kind of limitations do other systems that support modifying a > "live" program being debugged impose? Only allowing modification of I can only speak for VC, and from experience at that - I havent attempted to find documentation on it. It accepts most changes while running. The current line is fine. If you create or change the definition of globals (and possibly even the type of locals?), the "incremental compilation" fails, and you are given the option of continuing with the old code, or stopping the process and doing a full build. When the debug session terminates, some link process (and maybe even compilation?) is done to bring the .exe on disk up to date with the changes. If you do wierd stuff like delete the line being executed, it usually gives you some warning message before either restarting the function or trying to pick a line somewhere near the line you deleted. Either way, it can screw up, moving the "current" line somewhere else - it doesnt crash the debugger, but may not do exactly what you expected. It is still a _huge_ win, and a great feature! Ironically, I turn this feature _off_ for Python extensions. Although changing the C code is great, in 99% of the cases I also need to change some .py code, and as existing instances are affected I need to restart the app anyway - so I may as well do a normal build at that time. ie, C now lets me debug incrementally, but a far more dynamic language prevents this feature being useful ;-) > the function at the top of the stack might eliminate some problems, > although there are still ways to mess up. The value stack is not > always empty even when we only stop at statement boundaries If we forced a restart would this be better? Can we reliably reset the stack to the start of the current function? > I've been thinking a bit about this. Function objects now have > mutable func_code attributes (and also func_defaults), I think we can > use this. > > The hard part is to do the analysis needed to decide which functions > to recompile! Ideally, we would simply edit a file and tell the > programming environment "recompile this". The programming environment > would compare the changed file with the old version that it had saved > for this purpose, and notice (for example) that we changed two methods > of class C. It would then recompile those methods only and stuff the > new code objects in the corresponding function objects. If this would work for the few changed functions/methods, what would the impact be of doing it for _every_ function (changed or not)? Then the analysis can drop to the module level which is much easier. I dont think a slight performace hit is a problem at all when doing this stuff. > One option would be to actually change the semantics of the class and > def statements so that they modify an existing class or function > rather than using assignment. Effectively, this proposal would change > the semantics of > > class A: > ...some code... > > class A: > ...some more code... > > to be the same as > > class A: > ...more code... > ...some more code... Or extending this (didnt this come up at the latest IPC?) # .\package\__init__.py class BigMutha: pass # .\package\something.py class package.BigMutha: def some_category_of_methods(): ... # .\package\other.py class package.BigMutha: def other_category_of_methods(): ... [Of course, this wont fly as it stands; just a conceptual possibility] > So perhaps these new semantics should only be invoked when a special > "reload-compile" is asked for... Or perhaps the programming > environment could do this through source parsing as I proposed > before... From guido at CNRI.Reston.VA.US Fri May 21 05:02:49 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 20 May 1999 23:02:49 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: Your message of "Fri, 21 May 1999 11:42:14 +1000." <00c501bea32b$277ce3d0$0801a8c0@bobcat> References: <00c501bea32b$277ce3d0$0801a8c0@bobcat> Message-ID: <199905210302.XAA08129@eric.cnri.reston.va.us> > Generally, I think we could make something very useful even with a number > of limitations. For example, I would find a first cut completely > acceptable and a great improvement on today if: > > * Only the function at the top of the stack can be recompiled and have the > code reflected while executing. This function also must be restarted after > such an edit. If the function uses global variables or makes calls that > restarting will screw-up, then either a) make the code changes _before_ > doing this stuff, or b) live with it for now, and help us remove the > limitation :-) OK, restarting the function seems a reasonable compromise and would seem relatively easy to implement. Not *real* easy though: it turns out that eval_code2() is called with a code object as argument, and it's not entirely trivial to figure out the corresponding function object from which to grab the new code object. But it could be done -- give it a try. (Don't wait for me, I'm ducking for cover until at least mid June.) > Ironically, I turn this feature _off_ for Python extensions. Although > changing the C code is great, in 99% of the cases I also need to change > some .py code, and as existing instances are affected I need to restart the > app anyway - so I may as well do a normal build at that time. ie, C now > lets me debug incrementally, but a far more dynamic language prevents this > feature being useful ;-) I hear you. > If we forced a restart would this be better? Can we reliably reset the > stack to the start of the current function? Yes, no problem. > If this would work for the few changed functions/methods, what would the > impact be of doing it for _every_ function (changed or not)? Then the > analysis can drop to the module level which is much easier. I dont think a > slight performace hit is a problem at all when doing this stuff. Yes, this would be fine too. > >"What if Guido's brain exploded?" :-) > > At least on that particular topic I didnt even consider I was the only one > in fear of that! But it is good to know that you specifically are too :-) Have no fear. I've learned to say no. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Fri May 21 07:36:44 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 21 May 1999 01:36:44 -0400 Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us> Message-ID: <000401bea34b$e93fcda0$d89e2299@tim> [GvR] > ... > What kind of limitations do other systems that support modifying a > "live" program being debugged impose? As an ex-compiler guy, I should have something wise to say about that. Alas, I've never used a system that allowed more than poking new values into vrbls, and the thought of any more than that makes me vaguely ill! Oh, that's right -- I'm vaguely ill anyway today. Still-- oooooh -- the problems. This later got reduced to restarting the topmost function from scratch. That has some attraction, especially on the bang-for-buck-o-meter. > ... > Please, no more posts about Scheme. Each new post mentioning call/cc > makes it *less* likely that something like that will ever be part of > Python. "What if Guido's brain exploded?" :-) What a pussy . Really, overall continuations are much less trouble to understand than threads -- there's only one function in the entire interface! OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS. changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs - tim From tismer at appliedbiometrics.com Fri May 21 09:12:05 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 21 May 1999 09:12:05 +0200 Subject: [Python-Dev] Interactive Debugging of Python References: <00c001bea314$aefc5b40$0801a8c0@bobcat> Message-ID: <37450745.21D63A5@appliedbiometrics.com> Mark Hammond wrote: > > > I'm writing a prototype of a stackless Python, which means that > > you will be able to access the current state of the interpreter > > completely. > > The inner interpreter loop will be isolated from the frame > > dispatcher. It will break whenever the ticker goes zero. > > If you set the ticker to one, you will be able to single > > step on every opcode, have the value stack, the frame chain, > > everything. > > I think the main point is how to change code when a Python frame already > references it. I dont think the structure of the frames is as important as > the general concept. But while we were talking frame-fiddling it seemed a > good point to try and hijack it a little :-) > > Would it be possible to recompile just a block of code (eg, just the > current function or method) and patch it back in such a way that the > current frame continues execution of the new code? Sure. Since the frame holds a pointer to the code, and the current IP and SP, your code can easily change it (with care, or GPF:) . It could even create a fresh code object and let it run only for the running instance. By instance, I mean a frame which is running a code object. > I feel this is somewhat related to the inability to change class > implementation for an existing instance. I know there have been hacks > around this before but they arent completly reliable and IMO it would be > nice if the core Python made it easier to change already running code - > whether that code is in an existing stack frame, or just in an already > created instance, it is very difficult to do. I think this has been difficult, only since information was hiding in the inner interpreter loop. Gonna change now. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Fri May 21 09:21:22 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 21 May 1999 09:21:22 +0200 Subject: [Python-Dev] 'stackless' python? References: <000901bea26a$34526240$179e2299@tim> <37443DC5.1330EAC6@appliedbiometrics.com> <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us> Message-ID: <37450972.D19E160@appliedbiometrics.com> Jeremy Hylton wrote: > > >>>>> "CT" == Christian Tismer writes: > > CT> What I want to achieve is that I can run this again, from my > CT> snapshot. But with shared locals, my parameter x of the snapshot > CT> would have changed to x+1, which I don't find useful. I want to > CT> fix a state of the current frame and still think it should "own" > CT> its locals. Globals are borrowed, anyway. Class instances will > CT> anyway do what you want, since the local "self" is a mutable > CT> object. > > CT> How do you want to keep computations independent when locals are > CT> shared? For me it's just easier to implement and also to think > CT> with the shallow copy. Otherwise, where is my private place? > CT> Open for becoming convinced, of course :-) > > I think you're making things a lot more complicated by trying to > instantiate new variable bindings for locals every time you create a > continuation. Can you give an example of why that would be helpful? I'm not sure wether you all understand me, and vice versa. There is no copying at all, but for the frame. I copy the frame, which means I also incref all the objects which it holds. Done. This is the bare minimum which I must do. > (Ok. I'm not sure I can offer a good example of why it would be > helpful to share them, but it makes intuitive sense to me.) > > The call_cc mechanism is going to let you capture the current > continuation, save it somewhere, and call on it again as often as you > like. Would you get a fresh locals each time you used it? or just > the first time? If only the first time, it doesn't seem that you've > gained a whole lot. call_cc does a copy of the state which is the frame. This is stored away until it is revived. Nothing else happens. As Guido pointed out, virtually the whole frame chain is duplicated, but only on demand. > Also, all the locals that are references to mutable objects are > already effectively shared. So it's only a few oddballs like ints > that are an issue. Simply look at a frame, what it is. What do you need to do to run it again with a given state. You have to preserve the stack variables. And you have to preserve the current locals, since some of them might even have a copy on the stack, and we want to stay consistent. I believe it would become obvious if you tried to implement it. Maybe I should close my ears and get something ready to show? ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Fri May 21 11:00:26 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 21 May 1999 11:00:26 +0200 Subject: [Python-Dev] 'stackless' python? References: <000701bea30c$af7a1060$9d9e2299@tim> Message-ID: <374520AA.2ADEA687@appliedbiometrics.com> Tim Peters wrote: > > [Christian] > [... clarified stuff ... thanks! ... much clearer ...] But still not clear enough, I fear. > > ... > > If a frame is the current state, I make it two frames to have two > > current states. One will be saved, the other will be run. This is > > what I call "splitting". Actually, splitting must occour whenever > > a frame can be reached twice, in order to keep elements alive. > > That part doesn't compute: if a frame can be reached by more than one path, > its refcount must be at least equal to the number of its immediate > predecessors, and its refcount won't fall to 0 before it becomes > unreachable. So while you may need to split stuff for *some* reasons, I > can't see how keeping elements alive could be one of those reasons (unless > you're zapping frame contents *before* the frame itself is garbage?). I was saying that under the side condition that I don't want to change frames as they are now. Maybe that's misconcepted, but this is what I did: If a frame as we have it today shall be resumed twice, then it has to be copied, since: The stack is in it and has some state which will change after resuming. That was the whole problem with my first prototype, which was done hoping that I don't need to change the interpreter at all. Wrong, bad, however. What I actually did was more than seems to be needed: I made a copy of the whole current frame chain. Later on, Guido said this can be done on demand. He's right. [Scheme sample - understood] > GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't > care where that points *to* other frame and ceval2 wouldn't know the difference). Maybe a frame entered > due to continuation needs extra setup work? Scheme saves itself by putting > name-resolution and continuation info into different structures; to mimic > the semantics, Python would need to get the same end effect. Point taken. The pointer doesn't save time of access, it just saves allocating another structure. So we can use something else without speed loss. [have to cut a little] > So recall your uses of Icon generators instead: like Python, Icon does have > loops, and two-level scoping, and I routinely build loopy Icon generators > that keep state in locals. Here's a dirt-simple example I emailed to Sam > earlier this week: > > procedure main() > every result := fib(0, 1) \ 10 do > write(result) > end > > procedure fib(i, j) > local temp > repeat { > suspend i > temp := i + j > i := j > j := temp > } > end [prints fib series] > If Icon restored the locals (i, j, temp) upon each fib resumption, it would > generate a zero followed by an infinite sequence of ones(!). Now I'm completely missing the point. Why should I want to restore anything? At a suspend, which when done by continuations will be done by temporarily having two identical states, one is saved and another is continued. The continued one in your example just returns the current value and immediately forgets about the locals. The other one is continued later, and of course with the same locals which were active when going asleep. > Think of a continuation as a *paused* computation (which it is) rather than > an *independent* one (which it isn't ), and I think it gets darned > hard to argue. No, you get me wrong. I understand what you mean. It is just the decision wether a frame, which will be reactivated later as a continuation, should use a reference to locals like the reference which it has for the globals. This causes me a major frame redesign. Current design: A frame is: back chain, state, code, unpacked locals, globals, stack. Code and globals are shared. State, unpacked locals and stack are private. Possible new design: A frame is: back chain, state, code, variables, globals, stack. variables is: unpacked locals. This makes the variables into an extra structure which is shared. Probably a list would be the thing, or abusing a tuple as a mutable object. Hmm. I think I should get something ready, and we should keep this thread short, or we will loose the rest of Guido's goodwill (if not already). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From da at ski.org Fri May 21 18:27:42 1999 From: da at ski.org (David Ascher) Date: Fri, 21 May 1999 09:27:42 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Interactive Debugging of Python In-Reply-To: <000401bea34b$e93fcda0$d89e2299@tim> Message-ID: On Fri, 21 May 1999, Tim Peters wrote: > OK. So how do you feel about coroutines? Would sure be nice to have *some* > way to get pseudo-parallel semantics regardless of OS. I read about coroutines years ago on c.l.py, but I admit I forgot it all. Can you explain them briefly in pseudo-python? --david From tim_one at email.msn.com Sat May 22 06:22:50 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 22 May 1999 00:22:50 -0400 Subject: [Python-Dev] Coroutines In-Reply-To: Message-ID: <000401bea40a$c1d2d2c0$659e2299@tim> [Tim] > OK. So how do you feel about coroutines? Would sure be nice > to have *some* way to get pseudo-parallel semantics regardless of OS. [David Ascher] > I read about coroutines years ago on c.l.py, but I admit I forgot it all. > Can you explain them briefly in pseudo-python? How about real Python? http://www.python.org/tim_one/000169.html contains a complete coroutine implementation using threads under the covers (& exactly 5 years old tomorrow ). If I were to do it over again, I'd use a different object interface (making coroutines objects in their own right instead of funneling everything through a "coroutine controller" object), but the ideas are the same in every coroutine language. The post contains several executable examples, from simple to "literature standard". I had forgotten all about this: it contains solutions to the same "compare tree fringes" problem Sam mentioned, *and* the generator-based building block I posted three other solutions for in this thread. That last looks like: # fringe visits a nested list in inorder, and detaches for each non-list # element; raises EarlyExit after the list is exhausted def fringe( co, list ): for x in list: if type(x) is type([]): fringe(co, x) else: co.detach(x) def printinorder( list ): co = Coroutine() f = co.create(fringe, co, list) try: while 1: print co.tran(f), except EarlyExit: pass print printinorder([1,2,3]) # 1 2 3 printinorder([[[[1,[2]]],3]]) # ditto x = [0, 1, [2, [3]], [4,5], [[[6]]] ] printinorder(x) # 0 1 2 3 4 5 6 Generators are really "half a coroutine", so this doesn't show the full power (other examples in the post do). co.detach is a special way to deal with this asymmetry. In the general case you use co.tran all the time, where (see the post for more info) v = co.tran(c [, w]) means "resume coroutine c from the place it last did a co.tran, optionally passing it the value w, and when somebody does a co.tran back to *me*, resume me right here, binding v to the value *they* pass to co.tran ). Knuth complains several times that it's very hard to come up with a coroutine example that's both simple and clear <0.5 wink>. In a nutshell, coroutines don't have a "caller/callee" relationship, they have "we're all equal partners" relationship, where any coroutine is free to resume any other one where it left off. It's no coincidence that making coroutines easy to use was pioneered by simulation languages! Just try simulating a marriage where one partner is the master and the other a slave . i-may-be-a-bachelor-but-i-have-eyes-ly y'rs - tim From tim_one at email.msn.com Sat May 22 06:22:55 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 22 May 1999 00:22:55 -0400 Subject: [Python-Dev] Re: Coroutines In-Reply-To: Message-ID: <000501bea40a$c3d1fe20$659e2299@tim> Thoughts o' the day: + Generators ("semi-coroutines") are wonderful tools and easy to implement without major changes to the PVM. Icon calls 'em generators, Sather calls 'em iterators, and they're exactly what you need to implement "for thing in object:" when object represents a collection that's tricky to materialize. Python needs something like that. OTOH, generators are pretty much limited to that. + Coroutines are more general but much harder to implement, because each coroutine needs its own stack (a generator only has one stack *frame*-- its own --to worry about), and C-calling-Python can get into the act. As Sam said, they're probably no easier to implement than call/cc (but trivial to implement given call/cc). + What may be most *natural* is to forget all that and think about a variation of Python threads implemented directly via the interpreter, without using OS threads. The PVM already knows how to handle thread-state swapping. Given Christian's stackless interpreter, and barring C->Python cases, I suspect Python can fake threads all by itself, in the sense of interleaving their executions within a single "real" (OS) thread. Given the global interpreter lock, Python effectively does only-one-at-a-time anyway. Threads are harder than generators or coroutines to learn, but A) Many more people know how to use them already. B) Generators and coroutines can be implemented using (real or fake) threads. C) Python has offered threads since the beginning. D) Threads offer a powerful mode of control transfer coroutines don't, namely "*anyone* else who can make progress now, feel encouraged to do so at my expense". E) For whatever reasons, in my experience people find threads much easier to learn than call/cc -- perhaps because threads are *obviously* useful upon first sight, while it takes a real Zen Experience before call/cc begins to make sense. F) Simulated threads could presumably produce much more informative error msgs (about deadlocks and such) than OS threads, so even people using real threads could find excellent debugging use for them. Sam doesn't want to use "real threads" because they're pigs; fake threads don't have to be. Perhaps x = y.SOME_ASYNC_CALL(r, s, t) could map to e.g. import config if config.USE_REAL_THREADS: import threading else: from simulated_threading import threading from config.shared import msg_queue class Y: def __init__(self, ...): self.ready = threading.Event() ... def SOME_ASYNC_CALL(self, r, s, t): result = [None] # mutable container to hold the result msg_queue.put((server_of_the_day, r, s, t, self.ready, result)) self.ready.wait() self.ready.clear() return result[0] where some other simulated thread polls the msg_queue and does ready.set() when it's done processing the msg enqueued by SOME_ASYNC_CALL. For this to scale nicely, it's probably necessary for the PVM to cooperate with the simulated_threading implementation (e.g., a simulated thread that blocks (like on self.ready.wait()) should be taken out of the collection of simulated threads the PVM may attempt to resume -- else in Sam's case the PVM would repeatedly attempt to wake up thousands of blocked threads, and things would slow to a crawl). Of course, simulated_threading could be built on top of call/cc or coroutines too. The point to making threads the core concept is keeping Guido's brain from exploding. Plus, as above, you can switch to "real threads" by changing an import statement. making-sure-the-global-lock-support-hair-stays-around-even-if-greg- renders-it-moot-for-real-threads-ly y'rs - tim From tismer at appliedbiometrics.com Sat May 22 18:20:30 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 22 May 1999 18:20:30 +0200 Subject: [Python-Dev] Coroutines References: <000401bea40a$c1d2d2c0$659e2299@tim> Message-ID: <3746D94E.239D0B8E@appliedbiometrics.com> Tim Peters wrote: > > [Tim] > > OK. So how do you feel about coroutines? Would sure be nice > > to have *some* way to get pseudo-parallel semantics regardless of OS. > > [David Ascher] > > I read about coroutines years ago on c.l.py, but I admit I forgot it all. > > Can you explain them briefly in pseudo-python? > > How about real Python? http://www.python.org/tim_one/000169.html contains a > complete coroutine implementation using threads under the covers (& exactly > 5 years old tomorrow ). If I were to do it over again, I'd use a > different object interface (making coroutines objects in their own right > instead of funneling everything through a "coroutine controller" object), > but the ideas are the same in every coroutine language. The post contains > several executable examples, from simple to "literature standard". What an interesting thread! Unfortunately, all the examples are messed up since some HTML formatter didn't take care of the python code, rendering it unreadable. Is there a different version available? Also, I'd like to read the rest of the threads in http://www.python.org/tim_one/ but it seems that only your messages are archived? Anyway, the citations in http://www.python.org/tim_one/000146.html show me that you have been through all of this five years ago, with a five years younger Guido which sounds a bit different than today. I had understood him better if I had known that this is a re-iteration of a somehow dropped or entombed idea. (If someone has the original archives from that epoche, I'd be happy to get a copy. Actually, I'm missing all upto end of 1996.) A sort snapshot: Stackless Python is meanwhile nearly alive, with recursion avoided in ceval. Of course, some modules are left which still need work, but enough for a prototype. Frames contain now all necessry state and are now prepared for execution and thrown back to the evaluator (elevator?). The key idea was to change the deeply nested functions in a way, that their last eval_code call happens to be tail recursive. In ceval.c (and in other not yet changed places), functions to a lot of preparation, build some parameter, call eval_code and release the parameter. This was the crux, which I solved by a new filed in the frame object, where such references can be stored. The routine can now return with the ready packaged frame, instead of calling it. As a minimum facility for future co-anythings, I provided a hook function for resuming frames, which causes no overhead in the usual case but allows to override what a frame does when someone returns control to it. To implement this is due to some extension module, wether this may be coroutines or your nice nano-threads, it's possible. threadedly yours - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Sat May 22 21:04:43 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 22 May 1999 21:04:43 +0200 Subject: [Python-Dev] How stackless can Python be? Message-ID: <3746FFCB.CD506BE4@appliedbiometrics.com> Hi, to make the core interpreter stackless is one thing. Turning functions which call the interpreter from some deep nesting level into versions, which return a frame object instead which is to be called, is possible in many cases. Internals like apply are rather uncomplicated to convert. CallObjectWithKeywords is done. What I have *no* good solution for is map. Map does an iteration over evaluations and keeps state while it is running. The same applies to reduce, but it seems to be not used so much. Map is. I don't see at the moment if map could be a killer for Tim's nice mini-thread idea. How must map work, if, for instance, a map is done with a function which then begins to switch between threads, before map is done? Can one imagine a problem? Maybe it is no issue, but I'd really like to know wether we need a stateless map. (without replacing it by a for loop :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Sat May 22 21:35:58 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 22 May 1999 15:35:58 -0400 Subject: [Python-Dev] Coroutines In-Reply-To: <3746D94E.239D0B8E@appliedbiometrics.com> Message-ID: <000501bea48a$51563980$119e2299@tim> >> http://www.python.org/tim_one/000169.html [Christian] > What an interesting thread! Unfortunately, all the examples are messed > up since some HTML formatter didn't take care of the python code, > rendering it unreadable. Is there a different version available? > > Also, I'd like to read the rest of the threads in > http://www.python.org/tim_one/ but it seems that only your messages > are archived? Yes, that link is the old Pythonic Award Shrine erected in my memory -- it's all me, all the time, no mercy, no escape . It predates the DejaNews archive, but the context can still be found in http://www.python.org/search/hypermail/python-1994q2/index.html There's a lot in that quarter about continuations & coroutines, most from Steven Majewski, who took a serious shot at implementing all this. Don't have the code in a more usable form; when my then-employer died, most of my files went with it. You can save the file as text, though! The structure of the code is intact, it's simply that your browswer squashes out the spaces when displaying it. Nuke the